Update README.md for improved clarity and accuracy; revise privacy notice, cache descriptions, and model support details.

2026-01-22 06:51:01 +01:00 · 2025-10-15 14:13:55 +03:00
parent e3709fe677
commit 0726293446
1 changed files with 10 additions and 4 deletions
--- a/README.md
+++ b/README.md
@@ -2,6 +2,10 @@
 A long-term memory system that learns from conversations and personalizes responses without requiring external APIs or tokens.
 ## Important Notice
 **Privacy Consideration:** This system shares user messages and stored memories with your configured LLM for memory consolidation and retrieval operations. All data is processed through Open WebUI's built-in models using your existing configuration. No data is sent to external services beyond what your LLM provider configuration already allows.
 ## Core Features
 **Zero External Dependencies**  
@@ -21,7 +25,7 @@ Avoids wasting resources on irrelevant messages through two-stage detection:
 Categories automatically skipped: technical discussions, formatting requests, calculations, translation tasks, proofreading, and non-personal queries.
 **Multi-Layer Caching**  
-Three specialized caches (embeddings, retrieval results, memory lookups) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
+Three specialized caches (embeddings, retrieval, memory) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
 **Real-Time Status Updates**  
 Emits progress messages during operations: memory retrieval progress, consolidation status, operation summaries — keeping users informed without overwhelming them.
@@ -32,7 +36,7 @@ All prompts and logic work language-agnostically. Stores memories in English but
 ## Model Support
 **LLM Support**  
-Tested with Gemini 2.5 Flash Lite, GPT-4o-mini, Qwen2.5-Instruct, and Mistral-Small. Should work with any model that supports structured outputs.
+Tested with gemini-2.5-flash-lite, gpt-5-nano, and qwen3-instruct. Should work with any model that supports structured outputs.
 **Embedding Model Support**  
 Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure OpenAI, and local sentence-transformers). Configure embedding models through OpenWebUI's RAG settings. The memory system automatically uses whatever embedding backend you've configured in OpenWebUI.
@@ -54,11 +58,13 @@ Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure Open
 ## Configuration
 Customize behavior through valves:
- **model**: LLM for consolidation and reranking (default: `gemini-2.5-flash-lite`)
+- **model**: LLM for consolidation and reranking (default: `google/gemini-2.5-flash-lite`)
 - **max_message_chars**: Maximum message length before skipping operations (default: 2500)
 - **max_memories_returned**: Context injection limit (default: 10)
 - **semantic_retrieval_threshold**: Minimum similarity score (default: 0.5)
 - **relaxed_semantic_threshold_multiplier**: Adjusts threshold for consolidation (default: 0.9)
 - **enable_llm_reranking**: Toggle smart reranking (default: true)
- **llm_reranking_trigger_multiplier**: When to activate LLM (default: 0.5 = 50%)
+- **llm_reranking_trigger_multiplier**: When to activate LLM reranking (default: 0.5 = 50%)
 ## Performance Optimizations