mirror of
https://github.com/mtayfur/openwebui-memory-system.git
synced 2026-01-22 06:51:01 +01:00
Update README.md for improved clarity and accuracy; revise privacy notice, cache descriptions, and model support details.
This commit is contained in:
14
README.md
14
README.md
@@ -2,6 +2,10 @@
|
||||
|
||||
A long-term memory system that learns from conversations and personalizes responses without requiring external APIs or tokens.
|
||||
|
||||
## Important Notice
|
||||
|
||||
**Privacy Consideration:** This system shares user messages and stored memories with your configured LLM for memory consolidation and retrieval operations. All data is processed through Open WebUI's built-in models using your existing configuration. No data is sent to external services beyond what your LLM provider configuration already allows.
|
||||
|
||||
## Core Features
|
||||
|
||||
**Zero External Dependencies**
|
||||
@@ -21,7 +25,7 @@ Avoids wasting resources on irrelevant messages through two-stage detection:
|
||||
Categories automatically skipped: technical discussions, formatting requests, calculations, translation tasks, proofreading, and non-personal queries.
|
||||
|
||||
**Multi-Layer Caching**
|
||||
Three specialized caches (embeddings, retrieval results, memory lookups) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
|
||||
Three specialized caches (embeddings, retrieval, memory) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
|
||||
|
||||
**Real-Time Status Updates**
|
||||
Emits progress messages during operations: memory retrieval progress, consolidation status, operation summaries — keeping users informed without overwhelming them.
|
||||
@@ -32,7 +36,7 @@ All prompts and logic work language-agnostically. Stores memories in English but
|
||||
## Model Support
|
||||
|
||||
**LLM Support**
|
||||
Tested with Gemini 2.5 Flash Lite, GPT-4o-mini, Qwen2.5-Instruct, and Mistral-Small. Should work with any model that supports structured outputs.
|
||||
Tested with gemini-2.5-flash-lite, gpt-5-nano, and qwen3-instruct. Should work with any model that supports structured outputs.
|
||||
|
||||
**Embedding Model Support**
|
||||
Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure OpenAI, and local sentence-transformers). Configure embedding models through OpenWebUI's RAG settings. The memory system automatically uses whatever embedding backend you've configured in OpenWebUI.
|
||||
@@ -54,11 +58,13 @@ Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure Open
|
||||
## Configuration
|
||||
|
||||
Customize behavior through valves:
|
||||
- **model**: LLM for consolidation and reranking (default: `gemini-2.5-flash-lite`)
|
||||
- **model**: LLM for consolidation and reranking (default: `google/gemini-2.5-flash-lite`)
|
||||
- **max_message_chars**: Maximum message length before skipping operations (default: 2500)
|
||||
- **max_memories_returned**: Context injection limit (default: 10)
|
||||
- **semantic_retrieval_threshold**: Minimum similarity score (default: 0.5)
|
||||
- **relaxed_semantic_threshold_multiplier**: Adjusts threshold for consolidation (default: 0.9)
|
||||
- **enable_llm_reranking**: Toggle smart reranking (default: true)
|
||||
- **llm_reranking_trigger_multiplier**: When to activate LLM (default: 0.5 = 50%)
|
||||
- **llm_reranking_trigger_multiplier**: When to activate LLM reranking (default: 0.5 = 50%)
|
||||
|
||||
## Performance Optimizations
|
||||
|
||||
|
||||
Reference in New Issue
Block a user