mirror of
https://github.com/mtayfur/openwebui-memory-system.git
synced 2026-01-22 06:51:01 +01:00
Update README.md for improved clarity and accuracy; revise privacy notice, cache descriptions, and model support details.
This commit is contained in:
14
README.md
14
README.md
@@ -2,6 +2,10 @@
|
|||||||
|
|
||||||
A long-term memory system that learns from conversations and personalizes responses without requiring external APIs or tokens.
|
A long-term memory system that learns from conversations and personalizes responses without requiring external APIs or tokens.
|
||||||
|
|
||||||
|
## Important Notice
|
||||||
|
|
||||||
|
**Privacy Consideration:** This system shares user messages and stored memories with your configured LLM for memory consolidation and retrieval operations. All data is processed through Open WebUI's built-in models using your existing configuration. No data is sent to external services beyond what your LLM provider configuration already allows.
|
||||||
|
|
||||||
## Core Features
|
## Core Features
|
||||||
|
|
||||||
**Zero External Dependencies**
|
**Zero External Dependencies**
|
||||||
@@ -21,7 +25,7 @@ Avoids wasting resources on irrelevant messages through two-stage detection:
|
|||||||
Categories automatically skipped: technical discussions, formatting requests, calculations, translation tasks, proofreading, and non-personal queries.
|
Categories automatically skipped: technical discussions, formatting requests, calculations, translation tasks, proofreading, and non-personal queries.
|
||||||
|
|
||||||
**Multi-Layer Caching**
|
**Multi-Layer Caching**
|
||||||
Three specialized caches (embeddings, retrieval results, memory lookups) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
|
Three specialized caches (embeddings, retrieval, memory) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
|
||||||
|
|
||||||
**Real-Time Status Updates**
|
**Real-Time Status Updates**
|
||||||
Emits progress messages during operations: memory retrieval progress, consolidation status, operation summaries — keeping users informed without overwhelming them.
|
Emits progress messages during operations: memory retrieval progress, consolidation status, operation summaries — keeping users informed without overwhelming them.
|
||||||
@@ -32,7 +36,7 @@ All prompts and logic work language-agnostically. Stores memories in English but
|
|||||||
## Model Support
|
## Model Support
|
||||||
|
|
||||||
**LLM Support**
|
**LLM Support**
|
||||||
Tested with Gemini 2.5 Flash Lite, GPT-4o-mini, Qwen2.5-Instruct, and Mistral-Small. Should work with any model that supports structured outputs.
|
Tested with gemini-2.5-flash-lite, gpt-5-nano, and qwen3-instruct. Should work with any model that supports structured outputs.
|
||||||
|
|
||||||
**Embedding Model Support**
|
**Embedding Model Support**
|
||||||
Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure OpenAI, and local sentence-transformers). Configure embedding models through OpenWebUI's RAG settings. The memory system automatically uses whatever embedding backend you've configured in OpenWebUI.
|
Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure OpenAI, and local sentence-transformers). Configure embedding models through OpenWebUI's RAG settings. The memory system automatically uses whatever embedding backend you've configured in OpenWebUI.
|
||||||
@@ -54,11 +58,13 @@ Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure Open
|
|||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
Customize behavior through valves:
|
Customize behavior through valves:
|
||||||
- **model**: LLM for consolidation and reranking (default: `gemini-2.5-flash-lite`)
|
- **model**: LLM for consolidation and reranking (default: `google/gemini-2.5-flash-lite`)
|
||||||
|
- **max_message_chars**: Maximum message length before skipping operations (default: 2500)
|
||||||
- **max_memories_returned**: Context injection limit (default: 10)
|
- **max_memories_returned**: Context injection limit (default: 10)
|
||||||
- **semantic_retrieval_threshold**: Minimum similarity score (default: 0.5)
|
- **semantic_retrieval_threshold**: Minimum similarity score (default: 0.5)
|
||||||
|
- **relaxed_semantic_threshold_multiplier**: Adjusts threshold for consolidation (default: 0.9)
|
||||||
- **enable_llm_reranking**: Toggle smart reranking (default: true)
|
- **enable_llm_reranking**: Toggle smart reranking (default: true)
|
||||||
- **llm_reranking_trigger_multiplier**: When to activate LLM (default: 0.5 = 50%)
|
- **llm_reranking_trigger_multiplier**: When to activate LLM reranking (default: 0.5 = 50%)
|
||||||
|
|
||||||
## Performance Optimizations
|
## Performance Optimizations
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user