Enhance skip detection logic with additional checks for structured content and code indicators

Update README and logging messages for model configuration clarity
♻️ Skip detection now binary (#4 )
2026-01-22 23:11:03 +01:00 · 2025-10-28 03:44:26 +03:00 · 2025-10-28 03:03:11 +03:00 · 2025-10-28 02:35:29 +03:00 · 2025-10-27 23:39:06 +03:00 · 2025-10-27 21:14:35 +01:00
6 changed files with 693 additions and 442 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,4 @@
 __pycache__/
+.github/instructions/*
 .venv/
-**AGENTS.md
 tests/
--- a/README.md
+++ b/README.md
@@ -2,6 +2,18 @@

 A long-term memory system that learns from conversations and personalizes responses without requiring external APIs or tokens.

+## ⚠️ Important Notices
+
+**🔒 Privacy & Data Sharing:**
+- User messages and stored memories are shared with your configured LLM for memory consolidation and retrieval
+- If using remote embedding models (like OpenAI text-embedding-3-small), memories will also be sent to those external providers
+- All data is processed through Open WebUI's built-in models using your existing configuration
+
+**💰 Cost & Model Requirements:**
+- The system uses complex prompts and sends relevant memories to the LLM, which increase token usage and costs
+- Requires public models configured in OpenWebUI - you can use any public model ID from your instance
+- **Recommended cost-effective models:** `gpt-5-nano`, `gemini-2.5-flash-lite`, `qwen3-instruct`, or your local LLMs
+
 ## Core Features

 **Zero External Dependencies**  
@@ -21,7 +33,7 @@ Avoids wasting resources on irrelevant messages through two-stage detection:
 Categories automatically skipped: technical discussions, formatting requests, calculations, translation tasks, proofreading, and non-personal queries.

 **Multi-Layer Caching**  
-Three specialized caches (embeddings, retrieval results, memory lookups) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.
+Three specialized caches (embeddings, retrieval, memory) with LRU eviction keep responses fast while managing memory efficiently. Each user gets isolated cache storage.

 **Real-Time Status Updates**  
 Emits progress messages during operations: memory retrieval progress, consolidation status, operation summaries — keeping users informed without overwhelming them.
@@ -32,7 +44,7 @@ All prompts and logic work language-agnostically. Stores memories in English but
 ## Model Support

 **LLM Support**  
-Tested with Gemini 2.5 Flash Lite, GPT-4o-mini, Qwen2.5-Instruct, and Mistral-Small. Should work with any model that supports structured outputs.
+Tested with gemini-2.5-flash-lite, gpt-5-nano, and qwen3-instruct. Should work with any model that supports structured outputs.

 **Embedding Model Support**  
 Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure OpenAI, and local sentence-transformers). Configure embedding models through OpenWebUI's RAG settings. The memory system automatically uses whatever embedding backend you've configured in OpenWebUI.
@@ -54,11 +66,13 @@ Uses OpenWebUI's configured embedding model (supports Ollama, OpenAI, Azure Open
 ## Configuration

 Customize behavior through valves:
- **model**: LLM for consolidation and reranking (default: `gemini-2.5-flash-lite`)
+- **model**: LLM for consolidation and reranking. Set to "Default" to use the current chat model, or specify a model ID to use that specific model
+- **max_message_chars**: Maximum message length before skipping operations (default: 2500)
 - **max_memories_returned**: Context injection limit (default: 10)
 - **semantic_retrieval_threshold**: Minimum similarity score (default: 0.5)
+- **relaxed_semantic_threshold_multiplier**: Adjusts threshold for consolidation (default: 0.9)
 - **enable_llm_reranking**: Toggle smart reranking (default: true)
- **llm_reranking_trigger_multiplier**: When to activate LLM (default: 0.5 = 50%)
+- **llm_reranking_trigger_multiplier**: When to activate LLM reranking (default: 0.5 = 50%)

 ## Performance Optimizations

--- a/dev-check.sh
+++ b/dev-check.sh
@@ -0,0 +1,25 @@
+#!/usr/bin/env bash
+
+# Development tools script for openwebui-memory-system
+
+set -e
+
+if [ -f "./.venv/bin/python" ]; then
+    PYTHON="./.venv/bin/python"
+elif command -v python3 &> /dev/null; then
+    PYTHON="python3"
+elif command -v python &> /dev/null; then
+    PYTHON="python"
+else
+    echo "Python 3 is not installed. Please install Python 3 to proceed."
+    exit 1    
+fi
+
+echo "🔧 Running development tools..."
+
+echo "🎨 Formatting with Black..."
+$PYTHON -m black .
+echo "📦 Sorting imports with isort..."
+$PYTHON -m isort .
+
+echo "✅ All checks passed!"
--- a/memory_system.py
+++ b/memory_system.py
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,24 @@
+[tool.black]
+line-length = 160
+target-version = ['py38']
+include = '\.pyi?$'
+extend-exclude = '''
+/(
+  \.eggs
+  | \.git
+  | \.hg
+  | \.mypy_cache
+  | \.tox
+  | \.venv
+  | build
+  | dist
+)/
+'''
+
+[tool.isort]
+line_length = 160
+multi_line_output = 3
+include_trailing_comma = true
+force_grid_wrap = 0
+use_parentheses = true
+ensure_newline_before_comments = true
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +1,7 @@
-aiohttp>=3.12.15
-pydantic>=2.11.7
-numpy>=2.0.0
 open-webui>=0.6.32
-tiktoken>=0.11.0
+aiohttp
+pydantic
+numpy
+tiktoken
+black
+isort
Author	SHA1	Message	Date
mtayfur	577f6d6406	Enhance skip detection logic with additional checks for structured content and code indicators	2025-10-28 03:44:26 +03:00
mtayfur	fe3c47f6e4	Update README and logging messages for model configuration clarity	2025-10-28 03:03:11 +03:00
M. Tayfur	551b0c571b	♻️ Skip detection now binary (#4 )	2025-10-28 02:35:29 +03:00
M. Tayfur	e6e3f7ab99	Merge pull request #3 from GlisseManTV/dev Way to use current model instead of dedicated model.	2025-10-27 23:39:06 +03:00
iTConsult4Care	cca8079b94	Fix formatting and add metadata to memory_system.py	2025-10-27 21:14:35 +01:00
GlissemanTV	55a8c70bac	add suppport of current chat model	2025-10-27 21:06:35 +01:00
GlissemanTV	e12aa7b776	Merge branch 'dev' of https://github.com/GlisseManTV/openwebui-memory-system into dev	2025-10-27 21:05:26 +01:00
mtayfur	b5a4872096	📝 (memory_system): clarify and strengthen intent filtering and memory consolidation guidelines Expand and clarify the "Filter for Intent" rule to ensure only direct, personally significant facts are stored, explicitly excluding messages where the user's primary intent is instructional, technical, or analytical. Update processing and decision frameworks to reinforce selectivity based on user intent. Revise and annotate examples to demonstrate correct application of the new rules, making it clear that requests for advice, recommendations, or technical tasks are ignored. These changes improve the precision and reliability of memory consolidation, reducing the risk of storing irrelevant or transient information.	2025-10-27 00:57:26 +03:00
mtayfur	3f9b4c6d48	♻️ (memory_system): refactor skip detection and add semantic deduplication - Unify skip detection to a binary classifier (personal vs non-personal) for improved maintainability and clarity. Remove multiple technical/ instruction/translation/etc. categories and consolidate into NON_PERSONAL and PERSONAL. - Adjust skip detection margin for more precise classification. - Add semantic deduplication for memory operations using embedding similarity, preventing duplicate memory creation and updates. - Normalize and validate embedding dimensions for robustness. - Add per-user async locks to prevent race conditions during memory consolidation. - Refactor requirements.txt to remove version pinning for easier dependency management. - Improve logging and error handling for embedding and deduplication operations. These changes improve the reliability and accuracy of memory classification and deduplication, reduce false positives in skip detection, and prevent duplicate or conflicting memory operations in concurrent environments. Dependency management is simplified for compatibility.	2025-10-27 00:27:33 +03:00
mtayfur	bb1bd01222	♻️ (memory_system.py): reformat code for consistency, readability, and maintainability - Reorder and group imports for clarity and PEP8 compliance. - Standardize string quoting and whitespace for consistency. - Refactor long function signatures and dictionary constructions for better readability. - Use double quotes for all string literals and dictionary keys. - Improve formatting of multiline statements and function calls. - Add or adjust line breaks to keep lines within recommended length. - Reformat class and method docstrings for clarity. - Use consistent indentation and spacing throughout the file. These changes improve code readability, maintainability, and consistency, making it easier for future contributors to understand and modify the codebase. No functional logic is changed.	2025-10-27 00:20:35 +03:00
mtayfur	189c6d4226	🔧 (dev-check.sh, pyproject.toml, requirements.txt): add development tooling and configuration Introduce a dev-check.sh script to automate code formatting and import sorting using Black and isort. Add a pyproject.toml file to configure Black and isort settings for consistent code style. Update requirements.txt to include Black and isort as development dependencies and remove version pinning for easier dependency management. These changes streamline the development workflow, enforce code style consistency, and make it easier for contributors to run formatting and import checks locally.	2025-10-27 00:20:05 +03:00
iTConsult4Care	7630259ce1	Delete memory_system_ollama.py	2025-10-26 16:48:26 +01:00
GlissemanTV	a3a627339a	Merge branch 'dev' of https://github.com/GlisseManTV/openwebui-memory-system into dev	2025-10-26 16:46:06 +01:00
iTConsult4Care	390747c1d1	Delete memory_system_ollama.py	2025-10-26 16:45:30 +01:00
iTConsult4Care	b6b4c5fde8	Add files via upload	2025-10-26 16:44:24 +01:00
iTConsult4Care	4328e4b79c	Add current model usecase	2025-10-26 16:42:36 +01:00
GlissemanTV	fcae27e840	add current model usecase	2025-10-26 16:14:05 +01:00
GlissemanTV	1f779d86ec	Merge branch 'dev' of https://github.com/GlisseManTV/openwebui-memory-system into dev	2025-10-26 16:12:12 +01:00
GlissemanTV	d07a853aeb	add current model using case	2025-10-26 16:05:35 +01:00
GlissemanTV	89399f57cc	add current model workflow with checkbox	2025-10-26 16:01:13 +01:00
mtayfur	c0bfb3927b	Refactor memory creation guidelines for improved clarity and conciseness in contextual completeness section.	2025-10-19 05:13:45 +03:00
mtayfur	d05ed8a16e	Update semantic retrieval thresholds in Constants class for improved accuracy	2025-10-19 04:57:36 +03:00
mtayfur	7e2209633d	Refactor logger initialization in memory_system.py to use module name for better context in log messages.	2025-10-18 19:25:22 +03:00
mtayfur	505c443050	Update README.md to enhance clarity on privacy and cost considerations; restructure sections for better readability and add relevant details.	2025-10-15 14:33:33 +03:00
mtayfur	0726293446	Update README.md for improved clarity and accuracy; revise privacy notice, cache descriptions, and model support details.	2025-10-15 14:13:55 +03:00
mtayfur	e3709fe677	Refactor cache management in Filter class; reduce maximum cache entries and concurrent user caches for improved performance and clarity. Update cache management methods for consistency and better logging.	2025-10-15 14:05:01 +03:00
mtayfur	2deba4fb2c	Refactor Filter class to use async for pipeline context setup; implement locking mechanism for shared skip detector cache to enhance concurrency safety.	2025-10-12 23:24:58 +03:00
mtayfur	849dd71a01	Refactor memory selection logic in LLMRerankingService for improved clarity; streamline response handling by directly using response.ids.	2025-10-12 23:03:36 +03:00
mtayfur	158f0d1983	Refactor memory operations in Filter class for improved readability and consistency; utilize statistics.median for score calculation and streamline operation details formatting.	2025-10-12 22:54:18 +03:00
mtayfur	2db2d3f2c8	Refactor SkipDetector to streamline skip detection logic and improve clarity; update method signature for better integration with memory system.	2025-10-12 21:44:51 +03:00
GlissemanTV	08155816ff	add memory_system_ollama.py	2025-10-10 09:10:03 +02:00
mtayfur	840d4c59ca	Refactor SkipDetector to use a callable embedding function instead of SentenceTransformer; update requirements to remove unnecessary dependencies.	2025-10-09 23:36:27 +03:00