0

Question:

I'm building a memory-augmented AI system using RAG with persistent vector storage, but facing memory leaks and context contamination between sessions. Problem:

Vector embeddings aren't garbage collected after context switches Previous session embeddings bleeding into new conversations FAISS index performance degrades after ~1000 retrievals

Current Implementation:
pythonclass MemoAI:
    def __init__(self):
        self.vector_store = FAISS.load_local("./embeddings", embeddings)
        self.memory_buffer = ConversationSummaryBufferMemory(
            llm=llm, max_token_limit=2000
        )
        
    def add_memory(self, text, metadata):
        chunks = self.recursive_splitter.split_text(text)
        embeddings = self.embedder.embed_documents(chunks)
        
        # Problem: These embeddings persist even after session ends
        self.vector_store.add_embeddings(
            [(chunk, embedding) for chunk, embedding in zip(chunks, embeddings)],
            metadatas=[metadata] * len(chunks)
        )
        
    def retrieve_context(self, query, k=5):
        # Issue: Returns stale chunks from previous sessions
        return self.vector_store.similarity_search_with_score(
            query, k=k, filter={"session_id": self.current_session}
        )

Reproducible Issue:

python# Session 1
memo_ai.add_memory("User likes Python", {"session_id": "session_1"})
# Session 2 (new user)
memo_ai.switch_session("session_2")
result = memo_ai.retrieve_context("What programming language?")
# BUG: Returns "likes Python" from session_1 despite filter
# Memory usage: 2.3GB and growing (started at 500MB)

What I've Tried:

Manual cleanup with del self.vector_store and gc.collect() - memory not released Per-session FAISS indexes - too slow for real-time Metadata filtering - inconsistent results

Environment:

LangChain 0.1.0, FAISS-GPU 1.7.2, Python 3.10 32GB RAM, RTX 3090

Question: How can I properly isolate memory between sessions without rebuilding the entire vector store? Is there a pattern for efficient garbage collection of embeddings in production RAG systems?

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.