Scaling RAG QA with Large Docs, Tables, and 30K+ Chunks (No LangChain)

I'm building a RAG-based document QA system using Python (no LangChain), LLaMA (50K context), PostgreSQL with pgvector, and Docling for parsing. Users can upload up to 10 large documents (300+ pages each), often containing numerous tables and charts.

I'm facing a few specific challenges: 30K+ total chunks across all docs → KNN retrieval gets noisy. Tried LLM-based reranking, but it's too slow and expensive to run on all 30K chunks. Tried summarizing each chunk to improve retrieval, but: It's too expensive to generate LLM summaries for all 30K sections. Table chunks are especially difficult: Embeddings perform poorly on structured/numeric data. Summary-style embeddings (e.g. first 300 tokens, or using just heading/caption) aren’t sufficient for value-level lookups. Looking for ideas or proven strategies to: Improve precision in initial retrieval at scale Handle table-heavy content more effectively Reduce cost while preserving accuracy

Any ideas, techniques, or tooling (besides LangChain) that worked for you?

asked Jun 2 at 16:30

Anton Lee

111 bronze badge

Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer.

Community
– Community Bot

2025-06-02 17:55:41 +00:00
Commented Jun 2 at 17:55

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Scaling RAG QA with Large Docs, Tables, and 30K+ Chunks (No LangChain)

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest