Trending 'rag' questions

Advice

1 vote

1 replies

20 views

RAG with Pinecone + GPT-5 for generating new math problems: incoherent outputs, mixed chunks, and lack of originality

I’m building a tool that generates new mathematics exam problems using an internal database of past problems. My current setup uses a RAG pipeline, Pinecone as the vector database, and GPT-5 as the ...

Marc-Loïc Abena

11

asked 2 hours ago

Best practices

0 votes

1 replies

105 views

Regarding rag for telephony with deepgram

I'm building a voice-based calling system where users can create AI agents that make outbound phone calls. The agent uses Deepgram for real-time transcription and ElevenLabs/Cartesia for speech ...

Sarthak Sahu

1

asked Nov 15 at 9:35

Advice

0 votes

1 replies

58 views

How can I group transcribed phrases into meaningful chunks without using complex models?

I have a large set of phrases obtained via Azure Fast Transcription, and I need to group them into coherent semantic chunks (to use later in a RAG pipeline). Initially, I tried grouping phrases based ...

Daniel

13

asked Nov 6 at 10:18

0 votes

0 answers

54 views

Langchain RAG is not retrieving any document

This is my embedding code, which I run once only: embeddings = OpenAIEmbeddings(model="text-embedding-3-large") vector_store = MongoDBAtlasVectorSearch.from_connection_string( ...

Mingruifu Lin

161

asked Oct 29 at 17:00

0 votes

0 answers

24 views

How to exclude metadata from embedding?

I'm using LlamaIndex 0.14.7. I would like to embed document text without concatenating metadata, because I put a long text in metadata. Here's my code: table_vec_store: SimpleVectorStore = ...

Trams

421

asked Nov 6 at 9:31

1 vote

1 answer

122 views

Why does answer_relevancy return NaN when evaluating RAG with Ragas?

I’m trying to evaluate my Retrieval-Augmented Generation (RAG) pipeline using Ragas. . Here’s a complete version of my code: """# RAG Evaluation""" from datasets import ...

Chandima

11

asked Sep 25 at 8:51

1 vote

0 answers

51 views

Why does my LangChain RAG chatbot sometimes miss relevant chunks in semantic search?

I built a RAG chatbot using LangChain + ChromaDB + OpenAI embeddings. The pipeline works, but sometimes the chatbot doesn’t return the most relevant PDF content, even though it exists in the vector DB....

Naitik Mittal

11

asked Sep 21 at 15:20

0 votes

0 answers

146 views

Zep Graphiti - core - Adding Episode fails the LLM structured output

On the ingestion part to the graph db, I pass a json file, as an episode, custom entities (and edges), using gemini api, but I get some discrepancy on the structured output, like so: LLM generation ...

George Petropoulos

448

asked Sep 7 at 21:11

0 votes

1 answer

64 views

Chroma not accepting lists in persistentClient collection?

My objective is to do keyword filtering in Chroma. I have a field called keywords with a list of strings and I want to filter with it, but chroma won't let me add lists as a field. I checked my Chroma ...

Elena López-Negrete Burón

1

asked Sep 23 at 10:37

1 vote

0 answers

72 views

RAG Chatbot does not answer paraphrased questions

I built a RAG chatbot in python,langchain, and FAISS for the vectorstore. And the data is stored as JSON. The chatbot sometimes refuses to answer when a question is rephrased. Here are two ...

SoftwareEngineer

11

asked Sep 20 at 13:54

0 votes

1 answer

55 views

module not found in haystack 2.17.1

i am trying to create a small starter llm RAG project using haystack. my project packages are below (I use UV): [project] name = "llm-project" version = "0.1.0" description = "...

femi

984

asked Sep 13 at 17:35

1 vote

0 answers

71 views

ragas with Ollama does not terminate

I am using the python package ragas with the goal of generating a testset for a RAG application. I am defining my BaseRagasLLM as: from langchain_ollama import OllamaLLM from ragas.llms import ...

oyster

21

asked Aug 29 at 14:41

0 votes

0 answers

131 views

How to Use Pytest Fixtures in a RAG-Based LangChain Streamlit App?

I'm building a RAG (Retrieval-Augmented Generation) chatbot using LangChain, Gemini API, and Qdrant, with a Streamlit frontend. I want to write unit tests for the app using pytest, and I’m trying to ...

Krishna Suthar

1

asked Jul 28 at 22:43

0 votes

0 answers

75 views

How to accelerate my corpus embedding to the chromadb

I have the corpus.jsonl which has 6.5gb storage.And i use the one h100 gpu to embedding the corpus to the chromadb,but it seems very slowly.I want to find how can i accelerate the progress(gpu,cpu,io)....

YiJun Sachs

23

asked Aug 27 at 1:52

0 votes

1 answer

120 views

How do I prevent duplicate messages in context window, when using rag and memory?

When using rag and memory, multiple identical copies of the same information is sent to the ai, when asking related questions. I have import java.util.ArrayList; import java.util.List; import dev....

MTilsted

5,535

asked Jul 28 at 21:59

1 vote

1 answer

260 views

Firecrawl self-hosted crawler throws Connection violated security rules error

I set up a self-hosted Firecrawl instance and I want to crawl my internal intranet site (e.g. https://intranet.xxx.gov.tr/). I can access the site directly both from the host machine and from inside ...

birdalugur

307

asked Aug 22 at 13:47

0 votes

0 answers

56 views

How to send extra headers from RAGFlow Agent to a Spring Boot MCP server tool call?

I am using RAGFlow connected to a Spring Boot MCP server. My agent flow is simple: Begin node → collects inputs (auth_token, tenant_id, x_request_status) Agent (gpt-4o) → connected to MCP Tool (server)...

Ishan Garg

729

asked Sep 4 at 17:45

2 votes

1 answer

186 views

Why is FAISS document retrieval slow and inconsistent on EC2 t3.micro instance?

I'm building a document Q&A system using FAISS for vector search on an AWS EC2 t3.micro instance (1 vCPU, 1GB RAM). My FAISS index is relatively small (8.4MB .faiss + 1.4MB .pkl files), but I'm ...

user29255210

65

asked Aug 22 at 11:01

-1 votes

1 answer

54 views

How to ensure all documents contribute to summary context after merging indexes?

I'm building a LangChain RAG pipeline using the FAISS vector store. I'm merging multiple FAISS indexes — each representing one document — and then querying them to generate summaries or answers via ...

Musab

54

asked Jul 22 at 6:21

-4 votes

1 answer

143 views

I want to know where to locate the file I upload though the ragflow system, how to find it in the windows system

I use Ollama and RagFlow to manage my own knowledge files, I upload some files to a knowledge，and they works well in the system. I start the ragflow with docker commands. Who can help me to find the ...

Jinzhengxuan

13

asked Mar 19 at 1:22

1 vote

0 answers

193 views

How to handle follow-up confirmations in Spring AI 1.0.0 without losing context during tool selection using RAG?

I'm building a web application using Spring Boot 3.4.5 and Spring AI 1.0.0 with Llama3.2(Ollama) model integration. I've implemented tool calling, and because I have many tools in the application, I'm ...

Sarath Molathoti

81

asked Jul 1 at 12:27

-1 votes

1 answer

256 views

ImportError: cannot import name 'Client' from 'pinecone' (unknown location)

The problem with this piece of code is that I am unable to import Client from the pinecone library. I tried to uninstalling and reinstalling different versions none of them worked. I also tried it ...

ACR

11

asked Jul 24 at 2:41

0 votes

1 answer

268 views

Deleting data points in qDrant DB

I am trying to delete all the data points that are associated with a particular email Id, but I am encountering the following error. source code: app.get('/cleanUpResources', async (req, res) => { ...

Abhishek Prasad

11

asked Jul 24 at 19:05

0 votes

2 answers

545 views

GCP Vertex AI RAG Creation

Creating a RAG Corpus Using RAG and Storing Vector Search Information in the ragCorpus Endpoint Tried the Following Approaches: 1️⃣ Unable to Create ragCorpus with Vertex Vector Search Info Payload ...

suraj soni

1

asked Feb 14 at 12:28

1 vote

0 answers

680 views

Why is the upload of files to GCP Vertex AI RAG corpora so slow?

I am experimenting with RAG on GCP/Vertex AI, and tried to create some simple example. Here's what I came up with, creating small dummy files locally and then uploading them one by one to a newly-...

Davide Fiocco

6,039

asked May 9 at 11:58

0 votes

0 answers

172 views

Llamaindex returns "Empty Response"

I have a RAG system using llamaindex. I am upgrading library from 0.10.44 to 0.12.33. I see a different behaviour now. Before when there were not results from vectors store it seems it called the LLM ...

Deibys

669

asked May 6 at 14:16

0 votes

1 answer

139 views

llama-index RAG: how to display retrieved context?

I am using LlamaIndex to perform retrieval-augmented generation (RAG). Currently, I can retrieve and answer questions using the following minimal 5 line example, from https://docs.llamaindex.ai/en/...

Jeremy K.

1,802

asked Nov 21, 2024 at 5:15

1 vote

1 answer

698 views

How to keep updated information in a RAG system?

I have created a RAG system with documents I have (saved in chunks). I was wondering how to keep updated information in responses. For example, I have news articles about one subject. Those articles ...

Lefloch Had

97

asked Aug 29, 2024 at 11:18

0 votes

2 answers

491 views

Passing request context from FastAPI to Microsoft Semantic Kernel Plugin for OpenAI Integration

I am integrating Microsoft Semantic Kernel with OpenAI in my FastAPI application. I have a chat/ endpoint where I receive a session_id from the request, and I need to pass this session_id to a plugin ...

Prasanth Rao

1

asked Nov 26, 2024 at 12:51

1 vote

0 answers

94 views

Sentence similarity pipeline with @huggingface/transformers

Wanted to use the pipeline api from @huggingface/transformers js for sentence-similarity - but I do not see a specific pipeline for it. The closest thing is text classification and feature extractions ...

Edv Beq

1,020

asked Jun 4 at 16:16

0 votes

0 answers

129 views

Efficient Retrieval Methods of Relevant Chunks for Pydantic BaseModel for RAG Structured Output

I need to generate structured outputs using Pydantic’s BaseModel. Specifically, I need to retrieve relevant text chunks for each field in my model to minimize errors and ensure accurate data ...

Jingyuan Liang

1

asked Jan 3 at 1:49

0 votes

1 answer

264 views

Score Profiles Azure AI search not working

I have configured on my Index a default score profile to use on all of my searches, I have an test index that has a field named 'source' if the filed is == to 'reviewed' I want those docs to be moved ...

R_Student

809

asked Jul 22, 2024 at 22:45

1 vote

1 answer

137 views

Embedding model `all-mpnet-base-v2` not able to classify user prompt properly

I am using this model to embed a product catalog for a rag. In the product catalog, there are no red shirts for men, but there are red shirts for women. How can I make sure the model doesnt output ...

Advait Shendage

11

asked Apr 29 at 9:16

0 votes

0 answers

99 views

How to loop through text chunks created using AzureOpenAI `client.vector_stores.create`

I checked Azure's documentation on this topic here but I do not see anything related to this. My goal is to create a question and answer dataset for my RAG solution based on each chunk for a good ...

Mike B

3,629

asked May 5 at 13:03

-1 votes

1 answer

851 views

PDFSearchTool over multiple PDFs in CrewAI

How can I use the PDFSearchTool over multiple PDFs in CrewAI? I’m currently using the PDFSearchTool and it works well with one single PDF but I didn’t find any example or managed to pass a list of ...

Gian

361

asked Jan 14 at 14:45

0 votes

1 answer

433 views

I am using LangChain4j to develop a knowledge base and encountered the "different vector dimensions 1024 and 384"

I want to know if there are any other settings required for pgvector or what content needs to be set in the code to enable pgvector to support higher vector dimensions. I found on the official website ...

tom

3

asked Apr 5 at 13:12

1 vote

0 answers

61 views

Scaling RAG QA with Large Docs, Tables, and 30K+ Chunks (No LangChain)

I'm building a RAG-based document QA system using Python (no LangChain), LLaMA (50K context), PostgreSQL with pgvector, and Docling for parsing. Users can upload up to 10 large documents (300+ pages ...

Anton Lee

11

asked Jun 2 at 16:30

0 votes

1 answer

146 views

Is it possible to share an Azure-based RAG chatbot without requiring users to sign in with Azure?

I’ve been experimenting with Azure AI Foundry and created a Retrieval-Augmented Generation (RAG) chatbot that works great on its own. However, when I try to deploy the chatbot using Azure, I encounter ...

Liam Mason

3

asked Feb 24 at 7:06

0 votes

2 answers

75 views

SitemapLoader(sitemap_url).load() hangs

from langchain_community.document_loaders import SitemapLoader def crawl(self): print("Starting crawler...") sitemap_url = "https://gringo.co.il/sitemap.xml" ...

Gulzar

28.8k

asked Apr 18 at 19:38

0 votes

1 answer

85 views

Compatibility Issues with Library Versions for RAG Project Integration with Rasa

I want to create a RAG (Retrieval-Augmented Generation) project and integrate it with Rasa. However, Rasa requires older versions of some libraries, such as Pydantic (version 1.10.10). Meanwhile, ...

VJeet singh

1

asked Dec 3, 2024 at 12:33

1 vote

0 answers

47 views

how to deal with evolving information in RAG?

I'm trying to index a series of articles to use in a RAG knowledge base, I cannot find any best practice or recommendation documented about dealing with information that changes or evolves in time. ...

weeanon

821

asked Apr 7 at 13:10

4 votes

2 answers

568 views

Can't make SafetySettings of VertexAI Gemini API work

I'm working on a RAG application using VertexAI API. One of my questions has the word "roubar" and it keeps triggering the safety filter, like: ... Candidate: { "index": 0, &...

Lucas Miranda de Sena

31

asked Oct 19, 2024 at 12:43

0 votes

0 answers

523 views

How to Build a Chatbot That Queries an SQL Database and Uses Vector Search for RAG?

I'm working on a chatbot that answers based on a department store's SQL database, and I need help. The database looks like this: If the user asks something like this: The chatbot should answer like ...

an honest observer

11

asked Dec 7, 2024 at 5:35

0 votes

1 answer

162 views

Upserting in Pinecone takes too long

I'm trying to upsert reviews that i've scraped into pinecone. For the embedding model im using jina-embedding-v3. For 204 reviews this takes around 2.5 hours! in Colab. Tried using GPU but the ...

Daaku-C5

19

asked Mar 6 at 6:22

0 votes

0 answers

66 views

How to Extract Text Tables Images from PDFs while maintaining the structures

from unstructured library opensource one when i tried a pdf that have background images design patterns and XObjects in it this library also consider those as a images and store the path. so how can ...

Umair Ashraf

11

asked Apr 22 at 9:48

-3 votes

1 answer

603 views

CREW AI tool not using ChatGroq

Here is my tools.py file import os from dotenv import load_dotenv from langchain_groq import ChatGroq from crewai_tools import PDFSearchTool, SerperDevTool load_dotenv() # Initialize the ChatGroq ...

Priyanshu Khare

115

asked Aug 25, 2024 at 9:29

1 vote

3 answers

1k views

SemanticKernel with Plugin functions and vector database with C#

To start, I’d like to explain what I aim to achieve. My goal is to create an AI bot that will act as a hotel assistant, able to provide users with any hotel-related information they request. It should ...

Piotr

1,207

asked Oct 30, 2024 at 22:39

0 votes

1 answer

599 views

AttributeError: 'LlmAgent' object has no attribute 'invoke'

I am trying to call Flask API which i alrady running on port 5000 on my system, i am desgning a agentic AI code which will invoke GET and then POSt based on some condition , and using google-adk. I ...

witty_minds

79

asked Jun 20 at 6:18

0 votes

1 answer

1k views

Best approach for RAG using Azure OpenAI and AI Search with Python SDK [closed]

I struggle understanding what are the pros and cons of each one of these approaches for implementing a RAG using Azure OpenAI with AI Search as source, with Python SDK. Both work well, but option B ...

torete

60

asked Nov 30, 2024 at 9:50

0 votes

2 answers

880 views

How to add S3 bucket objects metadata into bedrock knowledgebase?

I am using AWS bedrock for the first time. I have configured the data source which is S3 along with opensearch serverless cluster for embeddings. However, I do not have any control over the mappings ...

Makarand

636

asked Apr 14 at 0:51

Collectives™ on Stack Overflow