When I am trying to build a RetrievalQA.from_chain_type with my local llm and PineCone VectorDataBase. However, it is not able to create a retriever owing to the error of not being able to instantiate abstract class of BaseRetriever.
The error is a Validation Error as shown below:
ValidationError Traceback (most recent call last)
/Users/dhruv/Desktop/Machine_Learning/Projects/Medical_ChatBot_Application/preprocessing/Experiment.ipynb Cell 28 line 1
----> 1 qa=RetrievalQA.from_chain_type(
2 llm=llm,
3 chain_type="stuff",
4 retriever=docsearch.as_retriever(search_kwargs={'k': 2}),
5 return_source_documents=True,
6 chain_type_kwargs=chain_type_kwargs)
File ~/anaconda3/envs/mchatbot/lib/python3.9/site-packages/langchain/chains/retrieval_qa/base.py:95, in BaseRetrievalQA.from_chain_type(cls, llm, chain_type, chain_type_kwargs, **kwargs)
91 _chain_type_kwargs = chain_type_kwargs or {}
92 combine_documents_chain = load_qa_chain(
93 llm, chain_type=chain_type, **_chain_type_kwargs
94 )
---> 95 return cls(combine_documents_chain=combine_documents_chain, **kwargs)
File ~/anaconda3/envs/mchatbot/lib/python3.9/site-packages/langchain/load/serializable.py:74, in Serializable.__init__(self, **kwargs)
73 def __init__(self, **kwargs: Any) -> None:
---> 74 super().__init__(**kwargs)
75 self._lc_kwargs = kwargs
File ~/anaconda3/envs/mchatbot/lib/python3.9/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.__init__()
ValidationError: 1 validation error for RetrievalQA
retriever
Can't instantiate abstract class BaseRetriever with abstract methods _aget_relevant_documents, _get_relevant_documents (type=type_error)
I tried to create my own class retriever and check the documentation of the PineCone API, however was not able to get the output chain as desired. My docsearch instance is basically a PineCone object and is of the type :
<langchain_pinecone.vectorstores.PineconeVectorStore at 0x147954250>
This is how I created my docsearch object :
from langchain_pinecone import PineconeVectorStore as PC
docsearch = PC.from_texts([t.page_content for t in text_chunks],
embeddings,
index_name = index_name)
Kindly let me know how to proceed ahead with this. My options are either this or using ChromaDB as a last resort. However, since I have memory constraints I would prefer to solve this via PineConeDB itself.