I'm using LlamaIndex 0.14.7. I would like to embed document text without concatenating metadata, because I put a long text in metadata. Here's my code:
table_vec_store: SimpleVectorStore = SimpleVectorStore()
pipeline: IngestionPipeline = IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=300, chunk_overlap=15, include_metadata=False),
embed_model
],
vector_store=table_vec_store
)
pipeline.run(documents=table_documents)
self._table_index = VectorStoreIndex.from_vector_store(table_vec_store)
Even though I set the ingestion pipeline and tell the sentence splitter to not include metadata, I still got this error:
ValueError: Metadata length (348) is longer than chunk size (300). Consider increasing the chunk size or decreasing the size of your metadata to avoid this.
I use document text for indexing. After retrieval, I also need the long text in metadata so I cannot simply drop the metadata text away. How should I fix my code? Thanks