1

I am using this model to embed a product catalog for a rag. In the product catalog, there are no red shirts for men, but there are red shirts for women. How can I make sure the model doesnt output women shirts for men based queries?

The following is the example product data

{
    "ColorDesc": "Brown",
    "styleName": "Scarf",
    "productType": "mens accessories",
    "tags": {
      "colorTag": [
        {
          "type": "color",
          "value": "Brown"
        }
      ],
      "newStyleTag": [
        {
          "type": "style",
          "value": "Scarves"
        }
      ],
      "depttTag": [
        {
          "type": "department",
          "value": "Men"
        }
      ]
    },
    "gender": "Men"
  },

When I prompt - "Looking for a brown scarf for women", the model will returns this product instead of returning nothing Is there any way to strictly apply certain filters in rag so that it retrieves only that those products and not output anything if the product is not available for that prompt? I am using FAISS for vectorstore and ollama for llm.

1 Answer 1

0

If you want to make sure the model doesn't return men's products when someone asks for women's stuff, just do the semantic search first, then filter by gender in the code.

So even if FAISS finds something similar, like a man's brown scarf when someone asks for a women's brown scarf, you can just filter that out afterward.

Here's a simple example:

def filter_by_gender(results, gender):
    return [product for product in results if product.get("gender", "").lower() == gender.lower()]

# Step 1: Get similar results using FAISS
retrieved_results = vectorstore.similarity_search(query, k=100)

# Step 2: Keep only products with the correct gender
filtered_results = filter_by_gender(retrieved_results, gender="Women")

# Step 3: Handle the output
if not filtered_results:
    print("No matching product found.")
else:
    for product in filtered_results:
        print(product)

To extract the gender from a query, you can use substring matching or an LLM.

You can ask the LLM something like:

Extract structured filters from this user query:
"Looking for a brown scarf for women"

Return the result as JSON. Only include fields like gender if they are clearly mentioned.

The LLM might respond with something like:

{
  "gender": "Women",
}
Sign up to request clarification or add additional context in comments.

1 Comment

But this will increase the response time by quite a bit since after fetching the products, we are making another llm call in the backend for the gender filter. That is what I dont want. I can use a simple NLP model for classification but again its another call which increases the response time

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.