Let's say I have a set of docs. Each doc is an unordered bag of strings
{a, b, b, d}, {a, b}, {j, k, d, a}, ....
Is it possible to use GIN to find all docs that are similar to the doc X? As a similarity - cosine or euclidean distance is used.
I know PostgreSQL provides trigram search. It's very similar to what I want. But without trigram. I want to use my own vectors.
Something like SELECT * from DOCS where content like {a, b, c}.
INSERT INTO docs (content) VALUES ({i, j, k})
INSERT INTO docs (content) VALUES ({a})
INSERT INTO docs (content) VALUES ({b, c})
...
-- Somehow build GIN index over the docs.content field
SELECT * FROM docs WHERE content LIKE {a, b, c}
Is it possible to do something like that with GIN?
If it helps - a bag of numbers could be used instead of bag of strings.
As a similarity measure pretty much anything could be used - cosine, euclidean, etc.That makes the question completely random. Please specify the kind of similarity you need.