7

My question is akin to Data structure for partial multi-keys mapping?.

I have key-value pairs where the key is composed of three components (strings).

I am looking for a data structure the can efficiently perform search queries over keys, where the queries could be complete or partial specification the key (one or more components omitted). For example:

(x, y, z)
(x, *, *)
(*, y, *)
etc.

The unspecified part of the key can be either at the front, the middle or the end of the key.

My current implementation (a hash map which maps all possible partial keys to a set of values that match this partial key) is horribly slow when inserting, deleting and updating values.

6
  • Trie comes to mind when using partial keys... Commented Jul 14, 2014 at 13:12
  • A trie seems to be limited to prefixes. This way, the first 'component' always needs to be specified. I also considered a suffix-tree, which seems to lift this restriction but it seems that this is the same as my current solution but (instead of using a hash) using a binary tree. Is this conclusion wrong? Commented Jul 14, 2014 at 13:36
  • Since you have a limited number of keys, why not just use three separate maps - one for each key? This will require slightly more space, but won't significantly impact time performance. Commented Jul 14, 2014 at 13:41
  • @blgt: 3 isn't enough. Commented Jul 14, 2014 at 13:43
  • 1
    I was thinking more along the lines of a directed graph where if the triple is (a,b,c) then the edges are (a,b), (b,c). This ends up being fairly similar to a trie over the dimensions rather than over each digit. So that (50,20,30) and (50,20,50) become basically `data[50][20] = {30,50}. You can certainly use only three (this is where the hasty not enough came from) but then you have to deal with set intersections which result in queries that aren't O(k), where k is the number of returned triples. Though that might be required if goal is to only increase speed by something like 5-10x. Commented Jul 14, 2014 at 14:40

1 Answer 1

3

My current implementation (a hash map which maps all possible partial keys to a list of values that match this partial key) is horribly slow when inserting, deleting and updating values.

If you are actually using a list then that is almost certainly the issue as they are slow on deleting/updating/inserting may also be an issue depending on how you are doing it. You are better off using a set (unordered) or balanced binary search tree (ordered).

Sign up to request clarification or add additional context in comments.

4 Comments

Is having 2^3 = 8 maps with (x, y, z), (x, y, ), (x, *, z), (x, *, *), (, y, z), (, y, *), (, , z), (, *, *) more efficient than one map with 8 entries for each key (like I have now)?
I'd have to see an example, but probably not now that I think about it. It seems like the actual issue that I missed is probably the data structure you are using for storing the triples. Lists are indeed rather slow at doing those things so I changed my answer based on that being more likely the issue. You also might want to include a language tag.
I am indeed using a set. Mentioning a list was a mistake. The process is slow in absolute terms, but you are right in the sense that 8 O(1) inserts are still O(1). Maybe this is as fast as it gets... Therefore I marked your answer as accepted.
@WernerdeGroot: If you are already using sets then it may be the case that this is a case of the XY-Problem, and there are a few additions you can make to your question. If the massive number of triples is what's slowing you down, then where are you getting the triples from? There might be a way to efficiently index the triples without actually iterating over every triple depending on how the list of triples is generated. Also how many triples do you have and how long does it take to add them to the data structure?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.