Encode interval as string for intersection query

Ask Question

Asked 4 years, 4 months ago

Modified 4 years, 4 months ago

Viewed 140 times

Part of AWS Collective

The goal is to check if two intervals intersect one another. Easy... right? Just plug this guy in and we can move on?

intersects = itemStart < queryEnd && queryStart < itemEnd

Not so fast, we have the following constraints:

We cannot use the candidate item's interval directly during the intersection check. We must pre-process the interval [itemStart, itemEnd] => encodedItem. The encoded item interval must be either a number or a string. We can however use the [queryStart, queryEnd] separately or not during the check.
When performing the intersection check we are limited to the following operations: beginsWith, between. No further boolean operators can be applied. (These are limitations of dynamodb).
- beginsWith(a, b) checks whether string a starts with string b.
- between(a, b, c) checks whether a is between b and c. Works with strings and numbers.
The start and end values of the items are in the range [0, 3254778254386].

In summary, the final intersection check must take the form of one of these two:

intersects = between(encodedItem, encodedQueryStart, encodedQueryEnd)

intersects = beginsWith(encodedItem, encodedQuery)

Perhaps there is a mathematical proof that flattening the 2 dimensional interval into 1 dimension prevents us from inspecting whether the intervals intersect.

The actual goal

While I am interested in random string problems as a pastime, there is a real world problem underlying this one. I am trying to perform date range intersection queries on a key value database, in this case dynamodb

What I've tried so far

Z-Ordering

I have been looking into geospatial indexes and Z-Ordering which has given me some faith that this is possible. [0][1][2][3] With Z-ordering you can work out if a query interval contains another by interleaving the interval into a string.

e.g.

            itemA = { start: 123, end: 456 }
     encodedItemA = "142536"
            itemB = { start: 111, end: 115 }
     encodedItemB = "111115"
            query = { start: 110, end: 119 }
encodedQueryStart = "111100" // interleave query.start with itself
  encodedQueryEnd = "111199" // interleave query.end with itself

Once this preprocessing is done I can do

isContained = between(encodedItem, encodedQueryStart, encodedQueryEnd)
isContained = between("142536", "111100", "111199") // returns false ✅
isContained = between("111115", "111100", "111199") // returns true ✅

However, "isContained" is not the same as "intersects".

Redundancy

The other thing I've explored is expanding every item so that it is no longer an interval but represented by many points facilitating prefix search. e.g.

itemA = { start: 123, end: 456 } is inserted as 334 rows:
           itemA1 = 123
           itemA2 = 124
              ...
           itemAN = 456
itemB = { start: 111, end: 115 } is inserted as 5 rows:
           itemB1 = 111
           itemB2 = 112
              ...
           itemBN = 115
query = { start: 110, end: 119 }
encodedQuery = commonPrefix(query.start, query.end) = "11"
intersects = beginsWith("111", "11") // returns false ✅
intersects = beginsWith("123", "11") // returns true ✅

This comes with its own trade offs.

Is there any other clever wizardry I could try to make this work? 🪄

edited Jul 28, 2021 at 13:51

asked Jul 27, 2021 at 13:24

david_adler

11.1k10 gold badges72 silver badges122 bronze badges

Out of interest, what solution did you settle on?

Oliver
– Oliver

2022-06-15 14:49:02 +00:00
Commented Jun 15, 2022 at 14:49
1

In the end we skipped this problem and decided it was easier for the client to download all the relevant data and do the query on the client

david_adler
– david_adler

2022-06-15 15:53:08 +00:00
Commented Jun 15, 2022 at 15:53

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Encode interval as string for intersection query

The actual goal

What I've tried so far

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

The actual goal

What I've tried so far

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked