2

I'm trying to incorporate a solution that takes a start and end coordinates, alongside timestamps, to find the shortest path between them. This uses the UK road network pulled from OSM, the start and end lat and lon, and finds the closest nodes in the network, and then the shortest path.

The current engineering pipeline uses Apache Spark on serverless dataproc.

The main issue is that I have to calculate the shortest path for 12 million start and end lat lon examples. They're not all unique examples, so I could preprocess certain routes but it would be a large number still, around 500,000, and I'd still need to calculate new timestamps for each example regardless.

I'm looking for a solution, that given a graph network, I can compute millions of shortest paths, and return the points, within those paths, as timestamps based on the individual examples.

Currently it's built as a pandas udf. The linestrings needed to build the network are broadcast as an Arrow dataframe. For each example it then builds a sub graph given an area around the start and end points, computes the shortest path, and adds timestamps for the points in the routes. This takes a long time to do and doesn't scale well.

I've explored GraphFrames and GraphX but the shortest_path algorithm doesn't return the actual path, just the weight, and the bfs doesn't incorporate weights (which in this case is the travel time based on the speed of the road). And with either example you're not able to easily iterate over multiple paths (at least not that I've found). Any potential ideas would be really appreciated.

2
  • 1
    this is an interesting question, but people may vote to close it because it's a bit open-ended. do you have any accompanying code so that someone trying to answer your question is able to reproduce the relevant part of your process on their own? Commented Feb 16 at 1:51
  • Maybe related: Link Commented Feb 16 at 14:55

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.