1

I need to list all the cities you can get to after stopping off at exactly one other city, starting off from any city of my choice. And list with it the distance to the final city and the intermediate city.

The tables in the database consist of cities, with the attributes:

| city_id |   name    |  
  1         Edinburgh  
  2         Newcastle  
  3         Manchester

citypairs:

| citypair_id | city_id |  
  1             1  
  1             2  
  2             1    
  2             3  
  3             2  
  3             3

and distances:

| citypair_id | distance |
  1             1234
  2             1324
  3             1324

and trains:

| train_id | departure_city_id | destination_city_id |
  1          1                   2
  2          2                   3
  3          1                   3
  4          3                   2

I haven't put any of the data in but basically if a city.name is chosen at random by me I need to find out which cities I can get to from this city if I go via another city (i.e. in two journeys) and then the distance to the final and intermediate city.

How would you, or how should I, go about forming a query to return the desired table?


Edited to include data and a missing table! As an example you can go from Edinburgh(1) to Manchester(3) via Newcastle(2) and you can go from Edinburgh to Newcastle via Manchester, however you can not go from Manchester to Edinburgh via Newcastle (since a train departs from 3, arrives at 2, but no train from 2 arrives in 1) and this route should not be returned from the query. Apologies for any confusion beforehand.

4
  • Can you put in some sample data. It's not easy to see how cities are linked. Does the citypair table have two city records for each pair? Putting the schema and some sample data in a sqlfiddle.com and posting the link would make things easier. Commented Mar 11, 2013 at 19:31
  • "I need to list all the cities you can get to after stopping off at exactly one other city . . ." Why, all of them, of course. Commented Mar 11, 2013 at 19:52
  • I don't understand why a table called "citypairs" would have rows with a spot for only one "city_id". Commented Mar 11, 2013 at 19:54
  • Sorry it really wasn't as clear as I thought, I also missed out a table, which was pretty special. I have edited the question, hopefully it is clearer now. Thanks Commented Mar 11, 2013 at 20:27

2 Answers 2

1

I've got a CTE that builds a tree of all the destinations.

WITH RECURSIVE trip AS (
  SELECT c.city_id AS start_city,
    ARRAY[c.city_id] AS route,
    cast(c.name AS varchar(100)) AS route_text,
    c.city_id AS leg_start_city,
    c.city_id AS leg_end_city,
    0 AS trip_count,
    0 AS leg_length,
    0 AS total_length
  FROM cities c
UNION ALL
  SELECT
    trip.start_city,
    trip.route || t.destination_city_id,
    cast(trip.route_text || ',' || c.name AS varchar(100)),
    t.departure_city_id,
    t.destination_city_id,
    trip.trip_count + 1,
    d.distance,
    trip.total_length + d.distance
  FROM trains t
  INNER JOIN trip
    ON t.departure_city_id =  trip.leg_end_city
  INNER JOIN citypairs cps
    ON t.departure_city_id = cps.city_id
  INNER JOIN citypairs cpe
    ON t.destination_city_id = cpe.city_id AND
       cpe.citypair_id = cps.citypair_id
  INNER JOIN distances d
    ON cps.citypair_id = d.citypair_id
  INNER JOIN cities c
     ON t.destination_city_id = c.city_id
  WHERE NOT (array[t.destination_city_id] <@ trip.route))
SELECT *
FROM trip
WHERE trip_count = 2
AND start_city = (SELECT city_id FROM cities WHERE name = 'Edinburgh');

The CTE starts from each city (in the non-recursive part at the start), then determines all the destination cities it can go to. It keeps a track of all the cities its been to in an array (the route column), so it won't loop back to itself again. As it progresses, it keeps track of the overall trip distance, and the number of trains taken (in trip_count).

As it goes through the tree, it keeps a running total of the distance.

This gives results of

| START_CITY | ROUTE |                     ROUTE_TEXT | LEG_START_CITY | LEG_END_CITY | TRIP_COUNT | LEG_LENGTH | TOTAL_LENGTH |
--------------------------------------------------------------------------------------------------------------------------------
|          1 | 1,2,3 | Edinburgh,Newcastle,Manchester |              2 |            3 |          2 |       1324 |         2558 |
|          1 | 1,3,2 | Edinburgh,Manchester,Newcastle |              3 |            2 |          2 |       1324 |         2648 |

If you change remove the final WHERE clause it'll show all the possible trips in the data, likewise you can change the trip_count to find all single train destinations etc.

| START_CITY | ROUTE |                     ROUTE_TEXT | LEG_START_CITY | LEG_END_CITY | TRIP_COUNT | LEG_LENGTH | TOTAL_LENGTH |
--------------------------------------------------------------------------------------------------------------------------------
|          1 |     1 |                      Edinburgh |              1 |            1 |          0 |          0 |            0 |
|          2 |     2 |                      Newcastle |              2 |            2 |          0 |          0 |            0 |
|          3 |     3 |                     Manchester |              3 |            3 |          0 |          0 |            0 |
|          1 |   1,2 |            Edinburgh,Newcastle |              1 |            2 |          1 |       1234 |         1234 |
|          1 |   1,3 |           Edinburgh,Manchester |              1 |            3 |          1 |       1324 |         1324 |
|          2 |   2,3 |           Newcastle,Manchester |              2 |            3 |          1 |       1324 |         1324 |
|          3 |   3,2 |           Manchester,Newcastle |              3 |            2 |          1 |       1324 |         1324 |
|          1 | 1,2,3 | Edinburgh,Newcastle,Manchester |              2 |            3 |          2 |       1324 |         2558 |
|          1 | 1,3,2 | Edinburgh,Manchester,Newcastle |              3 |            2 |          2 |       1324 |         2648 |

The cast( ... as varchar(100)) is a bit hacky, and I'm not sure why it was needed, but I haven't had a chance to get around that yet.

The SQL is here for testing: http://sqlfiddle.com/#!1/93964/24

Sign up to request clarification or add additional context in comments.

Comments

0

The first part is easy:

SELECT c2.name
FROM cities AS c
JOIN trains t ON c.city_id=t.departure_city_id
JOIN trains t2 ON t.destination_city_id=t2.departure_city_id
JOIN cities AS c2 ON t2.destination_city_id=c2.city_id
WHERE c2.city_id!=c.city_id
AND c.name='Edinburgh';

http://sqlfiddle.com/#!12/a656f/14 In PG 9.1+ you could even do it with a recursive CTE for any number of cities in between. The distances are a little more complicated and you probably would be better off transforming city_pairs into actual pairs.

3 Comments

Thanks, that does seem to make sense to me. However, when you set Manchester as the departing city as you did in your second query on sqlfiddle it returns Newcastle. It shouldn't return anything since you can only go to Newcastle from Manchester in one journey, the next journey would take you back to Manchester. I can't work out where it is going wrong though, the query seems good to me. Any ideas?
@HalfFrench the join was supposed to be between t2 and c2 and was between t and c2. Edited.
Just as a note, CTE's are suppored on PostgreSQL 8.4 and higher. The new features in this area in 9.1 (the ability to write in a CTE) don't have any bearing here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.