1

I have a origin & destination airport and the number of flights between two airports on a certain year.

       ORIGIN_AIRPORT DESTINATION_AIRPORT  Counts
0               ABE                 ATL     170
1               ABE                 DTW     154
2               ABE                 ORD      69
3               ABI                 DFW     530
4               ABQ                 ATL     123
...             ...                 ...     ...
4293            XNA                 MSP      63
4294            XNA                 ORD     490
4295            YAK                 CDV      67
4296            YAK                 JNU      67
4297            YUM                 PHX     377

Is there a way to form an adjacency matrix in python using this data? There should be a 0 if there is no connection (no flights) between airports and 1 if there is a connection.

The matrix should be N x N. It should look something like this:


Adjacency Matrix:
        ABE ABI ABQ ATL DTW ORD DFW
ABE     0   0   0   1   1   1   0
ABI     0   0   0   0   0   0   1
ABQ     0   0   0   1   0   0   0
ATL     1   0   1   0   0   0   0
DTW     1   0   0   0   0   0   0
ORD     1   0   0   0   0   0   0
DFW     0   1   0   0   0   0   0

...


3
  • Yes, just go trough the data and edit a matrix on the fly. Where is the problem ? You could even use a sparse matrix format to avoid storing zeros Commented Jan 16, 2023 at 23:45
  • There are 4297 rows and 315 distinct airports, I cannot go through the data manually and plus I need an N x N matrix Commented Jan 17, 2023 at 8:37
  • I guess that's what code is for, use a loop it'll go through the 4294 rows in milliseconds. Of course, you can probably find a built in solution that does the trick but it's not always the case and built-in solution should be use as a faster way to do something you can do, not to do something you can not do. Otherwise, you'll fast have trouble debugging your code Commented Jan 17, 2023 at 10:24

1 Answer 1

1

You can use pd.crosstab():

pd.crosstab(df["ORIGIN_AIRPORT"], df["DESTINATION_AIRPORT"])

This outputs:

DESTINATION_AIRPORT  ATL  DFW  DTW  ORD
ORIGIN_AIRPORT
ABE                    1    0    1    1
ABI                    0    1    0    0
ABQ                    1    0    0    0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.