Python: Intersection of Two 2D Arrays

Question

I have data in .csv file called 'Max.csv':

Valid Date  MAX
1/1/1995    51
1/2/1995    45
1/3/1995    48
1/4/1995    45

Another csv called 'Min.csv' looks like:

Valid Date  MIN
1/2/1995    33
1/4/1995    31
1/5/1995    30
1/6/1995    39

I want two generate two dictionaries or any other suggested data structure so that I can have two separate variables Max and Min in python respectively as:

Valid Date  MAX
1/2/1995    45
1/4/1995    45

Valid Date  MIN
1/2/1995    33
1/4/1995    31

i.e. select the elements from Max and Min so that only the common elements are output.

I am thinking about using numpy.intersect1d, but that means I have to separately compare the Max and Min first on date column, find the index of common dates and then grab the second columns for Max and Min. This appears too complicated and I feel there are smarter ways to intersect two curves Max and Min.

Eelco Hoogendoorn · Accepted Answer · 2016-05-20 06:07:07Z

You mention that:

I have to separately compare the Max and Min first on date column, find the index of common dates and then grab the second columns for Max and Min. This appears too complicated...

Indeed this is fundamentally what you need to do, one way or the other; but using the numpy_indexed package (disclaimer: I am its author), this isn't complicated in the slightest:

import numpy_indexed as npi
common_dates = npi.intersection(min_dates, max_dates)
print(max_values[npi.indices(max_dates, common_dates)])
print(min_values[npi.indices(min_dates, common_dates)])

Note that this solution is fully vectorized (contains no loops on the python-level), and as such is bound to be much faster than the currently accepted answer.

Note2: this is assuming the date columns are unique; if not, you should replace 'npi.indices' with 'npi.in_'

JeanPaulDepraz · Accepted Answer · 2016-05-19 18:28:13Z

1

The set() builtin must be enough as follows:

>>> max = {"1/1/1995":"51", "1/2/1995":"45", "1/3/1995":"48", "1/4/1995":"45"}
>>> min = {"1/2/1995":"33", "1/4/1995":"31", "1/5/1995":"30", "1/6/1995":"39"}

>>> a = set(max)
>>> b = set(min)
>>> {x:max[x] for x in a.intersection(b)}
{'1/4/1995': '45', '1/2/1995': '45'}
>>> {x:min[x] for x in a.intersection(b)}
{'1/2/1995': '33', '1/4/1995': '31'}

answered May 19, 2016 at 18:28

JeanPaulDepraz

6697 silver badges12 bronze badges

6 Comments

Zanam Over a year ago

Can you please provide hint on how to create set from the csv file? I use pandas to read the csv file into a dataframe.

JeanPaulDepraz Over a year ago

This might help you chrisalbon.com/python/pandas_dataframe_importing_csv.html.

JeanPaulDepraz Over a year ago

please vote up and check right my answer, I delivered.

JeanPaulDepraz Over a year ago

Zanam Did you succeed?

Zanam Over a year ago

Yes I did but I like the answer @Eelco as it is not running loop

|

Collectives™ on Stack Overflow

Python: Intersection of Two 2D Arrays

2 Answers 2

Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related