I'm trying to select values that have broken the record high or low values. I'm comparing to a DataFrame that has the high and low values for each day as two separate columns. The end goal is to graph a scatterplot of the (date, value) that are the new record values against a line graph of the old record values (using matplotlib.)
Here's an example dataset.
new_data = {'Date': ['1/1/2015', '1/2/2015', '1/3/2015', '1/4/2015', '1/5/2015'],
'new_low': [10, 25, 24, 21, 15],
'new_high': [35, 37, 38, 55, 47]}
record_data = {'Day': ['1/1', '1/2', '1/3', '1/4', '1/5'],
'record_low': [12, 28, 21, 25, 15],
'record_high': [30, 40, 36, 57, 46]}
df_new = pd.DataFrame(new_data)
df_new.set_index('Date', inplace=True)
df_record = pd.DataFrame(record_data)
df_record.set_index('Day', inplace=True)
So it would look like this
new_low new_high (new_data)
Date
1/1/2015 10 35
1/2/2015 25 37
1/3/2015 24 38
1/4/2015 21 55
1/5/2015 15 47
record_low record_high (record_data)
Date
1/1 12 30
1/2 28 40
1/3 21 36
1/4 25 57
1/5 15 46
I want the result to look along this line.
Date Record Value
0 1/1/2015 10
1 1/2/2015 25
2 1/4/2015 21
3 1/1/2015 35
4 1/3/2015 38
5 1/5/2015 47
Since I need to use the result with matplotlib to make a scatterplot, I will need a list of x-values and y-values to enter. My example result was a dataframe that I made, but it doesn't need to be. I could use two separate arrays or even a list of tuples that I could unzip into lists of x and y.
I feel like there should be some simple/elegant way to do this with mapping, but I'm not experienced enough to find it and I haven't been able to find an answer elsewhere.
I'm also having some issues with how to enter the record data with just a month and day as a datestamp, so I've just set them all to the same year. It works for my visualization, but I would rather not do that to the data.