I have a large data frame with ~20 years of data. I would like to group this data frame by YEAR, and then add the same set of new X values to each group. I'm having trouble figuring how to use pd.concat with groupby. How can I use pd.concat and df.groupby together?
Below is a subset of my data frame (I deleted a bunch of rows just to show that I have multiple years that I would like to group by.
my data frame:
XSNO YEAR X Z
5 LOL001 1978 0.22 -0.44
6 LOL001 1978 0.95 -0.55
7 LOL001 1978 1.70 -1.01
8 LOL001 1978 2.10 -1.22
9 LOL001 1978 2.68 -1.34
10 LOL001 1978 3.27 -1.41
48 LOL001 1978 17.60 -1.86
49 LOL001 1978 18.21 -1.77
50 LOL001 1978 18.41 -1.65
51 LOL001 1978 18.67 -1.54
52 LOL001 1978 19.00 -1.5
68 LOL001 1978 23.60 -0.31
78 LOL001 1980 0.40 -0.56
79 LOL001 1980 1.50 -0.91
80 LOL001 1980 2.50 -1.25
81 LOL001 1980 3.20 -1.43
82 LOL001 1980 3.90 -1.44
83 LOL001 1980 4.50 -1.55
84 LOL001 1980 5.80 -1.22
101 LOL001 1980 21.50 -0.96
102 LOL001 1980 22.50 -0.69
103 LOL001 1980 23.60 -0.43
104 LOL001 1980 25.10 -0.09
107 LOL001 1981 0.30 -0.40
108 LOL001 1981 0.60 -0.56
109 LOL001 1981 2.40 -1.20
110 LOL001 1981 4.40 -1.34
111 LOL001 1981 7.00 -1.10
112 LOL001 1981 8.60 -1.49
What I would like the output to be (just a subset of the added values for one year):
XSNO YEAR X Z
LOL004 1978 0 NaN
LOL003 1978 0.05 NaN
LOL002 1978 0.1 NaN
LOL001 1978 0.15 NaN
LOL000 1978 0.2 NaN
LOL001 1978 0.22 -0.44
LOL002 1978 0.25 NaN
LOL003 1978 0.3 NaN
LOL004 1978 0.35 NaN
LOL005 1978 0.4 NaN
LOL006 1978 0.45 NaN
LOL007 1978 0.5 NaN
LOL008 1978 0.55 NaN
LOL009 1978 0.6 NaN
LOL010 1978 0.65 NaN
LOL011 1978 0.7 NaN
LOL012 1978 0.75 NaN
LOL013 1978 0.8 NaN
LOL014 1978 0.85 NaN
LOL001 1978 0.95 -0.55
max = df.X.max()
x = np.arange(0, max, 0.05)
x = pd.DataFrame({'X': x})
concat_df = df.groupby(['YEAR']).apply(lambda x: x.concat([df1, x]))
# this doesn't work and gives me an error
concat = pd.concat([df1, x])
# this doesn't give me what I want, it just tacks all the 'x' values (new values) on at the end.
I'm not sure how to use merge/join/concat functions with a grouped pandas data frame. I can't seem to find any other questions/answers on stack that get at what I'm looking for.
pd.concat()is like.append()it just adds the new data at the end of the firstdataframe