Manipulating dataframe rows Python

Question

I have a dataframe (adjusted for simplicity) as follows:

                Location Code  Technology  Latitude  Longitude  ... Frequency
0                    ABLSERVP      Type A      11.1       11.1  ...       850
2                    ABLSERVP      Type A      11.1       11.1  ...       700
4                    ABLSERVP      Type B      11.1       11.1  ...       850
...                       ...         ...       ...        ...  ...       ...
1300                    CSEY3      Type A      22.2       22.2  ...      2100
1301                    CSEY3      Type A      22.2       22.2  ...       700
...                       ...         ...       ...        ...  ...       ...
265064                  CSEY1      Type A      33.3       33.3  ...       750
265065                  CSEY3      Type B      22.2       22.2  ...       850

What I'm trying to achieve:

                Location Code  Technologies  Latitude  Longitude  ...  Type A's  Type B's  ...  
0                    ABLSERVP      Type A,B      11.1       11.1  ...   700,850       850  ...
...                       ...         ...         ...       ...        
265064                  CSEY1        Type A      33.3       33.3  ...       750       n/a  ...
265065                  CSEY3      Type A,B      22.2       22.2  ...  700,2100       850  ...

Since I have multiple columns and rows, I included the ellipses to represent. Is there anyway to do this without having to loop through the entire dataframe (I've read that this is inefficient and is one of the LAST resort).

My attempt: I first sorted based on location code as follows:

x=x.sort_values(by='Location Code')

I thought I could get the required result by doing: df = x.groupby(['Location Code', 'Technology']).sum()

This obviously doesn't work as it sums the frequencies instead of listing them. Any help?

Since I don't want you guys to type everything out, I created a replica of the dataframe: # creating lists l1 =["ABLSERVP", "ABLSERVP", "ABLSERVP", "CSEY3", "CSEY3", "CSEY1", "CSEY3"] l2 =["Type A", "Type A", "Type B", "Type A", "Type A", "Type A", "Type B"] l3 =[850, 700, 850, 2100, 700,750,850] # creating the DataFrame x = pd.DataFrame(list(zip(l1, l2, l3))) x.columns =['Location Code', 'Technology', 'Frequency'] x=x.sort_values(by='Location Code') — tareenmj
– tareenmj, Commented Nov 18, 2021 at 20:06

not_speshal · Accepted Answer · 2021-11-19 14:10:04Z

1

Try with groupby, pivot and join:

tech = x.groupby(["Location Code", "Latitude", "Longitude"])["Technology"].agg(lambda x: ", ".join(x.unique().tolist()))
pivoted = (x.pivot_table(index=["Location Code", "Latitude", "Longitude"], 
                         columns="Technology", 
                         values="Frequency", 
                         aggfunc=lambda x: ", ".join(x.astype(str)))
           )
output = tech.to_frame().join(pivoted)

>>> output
                                      Technology     Type A Type B
Location Code Latitude Longitude                                  
ABLSERVP      11.1     11.1       Type A, Type B   850, 700    850
CSEY1         33.3     33.3               Type A        750    NaN
CSEY3         22.2     22.2       Type A, Type B  2100, 700    850

edited Nov 19, 2021 at 14:10

answered Nov 18, 2021 at 20:29

not_speshal

23.2k2 gold badges18 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

not_speshal Over a year ago

The answer should work on a DataFrame of any size as long as the structure is the same :)

tareenmj Over a year ago

I didn't realize this but the extra columns that I forgot to include (for example latitude and longitude which are now updated on the original question). My apologies for it btw. They are not being included in the output. Is there any way to include these columns too?

not_speshal Over a year ago

@tareenmj - Add all the extra columns to the groupby and index of the pivot. See the edit.

Collectives™ on Stack Overflow

Manipulating dataframe rows Python

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related