I have a DataFrame that looks like this (code to produce this at end):
... and I want to basically split up the index column, to get to this:
There could be a variable number of comma-separated numbers after each Type.ID. I've written a function that does the splitting up for individual strings, but I don't know how to apply it to a column (I looked at apply).
Thank you for your help! Code to set up input DataFrame:
pd.DataFrame({
'index': pd.Series(['FirstType.FirstID', 'OtherType.OtherID,1','OtherType.OtherID,4','LastType.LastID,1,1', 'LastType.LastID,1,2', 'LastType.LastID,2,3'],dtype='object',index=pd.RangeIndex(start=0, stop=6, step=1)),
'value': pd.Series([0.23, 50, 60, 110.0, 199.0, 123.0],dtype='float64',index=pd.RangeIndex(start=0, stop=6, step=1)),
}, index=pd.RangeIndex(start=0, stop=6, step=1))
Code to split up index values:
import re
def get_header_properties(header):
pf_type = re.match(".*?(?=\.)", header).group()
pf_id = re.search(f"(?<={pf_type}\.).*?(?=(,|$))", header).group()
pf_coords = re.search(f"(?<={pf_id}).*", header).group()
return pf_type, pf_id, pf_coords.split(",")[1:]
get_header_properties("Type.ID,0.625,0.08333")
#-> ('Type', 'ID', ['0.625', '0.08333'])

