1

I have an empty dataframe with 30 col, I am parsing each file and extracting the data extracting the metadata into a dictionary, the keys of dictionary match the col headers of dataframe, the number of keys in the dictionary depends on whats available in the file, how to insert a row into the dataframe based on values in dictionary?

Data in File:

 Col1                 Col2    Col3
 PD  .                 DD:   PERMANENT DATUM
 LMF .                 RT:   LOG MEASURED FROM
 DAPD.FT               98:   FEET ABOVE PERMANENT DATUM
 DMF .                 RT:   DRILLING MEASURED FROM
 EKB .FT               100:   KELLY BUSHING
 EGL .FT             -500:   GROUND LEVEL
 DATE.           08/12/95:   RUN DATE
 RUN .                  3:   RUN NUMBER} 

Dataframe headers : PERMANENT DATUM, LOG MEASURED FROM, FEET ABOVE PERMANENT DATUM,DRILLING MEASURED FROM,KELLY BUSHING

Desired output : The values in Col2 column should be converted as a row and match the Col33 value to dataframe header and insert a row

I wrote a code to parse the file and convert to dictionary : {'PERMANENT DATUM': 'DD', 'LOG MEASURED FROM': 'RT', 'FEET ABOVE PERMANENT DATUM': '98', 'DRILLING MEASURED FROM': 'RT', 'KELLY BUSHING': '100', 'GROUND LEVEL': '500', 'RUN DATE': '08/12/95', 'RUN NUMBER': '3'}

How to append the values in this dictionary to existing data frame? the keys in the dictionary matches dataframe headrs and is always a subset of dataframe headers.

2
  • Why bother with the empty DataFrame? Use a dict of dicts {'file_name': dict_for_file} to store each file, then you can construct it all at once with pd.DataFrame.from_dict probably using orient='index'. reindex if you then want a certain order or fields that never appeared in any file. Commented Oct 10, 2019 at 19:41
  • Have you solved your problem? Commented Oct 12, 2019 at 12:02

1 Answer 1

1

if I understand your problem correctly, given the following inputs:

df = pd.DataFrame(columns=['PERMANENT DATUM', 'LOG MEASURED FROM', 'FEET ABOVE PERMANENT DATUM', 'DRILLING MEASURED FROM', 'KELLY BUSHING', 'GROUND LEVEL', 'RUN DATE', 'RUN NUMBER'])

row = {'PERMANENT DATUM': 'DD', 'LOG MEASURED FROM': 'RT', 'FEET ABOVE PERMANENT DATUM': '98', 'DRILLING MEASURED FROM': 'RT', 'KELLY BUSHING': '100', 'GROUND LEVEL': '500', 'RUN DATE': '08/12/95', 'RUN NUMBER': '3'}

you want to add a line to the dataframe, and you just do it like this:

df = df.append(row, ignore_index=True)

and gives:

  PERMANENT DATUM LOG MEASURED FROM FEET ABOVE PERMANENT DATUM  \
0              DD                RT                         98   

  DRILLING MEASURED FROM KELLY BUSHING GROUND LEVEL  RUN DATE RUN NUMBER  
0                     RT           100          500  08/12/95          3  
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.