0

I have a list with information about ~400 collected papers where each element of the list is organized as follows:

title: "A 281.6mW Real-Time Depth Signal Processing Unit for Deep Learning-Based Dense  ",
author: "Im, Park, Li, et al.",
affiliation: "KAI ",
year: 2022,
publisher: "IEEE International SState CC",
url: "-",
short: "ISSCC 22'",
fabric: "28nm CMOS @ 0.72-1.1V 250MHz ",
accl_s: 4500,
accl_p: 0.5447,
ee: "8261.4 GOPs/W",
s1: 2,
s2: 1,
s3: 2,
support: 1,
performance: 2,
benchmark: "Depth CNN(8-bit float)",
mul_p: 1,
w_p: 3,
a_p: 3,
sparse_support: 2,
sparse_policy: 1,
sparsity: 1

How can I convert such a list to a dataframe with multiple columns (Title, author affiliation ...)? Herer is the code I used to read in the data:

my_file = open('data.txt', 'r')
lines = my_file.read().split("},")
3
  • What kind of data structure is this? A list of dictionaries? Commented Oct 6, 2022 at 11:48
  • are we talking list or dictionnary? Commented Oct 6, 2022 at 11:48
  • the data is structured as a text file ... I will add a snippet of the code I used to read the data. Commented Oct 6, 2022 at 11:52

1 Answer 1

3

EDIT: Since you've posted:

my_file = open('data.txt', 'r')
lines = my_file.read().split("},")

So you can first parse raw strings to dict and place it in lines list:

for i in range(len(lines)):
    data_dict = {}
    for line in lines[i].split('\n'):
        k, v = line.strip().split(':')
        data_dict[k.strip()] = v.strip()
    lines[i] = data_dict

Then you could use this dirty solution:

keys = []
for i in lines[0]:
    keys.append(str(i))
df = pd.DataFrame(column=keys)
for i in range(len(lines)):
    for key in keys:
       df.at[i,key] = lines[i][key]

First we are creating headers from first entry in this list. After that you create empty dataframe with colnames as this new keys. Then dirty iterate over whole list, and then go into every pair of key:value, read value and place it under proper index and column cell.

Hope it works! I don't know if all of your colnames looks like in this example. If not, it will skip columns not included in first list item keys.

Sorry for chaos, I've edited it few times. Tommorow can make more coherent version.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.