1

I've browsed a few answers but haven't found the exact thing i'm looking for yet.

I have a pandas dataframe with a single column structured as follows (example)

0 alex
1 7 
2 female
3 nora
4 3
5 female 
...
999 fred 
1000 15 
1001 male 

i want to split that single column into 3 columns holding name, age, and gender. to look something like this:

  name  age  gender
0 alex  7    female
1 nora  3    female
...
100 fred 15  male

is there a way to do this? i was thinking about using the index but not sure how to actually do it

6
  • where are these values for the three columns coming from? Commented Nov 3, 2022 at 15:41
  • it's an example on the data from splitting the orignal column Commented Nov 3, 2022 at 15:44
  • I should have been more explicit. Those values are not in the question, how i'll be able to provide in the answer? Commented Nov 3, 2022 at 15:45
  • there is no alex, nora, fed in your example inut column, so how are we meant to conjure them out of thin air? Commented Nov 3, 2022 at 15:46
  • and they always follow that same order? Commented Nov 3, 2022 at 15:47

4 Answers 4

3

Not the most efficient solution perhaps, but you can use pd.concat() and put them all next to each other, if they're always in order:

df = pd.DataFrame({'Value':['alex',7,'female','nora',3,'female','fred',15,'male']})
df2 = pd.concat([df[(df.index + x) % 3 == 0].reset_index(drop=True) for x in range(3)],axis=1)
df2.columns = ["name", "gender", "age"]

Returns:

name    gender  age
0   alex    female  7
1   nora    female  3
2   fred    male    15
Sign up to request clarification or add additional context in comments.

Comments

2

assuming "0" is your column name:

list_a = list(df[0])
a  = np.array(list_a).reshape(-1, 3).tolist()
df2= pd.DataFrame(a,columns = ["name", "age","gender"])

1 Comment

I like this better than what I came up with. Great idea! I tested it and found you don't even need to go through the lists so you can make this one line: df3 = pd.DataFrame(np.array(df.Value).reshape(-1,3),columns=["name", "gender", "age"])
0

Consider unstack:

import pandas as pd

df = pd.DataFrame(["alex", 7, "female", "nora", 3, "female", "fred", 15, "male"])

people = range(len(df) // 3)
attributes = ["name", "age", "gender"]

multi_index = pd.MultiIndex.from_product([people, attributes])

df.set_index(multi_index).unstack(level=1).droplevel(level=0, axis=1).reindex(columns=attributes)

Output:

   name age  gender
0  alex   7  female
1  nora   3  female
2  fred  15    male

Comments

0

here is one way to do it

# step through the DF and get values for name, age and gender as series
# each starts from 0, 1 and 3

name=df['Value'][::3].values
age=df['Value'][1::3].values
gender=df['Value'][2::3].values

# create a DF based on the values
out=pd.DataFrame({'name': name,
             'age' : age,
            'gender': gender})
out
    name    age  gender
0   alex    7    female
1   nora    3    female
2   fred    15   male

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.