0

I have pandas series. It's size is 10240. Each value in the series is a 2d array of size 143. I am making all the 2d array of size 143 into a 1d array of size 143. After that I am converting the series into a numpy array. So I should get a 2d array of size (10240*143), right? But I am not getting that. I am getting 2d array of shape (10240, ) and of size 10240. I don't know what I am doing wrong. My code is given below.

def get_subjects(x):
  print(type(x)) #2d list
  print(len(x)) # 2, 143
  x = to_categorical(x, num_classes=len(subjects)+1).sum(axis=0)
  print(type(x)) # numpy array
  print(x.size) # 143
  return x

print(type(train_data["subject_id"])) # pandas series
print(train_data["subject_id"].size) # 10240
subject_train = train_data["subject_id"].apply(lambda x: get_subjects(x)).to_numpy()
print(type(subject_train)) # numpy array
print(subject_train.size) # 10240 
1
  • If your shape is (x,), you don't have a 2D array. Please fix. Commented Apr 15, 2020 at 8:09

1 Answer 1

1

You are unable to get the expected shape because 'subject_train' is an array of arrays. To avoid it, you can split the 1d array returned by 'get_subjects' into multiple columns and then convert to numpy array like shown below.

import pandas as pd
import numpy as np
# df has 5 rows and each cell is made of 3x4 arrays 
df = pd.DataFrame({'data':[np.random.randint(low =1, high =10, size=(3,4)),
                           np.random.randint(low =1, high =10, size=(3,4)),
                           np.random.randint(low =1, high =10, size=(3,4)),
                           np.random.randint(low =1, high =10, size=(3,4)),
                           np.random.randint(low =1, high =10, size=(3,4)),
                          ]})

def get_subjects(x):
  #substitute to x = to_categorical(x, num_classes=len(subjects)+1).sum(axis=0)
  x = x.reshape(-1) # this one reshapes 3x4 array to 1x12
  return x

# apply(pd.series) splits the each row made of 1x12 array to 12 seperate columns
df["data"].apply(lambda x: get_subjects(x)).apply(pd.Series).to_numpy().shape

results in

5,12
Sign up to request clarification or add additional context in comments.

1 Comment

Let me try this first

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.