0

I have a list of numpy arrays and I am trying to merge them into a 2d matrix in the following way:

[arr1, arr2, arr3....] 

arr1 = [0.24, 0.24, 0.56, 0.77]
arr2 = [0.1, 0.24]
arr3 = [0.6, 0.7, 0.72, 0.88]

This is what the output should look like:

NaN, 0.24, 0.24, 0.56, Nan, Nan,  Nan, 0.77, Nan
0.1, 0.24,  Nan, Nan, Nan, Nan,  Nan,  Nan, Nan
Nan,  Nan,  Nan, Nan, 0.6, 0.7, 0.72,  NaN, 0.88

I use the following script to merge them:

# convert to series
series = [pd.Series(arr,index=arr) for arr in arrs]

# concat with reindex
pd.concat(series, axis=1)

But I run into the following error:

raise ValueError("cannot reindex from a duplicate axis")

ValueError: cannot reindex from a duplicate axis

Note that the input arrays have duplicates within them and I would like to keep those duplicates.

How do I go about fixing it?

EDIT:

given the discussion in the comments, the error is most likely arising due to duplicates and I was hoping to figure out a workaround that.

4
  • 1
    What happened to 0.88, and what is the logic of placing 0.77 where it is? Commented Jul 9, 2019 at 16:43
  • The code should run fine with this data. You have repeated values within on of the arrays. You need to decide what to do in which case. Commented Jul 9, 2019 at 16:58
  • You should modify your sample data to show the expected output when there are duplicates, e.g. do you want to drop them or do you want to keep them Commented Jul 9, 2019 at 17:30
  • I have modified the sample data and I would like to them Commented Jul 9, 2019 at 17:33

1 Answer 1

1

Here's a workaround when you have repeated data, namely, to index the series by the value and order of occurrence

new_arrs = []
for a in arrs:
    a = pd.Series(a)
    occurrences = a.groupby(a).cumcount()
    idx = pd.MultiIndex.from_tuples((x,y) for x,y in zip(a, occurrences ))
    a.index = idx

    new_arrs.append(a)

pd.concat(new_arrs, axis=1)

Output:

           0     1     2
0.10 0   NaN  0.10   NaN
0.24 0  0.24  0.24   NaN
     1  0.24   NaN   NaN
0.56 0  0.56   NaN   NaN
0.60 0   NaN   NaN  0.60
0.70 0   NaN   NaN  0.70
0.72 0   NaN   NaN  0.72
0.77 0  0.77   NaN   NaN
0.88 0   NaN   NaN  0.88
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.