combining Data from multiple excel files

Question

I am trying to combine data from 2 Excel files with each other, but its just not working how i'd like it to.. My Code:

import pandas as pd
import numpy as np
import xlsxwriter
import warnings
    
open_tradein_xlsx = "Z_results.xlsx"
open_keepa_xlsx =  "keepa_data.xlsx"

with warnings.catch_warnings(record=True):
    warnings.simplefilter("always")
    keepa_data = pd.read_excel(open_keepa_xlsx, usecols=['Used: Lowest'])
    tradein_data = pd.read_excel(open_tradein_xlsx, index_col=0,)

dataframe = pd.DataFrame =(tradein_data,keepa_data)
data = pd.concat(dataframe, ignore_index=True)

print(data)

#if dataframe['Used: Lowest'] < dataframe['Rebuy'] or tradein_data['Momox']:
    #print(x)

and the output:

             ISBN       Rebuy       Momox  Used: Lowest
0      Unnamed: 0  Unnamed: 1  Unnamed: 1           NaN
1             NaN         NaN         NaN           NaN
2   9783630876672       12.19         2.6           NaN
3   9783423282789       11.48         2.8           NaN
4   9783833879500       16.92       10.15           NaN
5   9783898798822        7.07        2.28           NaN
6   9783453281417       13.06        7.41           NaN
7             NaN         NaN         NaN          13.5
8             NaN         NaN         NaN          14.0
9             NaN         NaN         NaN          19.9
10            NaN         NaN         NaN           2.0
11            NaN         NaN         NaN          16.4

Process finished with exit code 0

I guess you can see what i am trying to do, the 'Used: Lowest' Data should be in lines 2-6.

I already tried to do data = pd.concat(dataframe, ignore_index=True, axis=1) but then i get following error:pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

the code that creates 'Z_results.xlsx':

import pandas as pd
import numpy as np
import xlsxwriter
import pandas as pd
from pathlib import Path

open_momox_xlsx = ("momox_ergebnisse.xlsx")
momox_data = pd.read_excel(open_momox_xlsx,usecols='B')

open_rebuy_xlsx = ("rebuy_ergebnisse.xlsx")
rebuy_data = pd.read_excel(open_rebuy_xlsx,usecols='B')

open_isbn_xlsx = ("momox_ergebnisse.xlsx")
isbn_data = pd.read_excel(open_rebuy_xlsx,usecols='A')

dataframe = pd.DataFrame =({'ISBN': isbn_data, 'Rebuy': rebuy_data, 'Momox': momox_data})
data = pd.concat(dataframe,axis=1)

data[['Rebuy','Momox']] = data[['Rebuy','Momox']].replace({"///": np.nan, ",": "."}, regex=True).astype(float)
data = data.loc[data[['Rebuy','Momox']].ge(1.).all(axis="columns")]

isbn_output = data['ISBN']

datatoexcel = pd.ExcelWriter("Z_results.xlsx", engine='xlsxwriter')

data.to_excel(datatoexcel)

datatoexcel.save()

np.savetxt("ISBN_output.txt",isbn_output,fmt = "%s")

I thought xlsx would be the best thing to use as storagetype but now i feel like its kind of complicated..

Can you check this line: dataframe = pd.DataFrame =(tradein_data,keepa_data) — Corralien
– Corralien, Commented Jun 3, 2021 at 9:56

williamr21 · Accepted Answer · 2021-06-03 09:58:04Z

1

you don't have to create a new dataframe since you are concatenating it in the next line of code. The below code should work

import pandas as pd
import numpy as np
import xlsxwriter
import warnings

open_tradein_xlsx = "Z_results.xlsx"
open_keepa_xlsx =  "keepa_data.xlsx"

with warnings.catch_warnings(record=True):
    warnings.simplefilter("always")
    keepa_data = pd.read_excel(open_keepa_xlsx, usecols=['Used: Lowest'])
    tradein_data = pd.read_excel(open_tradein_xlsx, index_col=0,)

data = pd.concat([tradein_data, keepa_data], axis=1, ignore_index=True)
print(data)

answered Jun 3, 2021 at 9:58

williamr21

1477 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

lasse3434 Over a year ago

thanks your the help! I already tried the axis=1 in pd.concat, but when i do that i get following error: pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

lasse3434 Over a year ago

the problem could be that the 'Z_results.xlsx' is created by another code using pd.Dataframe and pd.concat, aswell. I will attach the other code to my question too

williamr21 Over a year ago

the above code worked for me, can you put a snapshot of how your input data looks in excel

lasse3434 Over a year ago

and a part of keepa its too long for a screenshot ibb.co/CwWzfKb // unfortunately the Zresults looks really messy , i have tried hours to create a clean excel like the keepa_excel but it just didnt work.

Collectives™ on Stack Overflow

combining Data from multiple excel files

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related