1

I have two CSV files. File 1 that looks like:

Ticker  |    Date     |   Marketcap 
  A     |  2002-03-14 |    600000
  A     |  2002-06-18 |    520000
                   .
                   .
  ABB   |  2004-03-16 |    400000
  ABB   |  2005-07-11 |    800000
                   .
                   .
  AD    |  2004-03-16 |    680000
                   .
                   .

File 2 like:

Ticker  |    Date     |     Open    |    Close   |
  A     |  2002-03-14 |    580000   |    500000  |
  ABB   |  2002-03-14 |    500000   |    420000  |
  AD    |  2002-03-16 |    700000   |    670000  |
                          .
                          .
                          .
                          .

The periods indicate that values continue on for a large number of entries for each ticker for both File 1 and File 2. The first file has all values for every date and every ticker listed all in one line continuously whereas the second file has all values for every year and ticker listed one-by-one.

What I want to do is merge files 1 and 2 based off both "Ticker" and "Date" to look like:

Ticker  |    Date     |   Marketcap |    Open     |    Close   |
  A     |  2002-03-14 |    600000   |    580000   |    500000  |
  ABB   |  2002-03-14 |    520000   |    500000   |    420000  |
                                 .
                                 .

I've tried merging files using something like:

a = pd.read_csv("File1.csv")
b = pd.read_csv("File2.csv")
merged = a.merge(b, on='Date')

But I don't think this accounts for both Date and Ticker at once.

3 Answers 3

2

I believe you need to use ['Date', 'Ticker'] instead of just 'Date'. Also you might need to specify the how argument depending on what you want.

Sign up to request clarification or add additional context in comments.

Comments

2

You can try the following code:

a = pd.read_csv("File1.csv", "\t")
b = pd.read_csv("File2.csv", "\t")
merged = pd.merge(a, b, how='inner', on=['Ticker', 'Date'])
print merged

If File1.csv is:

Ticker  Date    Marketcap 
A   2002-03-14  600000
A   2002-06-18  520000
ABB 2004-03-16  400000
ABB 2005-07-11  800000
AD  2004-03-16  680000

And File2.csv is:

Ticker  Date    Open    Close
A   2002-03-14  580000  500000
ABB 2004-03-16  500000  420000
AD  2004-03-16  700000  670000

Then the output of the above code will be:

  Ticker        Date  Marketcap     Open   Close
0      A  2002-03-14      600000  580000  500000
1    ABB  2004-03-16      400000  500000  420000
2     AD  2004-03-16      680000  700000  670000


If you want all rows from File1.csv and only matching rows from File2.csv, you can use this instead:

merged = pd.merge(a, b, how='left', on=['Ticker', 'Date'])

This will produce:

  Ticker        Date  Marketcap       Open     Close
0      A  2002-03-14      600000  580000.0  500000.0
1      A  2002-06-18      520000       NaN       NaN
2    ABB  2004-03-16      400000  500000.0  420000.0
3    ABB  2005-07-11      800000       NaN       NaN
4     AD  2004-03-16      680000  700000.0  670000.0

Comments

0

Try this:

 merged=a.merge(b, how='left',on=['Ticker', 'Date'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.