how to choose which column to write in (.csv) in python

Question

import csv

f = csv.reader(open('lmt.csv','r'))         # open input file for reading
Date, Open, Hihh, mLow, Close, Volume = zip(*f)    #s plit it into separate columns

ofile = open("MYFILEnew1.csv", "wb")            # output csv file
c = csv.writer(ofile)

item = Date   
item2 = Volume
rows = zip(item, item)
i = 0
for row in item2:
    print row
    writer = csv.writer(ofile, delimiter='\t')
    writer.writerow([row])

ofile.close()

Above is what I have produced so far.

As you can see in the 3rd line, I have extracted 6 columns from a spreadsheet. I want to create a .csv file under the name of MYFILEnew1.csv which only has two columns, Date and Volume.

What I have above creates a .csv that only writes Volume column into the first column of the new .csv file. How would you go about placing Date into the second column?

For example

Date       Open   High      Low     Close    Volume
17-Feb-16   210   212.97    209.1   212.74  1237731

is what i have. and Id like to produce a new csv file such that it has

Date           Volume
17-Feb-16      1237731

Cleb · Accepted Answer · 2016-02-21 12:26:42Z

3

If I understand you question correctly, you can achieve that very easily using panda's read_csv and to_csv (@downvoter: Could you explain your downvote, please!?); the final solution to your problem can be found below EDIT2:

import pandas as pd

# this assumes that your file is comma separated
# if it is e.g. tab separated you should use pd.read_csv('data.csv', sep = '\t')
df = pd.read_csv('data.csv')

# select desired columns
df = df[['Date', 'Volume']]

#write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)

So, if your data.csv file looks like this:

Date,Open,Hihh,mLow,Close,Volume
1,5,9,13,17,21
2,6,10,14,18,22
3,7,11,15,19,23
4,8,12,16,20,24

The the MYFILEnew1.csv would look like this after running the script above:

Date    Volume
1   21
2   22
3   23
4   24

EDIT

Using your data (tab separated, stored in the file data3.csv):

Date    Open    Hihh    mLow    Close   Volume
17-Feb-16   210 212.97  209.1   212.74  1237731

Then

import pandas as pd

df = pd.read_csv('data3.csv', sep='\t') 

# select desired columns
df = df[['Date', 'Volume']]

# write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)

gives the desired output

Date    Volume
17-Feb-16   1237731

EDIT2

Since your header in your input csv file seems to be messed up (as discussed in the comments), you have to rename the first column. The following now works fine for me using your entire dataset:

import pandas as pd

df = pd.read_csv('lmt.csv', sep=',') 

# get rid of the wrongly formatted column name
df.rename(columns={df.columns[0]: 'Date' }, inplace=True)

# select desired columns
df = df[['Date', 'Volume']]

# write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)

edited Feb 21, 2016 at 12:26

answered Feb 20, 2016 at 19:39

Cleb

26.3k23 gold badges129 silver badges164 bronze badges

Sign up to request clarification or add additional context in comments.

27 Comments

stratofortress Over a year ago

i implemented your script above but i received the following error message "['Date'] not in index"

Cleb Over a year ago

Is your input csv file tab or comma separated? If it is tab separated, you need sep='\t' as additional optin in the read_csv command. But it should work fine with the example csv I provided. How does df look like after reading in the file?

stratofortress Over a year ago

csv i guess it means its comma seperated? (sorry im not good with this kindof stuff..)

Cleb Over a year ago

Usually yes (Comma Separated Values = csv). You could print df after reading in the the file and see how it looks like. I edit my question to account for the tab separated option.

stratofortress Over a year ago

"print df" after "df = pd.read_csv('lmt.csv')" prints out all columns nicely but i think its "df = df[['Date', 'Volume']]" that fails

|

cderwin · Accepted Answer · 2016-02-20 19:28:09Z

2

Here I would suggest using the csv module's csv.DictReader object to read and write from the files. To read the file, you would do something like

import csv
fieldnames=('Date', 'Open', 'High', 'mLow', 'Close', 'Volume')
with open('myfilename.csv') as f:
    reader = csv.DictReader(f, fieldnames=fieldnames)

Beyond this, you will just need to filter out the keys you don't want from each row and similarly use the csv.DictWriter class to write to your export file.

answered Feb 20, 2016 at 19:28

cderwin

4254 silver badges11 bronze badges

Comments

mechanical_meat · Accepted Answer · 2016-02-20 21:15:57Z

2

You were so close:

import csv

f = csv.reader(open('lmt.csv','rb')) # csv is binary
Date, Open, Hihh, mLow, Close, Volume = zip(*f)

rows = zip(Date, Volume)

ofile = open("MYFILEnew1.csv", "wb")  
writer = csv.writer(ofile)
for row in rows:
    writer.writerow(row) # row is already a tuple so no need to make it a list

ofile.close()

edited Feb 20, 2016 at 21:15

answered Feb 20, 2016 at 19:41

mechanical_meat

170k25 gold badges238 silver badges231 bronze badges

3 Comments

stratofortress Over a year ago

this just puts both columns in a single column of the new file

mechanical_meat Over a year ago

That likely has something to do with how you're opening the file.

mechanical_meat Over a year ago

I've changed it to be comma-separated.

Collectives™ on Stack Overflow

how to choose which column to write in (.csv) in python

3 Answers 3

27 Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

27 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related