1

I have a rather large string that I want to write to a python file object. Currently, when I try to write this string only the LAST row gets written to the file. I've tested to ensure that the variable holding the large string is infact <type 'str'>. Here is the sample content:

"0008788014065251","Rush Running - Bentonville","1030643167","5,788.00","11.55","5.77" 
"0008788014065271","Rush Running - Fayetteville","1030643159","1,577.00","3.16","1.58" 
"0008788014108297","Snow Ball Express","2423373737","11.00","0.04","0.02" 
"0008788014108354","Snow Ball Express","2423378892","1,421.00","5.69","2.84" 
"0008788014108374","Snow Ball Express","2423378959","59.00","0.24","0.12" 
"0008788014110860","Sound Master","2423477231","135.00","0.54","0.27" 
"0008788014074301","The Baby's Room","1030669816","6,912.00","13.82","6.91" 
"0008788014110760","The Reserve","2423470822","715.00","2.86","1.43" 
"0008788014077339","Tool Town LLC","1171354079","438.00","0.88","0.44" 

I want to write this to a file but everytime I do a file.write() I get only the last row. I'm using this simple file open and write procedure:

#link is a url to a csv file
export = urllib2.urlopen( link )
content = export.read()
with open("somefile.csv", "w") as file:
    try:
        file.write( content )
    except Exception, e:
        raise e

I read that I should be iterating over content with a for loop; but, since content is a string and not a list/tuple, the for loop will explode to each letter and write the letter on a separate row.

Any ideas how to write this type of content to a file?

9
  • 1
    Can you just print len(content) just to verify? Can you tell us the result? Commented Oct 23, 2013 at 16:55
  • 1
    What do you get if you examine content using repr? Have you checked for carriage returns or other escape characters? Commented Oct 23, 2013 at 16:56
  • What you could do is first split by newline, and then split by comma, in order to get the list structure which is easily writable (so from string to one level list to two level list). EDIT: Actually, I think you would only have to split by newline. Commented Oct 23, 2013 at 17:02
  • 1
    What operating system are you using ? Why do you use urllib2.urlopen() to open a file, while a normal open() function is intented to do so ? Commented Oct 23, 2013 at 17:04
  • I've just tried to open a CSV file with urllib2.urlopen() and it failed: ValueError: unknown url type: rada.csv For the moment, your question means nothing. Commented Oct 23, 2013 at 17:18

3 Answers 3

1

You must analyze the data to see if it has the wiated format.
Could you execute this code:

import urllib2

export = urllib2.urlopen( link )
content = export.read()

splt = content.splitlines(True) # True keeps the newlines
print 'len of splt : %d' % len(splt)
print [len(line.split(',')) for line in splt]

import re
print [re.match('"\d+",',line) for line in splt]

UPDATE FROM SADMICROWAVE Here is the content from the steps you requested me to execute:

len of splt : 48
[6, 8, 6, 6, 6, 6, 6, 6, 7, 6, 6, 7, 6, 7, 7, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 6, 7, 6, 7, 6, 6, 6, 6, 6, 7, 7, 7, 7, 6, 7, 6, 6, 7, 6, 6]
[None, <_sre.SRE_Match object at 0x1f578b8>, <_sre.SRE_Match object at 0x1f57bf8>, <_sre.SRE_Match object at 0x1f57e68>, <_sre.SRE_Match object at 0x1f57ed0>, <_sre.SRE_Match object at 0x1f57f38>, <_sre.SRE_Match object at 0x216e030>, <_sre.SRE_Match object at 0x216e098>, <_sre.SRE_Match object at 0x216e100>, <_sre.SRE_Match object at 0x216e168>, <_sre.SRE_Match object at 0x216e1d0>, <_sre.SRE_Match object at 0x216e238>, <_sre.SRE_Match object at 0x216e2a0>, <_sre.SRE_Match object at 0x216e308>, <_sre.SRE_Match object at 0x216e370>, <_sre.SRE_Match object at 0x216e3d8>, <_sre.SRE_Match object at 0x216e440>, <_sre.SRE_Match object at 0x216e4a8>, <_sre.SRE_Match object at 0x216e510>, <_sre.SRE_Match object at 0x216e578>, <_sre.SRE_Match object at 0x216e5e0>, <_sre.SRE_Match object at 0x216e648>, <_sre.SRE_Match object at 0x216e6b0>, <_sre.SRE_Match object at 0x216e718>, <_sre.SRE_Match object at 0x216e780>, <_sre.SRE_Match object at 0x216e7e8>, <_sre.SRE_Match object at 0x216e850>, <_sre.SRE_Match object at 0x216e8b8>, <_sre.SRE_Match object at 0x216e920>, <_sre.SRE_Match object at 0x216e988>, <_sre.SRE_Match object at 0x216e9f0>, <_sre.SRE_Match object at 0x216ea58>, <_sre.SRE_Match object at 0x216eac0>, <_sre.SRE_Match object at 0x216eb28>, <_sre.SRE_Match object at 0x216eb90>, <_sre.SRE_Match object at 0x216ebf8>, <_sre.SRE_Match object at 0x216ec60>, <_sre.SRE_Match object at 0x216ecc8>, <_sre.SRE_Match object at 0x216ed30>, <_sre.SRE_Match object at 0x216ed98>, <_sre.SRE_Match object at 0x216ee00>, <_sre.SRE_Match object at 0x216ee68>, <_sre.SRE_Match object at 0x216eed0>, <_sre.SRE_Match object at 0x216ef38>, <_sre.SRE_Match object at 0x216f030>, <_sre.SRE_Match object at 0x216f098>, <_sre.SRE_Match object at 0x216f100>, <_sre.SRE_Match object at 0x216f168>]
Sign up to request clarification or add additional context in comments.

1 Comment

Well, I wanted to be sure that the content wasn't a web page with plenty of other kinds of data, but the result of my code doesn't bring a lot of insight, except that the file that will be obtained when all the content will be correctly recorded won't have the same number of columns all along the lines. - It is difficult to help without having the possibility to perform tests. - I wonder what do you mean by "only the LAST row gets written to the file". How did you observe that ? Did you try to print the content before it is written ? Did you compare the written and the read data ?
0

you could give a try to use readlines and writelines instead, but it should be the same... maybe if the end of line encoding is different (mac/unix/win), this could yield the correct result.

Comments

0

Is this what you are looking for?

export = urllib2.urlopen( link )
content = export.read()
content_list = content.split("\n")

with open("somefile.csv", "a") as f:      # note the "a" for (a)ppending
     for line in content.list:
         f.write(line + "\n")

As far as I understood it, the only problem you have is because you are iterating over the string rather than line by line?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.