Python issue outputting multiline string into single csv.writer row

Question

The below code works, but assigns each newline of a multiline string to a new row vs. the desired state of one cell.

While researching a possible solutions I read:

That I should try enclose the string in double quotes
That replacing newline with carriage return helps
Parsing as a list/tuple via [] declaration

Having tried all the above, I'm thoroughly confused as to why this isn't working. If I remove newline it assigns everything to a single cell in the row but then it's hard to read.

Is there no way to preserve format when outputting a string with csv.writer?

def run_rip():
os.chdir('/REGRIPPERDIR/RegRipper2.8')
for k in ntDict:
    run_command = "".join(["./rip.pl", " -r /APATH/NTUSER.DAT -p ", str(k)])
    process = subprocess.Popen(run_command,
                               shell=True,
                               stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE)
    out, err = process.communicate()
    pattern = re.compile('lastwrite|(\d{2}:\d{2}:\d{2})|alert|trust|Value')
    grouping = re.compile('(?P<first>.+?)(\n)(?P<second>.+?)([\n]{2})(?P<rest>.+[\n])', re.MULTILINE | re.DOTALL)
    if pattern.findall(out):
        match = re.search(grouping, out)
        first = match.group('first')
        second =match.group('second')
        rest = ('"%s' % os.linesep.join([s for s in match.group('rest').splitlines() if s]))
        rest = rest.replace('\n','\r\n')
        headers = ['Name', 'Description', 'Found']
        f = csv.writer(open('/APATH/ntuser.csv', 'ab'),
                       dialect=csv.excel,
                       delimiter='\t')
        f.writerow(headers)
        f.writerow((first, second, rest))
        print(out)
    ntDict.popitem(last=False)

run_rip()

Sample Output:

/edit: I was requested in the comments below to post a sample of the third string rest as it is collected. The below text is what would be passed to csv.writer.

Baseline\n #First string as defined by the regex
(All) scans a hive\n #Second String as defined by the regex

Total values checked    : 64\n   #This and below is the rest string
Number of binary value lengths : 11\n
...\n
Value -5: 176 bytes [# times: 1]\n
Value -4: 712 bytes [# times: 5]\n

Desired State:

Can I suggest that you post some csv text (i.e. a string) that gives you the desired layout? The underlying problem is that csv isn't really a standard. What you're asking looks to be about whatever format the app that's producing your pasted images accepts. — demented hedgehog
– demented hedgehog, Commented Feb 20, 2016 at 6:04
@dementedhedgehog The app I'm interfacing with is Registry Ripper which outputs non-standard text and creates allot of excessive results that aren't needed. So my approach has been to call the app, read stdout, chuck the data into three variables as assigned above then output into a .csv. However the third variable has many newlines, which unfortunately are being read by csv.writer and output to separate cells as depicted in sample output. Is there a way to preserve newline/carriage return in output to .csv so that I can obtain the desired state? — ImNotLeet
– ImNotLeet, Commented Feb 21, 2016 at 16:35
Are there other .csv modules that would do the above? Am I not being clear in my question, is it lack of bounty? The above issue has been nagging me since I posted it. I've tried multiple hours of solutions and it's just out of my grasp (likely because I'm new to programming). — ImNotLeet
– ImNotLeet, Commented Feb 22, 2016 at 14:43
You can also write your own csv writer. It's pretty trivial (if you need to prepend a space to values that look like this: "2\n3"). I don't have excel so I can't test what formats excel imports successfully (and it's likely to vary by excel version anyway). — demented hedgehog
– demented hedgehog, Commented Feb 22, 2016 at 21:56

demented hedgehog · Accepted Answer · 2016-02-22 21:45:22Z

2

Not an answer... but I want code formatting.

import csv
x = ("1", "2\n3", "4")
f = csv.writer(open('foo', 'w'),
                   dialect=csv.excel,
                   delimiter='\t')
f.writerow(x)

Produces this:

$ less foo
1       "2
3"      4

And that's "valid" tab separated csv.. it's just that excel doesn't handle it "properly". Where things in quotes are really more implementation issues since it's not a standardized format. The extra double quotes around the 23 are kind of annoying.

Check out https://pypi.python.org/pypi/xlwt (pre xml excel formats) or http://xlsxwriter.readthedocs.org/ (xml excel formats) for 3rd party libraries to write excel directly.

edited Feb 22, 2016 at 21:45

answered Feb 22, 2016 at 21:40

demented hedgehog

7,6384 gold badges44 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

tagny · Accepted Answer · 2020-05-14 09:26:59Z

1

You can simply use \n to separate the items in the multiline cell. Just make sure the whole cell content appears between double-quote:

f = open(filename, "w", encoding='utf-8')

f.write('a,b,"c\nd",e\n')

This example writes a row in a csv file where the third cell has 2 lines c and d.

answered May 14, 2020 at 9:26

tagny

92 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 12:01:19Z

0

With the guidance of the comments I found my answer. Simply excel borks up formatting (for reasons covered in the comments). However when opened in LibreOffice the formatting is retained.

The suggested thread in the comments (Importing CSV with line breaks in Excel 2007) has a purposed workaround which includes quoting the actual newlines vs. quoting the entire string, which is what I was doing.

edited May 23, 2017 at 12:01

CommunityBot

11 silver badge

answered Feb 22, 2016 at 22:12

ImNotLeet

3915 silver badges20 bronze badges

Comments

Hussain Bohra · Accepted Answer · 2019-11-21 23:39:19Z

this is how I solved this problem.

Input Data:

('[email protected]', 'firstName', 'lastName', 'Address1', 'Address1
Address2', 'IP1
IP2
IP3', 'PhoneNumber')

Desired CSV Format:

Here is a Python code to get the same:

try:
    cs.execute(sql)
    row = cs.fetchone()
    while row is not None:
        email = row[0]
        filename = '{0}.csv'.format(email)
        with open(filename, "w", encoding='utf-8-sig', newline='') as fp:
            writer = csv.writer(fp)
            writer.writerow(('REQUESTER EMAIL', 'FIRST NAME', 'LAST NAME', 'BILLING ADDRESSES', 'SHIPPING ADDRESSES', 'IP ADDRESSES', 'PHONE NUMBER'))
            writer.writerow((
                row[0],
                row[1],
                row[2],
                row[3],
                row[4],
                row[5],
                row[6],
            ))
        row = cs.fetchone()
finally:
    cs.close()
ctx.close()

See this line (#7):

open(filename, "w", encoding='utf-8-sig', newline='')

setting up a unicode to 'utf-8-sig' is doing all trick.

Thanks, Hussain Bohra

Collectives™ on Stack Overflow

Python issue outputting multiline string into single csv.writer row

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related