Replacing multiple characters in a string

Question

I have a csv file that looks like this:

Mon-000101,100.27242,9.608597,11.082,10.034,0.39,I,0.39,I,31.1,31.1,,double with 1355,,,,,,,,
Mon-000171,100.2923,9.52286,14.834,14.385,0.45,I,0.45,I,33.7,33.7,,,,,,,,,,
Mon-000174,100.27621,9.563802,11.605,10.134,0.95,I,1.29,I,30.8,30.8,,,,,,,,,,

...it's a few hundred lines long.

I just want to grab the Mon-000101 (not just that specific one, but all the Mon-######) items. I have this really really ugly little script I threw together:

file_list1 = open(raw_input("Enter your list file: "))
file_lines = []
for line in file_list1:
    line.replace(' ','\n')
    for item in line.split('\n'):
        file_lines.append(item)
stringit = ''
for item in file_lines:
    stringit += item

IDs = re.findall('Mon-\d\d\d\d\d\d',stringit)
stringIDs = str(IDs)
new = stringIDs.replace(',','\n')

newer = new.replace('\'','')
newer2 = newer.replace('[\]','')
newer3 = newer2.replace(']','')
newer4 = newer3.replace('[','')
newer5 = newer4.replace(' ','')
file_write = open("Testit.txt","w+")
file_write.write(newer4)
print newer4
file_write.close()

I know it's ugly. Clearly I don't know what I'm doing with the regex stuff, but aside from that I want to know a more efficient way of replacing all the characters that I'm replacing. I know this isn't how it's done. I've tried something along the lines of

newer2 = newer.replace('([\',\[\] ])','')

which I sorta pieced together from various posts. That didn't work though, in fact it didn't do anything.

I want to see what a more efficient way of doing this looks like.

Thanks.

I'm also aware that my variable naming is not sufficient/not up to the style guide. This is just something I quickly threw together.

What's supposed to get written to your file? I can't tell what you're trying to do with the multiple replace calls, but I'm almost certain this is trivially replaceable using the csv library. — Peter DeGlopper
– Peter DeGlopper, Commented Nov 27, 2013 at 21:17
I just want those Mon-###### IDs. This script works, but it's ridiculous. — Matt
– Matt, Commented Nov 27, 2013 at 21:18

Peter DeGlopper · Accepted Answer · 2013-11-27 21:26:23Z

3

Assuming the IDs are always the first part of the line, this is a simple way to do it:

import csv
with open('some_list_file.txt', 'rb') as list_file:
    reader = csv.reader(some_list_file)
    with open('Testit.txt', 'W+') as output_file:
        output_file.writelines(line[0] + '\n' for line in reader)

If the position varies, it gets just a little more complicated:

import csv
with open('some_list_file.txt', 'rb') as list_file:
    reader = csv.reader(some_list_file)
    with open('Testit.txt', 'W+') as output_file:
        for line in reader:
            IDs = [part for part in line if part.startswith('Mon-')]
            if IDs:
                output_file.write(IDs[0] + '\n') # or accept multiple ID values if that's a possibilty

You can shorten that a little if you're sure there's a Mon- entry in every line:

    with open('Testit.txt', 'W+') as output_file:
        output_file.writelines([part for part in line if part.startswith('Mon-')][0] + '\n' for line in reader])

edited Nov 27, 2013 at 21:26

answered Nov 27, 2013 at 21:20

Peter DeGlopper

37.5k7 gold badges95 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Matt Over a year ago

Assume that they aren't though...what then?

Matt Over a year ago

Nice. I'll look this over in finer detail when I have a bit more time. Thank you.

Ωmega · Accepted Answer · 2013-11-27 21:20:09Z

1

Use regex pattern ^Mon\-\d{6} with m modifier.

answered Nov 27, 2013 at 21:20

Ωmega

44k35 gold badges143 silver badges213 bronze badges

Collectives™ on Stack Overflow

Replacing multiple characters in a string

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related