Replacing specific characters in python list

Question

I have a list named university_towns.txt which has a list as follows:

     ['Alabama[edit]\n',
        'Auburn (Auburn University)[1]\n',
        'Florence (University of North Alabama)\n',
        'Jacksonville (Jacksonville State University)[2]\n',
        'Livingston (University of West Alabama)[2]\n',
        'Montevallo (University of Montevallo)[2]\n',
        'Troy (Troy University)[2]\n',
        'Tuscaloosa (University of Alabama, Stillman College, Shelton State)[3]      [4]\n',
        'Tuskegee (Tuskegee University)[5]\n']

I want to clean this text file such that all the characters in parentheses are replaced by '' . So, I want my text file to look like:

['Alabama',
 'Auburn',
 'Florence',
 'Jacksonville',
 'Livingston',
 'Montevallo',
 'Troy',
 'Tuscaloosa,
 'Tuskegee',
 'Alaska',
 'Fairbanks',
 'Arizonan',
 'Flagstaff',
 'Tempe',
 'Tucson']

I am trying to do this as follows:

import pandas as pd
import numpy as np
file = open('university_towns.txt','r')
lines = files.readlines()
for i in range(0,len(file)):
    lines[i] = lines[i].replace('[edit]','')
    lines[i] = lines[i].replace(r' \(.*\)','')

With this, I am able to remove '[edit]' but I am not able to remove the string in '( )'.

Saurav Agarwal, it looks like your last edit rolled back some good edits from someone else. Please re-apply your own edits, but ensure you refresh your screen first, so that the prior edits are preserved. I have rolled back. Thanks. — halfer
– halfer, Commented Dec 22, 2016 at 18:55

Moinuddin Quadri · Accepted Answer · 2016-12-20 11:13:42Z

1

You may use regex along with list comprehension expression as:

import re

new_list = [re.match('\w+', i).group(0) for i in my_list]
#       match for word ^             ^ returns first word

where my_list is the original list mentioned in question. Final value hold by new_list will be:

['Alabama', 
 'Auburn', 
 'Florence', 
 'Jacksonville', 
 'Livingston', 
 'Montevallo', 
 'Troy', 
 'Tuscaloosa', 
 'Tuskegee']

answered Dec 20, 2016 at 11:13

Moinuddin Quadri

48.4k13 gold badges101 silver badges138 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Saurav Agarwal Over a year ago

This is just giving output as : [ ]

mVChr Over a year ago

Note this would fail for a city like Grand Rapids due to the whitespace.

doctorlove · Accepted Answer · 2016-12-20 11:17:51Z

1

The replace method on a string replaces an actual substring. You need to use regex:

import re
#...
line[i] = re.sub(r' (.*)', '', line[i])

answered Dec 20, 2016 at 11:17

doctorlove

19.4k3 gold badges49 silver badges65 bronze badges

Comments

mVChr · Accepted Answer · 2016-12-20 11:13:23Z

0

A simple regex should do the trick.

import re
output = [re.split(r'[[(]', s)[0].strip() for s in your_list]

answered Dec 20, 2016 at 11:13

mVChr

50.3k11 gold badges111 silver badges105 bronze badges

1 Comment

Saurav Agarwal Over a year ago

This is not showing the expected output, It is showing output as : [ ] which is an empty list. @mVChr

Oleksandr Dashkov · Accepted Answer · 2016-12-20 11:19:11Z

0

You can use re.sub instead of replace

import re
# your code here
lines[i] = re.sub(r' \(.*\)','', lines[i])

answered Dec 20, 2016 at 11:19

Oleksandr Dashkov

2,9781 gold badge20 silver badges31 bronze badges

2 Comments

doctorlove Over a year ago

That wasn't all the code... just pointing out replace takes a substring but you need to use regex. What error are you getting?

Oleksandr Dashkov Over a year ago

@SauravAgarwal, how doctorlove already wrote, it's not all the code, it's only example of your line 'lines[i] = lines[i].replace(r' (.*)','')' to change

Collectives™ on Stack Overflow

Replacing specific characters in python list

4 Answers 4

2 Comments

Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related