I have a list named university_towns.txt which has a list as follows:
['Alabama[edit]\n',
'Auburn (Auburn University)[1]\n',
'Florence (University of North Alabama)\n',
'Jacksonville (Jacksonville State University)[2]\n',
'Livingston (University of West Alabama)[2]\n',
'Montevallo (University of Montevallo)[2]\n',
'Troy (Troy University)[2]\n',
'Tuscaloosa (University of Alabama, Stillman College, Shelton State)[3] [4]\n',
'Tuskegee (Tuskegee University)[5]\n']
I want to clean this text file such that all the characters in parentheses are replaced by '' . So, I want my text file to look like:
['Alabama',
'Auburn',
'Florence',
'Jacksonville',
'Livingston',
'Montevallo',
'Troy',
'Tuscaloosa,
'Tuskegee',
'Alaska',
'Fairbanks',
'Arizonan',
'Flagstaff',
'Tempe',
'Tucson']
I am trying to do this as follows:
import pandas as pd
import numpy as np
file = open('university_towns.txt','r')
lines = files.readlines()
for i in range(0,len(file)):
lines[i] = lines[i].replace('[edit]','')
lines[i] = lines[i].replace(r' \(.*\)','')
With this, I am able to remove '[edit]' but I am not able to remove the string in '( )'.