0

the thing is that i had names in my list that were of the format... ['john a, smith,william tell,jacob oram'] now i wanted to correct the first name (john a smith being one complete name) but due to the format of the raw data it came out like this...now using this method i found all such names using

ab=re.search((r'[A-Z][a-z]+ [A-Z]\,[A-Z][a-z]+'),zz[uy])

where zz[uy] is the location of the string...

now what i want is that i want is that i just want to replace the name in the correct format of john a. smith,william tell,jacob oram in the list...please help out basically all that i want to do is replace the "," after "a" in john a, smith with a "." and place it back in the string

2
  • also names like these are in a list...and can be multiple in a string and also they can be anywhere... Commented Jan 18, 2012 at 11:24
  • Rather than modify the list, why not see how you fill it? Commented Jan 18, 2012 at 11:27

1 Answer 1

1

You haven't defined exactly how to decide which commas to replace, and it's probably going to be very hard to find a rule that always works correctly, given how variable names can be.

So I suggest as a first approximation to replace a comma if and only if it follows a single letter:

>>> import re
>>> a = "John A, Smith, william tell, jacob oram"
>>> re.sub(r"(?<=\b\w),", ".", a)
'John A. Smith, william tell, jacob oram'

Caveat: \w also matches digits and underscores, but it's probably better than [A-Z], otherwise you'd miss out on names with non-ASCII characters)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.