1

I have this situation where I thave strings like 1k, 300k, 500k_cleaned and replaced, etc.

I wish to use regex package to replace k with 000 and delete the rest of the characters.

My code always throws errors:

renamed=re.sub(r"\b[k]\b",'000',df_VSS[i][0])

This is the line of code I have and I would be grateful for any help.

0

2 Answers 2

1

The problem is that _ and digits are word chars, so there is no word boundary between k and _ and between 1 and k.

You can match k in between a digit and a character other than an alphanumeric char:

import re
text = '1k, 300k, 500k_cleaned and replaced'
print( re.sub(r'(?<=\d)k(?![^\W_])', '000', text) )
# => 1000, 300000, 500000_cleaned and replaced

See the Python demo and the regex demo.

Details:

  • (?<=\d) - a positive lookbehind that requires a digit to appear immediately on the left
  • k - a k letter
  • (?![^\W_]) - a negative lookahead that fails the match if there is a char other than a non-word or underscore char immediately on the right (it is a \b with _ subtracted from it).
Sign up to request clarification or add additional context in comments.

Comments

1

If you have only to deal with the problem you described, a (maybe) simpler solution could be to use

s = my_str.split("k")[0] # get everything before k
s += "000"

Regexes can be tricky so I would advise you to use it only if no easier solution has been found. Also, if you use regexes, the website https://regex101.com/ can come in handy

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.