1

Just like the question says I am trying to add a comma at the end of a pattern or sub string. I found 3 solutions that should do the job, and look logical too. But they are not changing anything. I will show you all those codes. The goal is to find out if there is something that I am missing or something I can add to make it work.

By the way, on Google most of these comma related questions are about adding comma in a number string at thousandth place, then 5th place etc. like this- 1,00,000. That is not what I am looking for.

So here are some of the codes I tried:

import re
f = open('pizza.txt', 'r')
content = f.read()



for x in content:
  regex = r"\\d{2}/\\d{2}/\\d{4}"
  rep_str = regex+","
  sentence += re.sub(regex, rep_str, x)
   
print(sentence)


content="42/20/2021 every day is a good day 30.25

13/14/2015 today is saturday 24."

Here I tried reading it line by line from a text file. The content variable at the bottom showing you what is inside the text file. Those are just test strings. Each string has a pattern for date, followed by some text mixed with numbers then a floating point.

Ideally I would like to put a comma right after the date and another one after the text and number mix, just before the floating point number.

To keep it simple I started by adding just the first comma after the date.

The variable regex has the pattern for date. The variable rep_str containing what I want the pattern to be replaced with, that is the pattern followed by a comma. Then re.sub to do the job. The output is the strings in the text files with no change. No comma nothing.

Next code:

content = "13/14/2015 today 001 is saturday 24.34"
m = re.sub(r'(\\d{2}/\\d{2}/\\d{4})(.*)', r'\1 ,\2 ', content)
   
print(m)

Even simpler, no text file, just 1 string. The code has 2 patterns. One for date, other one for everything else. Tried to add comma between them. Same result. No comma, no error, just the same string as output.

Third attempt:

content = "13/14/2015 today 001 is saturday 24.34"
result = re.sub('/(?<=\d\b)(?!,)/', ',', content); 
print(result)

This piece of code is collected from here. First part of the code looking for a alphanumeric that ends with a number with word boundaries. Next brace is confirming that there is not a comma already. Then place the comma. This code apparently solved a similar problem I have. So I gave it a try. Surprisingly the result has been the same. No error, no change, same string as output.

If you spot anything or can think of a working code, do advise.

5
  • 1
    You match literal \d with r'\\d', use regex = r"\d{2}/\d{2}/\d{4}". You also read the file into a single variable, so do not use for loop here, use just content = re.sub(regex, rep_str, content). If you need to process line by line use content = f.readlines() Commented Jan 28, 2022 at 9:14
  • I tried the pattern with single slash, that's what my initial code was. Got this error :error: bad escape \d at position 0. So I changed as the system prompted. I am using jupyter notebook. Also tried the same code without for loop. No luck. Thanks for your comment by the way. Commented Jan 28, 2022 at 9:27
  • Sorry to ask but is run the code snippet function new? I am seeing it for the first time. Commented Jan 28, 2022 at 9:29
  • You got that error because you did not use the raw string literal. You should either use a raw string literal, and then use a single backslash with regex escapes, or use a normal string literal and use double backslash with regex escapes (i.e. to define a single literal backslash). Commented Jan 28, 2022 at 9:44
  • here is my approach to do so m = re.sub(r'[0-9]{4}',re.findall(r'[0-9]{4}',content)[0]+",", content) Commented Jan 28, 2022 at 10:23

1 Answer 1

1

You need to use

import re

with open('pizza.txt', 'r') as f:
    for line in f:
        print(re.sub(r'\d{2}/\d{2}/\d{4}', r'\g<0>,', line))

See the Python demo:

import re

content="""42/20/2021 every day is a good day 30.25
13/14/2015 today is saturday 24."""

for line in content.splitlines(False):
    print(re.sub(r'\d{2}/\d{2}/\d{4}', r'\g<0>,', line))

Output:

42/20/2021, every day is a good day 30.25
13/14/2015, today is saturday 24.

Details:

  • with open('pizza.txt', 'r') as f: - opens the pizza.txt file for reading
  • for line in f: - reads the f file line by line
  • print(re.sub(r'\d{2}/\d{2}/\d{4}', r'\g<0>,', line)) - prints the result of the regex subsitution: r'\d{2}/\d{2}/\d{4}' (mind the single backslashes in the raw string literal) finds all occurrences of two digits, /, two digits, / and four digits and replaces with the same found value (the \g<0> backreference refers to the whole match value) and appends a comma right after. `
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.