Put comma after a pattern in python regex

Question

Just like the question says I am trying to add a comma at the end of a pattern or sub string. I found 3 solutions that should do the job, and look logical too. But they are not changing anything. I will show you all those codes. The goal is to find out if there is something that I am missing or something I can add to make it work.

By the way, on Google most of these comma related questions are about adding comma in a number string at thousandth place, then 5th place etc. like this- 1,00,000. That is not what I am looking for.

So here are some of the codes I tried:

import re
f = open('pizza.txt', 'r')
content = f.read()



for x in content:
  regex = r"\\d{2}/\\d{2}/\\d{4}"
  rep_str = regex+","
  sentence += re.sub(regex, rep_str, x)
   
print(sentence)


content="42/20/2021 every day is a good day 30.25

13/14/2015 today is saturday 24."

Here I tried reading it line by line from a text file. The content variable at the bottom showing you what is inside the text file. Those are just test strings. Each string has a pattern for date, followed by some text mixed with numbers then a floating point.

Ideally I would like to put a comma right after the date and another one after the text and number mix, just before the floating point number.

To keep it simple I started by adding just the first comma after the date.

The variable regex has the pattern for date. The variable rep_str containing what I want the pattern to be replaced with, that is the pattern followed by a comma. Then re.sub to do the job. The output is the strings in the text files with no change. No comma nothing.

Next code:

content = "13/14/2015 today 001 is saturday 24.34"
m = re.sub(r'(\\d{2}/\\d{2}/\\d{4})(.*)', r'\1 ,\2 ', content)
   
print(m)

Even simpler, no text file, just 1 string. The code has 2 patterns. One for date, other one for everything else. Tried to add comma between them. Same result. No comma, no error, just the same string as output.

Third attempt:

content = "13/14/2015 today 001 is saturday 24.34"
result = re.sub('/(?<=\d\b)(?!,)/', ',', content); 
print(result)

This piece of code is collected from here. First part of the code looking for a alphanumeric that ends with a number with word boundaries. Next brace is confirming that there is not a comma already. Then place the comma. This code apparently solved a similar problem I have. So I gave it a try. Surprisingly the result has been the same. No error, no change, same string as output.

If you spot anything or can think of a working code, do advise.

You match literal \d with r'\\d', use regex = r"\d{2}/\d{2}/\d{4}". You also read the file into a single variable, so do not use for loop here, use just content = re.sub(regex, rep_str, content). If you need to process line by line use content = f.readlines() — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Jan 28, 2022 at 9:14
I tried the pattern with single slash, that's what my initial code was. Got this error :error: bad escape \d at position 0. So I changed as the system prompted. I am using jupyter notebook. Also tried the same code without for loop. No luck. Thanks for your comment by the way. — user18004387
– user18004387, Commented Jan 28, 2022 at 9:27
Sorry to ask but is run the code snippet function new? I am seeing it for the first time. — Abhyuday Vaish
– Abhyuday Vaish, Commented Jan 28, 2022 at 9:29
You got that error because you did not use the raw string literal. You should either use a raw string literal, and then use a single backslash with regex escapes, or use a normal string literal and use double backslash with regex escapes (i.e. to define a single literal backslash). — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Jan 28, 2022 at 9:44
here is my approach to do so m = re.sub(r'[0-9]{4}',re.findall(r'[0-9]{4}',content)[0]+",", content) — naif_d
– naif_d, Commented Jan 28, 2022 at 10:23

Wiktor Stribiżew · Accepted Answer · 2022-01-28 09:41:12Z

You need to use

import re

with open('pizza.txt', 'r') as f:
    for line in f:
        print(re.sub(r'\d{2}/\d{2}/\d{4}', r'\g<0>,', line))

See the Python demo:

import re

content="""42/20/2021 every day is a good day 30.25
13/14/2015 today is saturday 24."""

for line in content.splitlines(False):
    print(re.sub(r'\d{2}/\d{2}/\d{4}', r'\g<0>,', line))

Output:

42/20/2021, every day is a good day 30.25
13/14/2015, today is saturday 24.

Details:

with open('pizza.txt', 'r') as f: - opens the pizza.txt file for reading
for line in f: - reads the f file line by line
print(re.sub(r'\d{2}/\d{2}/\d{4}', r'\g<0>,', line)) - prints the result of the regex subsitution: r'\d{2}/\d{2}/\d{4}' (mind the single backslashes in the raw string literal) finds all occurrences of two digits, /, two digits, / and four digits and replaces with the same found value (the \g<0> backreference refers to the whole match value) and appends a comma right after. `

Collectives™ on Stack Overflow

Put comma after a pattern in python regex

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related