Newest 'python-re' Questions

2 votes

4 answers

330 views

regex for matching a pattern with an optional part

I need a regex pattern to find substrings of the form "a:<some integer>" and an optional "b:<some float>" in a large string. The "a" string may be preceded ...

user1479670

1,355

asked Apr 24 at 10:55

3 votes

3 answers

132 views

Python regular expression for text search

I am trying to extract wanted text from a given set of text. I have created below function. def extract_name(title): matches = re.findall(r'\b[A-Z0-9\s&.,()-]+(?:\s*$\d$)?\b', title) ...

Totura

167

asked Apr 4 at 2:30

0 votes

1 answer

103 views

Predicting `re` regexp memory consumption

I have a large (gigabyte) file where an S-expression appears, and I want to skip to the end of the S-expression. The depth of the S-expression is limited to 2, so I tried using a Python regexp (b'\\((?...

Erik Carstensen

910

asked Mar 28 at 20:05

4 votes

2 answers

189 views

Using re.sub and replace with overall match [duplicate]

I was just writing a program where I wanted to insert a newline after a specific pattern. The idea was to match the pattern and replace with the overall match (i.e. capture group \0) and \n. s = "...

DuesserBaest

3,215

asked Feb 18 at 12:55

0 votes

1 answer

45 views

Python re.sub: backreference in replacement pattern followed by digit [duplicate]

I would like to match a regular expression in a string and add the character 0 after all occurrences. That is, each match will be replaced with itself followed by 0. But because 0 is a digit, I don'...

Ed Avis

1,622

asked Jan 24 at 16:05

0 votes

1 answer

71 views

Match characters between square brackets but only if text inside brackets follows pattern [duplicate]

I want to match text inside of square brackets - but ONLY if it contains hashtag+digit+digit i.e [#18] or [hello #25 bye] NOT [25] (no hashtag) I ultimately want to remove these match strings (...

lolo

79

asked Jan 23 at 2:11

1 vote

2 answers

55 views

Counting the hashtags in a collection of tweets: two methods with inconsistent results

I'm playing around with a numpy dataframe containing two columns: 'tweet_text' and 'cyberbullying_type'. It was created through this dataset as follows: df = pd.read_csv('data/cyberbullying_tweets.csv'...

Sam

494

asked Jan 9 at 0:17

1 vote

1 answer

184 views

Static typing of Python regular expression: 'incompatible type "str"; expected "AnyStr | Pattern[AnyStr]" '

Just to be clear, this question has nothing to do with the regular expression itself and my code is perfectly running even though it is not passing mypy strict verification. Let's start from the basic,...

toto

367

asked Dec 24, 2024 at 9:10

0 votes

1 answer

44 views

re.findall with requests doesn't match copied and pasted html (generated by requests.text)

I'm trying to capture some elements from the html code of a certain url. When I copy and paste the contents of the html directly to into my python code it works well. import re # Sample HTML content ...

Addoodi

13

asked Dec 23, 2024 at 15:31

-1 votes

1 answer

49 views

How to create a regex with an optional group without merging it with another group? [duplicate]

I'm trying to write a regex pattern in Python to capture two groups, where the second group is optional, but I want the groups to remain distinct. Here is are examples of the possible pattern I want ...

user1142252

29

asked Dec 10, 2024 at 17:29

1 vote

2 answers

78 views

re.sub eats next charater when replacing with more [duplicate]

so i was trying to format my text for markdown v2, basically I just want to replace a special character a with \a when trying to do this with regex, it does so< but the new symbol eats up the next ...

George

27

asked Dec 9, 2024 at 4:17

-3 votes

2 answers

82 views

Extracting string from ann email body

I'm using python to extract the information provided from the body of an email using imap. Part of the email that interests to my code: "BOT ID: 4824CF8B-2986-11EC-80F0-84A93851B964" I can ...

HC DARK BOT

3

asked Dec 2, 2024 at 13:31

0 votes

1 answer

63 views

Regex get parse inet value

I want to parse ifconfig to get ip_address, net mask and broadcast. and these are optional fields. If it present, it should return but if not it should return None. My below pattern works fine but if '...

premganesh

85

asked Nov 17, 2024 at 6:38

1 vote

1 answer

57 views

Issue with Toggling Sign of the Last Entered Number in Calculator Using ⁺∕₋ in Python

I am developing a calculator using Python. The problem I'm facing is that when I try to toggle the sign of the last number entered by the user using the ⁺∕₋ button, all similar numbers in the text get ...

Araz_devp

13

asked Nov 16, 2024 at 17:00

-1 votes

2 answers

77 views

RegEx: Python (findall). Order of elements in OR statement resulting in different output

I am trying to get my head around regular expressions and was playing with some examples trying to see what it comes out at. I am struggling to understand how the order of element in OR (|) impacts ...

Martin S.

13

asked Nov 10, 2024 at 19:25

8 votes

2 answers

228 views

How to ignore case but not diacritics with Python regex?

I'm working with a set of regex patterns that I have to match in a target text. My problematic regex is something like this: (İg)[[:punct:][:space:]]+[[:alnum:]]+ Initially, I noticed that Python’s re ...

Paolo Magnani

709

asked Nov 8, 2024 at 10:01

1 vote

1 answer

135 views

Facing irregular format while extracting data from pdf invoice to transfer in excel file

I have a irregular format pdf invoice files with multiple pages. I want excel file in return with data extracted from pdf files. For this I write code with plumberpdf library in python but I am able ...

Hannan Wali

11

asked Nov 7, 2024 at 15:04

0 votes

0 answers

33 views

How to make a non-greedy regex when in multiline mode [duplicate]

I have a text file (latin-1-encoded) with this content: 1 lorem ipsum 1 ... 1 OCTOBER 24, 2024 11/27/13 lorem ipsum 2 ... 1 ...

ostpoller

119

asked Oct 28, 2024 at 12:54

0 votes

0 answers

57 views

Regex pattern sanitization for wildcard replacement

I need a function to sanitize regex patterns in Python, specifically targeting strings that may contain wildcard characters (%). The goal is to replace these % wildcards with the regex equivalent .* ...

Abinash Biswal

3

asked Oct 16, 2024 at 22:27

2 votes

1 answer

106 views

Using re to match a digit + any contiguous duplicates and storing the duplicates, not just the digit as the result [duplicate]

I'm trying to use re.findall(pattern, string) to match all numbers and however many duplicates follow in a string. Eg. "1222344" matches "1", "222", "3", "...

Ethan

41

asked Oct 10, 2024 at 0:49

-1 votes

2 answers

186 views

Is there a limit to the size of a string in Python's re.search? [duplicate]

I am extracting data from an API call and am using this code: if response.status_code == 200: ReportResponse = re.search('<return>(.+?)</return>', response.text) print(...

Steve

19

asked Oct 8, 2024 at 4:02

0 votes

1 answer

80 views

Regular expression for searching only natural numbers

It is necessary to write a regular expression to search for natural numbers in the text. Numbers can be inside words and any special characters. The main condition for the search is a sequence of ...

CollonelDain

31

asked Oct 7, 2024 at 19:49

0 votes

1 answer

49 views

python re identifiers not working with lookahead and lookbehind

I have the following string str = '2024-09-23 18:05:08,147 INFO [WatchDog_191084] (alloc:0MB, cpu:0%) 10 422' and I am trying to extract the numbers between the squared brackets. so I am ...

Eliseo Di Folco

171

asked Oct 1, 2024 at 14:54

2 votes

1 answer

49 views

Grabbing a specific url from a webpage with re and requests [duplicate]

import requests, re r = requests.get('example.com') p = re.compile('\d') print(p.match(str(r.text))) This always prints None, even though r.text definitely contains numbers, but print(p.match('...

Red Dwarf

728

asked Sep 20, 2024 at 22:15

1 vote

1 answer

77 views

Python re.sub () replace content but replacement contains special characters

I'm working on auto replacing contents in a file, the re.search() are successfully got the new_content, but it contains special characters and when I want to use re.sub() it shows : error: invalid ...

Gabriel Za

13

asked Sep 13, 2024 at 14:21

0 votes

2 answers

85 views

Unexpected behaviour of the regex "{m, n}?$"

Consider the following example >>> import sys, re >>> sys.version '3.11.5 (main, Sep 11 2023, 13:23:44) [GCC 11.2.0]' >>> re.__version__ '2.2.1' >>> re.findall('a{1,...

rasul

1,129

asked Sep 8, 2024 at 5:16

1 vote

1 answer

92 views

Reformat complex file output from an old fortran program to csv using python

I want to convert complex file output into a simpler version, but I can't seem to get the regex right. I have tried using regex and pandas to convert this weird formatted code to something nicer but ...

Pad

911

asked Sep 6, 2024 at 14:34

-1 votes

1 answer

74 views

Text is split depending on the order of specific delimiter [duplicate]

The code is supposed to split the string without removing the delimiters. import re operations = '8-8/84' operations = re.split(r'([+,*,/,-])', operations) Executing the code, operations ends up with ...

eye egg

21

asked Sep 4, 2024 at 18:54

-3 votes

1 answer

75 views

Sumarize double for loop into list comprehension

I've been trying to translate these two for loops into list comprehension: with open(sourceFile, 'r+t') as file: for line in file: for key, value in patterns.items(): ...

ludovico

95

asked Sep 4, 2024 at 8:55

0 votes

1 answer

87 views

Why do certain regex functions return a match object and a few don't? [closed]

In Python common regex functions, re.match, re.search, re.fullmatch, etc. return a match object and to print the result we have to use match.group(): re.search(pattern, string): Searches for the first ...

NBS

49

asked Sep 4, 2024 at 7:06

0 votes

0 answers

39 views

How can I use python regex to find as many matches as possible, leaving out those that are concatenations? [duplicate]

I have this string "~/goofy.git$ /home/maria/L1-07-51.mdl /home/maria/L1-08-09.res" I want to find every occurrence of a string that starts /home and ends in either res or mdl. And: I want ...

Mikke Mus

155

asked Aug 26, 2024 at 10:50

-1 votes

2 answers

86 views

How to use regex to extract a set of particular substrings?

I want to extract all possible substrings which have all the vowels from a string. For example in the code: import re text = "thisisabeautifulsequencofwords" pattern = r"(?=.*a)(?=.*e)(?...

Arjo

25

asked Aug 25, 2024 at 12:38

1 vote

1 answer

95 views

How do I set the time zone format to abbreviated on Windows 10?

I've written a python script that uses strftime() from the time module. On my windows 10 computer I get the long form format for time zone when I call strftime("%Z"), and I want to ...

HeKaiNani

9

asked Aug 23, 2024 at 18:48

-1 votes

1 answer

40 views

how to find higher case followed by lower case or just higher case

I am trying to match either a higher-case letter followed by a lower-case letter or just a higher-case letter. Many questions were answered about how to get higher-case or lower-case letters, but I ...

Jabed A. Mohammed

1

asked Aug 22, 2024 at 15:13

0 votes

0 answers

54 views

Find text inside top-level brackets when they're nested [duplicate]

I have a file with nested brackets. I need to parse the text within the top-level brackets with Python regex. import re string = '{a {b} c} {d}' # desired output: ['a {b} c', 'd'] # non-greedy ...

zest16

679

asked Aug 22, 2024 at 12:32

1 vote

1 answer

55 views

xhUsing Regex to find instances of a headr, then editing the lines below with some specifications

So I have this excerpt of the .msg file below. What I wish to do is for all the [sel xxx xxx] headers find them then read the lines below them. If any of the answers contain a (+3) or any (+x) then ...

Medusa

21

asked Aug 18, 2024 at 17:43

-2 votes

3 answers

73 views

"not" workaround when using Regular Expression in Python [duplicate]

What I want to do is validate user inputs. The criterion is only numeric inputs are allowed, no alpha, no characters like .,/?<> etc. Say a user inputs 1989, it will print true But if the user ...

cutelittlebunny

55

asked Aug 9, 2024 at 3:37

1 vote

1 answer

72 views

Replace characters before a number to a new character after the number python

I have some strings look like: *.rem.1.gz and *.rem.2.gz And I want to replace it into *.1.trim.gz and *.2.trim.gz The number 1 and number two files are paired with each other, which I want to create ...

Pluto Liu

27

asked Jul 29, 2024 at 3:53

5 votes

0 answers

124 views

Why does re._compile exist?

Here is re.compile: >>> import re, inspect >>> print(inspect.getsource(re.compile)) def compile(pattern, flags=0): "Compile a regular expression pattern, returning a Pattern ...

wim

368k

asked Jul 26, 2024 at 18:14

0 votes

1 answer

44 views

Is there any situation where re.search could not be used instead of re.match? [duplicate]

The documentation seems clear but it begs the question, what is the purpose of re.match? Couldn't re.search with the caret (^) be used instead as long as the MULTILINE flag is not enabled? Is re.match ...

Kevin Eldurson

190

asked Jul 26, 2024 at 17:46

1 vote

2 answers

166 views

How to find all occurrences of a substring in a string while ignoring some characters in Python?

I'd like to find all occurrences of a substring while ignoring some characters. How can I do it in Python? Example: long_string = 'this is a t`es"t. Does the test work?' small_string = "test&...

Franck Dernoncourt

84.7k

asked Jul 25, 2024 at 23:12

1 vote

1 answer

91 views

How to extract the volume from a string using a regular expression?

I need to extract the volume with regular expression from strings like "Candy BAR 350G" (volume = 350G), "Gin Barrister 0.9ml" (volume = 0.9ml), "BAXTER DRY Gin 40% 0.5 ml&...

Veronica Isakova

13

asked Jul 25, 2024 at 12:26

2 votes

1 answer

72 views

How can I simplify this method to replace punctuation while keeping special words intact?

I am making a modulatory function that will take keywords with special characters (@&\*%) and keep them intact while all other punctuation is deleted from a sentence. I have devised a solution, ...

linkey apiacess

297

asked Jul 24, 2024 at 1:26

2 votes

1 answer

71 views

How do I fix this Reg ex so that it matches hyphenated words where the final segment ends in a consonant other than the letter m

I want to match all cases where a hyphenated string (which could be made up of one or multiple hyphenated segments) ends in a consonant that is not the letter m. In other words, it needs to match ...

Paige Cox

55

asked Jul 22, 2024 at 17:35

-3 votes

1 answer

24 views

Replacing part of string with re.sub with number and string [closed]

I want to replace part of a string based on re.sub method as below import re re.sub("([0-9]_F)$", '[0-9]_DO', 'sdsd3_F') However I fail to manage the numerical part of match which is also a ...

Bogaso

3,896

asked Jul 20, 2024 at 4:21

0 votes

1 answer

69 views

How can a number range and value be extracted from this complicated string using Python?

I have a complicated string that includes a kilometer range and a fee for users that fall into that range. Ideally, I would like to transform the string into something that I could use to easily ...

Feiznia

15

asked Jul 10, 2024 at 6:45

-2 votes

1 answer

40 views

regular expression to find pattern in the same word [duplicate]

There is a string "123:987 767687:99 145:986 156:876 " My regex expression is (\d{3}):\1 I expecting the result is 123:987, 145:986, 156:876 there is no result found. i dont undertsand. ...

Анатолій

1

asked Jun 30, 2024 at 16:34

0 votes

1 answer

54 views

Match a patern with multiple entries in arbitrary order in Python with re [duplicate]

I try to catch values entered in syntax like this one name="Game Title" authors="John Doe" studios="Studio A,Studio B" licence=ABC123 url=https://example.com command=&...

fauve

321

asked Jun 28, 2024 at 22:25

4 votes

1 answer

155 views

Why is `re.Pattern` generic?

import re x = re.compile(r"hello") In the above code, x is determined to have type re.Pattern[str]. But why is re.Pattern generic, and then specialized to string? What does a re.Pattern[...

bzm3r

4,664

asked Jun 26, 2024 at 0:11

0 votes

1 answer

107 views

How does regex filteration work in Python re while logging sensitive info?

I am trying to write a python script which would redact/hide certain data present in a string before logging it out to the console. Below is my code snippet so far. import re from logging import DEBUG,...

John Bosman

1

asked Jun 24, 2024 at 13:38

Collectives™ on Stack Overflow