0

I would like to select a text from a file in Python and replace only from the selected phrase until a certain text.

with open ('searchfile.txt', 'r' ) as f:
    content = f.read()
    content_new = re.sub('^\S*', '(.*?\/)', content, flags = re.M)
with open ('searchfile.txt', 'w') as f:
    f.write(content_new)

searchfile.txt contains the below text:

abc/def/efg 212 234 asjakj
hij/klm/mno 213 121 ashasj

My aim is to select everything from the line until the first space and then replace it with the text until the first occurance of backslash /

Example:

^\S* selects everything until the first space in my file which is "abc/def/efg".

I would like to replace this text with only "abc" and "hij" in different lines

My regexp (.*?\/) does not work for me here.

6
  • The second argument to re.sub() isn't a regular expression, it's the replacement text. Only the first argument is a regexp. Commented Oct 6, 2022 at 21:03
  • 3
    Just use content_new = content.split()[0].split('/')[0], why regex? Commented Oct 6, 2022 at 21:03
  • content_new = content.split()[0].split('/')[0] replaces all the lines in the file with the first match i.e abc even if we had multiple lines Commented Oct 6, 2022 at 21:08
  • See my answer with two solutions for the text you provided. You say "searchfile.txt contains the below text: abc/def/efg 212 234 asjakj" - it means there is only one line. Commented Oct 6, 2022 at 21:08
  • this is correct, you need to update verbiage in the question if your ask is different. in future, please choose wording more carefully such that the meaning or intention is unambiguous. Commented Oct 6, 2022 at 21:11

5 Answers 5

2

You can split the content with whitespace, get the first item and split it with / and take the first item:

content_new = content.split()[0].split('/')[0]

See the Python demo.

If you plan to use a regex, you may use

match = re.search(r'^[^\s/]+', content, flags = re.M)
if match:
    content_new = match.group()

See the Python demo. Details:

  • ^ - start of a line (due to re.M)
  • [^\s/]+ - one or more chars other than whitespace and /.
Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

>>> s = 'abc/def/efg 212 234 asjakj'
>>> p = s.split(' ', maxsplit=1)
>>> p
['abc/def/efg', '212 234 asjakj']
>>> p[0] = p[0].split('/', maxsplit=1)[0]
>>> p
['abc', '212 234 asjakj']
>>> s = ' '.join(p)
>>> s
'abc 212 234 asjakj'

One-liner solution:

>>> s.replace(s[:s.index(' ')], s[:s.index('/')], 1)
'abc 212 234 asjakj'

Comments

1

May be this can help

import re

s = "abc/def/efg 212 234 asjakj"
pattern = r"^(.*?\/)"
replace = "xyz/"
op = re.sub(pattern, replace, s)
print (op)

Comments

1

Rephrased expected behavior

  1. Given a string that has this pattern: <path><space>.
  2. If the first part of given string (<path>) has at least one slash / surrounded by words.
  3. Then return the string before the slash.
  4. Else return empty string.

Where path is words delimited by slashes. For example abc/de. But but not one of those:

  • abc
  • /de
  • abc/file.txt
  • abc/

Solution

Matching lines

Could also match for the pattern and only extract the first path-element before the slash then.

import re

line = "abc/def/efg 212 234 asjakj"

extracted = ''  # default
if re.match(r'^(\w+/\w+)+ ', line):
    extracted = line.split('/')[0]  # even simpler than Wiktors split

print(extracted)

Extraction

The extraction can be done in two ways:

(1) Just the first path-element, like Wiktor answered.

first_path_element = "abc/def/efg 212 234 asjakj".split('/')[0]
print(first_path_element)

(2) Some may find a regex shorter and more expressive:

import re

first_path_element = re.findall(r'^(\w+)/', "abc/def/efg 212 234 asjakj")[0]
print(first_path_element)

Comments

0

Here is a solution which is working for reading from the file, searching a pattern, replacing with a new one and writing into the same file.

file_name = ("/home/searchfile.txt")
with open(file_name) as file:
    lines = file.readlines()
result_data = []
for line in lines:
    line = line.strip()
    space_split = line.split(" ")
    prefix = space_split[0].split("/")[0]
    result = prefix + " " + " ".join(space_split[1:])
    result_data.append(result)
with open(file_name, "w") as file:
    lines = file.writelines("\n".join(result_data)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.