How to replace dash between characters with space using regex

Question

I want to replace dashes which appear between letters with a space using regex. For example to replace ab-cd with ab cd

The following matches the character-character sequence, however also replaces the characters [i.e. ab-cd results in a d, rather than ab cd as i desire]

 new_term = re.sub(r"[A-z]\-[A-z]", " ", original_term)

How i adapt the above to only replace the - part?

Can you do this by simple replacing - with a space in the given string? Is using regex necessary? — Jeff B
– Jeff B, Commented Oct 16, 2015 at 22:11
@JeffBridgman yes - i only want to replace when the dash occurs between characters, and not when between space. i.e. to replace ab-cd, but not to change ab - cd - [replace doesn't have that control]. — kyrenia
– kyrenia, Commented Oct 16, 2015 at 22:13

Pedro Lobito · Accepted Answer · 2015-10-13 22:06:48Z

You need to capture the characters before and after the - to a group and use them for replacement, i.e.:

import re
subject = "ab-cd"
subject = re.sub(r"([a-z])\-([a-z])", r"\1 \2", subject , 0, re.IGNORECASE)
print subject
#ab cd

DEMO

http://ideone.com/LAYQWT

REGEX EXPLANATION

([A-z])\-([A-z])

Match the regex below and capture its match into backreference number 1 «([A-z])»
   Match a single character in the range between “A” and “z” «[A-z]»
Match the character “-” literally «\-»
Match the regex below and capture its match into backreference number 2 «([A-z])»
   Match a single character in the range between “A” and “z” «[A-z]»

\1 \2

Insert the text that was last matched by capturing group number 1 «\1»
Insert the character “ ” literally « »
Insert the text that was last matched by capturing group number 2 «\2»

TigerhawkT3 · Accepted Answer · 2015-10-13 22:00:01Z

5

Use references to capturing groups:

>>> original_term = 'ab-cd'
>>> re.sub(r"([A-z])\-([A-z])", r"\1 \2", original_term)
'ab cd'

This assumes, of course, that you can't just do original_term.replace('-', ' ') for whatever reason. Perhaps your text uses hyphens where it should use en dashes or something.

answered Oct 13, 2015 at 22:00

TigerhawkT3

49.4k6 gold badges65 silver badges101 bronze badges

1 Comment

Federico Piazza Over a year ago

You shouldn't use [A-z] since regex ranges uses ascii table index. For this specific range you will match A-Z[\]^_`a-z. However, you can use (?i) if you want to use a-z as key insensitive. For instance, you can have (?i)([a-z])\-([a-z]). Anyway, I know OP original regex is that... but just saying.

cg909 · Accepted Answer · 2015-10-13 22:02:51Z

1

re.sub() always replaces the whole matched sequence with the replacement.

A solution to only replace the dash are lookahead and lookbehind assertions. They don't count to the matched sequence.

new_term = re.sub(r"(?<=[A-z])\-(?=[A-z])", " ", original_term)

The syntax is explained in the Python documentation for the re module.

answered Oct 13, 2015 at 22:02

cg909

2,58422 silver badges25 bronze badges

Comments

Wiktor Stribiżew · Accepted Answer · 2019-11-04 09:57:48Z

1

You need to use look-arounds:

 new_term = re.sub(r"(?<=[A-Za-z])-(?=[A-Za-z])", " ", original_term)

Or capturing groups:

 new_term = re.sub(r"([A-Za-z])-(?=[A-Za-z])", r"\1 ", original_term)

See IDEONE demo

Note that [A-z] also matches some non-letters (namely [, \, ], ^, _, and `), thus, I suggest replacing it with [A-Z] and use a case-insensitive modifier (?i).

Note that you do not have to escape a hyphen outside a character class.

edited Nov 4, 2019 at 9:57

answered Oct 13, 2015 at 22:02

Wiktor Stribiżew

631k41 gold badges502 silver badges633 bronze badges

Comments

TylerH · Accepted Answer · 2023-03-23 14:23:24Z

-1

I think there's a simple way to replace dashes using Visual Basic, in a multiline textbox:

Regex.Replace(ReadText.Text, "[-]", " ")

edited Mar 23, 2023 at 14:23

TylerH

21.3k84 gold badges84 silver badges122 bronze badges

answered Oct 15, 2021 at 12:52

Cynthia Fridsma

753 bronze badges

1 Comment

TylerH Over a year ago

This may be accurate, but it's not Python, which is what this question is about.

Collectives™ on Stack Overflow

How to replace dash between characters with space using regex

5 Answers 5

Comments

1 Comment

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

1 Comment

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related