5

So I'm pretty good with regular expressions, but I'm having some trouble with them on unix. Here are two things I'd love to know how to do:

1) Replace all text except letters, numbers, and underscore

In PHP I'd do this: (works great)

preg_replace('#[^a-zA-Z0-9_]#','',$text).

In bash I tried this (with limited success); seems like it dosen't allow you to use the full set of regex:

text="my #1 example!"
${text/[^a-zA-Z0-9_]/'')

I tried it with sed but it still seems to have problems with the full regex set:

echo "my #1 example!" | sed s/[^a-zA-Z0-9\_]//

I'm sure there is a way to do it with grep, too, but it was breaking it into multiple lines when i tried:

echo abc\!\@\#\$\%\^\&\*\(222 | grep -Eos '[a-zA-Z0-9\_]+'

And finally I also tried using expr but it seemed like that had really limited support for extended regex...


2) Capture (multiple) parts of text

In PHP I could just do something like this:

preg_match('#(word1).*(word2)#',$text,$matches);

I'm not sure how that would be possible in *nix...

0

3 Answers 3

14

Part 1

You are almost there with the sed just add the g modifier so that the replacement happen globally, without the g, replacement will happen just once.

$ echo "my #1 example!" | sed s/[^a-zA-Z0-9\_]//g
my1example
$

You did the same mistake with your bash pattern replacement too: not making replacements globally:

$ text="my #1 example!"

# non-global replacement. Only the space is delete.
$ echo ${text/[^a-zA-Z0-9_]/''}
my#1 example!

# global replacement by adding an additional / 
$ echo ${text//[^a-zA-Z0-9_]/''}
my1example

Part 2

Capturing works the same in sed as it did in PHP's regex: enclosing the pattern in parenthesis triggers capturing:

# swap foo and bar's number using capturing and back reference.
$ echo 'foo1 bar2' | sed -r 's/foo([0-9]+) bar([0-9]+)/foo\2 bar\1/'
foo2 bar1
$ 
Sign up to request clarification or add additional context in comments.

Comments

1

As an alternative to codaddict's nice answer using sed, you could also use tr for the first part of your question.

echo "my #1 _ example!" | tr -d -C '[[:alnum:]_]'

I've also made use of the [:alnum:] character class, just to show another option.

1 Comment

Note, the :macro: features in tr aren't consistent across implementations, and may be missing altogether. For instance, busybox's tr lacks them altogether (or did, the last time I checked)
0

what do you mean you can't use the regex syntax for bash?

$ text="my #1 example!"
$ echo ${text//[^a-zA-Z0-9_]/}
my1example

you have to use // for more than 1 replacement.

for your 2nd question, with bash 3.2++

$ [[ $text =~ "(my).*(example)" ]]
$ echo ${BASH_REMATCH[1]}
my
$ echo ${BASH_REMATCH[2]}
example

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.