2

Problem

I was trying to get sed command to do the same thing I could do with Python regex flavour, but I encountered some problems

Python regex example: (tested it on regex101 and it was working fine)

find: (https.*?)

replace: "\1"

Unsuccessful code:

sed 's/\(https.*?\)[:space:]/\"\1\"/g' .\elenco.txt

elenco.txt file:

https://www.youtube.com/watch?app=desktop&v=Ot34P0yyQqI&t=984s https://www.youtube.com/watch?v=vviniZjvDQs  https://www.youtube.com/watch?v=Ih7qgkyo_oo  https://www.youtube.com/watch?v=X6UEDpwI3HI  https://www.youtube.com/watch?v=nShgaRMNlLw  https://www.youtube.com/watch?v=nd_jN-C_Juw  https://www.youtube.com/watch?v=aOtqox2uB3Y

Expected output:

"https://www.youtube.com/watch?app=desktop&v=Ot34P0yyQqI&t=984s" "https://www.youtube.com/watch?v=vviniZjvDQs"  "https://www.youtube.com/watch?v=Ih7qgkyo_oo"  "https://www.youtube.com/watch?v=X6UEDpwI3HI"  "https://www.youtube.com/watch?v=nShgaRMNlLw"  "https://www.youtube.com/watch?v=nd_jN-C_Juw"  "https://www.youtube.com/watch?v=aOtqox2uB3Y"

Actual output:

"https://www.youtube.com/watch?"pp=desktop&v=Ot34P0yyQqI&t=984s https://www.youtube.com/watch?v=vviniZjvDQs  https://www.youtube.com/watch?v=Ih7qgkyo_oo  https://www.youtube.com/watch?v=X6UEDpwI3HI  https://www.youtube.com/watch?v=nShgaRMNlLw  https://www.youtube.com/watch?v=nd_jN-C_Juw  https://www.youtube.com/watch?v=aOtqox2uB3Y

Info

OS

Name: Microsoft Windows 11 Home Version: 10.0.26100 N/D build 26100

installed sed through winget install bmatzelle.Gow

I've always avoided using POSIX regex etc, as I found it unnecessarily complicated / limited compared to using perl/python etc. and the regex flavour available there.

Any other options than to install Perl/Python? 200MB for StrawberryPerl (Perl on Windows) seems to be quite overkill and useless bloat just to have access to perl flavour regex, and sed unlike perl doen't support 'easy' regex...

https://askubuntu.com/questions/1050693/sed-with-pcre-like-grep-p

2
  • If you don't want a perl solution, you probably shouldn't have the perl tag on your question. Also, I am not familiar with windows, but perhaps adding such a tag can get you a native solution that requires no installs. Commented May 3 at 17:07
  • try: sed 's#(https\?://[^ ]\+)#"\1"#g' NB:GNU sed 4.7 Commented May 3 at 17:10

4 Answers 4

3

If you can use sed's -E (--regexp-extended), it'll match more how you expect

sed -E 's/(http[^ ]+)/"\1"/g' elenco.txt
Sign up to request clarification or add additional context in comments.

Comments

3

The non-white-space pattern \S (uppercase \S opposite of \s) documented for sed here was chosen because often we need to match spaces or non-spaces in regexp:

sed 's/\(https\S*\)/"\1"/g' elenco.txt

same as

sed -E 's/(https\S+)/"\1"/g' elenco.txt

Comments

2

It's pretty trivial to do something like this in Perl. I don't know, 200mb these days seems pretty small. You can even do this with Windows Subsystems for Linux or WSL. Install WSL, run bash from a command prompt, then sudo apt install perl. I use WSL from the command line in Windows all the time. It's very small and incredibly useful. PCRE regular expressions are really useful because they are portable, and you don't have to rewrite your regular expressions for every minor wrinkle in every language.

Basically look for anything not a space, until you find a space or end of line. Backreference all that and put quotes around it in a global match.

Here is the code golfed at 21 characters...

$ perl -pe 's/(\S+)( |$)+/"\1" /g' elenco.txt 
"https://www.youtube.com/watch?app=desktop&v=Ot34P0yyQqI&t=984s" "https://www.youtube.com/watch?v=vviniZjvDQs" "https://www.youtube.com/watch?v=Ih7qgkyo_oo" "https://www.youtube.com/watch?v=X6UEDpwI3HI" "https://www.youtube.com/watch?v=nShgaRMNlLw" "https://www.youtube.com/watch?v=nd_jN-C_Juw" "https://www.youtube.com/watch?v=aOtqox2uB3Y"

The output matches your expected output. If you can install Perl it is probably worth your time to do so. It gets really difficult to manage regular expressions across different languages unless they are all PCRE. IMO Perl is better in every way than both Sed and Awk.

To modify the original input file with the quoted URLs you could run something like this...

$ perl -i -pe 's/(\S+)( |$)+/"\1" /g' elenco.txt  

However this is dangerous during testing. The original file will be lost unless you have backups. It is probably safer to run something like this...

$ perl -pe 's/(\S+)( |$)+/"\1" /g' elenco.txt  > updated_elenco.txt

Comments

2

Using any POSIX sed to simply wrap any sequence of 1 or more non-blanks in double quotes:

$ sed -E 's/[^ ]+/"&"/g' elenco.txt
"https://www.youtube.com/watch?app=desktop&v=Ot34P0yyQqI&t=984s" "https://www.youtube.com/watch?v=vviniZjvDQs"  "https://www.youtube.com/watch?v=Ih7qgkyo_oo"  "https://www.youtube.com/watch?v=X6UEDpwI3HI"  "https://www.youtube.com/watch?v=nShgaRMNlLw"  "https://www.youtube.com/watch?v=nd_jN-C_Juw"  "https://www.youtube.com/watch?v=aOtqox2uB3Y"

The same with any awk:

$ awk '{gsub(/[^ ]+/,"\"&\"")} 1' elenco.txt
"https://www.youtube.com/watch?app=desktop&v=Ot34P0yyQqI&t=984s" "https://www.youtube.com/watch?v=vviniZjvDQs"  "https://www.youtube.com/watch?v=Ih7qgkyo_oo"  "https://www.youtube.com/watch?v=X6UEDpwI3HI"  "https://www.youtube.com/watch?v=nShgaRMNlLw"  "https://www.youtube.com/watch?v=nd_jN-C_Juw"  "https://www.youtube.com/watch?v=aOtqox2uB3Y"

Change [^ ] to [^[:space:]] (or \S if you have GNU versions of the tools) if the white space can be more than just blanks.

To do it the way you were trying to do it in the question using a BRE and including https and [:space:] in the regexp and then referring to the capture group \1 in the replacement would be:

$ sed 's/\(https[^[:space:]]*\)/"\1"/g' elenco.txt
"https://www.youtube.com/watch?app=desktop&v=Ot34P0yyQqI&t=984s" "https://www.youtube.com/watch?v=vviniZjvDQs"  "https://www.youtube.com/watch?v=Ih7qgkyo_oo"  "https://www.youtube.com/watch?v=X6UEDpwI3HI"  "https://www.youtube.com/watch?v=nShgaRMNlLw"  "https://www.youtube.com/watch?v=nd_jN-C_Juw"  "https://www.youtube.com/watch?v=aOtqox2uB3Y"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.