Grep for URL parsing - bash script programming

Question

I am trying to learn some bash scripting and i can't understand how to use grep in order to split a URL link for example :

blabla1.com         
blabla2.gov         
blabla3.fr

I just want to keep com , gov and fr ( without the '.' character) ignore whats before '.'

Thanks in advance ..

That is not what grep is for. grep is an abbreviation for "Globally find a Regular Expression and Print the result". What you are describing is a job for some other tool like sed or awk. If your URLs are included among other text in files, post some samples of that full text. Also post exactly what your expected output would be given your posted sample input. — Ed Morton
– Ed Morton, Commented Apr 21, 2015 at 21:45
@Ed Morton You are right.. grep is not what I wanted in the end because it only prints .. I think awk is more suitable ( I am studying on it ) ! sorry for the late answer — User1911
– User1911, Commented Apr 22, 2015 at 15:37
It depends what you are really trying to do. If you're doing a simple substitution on individual lines then sed is the right tool. If it's more than that then you'd use awk. We can't tell from what you've posted so far. — Ed Morton
– Ed Morton, Commented Apr 22, 2015 at 16:13

John Bollinger · Accepted Answer · 2015-04-23 18:21:58Z

2

Grep is a tool for matching text. You need something else if you want to transform text. If you have the values in question in a bash variable, then what you ask is pretty easy:

authority=blabla.com

# Here's the important bit:
domain=${authority/*./}

echo $domain

The funny syntax in the middle evaluates to the result of a pattern substitution on the value of variable temp.

If you're trying to do this on lines of a file, then the sed program is your friend:

sed 's/.*\.//' < input.file

This is again a pattern substitution, but sed uses regular expression patterns, whereas bash uses shell glob patterns.

edited Apr 23, 2015 at 18:21

answered Apr 21, 2015 at 20:07

John Bollinger

191k11 gold badges103 silver badges208 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Sergey Grigoriev · Accepted Answer · 2015-04-21 20:15:07Z

1

grep -E -o '[^.]+$' < input

-o instructs grep to print only the matching part of the line

-E switches on extended regexp which is needed for + quantifier

[^.]+$ means any character which is not a dot at the end of the line

answered Apr 21, 2015 at 20:15

Sergey Grigoriev

7197 silver badges15 bronze badges

1 Comment

Ed Morton Over a year ago

@User1911 a tool that produces the output you want from some specific sample input is just the starting point for a solution. The above is unnecessarily complex if all you want is the part after a . in a file, other tools can do it more simply and portably.

jherran · Accepted Answer · 2015-04-21 20:14:51Z

0

Try this way:

grep -o -E '[a-z]{2,3}\b' input > output

-o, --only-matching: Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

$ cat input
blabla1.com
blabla2.gov
blabla3.fr

$ cat output
com
gov
fr

answered Apr 21, 2015 at 20:14

jherran

3,3878 gold badges40 silver badges54 bronze badges

Comments

Ed Morton · Accepted Answer · 2015-04-21 21:49:24Z

0

$ cut -d. -f2 file
com
gov
fr

If that's not all you need, post some more truly representative input and expected output so we can help you find the right solution.

answered Apr 21, 2015 at 21:49

Ed Morton

209k18 gold badges90 silver badges212 bronze badges

Collectives™ on Stack Overflow

Grep for URL parsing - bash script programming

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related