3

I am trying to work with a text file named itemlist.txt that contains:

http://example.com/item-a
http://example.com/item-b
http://example.com/item-c
http://example.com/item-d
http://example.com/item-e

I've tried many different variations of code. Some will return just the item but not the url. I can't figure out how to assign $url correctly. This is about the closest I've come to achieving the desired output.

#!/bin/bash

while read url; do 
for item in $(sed "s/http:\/\/example.com\///g"); do
echo $item $url; done
done < itemlist.txt

The desired output is:

item-a http://example.com/item-a
item-b http://example.com/item-b
item-c http://example.com/item-c
item-d http://example.com/item-d
item-e http://example.com/item-e

But instead I am getting:

item-b http://example.com/item-a
item-c http://example.com/item-a
item-d http://example.com/item-a
item-e http://example.com/item-a

Can someone shed some light on how to do this correctly?

3 Answers 3

7

Don't use sed; just use parameter expansion to remove everything up to and including the final / in the URL.

while IFS= read -r url; do
    item=${url##*/}
    echo "$item $url"
done < itemlist.txt

(Your problem, by the way, is that both sed and read are reading from itemlist.txt; read gets the first line, and sed consumes the rest. Your while loop exits after the first iteration.)

Sign up to request clarification or add additional context in comments.

7 Comments

++ for the explanation; overall, I'd still recommend a single tool invocation over a shell loop.
On second thought: the title suggests that storing results in shell variables may be a requirement, though it's not clear whether that's a mere means to the end of printing the result; either way: use this answer if you need to store each line's results in shell variables; consider the awk solution if it's only stdout output that matters.
Agreed; shell has some rudimentary data processing abilities, but generally it should only be used in support of shell's main raison d'être, which is to execute other programs. Data processing in its own right should be done in a more suitable language, like awk.
@user556068: It doesn't stop at //, because ## means to strip the longest prefix that matches pattern */. For the same reason, a URL that ends in / will have everything stripped, leaving the empty string. To handle this case you either need an intermediate step that removes the trailing / - urlNoTerminator=${url%/} - or you can use [[ $url =~ /([^/]+)/?$ ]] && item="${BASH_REMATCH[1]}"
@mklement0: Ok I think I have a better understanding now as far as how parameter expansion works. I can see that both of the examples you give work. And I even understand the first one which is pretty cool. The second one, though i know it works; why it works still eludes me. But I'll get it eventually. I appreciate everyone answering my questions. I know we're not supposed to say thanks in the comments but, thanks to you and chepner and jaypal for helping me understand this stuff.
|
3

This answer assumes that printing the results to stdout is sufficient; if, by contrast, you need to store result components in shell variables for each input line, see chepner's helpful answer.

awk is probably the best tool to use here:

awk -F/ '{ print $NF, $0 }' itemlist.txt
  • -F/ splits each input line into fields by /
  • $NF is the last field on each input line
  • $0 is the full input line.
  • print prints its arguments separated by a single space each by default (based on built-in variable OFS; setting OFS changes that).

Comments

2

Well, awk is probably be the best tool shown in mklement0's answer. However, no harm in having another option.

If your sed does not have -r option, just escape all parens. I have used # as the delimiter. You can use the conventional one by escaping / used as part of capture group.

The logic is pretty simple. You greedily capture everything until the last piece in a capture group. You capture the last piece in another capture group and just use them to suit your desired output.

$ sed -r 's#(.*/)(.*)$#\2 \1\2#' file
item-a http://example.com/item-a
item-b http://example.com/item-b
item-c http://example.com/item-c
item-d http://example.com/item-d
item-e http://example.com/item-e

2 Comments

++; if you use -E instead of -r, it'll also work on OS X (-E isn't documented in GNU sed's man page, but it does work as an alias of -r).
Getting a bit rusty not answering on SO often. ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.