2

I want to get data between td tags in unix shell script in a generalize way.

for example in the following

<td style="padding:3px;" align="center">123.456</td>

how to retrieve 123.456 in a generalize way.

Thanks

2
  • In order to help you, can you post What have tried so far? Commented Apr 25, 2013 at 10:45
  • What system are you working with? Can you start/install XML Shell (xmlsh)? Commented Apr 25, 2013 at 11:04

4 Answers 4

2

You can try with sed,

sat:~# cat file
<td style="padding:3px;" align="center">123.456</td>
<td>sat</td>
sat:~#  
sat:~# sed 's/<td\(.*[^<>]\+\?>\)\(.*\)<\/td>/\2/g' file
123.456
sat
sat:~# 

I hope it will help you.

Sign up to request clarification or add additional context in comments.

Comments

0
sed 's/^.*<td.*>\(.*\)<.*$/\1/' file

Comments

0

For a proper solution and in a generalized way use a proper parser like html-xml-utils

for a non-proper and non-gerneralized way, use sed

sed 's/^.*>\([0-9.]*\)<.*$/\1/'

Comments

0

If for some reason you cannot use a xml parser,

grep was born to extract things. :)

grep -Po '(?<=>)[^<]*'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.