1

I am new in bash scripting, I have html file and i want to read the file and show into terminal with formatting.

My Html file Code

<table>
<tr><th >Country Name</th><th >City1</th><th >City2</th><th>City3</th></tr>
<tr><td>CHINA</td><td>500</td><td>700</td><td>1200</td></tr>
<tr><td>USA</td><td>400</td><td>600</td><td>1000</td></tr>
</table>

How can format Terminal output, i mean their spaces between colum1 and column2?

6
  • 1
    Don't put text in images. Commented Sep 25, 2020 at 19:54
  • it is not text i just take prtscr of html file and terminal, so that i can explain my problem easly. Commented Sep 25, 2020 at 19:56
  • 1
    There is no such thing as "awk bash". Awk is one programming language. Bash is a different one. A bash script that calls awk (or the inverse) is a script that has different parts written in different languages, run by completely independent interpreters. Commented Sep 25, 2020 at 19:57
  • ...anyhow, if you want to extract content from XML or HTML, there are dedicated tools for that. I strongly recommend using something that leverages XPath, XSLT, and other standardized query languages; my preferred favorite command-line tool (which generates XSLT under the hood in many of its modes) is xmlstarlet. Commented Sep 25, 2020 at 19:57
  • And if you're going to use printf in awk, use it for both values -- you can have it pad out the string to a specific column length. Commented Sep 25, 2020 at 20:00

1 Answer 1

1

Option 1: Using column to format your existing code's output

Use column tool to format the code for you

$ cat test.sh 
#!/bin/bash

pre="<tr><td>"
post="<\/td><\/tr>"
mid="<\/td><td>"

cat myfile.html | grep "<td>" | sed -e "s/^$pre//g;s/$post$//g;s/$mid/ /g" | awk '{ sum=($2+$3+$4); printf $1  " %.0f \n" ,sum}'

$ cat myfile.html 
<table>
<tr><th >Country Name</th><th >City1</th><th >City2</th><th>City3</th></tr>
<tr><td>CHINA</td><td>500</td><td>700</td><td>1200</td></tr>
<tr><td>USA</td><td>400</td><td>600</td><td>1000</td></tr>
</table>

$ ./test.sh | column -t
CHINA  2400
USA    2000

Option 2: Updating your existing code's use of printf

If we know the longest possible country-name length, we can tell printf to pad to it. Changing only the awk part of your existing answer (in this case, telling it to pad to 8 spaces):

grep "<td>" myfile.html \
  | sed -e "s/^$pre//g;s/$post$//g;s/$mid/ /g" \
  | awk '{ sum=($2+$3+$4); printf "%-08s %.0f \n", $1, sum}'

...we get output:

CHINA    2400
USA      2000
Sign up to request clarification or add additional context in comments.

8 Comments

This is already covered in How can I align the columns of tables in bash?, which is a member of the duplicate list. (And as covered there, the column tool is not available everywhere bash is, so printf is often better).
@CharlesDuffy in that case should I remove my answer? I'm new here 2 days only, not sure about all the "rules".
shrug. If it were me, I'd just flag it community-wiki, but that's a strictly voluntary action -- I can't tell you to do it (but disowning gaining any reputation from an answer, as a community-wiki flag does, tends to make answers that edge up against the rules more acceptable).
@CharlesDuffy done, in that case because I believe this answer can be improved to answer this specific scenario with the printf | awk tool. if someone wants to put more effort on it :)
Heh. Glad to demo using awk's printf for alignment.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.