2

Dear highly appreciated community,

At first let me say thank you for years of valuable lecture and learning potantial. I always got an answer on my questions by survey. Unfortunately, I didn't find any clue this time.

I am writing, what I thought, a small and easy script to just download several websites from a .csv file.

The file is structured as followed:

[email protected];http://www.url.com/?s=NUMBER&a=NUMBER&l=COUNTRY&c=NUMBER&h=NUMBER

where NUMBER is a number and country is the 2 digits countrycode. "uk" or "fr", for example.

The URL alwas has the same beginning http://www.URL.com/?s= followed by 4 settings.

I thought of being satisfied by just downloading those hundreds websites as is. Because they do not contain any special images.

My script looks like this:

#!/bin/bash
while read line
do
    #echo $line
    #curl -o download/test.htm $line
    varA="$( echo $line|awk -F';' '{print $1}' )"
    varB="$( echo $line|awk -F';' '{print $2}' )"
    varB1="$( echo $varB|awk -F'&' '{print $2}' )"
    varB2="$( echo $varB|awk -F'&' '{print $3}' )"
    varB3="$( echo $varB|awk -F'&' '{print $4}' )"
    varB4="$( echo $varB|awk -F'&' '{print $5}' )"
    echo 'Downloading survey of:'
    echo $varA
    curl -o $varA.htm "http://www.url.com/?s=771223&"$varB1"&"$varB2"&"$varB3"&"$varB4
    echo "--------------------------------------------------------------"
    echo ""
done < Survey.csv

The website downloaded always contains a http 400 Error.

I already tried curl -o $varA.htm $varB which also returned the http 400 Error.

Thinking the '&' was the culprit, the script you see above is my last try.

Many thanks in advance! Andre

2
  • Repeated use of awk is a very inefficient way to parse a line. IFS=";" read varA varB and IFS="&" read _ varB1 varB2 varB3 varB4 <<< "$varB" are far superior. Commented Mar 9, 2014 at 15:20
  • Quote your variable expansions; see if curl -o "$varA.html" "$varB" works. Commented Mar 9, 2014 at 15:22

2 Answers 2

2

Similar to the remarks by @chepner, try something like:

while IFS=';?&' read varA varB0 varB1 varB2 varB3 varB4
do
  echo 'Downloading survey of:'
  echo "$varA"
  curl -o "$varA.htm" "http://www.url.com/?s=771223&${varB1}&${varB2}&${varB3}&${varB4}"
done < Survey.csv

or in this case where the last 4 variables are used unchanged:

while IFS=';?&' read varA varB0 rest
do
  echo 'Downloading survey of:'
  echo "$varA"
  curl -o "$varA.htm" "http://www.url.com/?s=771223&$rest"
done < Survey.csv
Sign up to request clarification or add additional context in comments.

Comments

1

Rather than using multiple awk you can do in single awk:

s='[email protected];http://www.url.com/?s=NUMBER&a=NUMBER&l=COUNTRY&c=NUMBER&h=NUMBER'
awk -F '[;&?]' '{for (i=1; i<=NF; i++) print $i}' <<< "$s"
[email protected]
http://www.url.com/
s=NUMBER
a=NUMBER
l=COUNTRY
c=NUMBER
h=NUMBER

You can store results in BASH arrays:

arr=( $(awk -F '[;&?]' '{for (i=1; i<=NF; i++) printf "%s ", $i}' <<< "$s") )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.