6

I'm trying to curl/wget a list of directories/files names available in a directory listing of a webserver.

For example from (randomly chosen) http://prodata.swmed.edu/download/, I'm trying to download:

bin
dev
etc
member
pub
usr
usr1
usr2

cUrl (curl http://prodata.swmed.edu/download/) gets me the whole HTML page, which I'd need to parse manually for all file/directory entries.

Is there a way to download the names of the available files/directories only, with curl/wget, without installing additional parser?

2 Answers 2

6

HTTP protocol has no feature to request a "list of files" from an HTTP server.

curl / wget/ browser requests a URL, which contains an arbitrary request string and the server sends back some arbitrary data.

However you can extract the names with following commands

curl --silent http://prodata.swmed.edu/download/ | grep -o 'href=".*">' | sed 's/href="//;s/\/">//'  

bin
dev
etc
member
pub
usr
usr1
usr2
1
  • 1
    curl -s http://10.10.1.2/WEBDATA/ | grep -o 'href=".*">' | sed -e "s/href=\"//g" | sed -e 's/">//g' this is what work for me Commented Jun 15, 2023 at 11:49
2

curl -s http://example.com/files/ | grep -o 'href=".*">' | sed -e "s/href=\"//g" | sed -e "s/\"\>//g"

Gives me an experience like ls in a directory

1
  • curl -s http://10.10.1.2/WEBDATA/ | grep -o 'href=".*">' | sed -e "s/href=\"//g" | sed -e 's/">//g' this is what work for me Commented Jun 15, 2023 at 11:49

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.