2

I need to URL encode a string using a shell function that will run in BusyBox Ash, Dash, Bash and ZSH.

It will run in different Docker containers, so it'd be good to have as little dependencies to install as possible.

Notes:

  • The question "URL encoding a string in bash script" asks for a Bash-specific script, and the only provided answer depends on PHP being installed on the container.

  • The question "How to urlencode data for curl command?" is specific to curl, but is certainly open to non-specific answers. However, none of the 25 answers seem to apply. One of the answers there only works to send data to curl, while some are specific to bash or ksh, some others require Perl, PHP, Python, Lua, NodeJS, Ruby, gridsite-clients, uni2ascii, jq, awk or sed to be installed. One of them doesn't require additional dependencies, but doesn't preserve characters like a, 1 and ~.

What I'd expect to have:

$> urlencode '/'
%2f

$> urlencode 'ç'
%c3%a7

$> urlencode '*'
%2a

$> urlencode abc-123~6
abc-123~6

$> urlencode 'a test ?*ç '
a%20test%20%3f%2a%c3%a7%20
1

1 Answer 1

7

The functions below have been tested in BusyBox Ash, Dash, Bash and ZSH.

They only use shell builtins or core commands and should work in other shells as well.

urlencodepipe() {
  local LANG=C; local c; while IFS= read -r c; do
    case $c in [a-zA-Z0-9.~_-]) printf "$c"; continue ;; esac
    printf "$c" | od -An -tx1 | tr ' ' % | tr -d '\n'
  done <<EOF
$(fold -w1)
EOF
  echo
}

urlencode() { printf "$*" | urlencodepipe ;}

How it works:

  • Standard input is processed by fold -w1, which re-formats its input so that it is 1 column wide (in other words, it adds a \n after each input character so that each character will be on its own line)
  • The here-document <<EOF feeds the output of fold to the read command
  • The while command accepts one line at a time (which is only 1 character wide) from the read command, which gets its input from fold, and assigns it to variable c
  • case tests if that character needs to be url encoded. If it doesn't, then it's printed and the while loop continues
  • od converts each input character to hex
  • tr converts spaces to % and joins multiple lines
Sign up to request clarification or add additional context in comments.

2 Comments

Nice, though it's not perfectly portable, as the output from od varies from what you expect in non-Linux environments. For me, urlencode ' ' returns %%%%%%%%%%%20%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%. Replacing the first tr with sed 's/ */%/g;s/%$//' seems to help, but it still fails to translate \n to %0d.
This unfortunately does not work in ZSH on macOS (as @ghoti illustrates above).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.