Skip to main content
added 22 characters in body
Source Link
Ed Morton
  • 208.7k
  • 18
  • 90
  • 212

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea. Here's the performance difference between the 2 implementations (timed using bash):

$ cat tst.sh
#!/usr/bin/env shbash

save_sed(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"
    done
    echo " "
}

save_awk() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, q "\\" q q, ARGV[i])
                printf q"%s "%s"", q, ARGV[i] q
            }
            print ""
        }
    ' "$@"
}

echo "calling sed in a loop:"
time save_sed >/dev/null $(seq 100)
echo ""
echo "calling awk once:"
time save_awk >/dev/null $(seq 100)

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea. Here's the performance difference between the 2 implementations:

$ cat tst.sh
#!/usr/bin/env sh

save_sed(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"
    done
    echo " "
}

save_awk() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, q "\\" q q, ARGV[i])
                printf q "%s" q, ARGV[i]
            }
            print ""
        }
    ' "$@"
}

echo "calling sed in a loop:"
time save_sed >/dev/null $(seq 100)
echo ""
echo "calling awk once:"
time save_awk >/dev/null $(seq 100)

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea. Here's the performance difference between the 2 implementations (timed using bash):

$ cat tst.sh
#!/usr/bin/env bash

save_sed(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"
    done
    echo " "
}

save_awk() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, q "\\" q q, ARGV[i])
                printf "%s ", q ARGV[i] q
            }
            print ""
        }
    ' "$@"
}

echo "calling sed in a loop:"
time save_sed >/dev/null $(seq 100)
echo ""
echo "calling awk once:"
time save_awk >/dev/null $(seq 100)
added 484 characters in body
Source Link
Ed Morton
  • 208.7k
  • 18
  • 90
  • 212

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea which can be made slightly more efficient still by moving. Here's the string concatenation out ofperformance difference between the loop so it's only done once2 implementations:

save$ cat tst.sh
#!/usr/bin/env sh

save_sed() {
    LC_ALL=C awk -vfor q=\'i 'do
        BEGIN {
printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"
    done
  escq = qecho "\\"" q"
}

save_awk() q{
    LC_ALL=C awk -v q=\' '
    qfmt = q "%s" qBEGIN {
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, escqq "\\" q q, ARGV[i])
                printf qfmtq "%s" q, ARGV[i]
            }
            print ""
        }
    ' "$@"
}

echo "calling sed in a loop:"
time save_sed >/dev/null $(seq 100)
echo ""
echo "calling awk once:"
time save_awk >/dev/null $(seq 100)

$ ./tst.sh
calling sed in a loop:

real    0m3.403s
user    0m1.014s
sys     0m2.615s

calling awk once:

real    0m0.042s
user    0m0.015s
sys     0m0.031s

The awk versions performance hardly changes when the number of arguments change while the loop+sed version increases/decreases by about a factor of 10 when the number of arguments changes by the same factor of 10.

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea which can be made slightly more efficient still by moving the string concatenation out of the loop so it's only done once:

save() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            escq = q "\\" q q
            qfmt = q "%s" q
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, escq, ARGV[i])
                printf qfmt, ARGV[i]
            }
            print ""
        }
    ' "$@"
}

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea. Here's the performance difference between the 2 implementations:

$ cat tst.sh
#!/usr/bin/env sh

save_sed(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"
    done
    echo " "
}

save_awk() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, q "\\" q q, ARGV[i])
                printf q "%s" q, ARGV[i]
            }
            print ""
        }
    ' "$@"
}

echo "calling sed in a loop:"
time save_sed >/dev/null $(seq 100)
echo ""
echo "calling awk once:"
time save_awk >/dev/null $(seq 100)

$ ./tst.sh
calling sed in a loop:

real    0m3.403s
user    0m1.014s
sys     0m2.615s

calling awk once:

real    0m0.042s
user    0m0.015s
sys     0m0.031s

The awk versions performance hardly changes when the number of arguments change while the loop+sed version increases/decreases by about a factor of 10 when the number of arguments changes by the same factor of 10.

added 515 characters in body
Source Link
Ed Morton
  • 208.7k
  • 18
  • 90
  • 212

The only array-like structure in a POSIX shell is the $@ list whose elements can be accessed independently with $1,$2,$3,... The number of elements available is given by $#.

$@ elements can be modified using set and shift.

The top-level and each function call has its own separate $@ array which is initialised to the arguments received by the script/function when it is called. Using set or shift inside a function does not affect the top-level $@.

It is possible to save all the original elements of $@ somewhere but I don't believe there is way to use the result in a form as simple as bash's "${arr[@]}" syntax. (Individual elements may accessed without too much effort but not, trivially, the array as a whole.)

However, by appropriately reloading/manipulating the elements of $@ it can be used directly, although performing the manipulation is likely to be rather tedious.

A quick search for ways to accomplish the saving found these approaches:

Rich's code from the second link is probably the simplest and looks like:

save(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"   
    done
    echo " "
}

and is used as:

myarray=$(save "$@")

set -- foo bar baz boo
# ... do stuff with new $@ ...

eval "set -- $myarray"

The eval is safe since $myarray expands to a list of single-quoted strings.

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea. which can be made slightly more efficient still by moving the string concatenation out of the loop so it's only done once:

save() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            escq = q "\\" q q
            qfmt = q "%s" q
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, escq, ARGV[i])
                printf qfmt, ARGV[i]
            }
            print ""
        }
    ' "$@"
}

The only array-like structure in a POSIX shell is the $@ list whose elements can be accessed independently with $1,$2,$3,... The number of elements available is given by $#.

$@ elements can be modified using set and shift.

The top-level and each function call has its own separate $@ array which is initialised to the arguments received by the script/function when it is called. Using set or shift inside a function does not affect the top-level $@.

It is possible to save all the original elements of $@ somewhere but I don't believe there is way to use the result in a form as simple as bash's "${arr[@]}" syntax. (Individual elements may accessed without too much effort but not, trivially, the array as a whole.)

However, by appropriately reloading/manipulating the elements of $@ it can be used directly, although performing the manipulation is likely to be rather tedious.

A quick search for ways to accomplish the saving found these approaches:

Rich's code from the second link is probably the simplest and looks like:

save(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"   
    done
    echo " "
}

and is used as:

myarray=$(save "$@")

set -- foo bar baz boo
# ... do stuff with new $@ ...

eval "set -- $myarray"

The eval is safe since $myarray expands to a list of single-quoted strings.

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient awk implementation of this idea.

The only array-like structure in a POSIX shell is the $@ list whose elements can be accessed independently with $1,$2,$3,... The number of elements available is given by $#.

$@ elements can be modified using set and shift.

The top-level and each function call has its own separate $@ array which is initialised to the arguments received by the script/function when it is called. Using set or shift inside a function does not affect the top-level $@.

It is possible to save all the original elements of $@ somewhere but I don't believe there is way to use the result in a form as simple as bash's "${arr[@]}" syntax. (Individual elements may accessed without too much effort but not, trivially, the array as a whole.)

However, by appropriately reloading/manipulating the elements of $@ it can be used directly, although performing the manipulation is likely to be rather tedious.

A quick search for ways to accomplish the saving found these approaches:

Rich's code from the second link is probably the simplest and looks like:

save(){
    for i do
        printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/"   
    done
    echo " "
}

and is used as:

myarray=$(save "$@")

set -- foo bar baz boo
# ... do stuff with new $@ ...

eval "set -- $myarray"

The eval is safe since $myarray expands to a list of single-quoted strings.

See how-to-use-pseudo-arrays-in-posix-shell-script for a more efficient and arguably easier to understand awk implementation of this idea which can be made slightly more efficient still by moving the string concatenation out of the loop so it's only done once:

save() {
    LC_ALL=C awk -v q=\' '
        BEGIN {
            escq = q "\\" q q
            qfmt = q "%s" q
            for ( i=1; i<ARGC; i++ ) {
                gsub(q, escq, ARGV[i])
                printf qfmt, ARGV[i]
            }
            print ""
        }
    ' "$@"
}
added 119 characters in body
Source Link
Ed Morton
  • 208.7k
  • 18
  • 90
  • 212
Loading
added 94 characters in body
Source Link
jhnc
  • 18.7k
  • 2
  • 14
  • 33
Loading
example code
Source Link
jhnc
  • 18.7k
  • 2
  • 14
  • 33
Loading
Source Link
jhnc
  • 18.7k
  • 2
  • 14
  • 33
Loading