1

I have a string that represents a date like so:

    "May 5 2014"

I'd like to know how to extract the "5" from it.

What I've Tried so far:

   echo "May 5 2014" | sed 's/[^0-9]*\s//'

That returns "5 2014"

sorry for the remedial questions. just new to bash.

1

6 Answers 6

5

Use cut:

echo "May 5 2014" | cut -d' ' -f2

or awk:

echo "May 5 2014" | awk '{print $2}'

In case you want to it without external utilities, it'd be a two step process:

s="May 5 2014"
t="${s#* }"
echo "${t% *}"
Sign up to request clarification or add additional context in comments.

2 Comments

devnull, my date string is stored in a variable called $a. when I do "newvar = $(echo $a|cut -d' ' -f2)" i get an error that says newvar not found
@dot Eliminate spaces around =.
4

Bash's builtin read command can split input into multiple variables. The '<<<' tells read to take input from the following string.

read first second remainder <<< "May 5 2014"

After which, "$first" will be "May", "$second" will be "5" and "$remainder" will be "2014"

It is common practice to use '' as a placeholder for uninteresting fields as the shell automatically overwrites $.

read _ day _ <<< 'May 5 2014 utc'

7 Comments

This is really neat :) +1
@dave sines Not only neat, but read month day year <<< "May 5 2014" is so much incredibly faster. I did some test and found it to be over 20 times faster than day=$(echo "May 5 2014" | cut -d' ' -f2). If one were to do the same for month day and year, it is over 60 times faster. Thank you!
@KeithReynolds the @devnull's pure bash solution is 5 times faster and my pure-bash-regex solution is 3 times faster than this read solution. so, it is neat - but not the fastest :)
@jm666 I also found that your pure-bash-regex solution is 3 times faster than this read solution if your only looking for the day. On the other hand read month day year <<< "May 5 2014" is about the same speed as re="(.*) (.*) (.*)"; [[ $aaa =~ $re ]]; month=${BASH_REMATCH[1]}; day=${BASH_REMATCH[2]};year=${BASH_REMATCH[3]}
@KeithReynolds in my system 100000 times, read solution: 27sec, regex with 3x assign 10sec, regex 1x assign 8 sec, and devnulls solution 5.4 sec. ;) anyway, it is really not very important - all pure bash solutions are good. ;) :)
|
4

If you're writing a script that needs to parse date strings, you can surely do it using sed et al, and indeed there are already several answers here that do the trick nicely.

However, my advice would be to let the date program do the heavy lifting for you:

$ date -d "May 5 2014" +%-d
5

The maintainers of the date program have no doubt spent many hours and days getting their date-parsing code right. Why not leverage that work instead of rolling your own?

EDIT

Added BSD solution e.g. for (Mac OS X)

date -j -f '%b %d %Y' 'May 5 2014' '+%d'

on BSD need tell to the date in what format is the "incoming" date with -f format and will output it in the format +format. The -j mean, do not set the date.

5 Comments

nice one! unfortunately for BSD systems (OS X) not works (needs another syntax)
Note that although date doesn't accept any and all date formats you can think of (e.g., it complains about "May 5th 2014"), it still is much more flexible than assuming a single format. For example, date will accept dates such as "5/5/2014", "May 5", "2014-05-05", "2014-5-5", and others.
Doing anything other than this is a bit bizarre for date parsing (and a duplicate of so many other questions). No mention of BSD in the question. +1
@BroSlow not mention Linux too... You can't assume than everybody uses GNU date, here are many Mac users too. Anyway, I agree with the answer - the date parsing with date is nice - but need care about the different OS syntax.
@jm666 Nothing against bsd (though I dislike bsd variants of some tools like find, stat, etc...), gnu is just more prevalent, and questions where OP is asking about bsd tend to get tagged with something like osx, solaris, bsd, etc... But obviously nice to provide multiple solutions as you have.
3

with sed, one possibility is:

echo "May 5 2014" | sed 's/.* \([0-9]*\) .*/\1/'

another one

echo "May 5 2014" | sed 's/[^ ]* //;s/ [^ ]*//'

another

echo "May 5 2014" | sed 's/\(.*\) \(.*\) \(.*\)/\2/'

with grep

echo "May 5 2014" | grep -oP '\b\d{1,2}\b'

or perl

echo "May 5 2014" | perl -lanE 'say $F[1]'

as curiosity

echo "May 5 2014" | xargs -n1 | head -2 | tail -1
echo "May 5 2014" | xargs -n1 | sed -n 2p
echo "May 5 2014" | xargs -n1 | egrep '^[0-9]{1,2}$'

and finally, pure bash solution, without starting any external commands

aaa="May 5 2014"
[[ $aaa =~ (.*)[[:space:]](.*)[[:space:]](.*) ]] && echo ${BASH_REMATCH[2]}

or

aaa="May 5 2014"
re="(.*) (.*) (.*)"
[[ $aaa =~ $re ]] && echo ${BASH_REMATCH[2]}

EDIT

Because Keith Reynolds asking for some benchmarks, i tested the following script. Using time is not the perfect benchmarking tool, but gives some insight.

  • each test outputs N-times the result (what is counted by wc)
  • NOTE, the external commands are executed only 10_000 times while the pure bash solutions 100_000 times

Here is the script:

xbench_with_read() {
    let i=$1; while ((i--)); do
        read _ day _ <<< 'May 5 2014'
        echo $day
    done
}

xbench_regex_3x_assign() {
    let i=$1; while ((i--)); do
        aaa="May 5 2014"
        re="(.*) (.*) (.*)"
        [[ $aaa =~ $re ]] && month="${BASH_REMATCH[1]}" && day="${BASH_REMATCH[2]}" && year="${BASH_REMATCH[3]}" && echo "$day"
    done
}

xbench_regex_1x_assign() {
    let i=$1; while ((i--)); do
        aaa="May 5 2014"
        re="(.*) (.*) (.*)"
        [[ $aaa =~ $re ]] && day=${BASH_REMATCH[2]} && echo "$day"
    done
}

xbench_var_expansion() {
    let i=$1; while ((i--)); do
        s="May 5 2014"
        t="${s#* }"
        echo "${t% *}"
    done
}

xbench_ext_cut() {
    let i=$1; while ((i--)); do
        echo "May 5 2014" | cut -d' ' -f2
    done
}

xbench_ext_grep() {
    let i=$1; while ((i--)); do
        echo "May 5 2014" | grep -oP '\b\d{1,2}\b'
    done
}

xbench_ext_sed() {
    let i=$1; while ((i--)); do
        echo "May 5 2014" | sed 's/\(.*\) \(.*\) \(.*\)/\2/'
    done
}

xbench_ext_xargs() {
    let i=$1; while ((i--)); do
        echo "May 5 2014" | xargs -n1 | sed -n 2p
    done
}

title() {
    echo '~ -'$___{1..20} '~' >&2
    echo "Timing $1 $2 times" >&2
}

for script in $(compgen -A function | grep xbench)
do
    cnt=100000
    #external programs run 10x less times
    [[ $script =~ _ext_ ]] && cnt=$(( $cnt / 10 ))
    title $script $cnt
    time $script $cnt | wc -l
done

and here are the raw results:

~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_ext_cut 10000 times
   10000

real    0m37.752s
user    0m14.587s
sys 0m25.723s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_ext_grep 10000 times
   10000

real    1m35.570s
user    0m21.778s
sys 0m34.524s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_ext_sed 10000 times
   10000

real    0m41.628s
user    0m15.310s
sys 0m26.422s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_ext_xargs 10000 times
   10000

real    1m42.235s
user    0m46.601s
sys 1m11.238s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_regex_1x_assign 100000 times
  100000

real    0m11.215s
user    0m8.784s
sys 0m0.907s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_regex_3x_assign 100000 times
  100000

real    0m14.669s
user    0m12.419s
sys 0m1.027s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_var_expansion 100000 times
  100000

real    0m5.148s
user    0m4.658s
sys 0m0.788s
~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~ - ~
Timing xbench_with_read 100000 times
  100000

real    0m27.700s
user    0m6.279s
sys 0m19.724s

So sorted by real execution time

pure bash solutions 100_000 times

  1. xbench_var_expansion - real 0m5.148s - 5.2 sec
  2. xbench_regex_1x_assign - real 0m11.215s - 11.2 sec
  3. xbench_regex_3x_assign - real 0m14.669s - 14.7 sec
  4. xbench_with_read - real 0m27.700s - 27.7 sec

No surprises here - the variable expansion is simply the fastest solution.

external programs only 10_000 times

  1. xbench_ext_cut - real 0m37.752s - 37.8 sec
  2. xbench_ext_sed - real 0m41.628s - 41.6 sec
  3. xbench_ext_grep - real 1m35.570s - 95.6 sec
  4. xbench_ext_xargs - real 1m42.235s - 102.2 sec

Two surprises here (at least for me):

  • the grep solution is 2x slover as sed
  • the xargs (curiosity solution) only slightly slower as grep

Enviromnent:

$ uname -a
Darwin marvin.local 13.1.0 Darwin Kernel Version 13.1.0: Thu Jan 16 19:40:37 PST 2014; root:xnu-2422.90.20~2/RELEASE_X86_64 x86_64

$ LC_ALL=C bash --version
GNU bash, version 4.2.45(2)-release (i386-apple-darwin13.0.0)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

3 Comments

I like the variety here. I was actually just writing your first possibility myself but was forgetting the extra space before the [0-9]*
@SS781 - the best method is with cut already answered by devnull
the cut is small - so fast start and short typing in a script ;). but in the reality the best is a pure bash solution not showed it yet by nobody, because a pure bash doesn't start any external programs...
0

With awk :

echo "May 5 2014" | awk '{print $2}'

Comments

0

You could use bash substring expansion and apply an offset (:4) and a length (:1) value. Just adjust the offset and the lenght values in cases where the format of the string changes.

Here is an example:

$ date_format="May 5 2014"
$ echo "${date_format:4:1}"
5

$ date_format="2014 May 5"
$ echo "${date_format: -1:1}"    # <- Watch that space before the negative value
5

$ date_format="5 May 2014"
$ echo "${date_format:0:1}"
5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.