0

I have couple of lines like these as a part of a file

the jdbc:mondrian:DataSource=abcd_datasource
the jdbc:mondrian:DataSource=efgh_datasource
the jdbc:mondrian:DataSource=hijk_datasource
the jdbc:mondrian:DataSource=lmno_datasource 

I want to extract the strings 'abcd','efgh','hijk','lmno'

How to extract them? This is what I have tried so far:-

datasource_delimiter="_datasource"

logFileName=${1}


errorLogLines=($(grep -i "_datasource" $logFileName))

  for errorLogLine in ${errorLogLines[@]}
  do
    if [[ "$errorLogLine"~="jdbc:mondrian:DataSource=([a-zA-Z0-9]+)_datasource"  ]]
    then
      # what should I put here?
    fi
  done

Thanks

2
  • BTW, why the -i in the grep? Your regex requires _datasource to be lowercase elsewhere, so making it case-insensitive in the first-pass filter doesn't buy much. Commented Aug 7, 2018 at 23:20
  • BTW, while one of the linked question's answers suggests expr, that's very bad form (not actually built into bash, much slower than using a builtin). Commented Aug 7, 2018 at 23:24

3 Answers 3

1
#!/usr/bin/env bash
logFileName=$1

datasource_re='jdbc:mondrian:DataSource=([[:alnum:]]+)_datasource'
while read -r errorLogLine; do
  if [[ "$errorLogLine" =~ $datasource_re ]]; then
    echo "Found source: ${BASH_REMATCH[1]}"
  fi
done < <(grep -i "_datasource" "$logFileName")

Note:

  • The quoting and spacing in [[ "$var" =~ $regex ]] is very deliberate.
    • You must have spaces surrounding the operators.
    • You must not quote the right-hand side if you want it to be parsed as a regex rather than a literal string.
  • BashFAQ #1: How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
  • Why you don't read lines with for
  • BashPitfalls #50, on why array=( $(...) ) is bad form.
Sign up to request clarification or add additional context in comments.

Comments

1

Using GNU grep you can do this:

grep -ioP 'DataSource=\K[a-z\d]+' file

abcd
efgh
hijk
lmno

If you don't have GNU grep then use this sed:

sed 's/.*DataSource=\([a-zA-Z0-9]*\).*/\1/' file

1 Comment

This is only conditionally available as part of GNU grep -- the libpcre dependency is a compile-time flag.
0

You also could a simple awk one-liner as follows:

awk 'BEGIN{FS="DataSource=|_datasource"}{print $2}' file

output:

abcd
efgh
hijk
lmno

Hope that helps!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.