extract a string from a line using shell script [duplicate]

Question

I have couple of lines like these as a part of a file

the jdbc:mondrian:DataSource=abcd_datasource
the jdbc:mondrian:DataSource=efgh_datasource
the jdbc:mondrian:DataSource=hijk_datasource
the jdbc:mondrian:DataSource=lmno_datasource

I want to extract the strings 'abcd','efgh','hijk','lmno'

How to extract them? This is what I have tried so far:-

datasource_delimiter="_datasource"

logFileName=${1}


errorLogLines=($(grep -i "_datasource" $logFileName))

  for errorLogLine in ${errorLogLines[@]}
  do
    if [[ "$errorLogLine"~="jdbc:mondrian:DataSource=([a-zA-Z0-9]+)_datasource"  ]]
    then
      # what should I put here?
    fi
  done

Thanks

BTW, why the -i in the grep? Your regex requires _datasource to be lowercase elsewhere, so making it case-insensitive in the first-pass filter doesn't buy much. — Charles Duffy
– Charles Duffy, Commented Aug 7, 2018 at 23:20
BTW, while one of the linked question's answers suggests expr, that's very bad form (not actually built into bash, much slower than using a builtin). — Charles Duffy
– Charles Duffy, Commented Aug 7, 2018 at 23:24

Charles Duffy · Accepted Answer · 2018-08-07 23:15:17Z

1

#!/usr/bin/env bash
logFileName=$1

datasource_re='jdbc:mondrian:DataSource=([[:alnum:]]+)_datasource'
while read -r errorLogLine; do
  if [[ "$errorLogLine" =~ $datasource_re ]]; then
    echo "Found source: ${BASH_REMATCH[1]}"
  fi
done < <(grep -i "_datasource" "$logFileName")

Note:

The quoting and spacing in [[ "$var" =~ $regex ]] is very deliberate.
- You must have spaces surrounding the operators.
- You must not quote the right-hand side if you want it to be parsed as a regex rather than a literal string.
BashFAQ #1: How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Why you don't read lines with for
BashPitfalls #50, on why array=( $(...) ) is bad form.

answered Aug 7, 2018 at 23:15

community wiki

Charles Duffy

Sign up to request clarification or add additional context in comments.

Comments

Charles Duffy · Accepted Answer · 2018-08-07 23:12:15Z

1

Using GNU grep you can do this:

grep -ioP 'DataSource=\K[a-z\d]+' file

abcd
efgh
hijk
lmno

If you don't have GNU grep then use this sed:

sed 's/.*DataSource=\([a-zA-Z0-9]*\).*/\1/' file

edited Aug 7, 2018 at 23:12

Charles Duffy

300k43 gold badges442 silver badges498 bronze badges

answered Aug 7, 2018 at 20:51

anubhava

790k67 gold badges603 silver badges671 bronze badges

1 Comment

Charles Duffy Over a year ago

This is only conditionally available as part of GNU grep -- the libpcre dependency is a compile-time flag.

LeFlan · Accepted Answer · 2018-08-07 22:49:04Z

0

You also could a simple awk one-liner as follows:

awk 'BEGIN{FS="DataSource=|_datasource"}{print $2}' file

output:

abcd
efgh
hijk
lmno

Hope that helps!

answered Aug 7, 2018 at 22:49

LeFlan

11 bronze badge

Collectives™ on Stack Overflow

extract a string from a line using shell script [duplicate]

3 Answers 3

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Linked

Related