If I have the text in a shell variable, say $a:
a="The cat sat on the mat"
How can I search for "cat" and return 4 using a Linux shell script, or -1 if not found?
If I have the text in a shell variable, say $a:
a="The cat sat on the mat"
How can I search for "cat" and return 4 using a Linux shell script, or -1 if not found?
With bash
a="The cat sat on the mat"
b=cat
strindex() {
x="${1%%"$2"*}"
[[ "$x" = "$1" ]] && echo -1 || echo "${#x}"
}
strindex "$a" "$b" # prints 4
strindex "$a" foo # prints -1
strindex "$a" "ca*" # prints -1
[ "$x" = "$1" ] and pdksh wants x=$2; x="${1%%$x*}", however.* in your search string will be interpreted as a wild card unless it is manually escape. i added a copy of your answer with automatic escaping, but all credit to you. stackoverflow.com/a/69960043/912236You can use grep to get the byte-offset of the matching part of a string:
echo $str | grep -b -o str
As per your example:
[user@host ~]$ echo "The cat sat on the mat" | grep -b -o cat
4:cat
you can pipe that to awk if you just want the first part
echo $str | grep -b -o str | awk 'BEGIN {FS=":"}{print $1}'
cut -d: -f1 is a bit more lightweight than piping through awkcolrm 2 could also replace the awk portionI used awk for this
a="The cat sat on the mat"
test="cat"
awk -v a="$a" -v b="$test" 'BEGIN{print index(a,b)}'
awk -v a="$a" -v b="$test" 'BEGIN{print index(a,b)}' | xargs expr -1 +echo $a | grep -bo cat | sed 's/:.*$//'
This is just a version of the glenn jackman's answer with escaping, the complimentary reverse function strrpos and python-style startswith and endswith function based on the same principle.
Edit: updating escaping per @bruno's excellent suggestion.
strpos() {
haystack=$1
needle=$2
x="${haystack%%"$needle"*}"
[[ "$x" = "$haystack" ]] && { echo -1; return 1; } || echo "${#x}"
}
strrpos() {
haystack=$1
needle=$2
x="${haystack%"$needle"*}"
[[ "$x" = "$haystack" ]] && { echo -1; return 1 ;} || echo "${#x}"
}
startswith() {
haystack=$1
needle=$2
x="${haystack#"$needle"}"
[[ "$x" = "$haystack" ]] && return 1 || return 0
}
endswith() {
haystack=$1
needle=$2
x="${haystack%"$needle"}"
[[ "$x" = "$haystack" ]] && return 1 || return 0
}
* (? and [..]). The best way to prevent pathname expansion is to quote $2 in x=${haystack%%"$2"*}strpos(), if the value of $needle is the empty string '' (null) or some other un-matchable pattern, then the result stored in $x would be the value of $haystack itself. There can be no match, and so therefore nothing is deleted from an expanded $haystack. The variable expands normally by bash rules, and the value is stored in $x. Prevent empty arguments upon execution with the someVariable="${1:?}" format. It makes your functions MUCH more type safe, so to speak. gnu.org/software/bash/manual/bash.html#Shell-Expansions$needle is empty, then strpos will echo position 0. If $needle is not found, it will echo -1. Errorlevels will be set to 0 and 1 respectively. That is entirely correct for strpos, conforms to JavaScript's behaviour, and seems logical to me. It is also an essential component of gist.github.com/sfinktah/a432630706393d7bbe51f01508805cc6 (where I used these functions). strrpos should return the length of the string if $needle is '', but defaults won't help there.This can be accomplished using ripgrep (aka rg).
❯ a="The cat sat on the mat"
❯ echo $a | rg --no-config --column 'cat'
1:5:The cat sat on the mat
❯ echo $a | rg --no-config --column 'cat' | cut -d: -f2
5
If you wanted to make it a function you can do:
function strindex() {
local str=$1
local substr=$2
echo -n $str | rg --no-config --column $substr | cut -d: -f2
}
...and use it as such: strindex <STRING> <SUBSTRING>
strindex "The cat sat on the mat" "cat"
5
You can install ripgrep on MacOS with: brew install --formula ripgrep.
A variation (bash) on @Orwellophile 's answer, done out the long way. However, it also does the comparison with string lengths instead of comparing strings. You never know how long a string might be! :-) Hopefully, while clearly longer, this answer will be clearer.
function strpos ()
{
local -r needle="${1:?}" ## Prevents empty strings
local -r haystack="${2:?}" ## Prevents empty strings
## From a copy, attempts to remove characters from the end of a string, greedily.
local -r remainingHaystack="${haystack%%"$needle"*}"
local -ir remainingHaystackLength="${#remainingHaystack}"
## When the needle is not found in haystack, these values will be equal.
if (( $remainingHaystackLength == ${#haystack} )); then
echo -n -1
return 1
fi
echo -n $remainingHaystackLength
}
If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted.
Example:
If parameter = "/usr/bin/foo/bin", and word = "/bin"
${parameter%word} ## /usr/bin/foo/bin --> /usr/bin/foo (non-greedy)
${parameter%%word} ## /usr/bin/foo/bin --> /usr (greedy)
If parameter is ‘@’ or ‘*’, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with ‘@’ or ‘*’, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list.
Most simple is - expr index "The cat sat on the mat" cat
it will return 5
-1 when the text value is not found. expr returns one (1) and prints zero (0) when a CHAR is not found, and so may cause ambiguity, depending on usage. Also if the string='Cat in the Hat Strikes Back' then expr index "$string" 'Hat' will print` 2, because the form of the command is index STRING CHARS` and not index STRING String. In this case it returns the position of the character a, because it is the second character in $string, meaning that expr uses one (1) based indexing, not zero (0) based indexing.