1

I have many files in a folder:

yyyymmdd_hhmmss.mp4
yyyymmdd_hhmmss_suffix1.mp4
yyyymmdd_hhmmss_suffix1_suffix2.mp4

The following filename formats are also possible (rarely):

yyyymmdd_hhmmss_$$$.mp4
yyyymmdd_hhmmss_$$$_suffix1.mp4
yyyymmdd_hhmmss_$$$_suffix1_suffix2.mp4
yyyymmdd_hhmmss_$$.mp4
yyyymmdd_hhmmss_$$_suffix1.mp4
yyyymmdd_hhmmss_$$_suffix1_suffix2.mp4
yyyymmdd_hhmmss_$.mp4
yyyymmdd_hhmmss_$_suffix1.mp4
yyyymmdd_hhmmss_$_suffix1_suffix2.mp4

where $ is a number 0-9

I am trying to catch "yyyymmdd_hhmmss" and use it as an argument. This is what I do when only one suffix presented:

for file in "$@"; do 
  file_nosuffix="${file%*_suffix1.mp4}.mp4"
  echo "$file and $file_nosuffix"
done

But I get lost when all sorts of the filename formats mentioned above are presented. Ideally I would like to stick to the current pattern:

for file in "$@"; do 
   #catch "yyyymmdd_hhmmss"
   #do something on files yyyymmdd_hhmmss.mp4
   #do something else on files yyyymmdd_hhmmss_suffix1.mp4
   #etc.
done

Is that possible?

7
  • why don't you just save the first 15 characters into a variable? Commented Apr 7, 2017 at 15:07
  • 1
    As an aside, for file iterates over "$@" by default. Commented Apr 7, 2017 at 15:09
  • Also, as an aside, it's helpful to provide test data that actually matches your stated format, rather than something that's descriptive to humans but can't be used for actual testing. yyyymmdd_hhmmss doesn't match something that expects all those characters to be digits, for example, which is presumably your actual format. Commented Apr 7, 2017 at 15:12
  • As a reference to get a substring: thegeekstuff.com/2010/07/bash-string-manipulation Commented Apr 7, 2017 at 15:12
  • 1
    Actually, I'd suggest BashFAQ #100 (part of the Wooledge wiki, maintained by the denizens of the irc.freenode.org #bash channel) for a discussion of string manipulation in bash over some random website. wiki.bash-hackers.org/syntax/pe is also reputable (and actively maintained). Commented Apr 7, 2017 at 15:13

1 Answer 1

4

Bash has built-in regex support, if you want to confirm the format:

regex='^[[:digit:]]{8}_[[:digit:]]{6}' # POSIX ERE; can't use PCRE extensions here

for file; do
  if [[ $file =~ $regex ]]; then
    echo "${BASH_REMATCH[0]} is the substring for $file" >&2
  else
    echo "$file does match the required format" >&2
  fi
done

One can also trivially take a prefix;

for file; do
  prefix=${file:0:15}
  echo "Prefix for $file is $prefix"
done

...or, to delete the last two underscores and everything after them:

prefix=${file%_*_*}

See:

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you all for your help. I think I am going to go ahead and take a prefix. However I wonder if there is a way to write code that will work for anyname_suffix1_suffix2.mp4
What logic do you actually want? Just kill _*_* from the end? prefix=${file%_*_*}, then.
prefix=${file%__} is brilliant, that is exactly what I want!! Thank you @Charles Duffy
Amended, then. (You can mark your question answered via the checkbox next to the answer).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.