3

In a bash function, I want to list all the files in a given folder which correspond to a given set of file types. In pseudo-code, I am imagining something like this:

getMatchingFiles() {
  output=$1
  directory=$2
  shift 2
  _types_=("$@")

  file_array=find $directory -type f where-name-matches-item-in-_types_

  # do other stuff with $file_array, such as trimming file names to
  # just the basename with no extension

  eval $output="${file_array[@]}"
}

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types
echo "${result[@]}"

For your amusement, here are the multiple workarounds, based on my current knowledge of bash, that I am using to achieve this. I have a problem with the way the function returns the array of files: the final command tries to execute each file, rather than to set the output parameter.

getMatchingFiles() {
  local _output=$1
  local _dir=$2
  shift 2
  local _type=("$@")
  local _files=($_dir/$_type/*)
  local -i ii=${#_files[@]}
  local -a _filetypes
  local _file _regex

  case $_type in
    audio )
      _filetypes=(ogg mp3)
      ;;
    images )
      _filetypes=(jpg png)
      ;;
  esac

  _regex="^.*\.("
  for _filetype in "${_filetypes[@]}"
  do
     _regex+=$_filetype"|"
  done

  _regex=${_regex:0:-1}
  _regex+=")$"

  for (( ; ii-- ; ))
  do
    _file=${_files[$ii]}
    if ! [[ $_file =~ $_regex ]];then
      unset _files[ii]
    fi
  done

  echo "${_files[@]}"

  # eval $_output="${_files[@]}" # tries to execute the files
}

dir=/path/to/parent
getMatchingFiles result $dir audio
echo "${result[@]}"
4
  • 1
    Why not just return the result from the function instead of passing by reference? Commented Dec 5, 2017 at 11:07
  • @Inian Could you explain how I would do that? Commented Dec 5, 2017 at 11:08
  • 1
    file_array=find $directory -type f where-name-matches-item-in-_types_ will just assign the string to file array, nothing is being executed. Commented Dec 5, 2017 at 11:21
  • @JamesNewton: Did you try my suggestion below? Commented Dec 5, 2017 at 11:34

3 Answers 3

2

As a matter of fact, it is possible to use nameref (note that you need bash 4.3 or later) to reference an array. If you want to put the output of find to an array specified by a name, you can reference it like this:

#!/usr/bin/env bash

getMatchingFiles() {

   local -n output=$1
   local dir=$2
   shift 2
   local types=("$@")
   local ext file
   local -a find_ext

   [[ ${#types[@]} -eq 0 ]] && return 1

   for ext in "${types[@]}"; do
      find_ext+=(-o -name "*.${ext}")
   done

   unset 'find_ext[0]'
   output=()

   while IFS=  read -r -d $'\0' file; do
      output+=("$file") 
   done < <(find "$dir" -type f \( "${find_ext[@]}" \) -print0)
}

dir=/some/path

getMatchingFiles result "$dir" mp3 txt
printf '%s\n' "${result[@]}"

getMatchingFiles other_result /some/other/path txt
printf '%s\n' "${other_result[@]}"

Don't pass your variable $dir as a reference, pass it as a value instead. You will be able to pass a literal as well.

Sign up to request clarification or add additional context in comments.

4 Comments

Interesting, I did not know that. Do you know then why declaring local -na output=$1 raise the "local: output: reference variable cannot be an array" error?
@RenaudPacalet You don't reference the array per se. You reference the variable and any operation done on the ref will be actually done on the original. So, for example, referencing an element of an array will be done on the original variable. That's all there is to it.
@RenaudPacalet, ...you might give this answer a +1, as I have, if you've learned from it.
@CharlesDuffy Done, of course. Thanks for the reminder.
0

Update: namerefs can indeed be arrays (see PesaThe's answer)

Without spaces in file and directory names

I first assume you do not have spaces in your file and directory names. See the second part of this answer if you have spaces in your file and directory names.

In order to pass result, dir and types by name to your function, you need to use namerefs (local -n or declare -n, available only in recent versions of bash).

Another difficulty is to build the find command based on the types you passed but this is not a major one. Pattern substitutions can do this. All in all, something like this should do about what you want:

#!/usr/bin/env bash

getMatchingFiles() {
    local -n output=$1
    local -n directory=$2
    local -n _types_=$3
    local filter

    filter="${_types_[@]/#/ -o -name *.}"
    filter="${filter# -o }"
    output=( $( find "$directory" -type f \( $filter \) ) )

    # do other stuff with $output, such as trimming file names to
    # just the basename with no extension
}

declare dir
declare -a types
declare -a result=()

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types
for f in "${result[@]}"; do echo "$f"; done

With spaces in file and directory names (but not in file suffixes)

If you have spaces in your file and directory names, things are a bit more difficult because you must assign your array such that names are not split in words; one possibility to do this is to use \0 as file names separator, instead of a space, thanks to the -print0 option of find and the -d $'\0' option of read:

#!/usr/bin/env bash

getMatchingFiles() {
    local -n output=$1
    local -n directory=$2
    local -n _types_=$3
    local filter

    filter="${_types_[@]/#/ -o -name *.}"
    filter="${filter# -o }"
    while read -d $'\0' file; do
        output+=( "$file" )
    done < <( find "$directory" -type f \( $filter \) -print0 )

    # do other stuff with $output, such as trimming file names to
    # just the basename with no extension
}

declare dir
declare -a types
declare -a result=()

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types[@]
for f in "${result[@]}"; do echo "$f"; done

With spaces in file and directory names, even in file suffixes

Well, you deserve what happens to you... Still possible but left as an exercise.

8 Comments

Thanks! What exactly does local -n <var> do? I have tried googling for it, but the man pages I find don't mention an -n option. And is there any advantage in += for setting file_array? Is a simple = not good?
local -n directory=$1 declares directory as a named reference to whatever $1` is. So, in our example, when we call getMatchingFiles dir ..., after this declaration, the local directory variable and the global dir variable are the same. And you are right, in this case, file_array= would do the same. In modified my answer.
Why aren't you collecting content into an array?
@CharlesDuffy I do (in the file_array array), but as the OP wanted to pass the array by name and I did not know that a nameref could also be an array (thanks to PesaThe answer I learned something today), I had to return a value.
Namerefs wouldn't be very useful if they couldn't point to arrays -- that's the primary thing they do one can't easily do without 'em (barring eval).
|
0

Supporting the original, unmodified calling convention, and correctly handling extensions with whitespace or glob characters:

#!/usr/bin/env bash

getMatchingFiles() {
  declare -g -a "$1=()"
  declare -n gMF_result="$1"  # variables are namespaced to avoid conflicts w/ targets
  declare -n gMF_dir="$2"
  declare -n gMF_types="$3"
  local gMF_args=( -false )   # empty type list not a special case
  local gMF_type gMF_item

  for gMF_type in "${gMF_types[@]}"; do
    gMF_args+=( -o -name "*.$gMF_type" )
  done

  while IFS= read -r -d '' gMF_item; do
    gMF_result+=( "$gMF_item" )
  done < <(find "$gMF_dir" '(' "${gMF_args[@]}" ')' -print0)
}

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types

3 Comments

Hm, is it really working for extensions with glob chars? types=("t*xt") will output even .txt or .tsomethingxt etc. In order to process extensions with globs literally (I think that is what you wanted), you would have to escape them. Or maybe that was your intention?
PesaThe is right. Moreover, the find command is bogus: 1) the -type f is missing and 2) it outputs only the files with the last extension (mp3 here) because -name *.mp3 -print0 is interpreted as -name *.mp3 and -print0 while all other -name *.xxx are not anded with any action and do not produce anything.
@RenaudPacalet Yep, type -f is just a detail but \( \) are really needed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.