184

In Bash, I would like to create a function that returns the filename of the newest file that matches a certain pattern. For example, I have a directory of files like:

Directory/
   a1.1_5_1
   a1.2_1_4
   b2.1_0
   b2.2_3_4
   b2.3_2_0

I want the newest file that starts with 'b2'. How do I do this in bash? I need to have this in my ~/.bash_profile script.

1

11 Answers 11

311

The ls command has a parameter -t to sort by time. You can then grab the first (newest) with head -1.

ls -t b2* | head -1

But beware: Why you shouldn't parse the output of ls

My personal opinion: parsing ls is dangerous when the filenames can contain funny characters like spaces or newlines.

If you can guarantee that the filenames will not contain funny characters (maybe because you are in control of how the files are generated) then parsing ls is quite safe.

If you are developing a script which is meant to be run by many people on many systems in many different situations then do not parse ls.

Here is how to do it safe:

unset -v latest
for file in "$dir"/*; do
  [[ $file -nt $latest ]] && latest=$file
done

for more explanation read How can I find the latest (newest, earliest, oldest) file in a directory?

also What is the difference between test, single square bracket and double square bracket ?

Sign up to request clarification or add additional context in comments.

16 Comments

Note to others: if you are doing this for a directory, you would add the -d option to ls, like this 'ls -td <pattern> | head -1'
The parsing LS link says not to do this and recommends the methods in BashFAQ 99. I'm looking for a 1-liner rather than something bullet-proof to include in a script, so I'll continue to parse ls unsafely like @lesmana.
@Eponymous: If you're looking for a one liner without using the fragile ls, printf "%s\n" b2* | head -1 will do it for you.
@DavidOngaro The question does not say that the filenames are version numbers. This is about modification times. Even with the filename assumption b2.10_5_2 kills this solution.
Your one liner is giving me right answer, but the "right" way is actually giving me the oldest file. Any idea why?
|
35

The combination of find and ls works well for

  • filenames without newlines
  • not very large amount of files
  • not very long filenames

The solution:

find . -name "my-pattern" -print0 |
    xargs -r -0 ls -1 -t |
    head -1

Let's break it down:

With find we can match all interesting files like this:

find . -name "my-pattern" ...

then using -print0 we can pass all filenames safely to the ls like this:

find . -name "my-pattern" -print0 | xargs -r -0 ls -1 -t

additional find search parameters and patterns can be added here

find . -name "my-pattern" ... -print0 | xargs -r -0 ls -1 -t

ls -t will sort files by modification time (newest first) and print it one at a line. You can use -c to sort by creation time. Note: this will break with filenames containing newlines.

Finally head -1 gets us the first file in the sorted list.

Note: xargs use system limits to the size of the argument list. If this size exceeds, xargs will call ls multiple times. This will break the sorting and probably also the final output. Run

xargs  --show-limits

to check the limits on you system.

Note 2: use find . -maxdepth 1 -name "my-pattern" -print0 if you don't want to search files through subfolders.

Note 3: As pointed out by @starfry - -r argument for xargs is preventing the call of ls -1 -t, if no files were matched by the find. Thank you for the suggesion.

11 Comments

This is better than the ls based solutions, as it works for directories with extremely many files, where ls chokes.
find . -name "my-pattern" ... -print0 gives me find: paths must precede expression: `...'
Oh! ... stands for "more parameters". Just omit it, if you don't need it.
I found that this can return a file that does not match the pattern if there are no files that do match the pattern. It happens because find passes nothing to xargs which then invokes ls with no file lists, causing it to work on all files. The solution is to add -r to the xargs command-line which tells xargs not to run its command-line if it receives nothing on its standard input.
@starfry thank you! Nice catch. I added -r to the answer.
|
15

This is a possible implementation of the required Bash function:

# Print the newest file, if any, matching the given pattern
# Example usage:
#   newest_matching_file 'b2*'
# WARNING: Files whose names begin with a dot will not be checked
function newest_matching_file
{
    # Use ${1-} instead of $1 in case 'nounset' is set
    local -r glob_pattern=${1-}

    if (( $# != 1 )) ; then
        echo 'usage: newest_matching_file GLOB_PATTERN' >&2
        return 1
    fi

    # To avoid printing garbage if no files match the pattern, set
    # 'nullglob' if necessary
    local -i need_to_unset_nullglob=0
    if [[ ":$BASHOPTS:" != *:nullglob:* ]] ; then
        shopt -s nullglob
        need_to_unset_nullglob=1
    fi

    newest_file=
    for file in $glob_pattern ; do
        [[ -z $newest_file || $file -nt $newest_file ]] \
            && newest_file=$file
    done

    # To avoid unexpected behaviour elsewhere, unset nullglob if it was
    # set by this function
    (( need_to_unset_nullglob )) && shopt -u nullglob

    # Use printf instead of echo in case the file name begins with '-'
    [[ -n $newest_file ]] && printf '%s\n' "$newest_file"

    return 0
}

It uses only Bash builtins, and should handle files whose names contain newlines or other unusual characters.

6 Comments

You could you use nullglob_shopt=$(shopt -p nullglob) and then later $nullglob to put back nullglob how it was before.
The suggestion by @gniourf_gniourf to use $(shopt -p nullglob) is a good one. I generally try to avoid using command substitution ($() or backticks) because it is slow, particularly under Cygwin, even when the command only uses builtins. Also, the subshell context in which the commands get run can sometimes cause them to behave in unexpected ways. I also try to avoid storing commands in variables (like nullglob_shopt) because very bad things can happen if you get the value of the variable wrong.
I appreciate the attention to details that can lead to obscure failure when overlooked. Thanks!
I love that you went for a more unique way to solve the problem! It's a certainty that in Unix/Linux there is more than one way to 'skin the cat!'. Even if this takes more work it has the benefit of showing people concepts. Have a +1!
@gniourf_gniourf Your method implie one fork! But using if shopt -q nullglob; then shopt -s nullglob; need_to_unset_nullglob=1; fi will be nicer ( faster ;-).
|
12

Use the find command.

Assuming you're using GNU find, you can use -printf '%T+ %p\n' for file timestamp value.

find . -type f -printf '%T+ %p\n' | sort -r | head -n 1 | cut -d' ' -f2

Example:

find ~/Downloads -type f -printf '%T+ %p\n' | sort -r | head -n 1 | cut -d' ' -f2

For a more useful script, see the find-latest script here: https://github.com/l3x/helpers

2 Comments

to work with file names that contains spaces change cut -d' ' -f2,3,4,5,6,7,8,9 ...
The version of Bash is unimportant. You need to have GNU find because the -printf option is non-standard (so typically, out of the box, this will only work on Linux).
7

You can use stat with a file glob and a decorate-sort-undecorate with the file time added on the front:

$ stat -f "%m%t%N" b2* | sort -rn | head -1 | cut -f2-

As stated in comments, the best cross platform solution may be with a Python, Perl, Ruby script.

For such things, I tend to use Ruby since it is very awk like in the ease of writing small, throw away scripts yet has the power of Python or Perl right from the command line.

Here is a ruby:

ruby -e '
# index [0] for oldest and [-1] for newest
newest=Dir.glob("*").
    reject { |f| File.directory?(f)}.
    sort_by { |f| File.birthtime(f) rescue File.mtime(f) 
    }[-1]
p newest'

That gets the newest file in the current working directory.

You can also make the glob recursive by using **/* in glob or limit to matched files with b2*, etc

3 Comments

nope. "stat: cannot read file system information for '%m%t%N': No such file or directory"
I think this might be for the Mac/FreeBSD version of stat, if I'm remembering its options correctly. To get similar output on other platforms, you could use stat -c $'%Y\t%n' b2* | sort -rn | head -n1 | cut -f2-
With "other platforms" you probably mean Linux. There are other platforms still which require different options, or, in the wrorst case, don't easily provide this level of granularity of control over the behavior of stat. If you need a portable solution, paradoxically, maybe write a Perl or Python script.
7

A Bash function to find the newest file under a directory matching a pattern

#1.  Make a bash function:
newest_file_matching_pattern(){ 
    find $1 -name "$2" -print0 | xargs -0 ls -1 -t | head -1  
} 
 
#2. Setup a scratch testing directory: 
mkdir /tmp/files_to_move;
cd /tmp/files_to_move;
touch file1.txt;
touch file2.txt; 
touch foobar.txt; 
 
#3. invoke the function: 
result=$(newest_file_matching_pattern /tmp/files_to_move "file*") 
printf "result: $result\n"

Prints:

result: /tmp/files_to_move/file2.txt

Or if brittle bash parlor tricks subcontracting to python interpreter is more your angle, this does the same thing:

#!/bin/bash 
 
function newest_file_matching_pattern { 
python - <<END 
import glob, os, re  
print(sorted(glob.glob("/tmp/files_to_move/file*"), key=os.path.getmtime)[0]); 
END 
} 
 
result=$(newest_file_matching_pattern) 
printf "result: $result\n" 

Prints:

result: /tmp/files_to_move/file2.txt

2 Comments

@tripleee All excellent bash tips and links. Cringe code from 3 years ago made less bad.
4

Unusual filenames (such as a file containing the valid \n character can wreak havoc with this kind of parsing. Here's a way to do it in Perl:

perl -le '@sorted = map {$_->[0]} 
                    sort {$a->[1] <=> $b->[1]} 
                    map {[$_, -M $_]} 
                    @ARGV;
          print $sorted[0]
' b2*

That's a Schwartzian transform used there.

2 Comments

May the schwartz be with you!
this answer may work but i wouldn't trust it given the poor documentation.
2

For googlers:

ls -t | head -1

  • -t sorts by last modification datetime
  • head -1 only returns the first result

(Don't use in production)

Comments

1

Combine find, stat, sort, cut and tail.

  1. find files -type f (matching name -name 'b2*')
  2. xargs stat stat those files printing %Y seconds since epoch and %n the filename
  3. sort that
  4. cut field -f 2 onwards 2- (tab separated)
  5. tail the last -n 1 of that to get the newest file

Works in all shells with GNU coreutils

tab=$(printf '\t');
find . -type f -print0 |
  xargs -0 stat --format "%Y$tab%n" |
  sort |
  cut -f 2- |
  tail -n 1

Feel free to substitute "$tab" for a literal tab character, it won't work on SO.

OP asked to filter the names of files starting with b2, so that would be

tab=$(printf '\t');
find . -type f -name 'b2*' -print0 |
  xargs -0 stat --format "%Y$tab%n" |
  sort |
  cut -f 2- |
  tail -n 1

1 Comment

I was wrong. Doesn't work in FreeBSD
0

I prefer it the simplest possible way, and this IMHO using ls -1 with the necessary other options for sorting. The -1 instructs ls to give me only the filenames, one per line, and is therefore absolutly safe for every form of filenames. So here on a debian 12 this would be

NEWESTFILE=$(ls -snew -1 <dir>/<pattern> | head -1)

to perform this task.

Maybe -1 is a newer addition to ls, idk, but today it seems the most elegant solution for me :-)

Comments

-2

There is a much more efficient way of achieving this. Consider the following command:

find . -cmin 1 -name "b2*"

This command finds the latest file produced exactly one minute ago with the wildcard search on "b2*". If you want files from the last two days then you'll be better off using the command below:

find . -mtime 2 -name "b2*"

The "." represents the current directory. Hope this helps.

5 Comments

This doesn't actually find the "newest file matching pattern"... it just find all the files matching pattern created a minute ago, or modified two days ago.
This answer was based on the question posed. Also, you can tweak the command to look at the latest file that came in a day or so ago. It depends on what you're trying to do.
"tweaking" is not the answer. it's like posting this as an answer: "Just tweak the find command and find the answer depending on what you want to do" .
Not sure about the unnecessary comment. If you feel like my answer does not substantiate, then please provide proper reason to why my answer doesn't make sense with EXAMPLES. If unable to do so, then please refrain from commenting further.
Your solution requires you to know when the latest file was created. That was not in the question so no, your answer is not based on the question posed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.