VAR=$(< file) vs read VAR < file

Question

I recently learned that there is a special case of command substitution:

The command substitution $(cat file) can be replaced by the equivalent but faster $(< file).

I never use the cat variation, and I have seen extensive usage of read instead for the same, i.e.:

IFS='' read -r -d '' VAR < file

What is the difference in terms of the side effects (e.g. special characters in file), performance or any other aspects between the two and why don't I see the former used extensively in scripts that are otherwise using Bash-only features already?

Are you asking for someone to provide a side-by-side comparison of VAR=$(cat file), VAR=$(< file), and IFS='' read -r -d '' VAR < file or something else? When you say "why don't I see the former used extensively in scripts" - is the "former" $(cat file) (that's the first command in the question)? Not my downvote btw. — Ed Morton
– Ed Morton, Commented Aug 18 at 12:40
(Performance cost is going to depend on the details -- in particular, read is more efficient when it's reading from a seekable source such as a regular file, and slower when it's reading from a pipe/FIFO/&c where it can't read more than it needs and rewind the file pointer after). — Charles Duffy
– Charles Duffy, Commented Aug 18 at 15:37
@AlbertCamu var=$(cat file; printf 'x'); var=${var%x} or similar to add a char after the file thereby ensuring no trailing white space, then remove that char. — Ed Morton
– Ed Morton, Commented Aug 18 at 17:02
@Barmar, ...I just smoketested it with Apple's 3.2.57(1)-release build; it's definitely there. — Charles Duffy
– Charles Duffy, Commented Aug 19 at 19:52
@kojiro, fair, but again, cat is an external command so even though ${c ...} may not require a subshell when all you're running is builtins, ${c cat file} definitely calls a fork(), so there's a transient subshell. (And while theoretically, $(cat file) would be two subshells -- one for the $() and the other in the transient fork() -- making the other a savings, in practice, the shell detects when you're running a process substitution that invokes only one command with no traps &c and performs an implicit exec, so when the conditions for the optimization exist it evens out). — Charles Duffy
– Charles Duffy, Commented Aug 19 at 19:54

jhnc · Accepted Answer · 2025-08-18 15:05:13Z

4

They are not equivalent. The value returned from a subshell has trailing newlines stripped:

See: https://pubs.opengroup.org/onlinepubs/9799919799/utilities/V3_chap02.html#tag_19_06_03

$(commands)
or (backquoted version):
`commands`
The shell shall expand the command substitution by executing commands in a subshell environment (see 2.13 Shell Execution Environment) and replacing the command substitution (the text of the commands string plus the enclosing "$()" or backquotes) with the standard output of the command(s); if the output ends with one or more bytes that have the encoded value of a <newline> character, they shall not be included in the replacement. Any such bytes that occur elsewhere shall be included in the replacement; however, they might be treated as field delimiters and eliminated during field splitting, depending on the value of IFS and quoting that is in effect. If the output contains any null bytes, the behavior is unspecified.

The cost of setting up a subshell environment is also higher than using read.

In recent versions of bash, mapfile (aka readarray) can also be used as a faster (~10%?) alternative to read (technically it is also slightly different as it creates an array):

Consider:

$ unset v1 v2 v3
$ printf '\n\nabc\t\t\n\n\n' > f
$ v1=$(<f)
$ IFS= read -r -d '' v2 <f
$ mapfile -d '' v3 <f
$ declare -p v1 v2 v3
declare -- v1=$'\n\nabc\t\t'
declare -- v2=$'\n\nabc\t\t\n\n\n'
declare -a v3=([0]=$'\n\nabc\t\t\n\n\n')
$

There's also a difference if the file actually contains nuls:

$ unset v1 v2 v3
$ printf '\n\n\ta\t\n\n\n\tb\t\n\0\n\n\tc\t\n\n\0\n' >f
$ v1=$(<f)
bash: warning: command substitution: ignored null byte in input
$ IFS= read -r -d '' v2 <f
$ mapfile -d '' v3 <f
$ declare -p v1 v2 v3
declare -- v1=$'\n\n\ta\t\n\n\n\tb\t\n\n\n\tc\t'
declare -- v2=$'\n\n\ta\t\n\n\n\tb\t\n'
declare -a v3=([0]=$'\n\n\ta\t\n\n\n\tb\t\n' [1]=$'\n\n\tc\t\n\n' [2]=$'\n')
$

edited Aug 18 at 15:05

answered Aug 18 at 12:47

jhnc

18.7k2 gold badges14 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mickmackusa Aug 19 at 9:09

Comments have been moved to chat; please do not continue the discussion here. Before posting a comment below this one, please review the purposes of comments. Comments that do not request clarification or suggest improvements usually belong as an answer, on Meta Stack Overflow, or in Stack Overflow Chat. Comments continuing discussion may be removed.

F. Hauri - Give Up GitHub · Accepted Answer · 2025-08-19 11:23:09Z

set variable from file content: little bench

In complement to jhnc's correct answer, here is a litte bench:

First creating a single long line with only ascii characters:

For testing, I want to read only one line. I use pseudo fs in shared memory: /dev/shm in order to minimize fs footprint in bench.

LANG=C man -Len -Pcol\ -b man |
    tr \\n \ |
    sed 's/[[:space:]]\+/ /g' >/dev/shm/file

This produce on my host a file containing one single 28kb line.

wc /dev/shm/file

    0  4749 28670 /dev/shm/file

Creating functions regarding our interest

getvar1() {  var=$(cat /dev/shm/file)       ;}
getvar2() {  var=$(< /dev/shm/file)         ;}
getvar3() {  read -r var < /dev/shm/file    ;}
getvar4() {  mapfile -t var < /dev/shm/file ;}

Tests loops:

times=();for string in long short; do
    times+=($string)
    echo "Doing 4 tests with $string string:"
    for test in {1..4}; do
        started=${EPOCHREALTIME/.}  var=---
        for ((i=1000;i--;)); do
            getvar$test
        done
        elap=00000$(( ${EPOCHREALTIME/.}-started))
        printf -v "times[${#times[@]}]" %.5f  ${elap::-6}.${elap: -6}
        mapfile -t command < <(declare -f getvar$test)
        printf 'Test: %d: %s\n  var is %d len: "%s"\n  time: %ssec.\n' $test \
            "${command[2]}" "${#var}" "${var::4}...${var: -4}" "${times[-1]}"
        done
        echo Lorem Ipsum >/dev/shm/file
 done

Produce on my host:

Doing 4 tests with long string:
Test: 1:     var=$(cat /dev/shm/file)
  var is 28670 len: "MAN(...(1) "
  time: 1.72581sec.
Test: 2:     var=$(< /dev/shm/file)
  var is 28670 len: "MAN(...(1) "
  time: 0.15679sec.
Test: 3:     read -r var < /dev/shm/file
  var is 28669 len: "MAN(...N(1)"
  time: 0.35542sec.
Test: 4:     mapfile -t var < /dev/shm/file
  var is 28670 len: "MAN(...(1) "
  time: 0.10624sec.
Doing 4 tests with short string:
Test: 1:     var=$(cat /dev/shm/file)
  var is 11 len: "Lore...psum"
  time: 1.55918sec.
Test: 2:     var=$(< /dev/shm/file)
  var is 11 len: "Lore...psum"
  time: 0.01891sec.
Test: 3:     read -r var < /dev/shm/file
  var is 11 len: "Lore...psum"
  time: 0.01821sec.
Test: 4:     mapfile -t var < /dev/shm/file
  var is 11 len: "Lore...psum"
  time: 0.01632sec.

Then

printf '%8s: %10s%10s%10s%10s\n' string test{1..4} ${times[@]}

  string:      test1     test2     test3     test4
    long:    1.72581   0.15679   0.35542   0.10624
   short:    1.55918   0.01891   0.01821   0.01632

using $(cat file) took more than 1,5 seoonds! We see here the higher cost of setting up a subshell environment!
using read is significantly slower than var=$(<file) or mapfile.
the read command will drop trailing space(s).
the quickest seem to be mapfile.

Note about `mapfile`

Reading binary under bash is possible, but for this, you have to read by byte (see Yes , bash can read and write binary. This make the, job but slowly.

Using mapfile with null separator could be very efficient. For sample, Linux kernel use pseudo fs /proc where you could read environment from all process (depending on your access rights). But all entries are separated by a null byte 0x00.

Reading your own environment is useless, this is a sample only:

mapfile -d '' -t env </proc/$$/environ

Then now in array $env, you must be able to found all your shell environment:

shopt -s extglob
printf 'Variable USER: real="%s", in $env var="%s"\n' "$USER" ${env[@]/#!(USER=?*)}

Should ouptupt something like:

Variable USER: real="john", in $env var="john"

Further with mapfile...

Little binary test using mapfile:

man man | md5sum

2953aa6314f6c27c4277d1731464f2ea  -

IFS= LANG=C mapfile -t -d '' binary < <(man man|zstd)
printf '%s\0' "${binary[@]}" | zstd -d | md5sum

zstd: /*stdin*\: unknown header 
2953aa6314f6c27c4277d1731464f2ea  -

Where zstd complain about trailing null byte added after last ${binary[-1]} field, but decompress binary correctly!

printf '%s\0' "${binary[@]}"| head -c -1 | zstd -d | md5sum

2953aa6314f6c27c4277d1731464f2ea  -

John Bollinger · Accepted Answer · 2025-08-18 15:47:47Z

2

I [...] have seen extensively the use of read instead for the same, i.e.:
IFS='' read -r -d '' VAR < file

Really? I see plenty of use of read, but I don't think I've ever seen it used in the particular manner you describe, to read an entire file into a variable. Of course, reading an entire file into a variable is itself something I don't see very often (and I would generally consider doing so an anti-pattern), so I guess my sample size is limited.

What is the difference in terms of the side effects (e.g. special characters in file),

cat and < both transfer raw bytes without interpretation, until the input is exhausted. On the other hand, read reads and processes just one line. Setting the line delimiter to the empty string does not prevent that. It instructs read that the line delimiter is the NUL character, not that there isn't any delimiter at all. That might be desirable, undesirable, or irrelevant, depending on the situation.

Also, as another answer already observes, command substitutions remove trailing newlines. Because it does not treat newlines as line delimiters, your read command may store trailing newlines in the value of VAR, but VAR=$(< file) will not do so.

performance or any other aspects between the two

read by default interprets line continuations as the shell itself does, and it performs word splitting on the resulting line. The -r option will moot line continuations, and setting IFS to an empty string will neuter word splitting, but that does not necessarily mean that read's provision for these things does not still have a cost.

and why don't I see the former used extensively in scripts that are otherwise using Bash-only features already?

It's not entirely clear what you mean by "the former", but since you say you do see the read variation, I guess you're asking why you don't see $(< file).

That has a lot to do with what scripts you read, so we're not in a position to answer definitively. As I already observed, I don't, myself, see the read usage you describe, which you can take as a data point supporting

it's a matter of style and personal preference

You could also consider that

Although a bit wasteful, $(cat file) is intuitive. And $(< file) not too much worse in that sense. Your read command, however, requires a lot more analysis to decipher. At least for me. Code clarity is important.

And of course,

there are semantic differences between these alternatives, as already described. It is conceivable that the read uses of this form that you observe yourself intentionally make use of that.

Furthermore, although shell script performance is a relevant consideration, it is rarely a primary consideration. If it's important for your program to run as fast as possible then writing it as a shell script at all is a mistake.

edited Aug 18 at 15:47

answered Aug 18 at 15:24

John Bollinger

191k11 gold badges103 silver badges206 bronze badges

7 Comments

Charles Duffy Aug 18 at 15:46

Bah. Anyone who knows how to read NUL-delimited streams already knows when they see IFS= read -r -d '' what it means -- it's an idiom at this point, because the several pieces need to be used together to get the standard / commonly-desired effect.

Albert Camu Aug 18 at 16:38

I have difficulty following your "read reads and processes just one line. Setting the line delimiter to the empty string does not prevent that." Say the term line means a unit of number of characters read. Then setting the line deliminer to effectively have it swallow everything (do not care for NULLs now) prevents reading actual "line by line" behaviour.

John Bollinger Aug 18 at 16:52

@CharlesDuffy, I take you to be responding to my remarks about code clarity. You may be quite right that IFS= read -r -d '' will be taken as idiomatic among some group of people who are in the know about it, but I'm not prepared to throw out clarity to others, who seem to include the OP. Especially where anyone might consider the read usage in question as an alternative to $(< file).

John Bollinger Aug 18 at 16:57

@AlbertCamu, read's -d option determines the definition of "line" for its purposes. It never reads more than one line in this sense. This is a semantic difference from $(< file) that cannot be removed, though whether it makes an actual difference for reading any particular file depends on the contents of that file.

Albert Camu Aug 18 at 16:58

I actually thought that such use of read was the usual way to slurp a file. Now considering that cat with command substitution strips the newlines, it looks like it's the only way.

|

Collectives™ on Stack Overflow

VAR=$(< file) vs read VAR < file

3 Answers 3

1 Comment

set variable from file content: little bench

First creating a single long line with only ascii characters:

Creating functions regarding our interest

Tests loops:

Note about `mapfile`

Further with mapfile...

Comments

7 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

set variable from file content: little bench

First creating a single long line with only ascii characters:

Creating functions regarding our interest

Tests loops:

Note about mapfile

Further with mapfile...

Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Note about `mapfile`