Skip to main content
added 472 characters in body
Source Link
Stéphane Chazelas
  • 587k
  • 96
  • 1.1k
  • 1.7k
  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    (or addr=( $infile'' ) to not ignore a trailing :.)

    Or switch to better shells with proper splitting operators.

    Another approach here, you would be to do:

    regex='^(.*)_1\.fastq\.(Block.*)$'
    if [[ $infile =~ $regex ]]; then
      outfile=${BASH_REMATCH[1]}_2.fastq.${BASH_REMATCH[2]}
      ...
    

    With the caveat that regex matching only works with valid text, which again is not a guarantee for file names.

    Here, you could also use standard sh parameter expansion operators:

    new_file=${infile%_1.fastq.Block*}_2.fastq.Block${infile##*_1.fastq.Block}
    

    Or the ksh93-style:

    new_file=${infile/_1.fastq.Block/_2.fastq.Block}
    

    (note the variations in behaviour among all those approaches if _1.fastq.Block occurs more than once in the file name).

  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    Or switch to better shells with proper splitting operators.

    Another approach here, you be to do:

    regex='^(.*)_1\.fastq\.(Block.*)$'
    if [[ $infile =~ $regex ]]; then
      outfile=${BASH_REMATCH[1]}_2.fastq.${BASH_REMATCH[2]}
      ...
    

    With the caveat that regex matching only works with valid text, which again is not a guarantee for file names.

    Here, you could also use standard sh parameter expansion operators:

    new_file=${infile%_1.fastq.Block*}_2.fastq.Block${infile##*_1.fastq.Block}
    

    Or the ksh93-style:

    new_file=${infile/_1.fastq.Block/_2.fastq.Block}
    
  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    (or addr=( $infile'' ) to not ignore a trailing :.)

    Or switch to better shells with proper splitting operators.

    Another approach here would be to do:

    regex='^(.*)_1\.fastq\.(Block.*)$'
    if [[ $infile =~ $regex ]]; then
      outfile=${BASH_REMATCH[1]}_2.fastq.${BASH_REMATCH[2]}
      ...
    

    With the caveat that regex matching only works with valid text, which again is not a guarantee for file names.

    Here, you could also use standard sh parameter expansion operators:

    new_file=${infile%_1.fastq.Block*}_2.fastq.Block${infile##*_1.fastq.Block}
    

    Or the ksh93-style:

    new_file=${infile/_1.fastq.Block/_2.fastq.Block}
    

    (note the variations in behaviour among all those approaches if _1.fastq.Block occurs more than once in the file name).

added 472 characters in body
Source Link
Stéphane Chazelas
  • 587k
  • 96
  • 1.1k
  • 1.7k

In

echo $second_file

Since you forgot to quote the parameter expansion, it's subject to split+glob, so with IFS=., ABC_2.fastq.Block12 is first split into, ABC_2, fastq and Block12 and each word subject to globbing, with no effect here since none of the words contain glob operators.

So 3 arguments are passed to echo which it prints space separated.

To print the contents of a variable followed by a newline character, you need:

printf '%s\n' "$var"

For more details, see:


Now, a few more comments on your code:

  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    Or switch to better shells with proper splitting operators.

    Another approach here, you be to do:

    regex='^(.*)_1\.fastq\.(Block.*)$'
    if [[ $infile =~ $regex ]]; then
      outfile=${BASH_REMATCH[1]}_2.fastq.${BASH_REMATCH[2]}
      ...
    

    With the caveat that regex matching only works with valid text, which again is not a guarantee for file names.

    Here, you could also use standard sh parameter expansion operators:

    new_file=${infile%_1.fastq.Block*}_2.fastq.Block${infile##*_1.fastq.Block}
    

    Or the ksh93-style:

    new_file=${infile/_1.fastq.Block/_2.fastq.Block}
    

¹ Though beware that if a trap is handled whilst read is running, the code in that trap will have the modified $IFS

In

echo $second_file

Since you forgot to quote the parameter expansion, it's subject to split+glob, so with IFS=., ABC_2.fastq.Block12 is first split into, ABC_2, fastq and Block12 and each word subject to globbing, with no effect here since none of the words contain glob operators.

So 3 arguments are passed to echo which it prints space separated.

To print the contents of a variable followed by a newline character, you need:

printf '%s\n' "$var"

For more details, see:


Now, a few more comments on your code:

  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    Or switch to better shells with proper splitting operators.


¹ Though beware that if a trap is handled whilst read is running, the code in that trap will have the modified $IFS

In

echo $second_file

Since you forgot to quote the parameter expansion, it's subject to split+glob, so with IFS=., ABC_2.fastq.Block12 is first split into, ABC_2, fastq and Block12 and each word subject to globbing, with no effect here since none of the words contain glob operators.

So 3 arguments are passed to echo which it prints space separated.

To print the contents of a variable followed by a newline character, you need:

printf '%s\n' "$var"

For more details, see:


Now, a few more comments on your code:

  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    Or switch to better shells with proper splitting operators.

    Another approach here, you be to do:

    regex='^(.*)_1\.fastq\.(Block.*)$'
    if [[ $infile =~ $regex ]]; then
      outfile=${BASH_REMATCH[1]}_2.fastq.${BASH_REMATCH[2]}
      ...
    

    With the caveat that regex matching only works with valid text, which again is not a guarantee for file names.

    Here, you could also use standard sh parameter expansion operators:

    new_file=${infile%_1.fastq.Block*}_2.fastq.Block${infile##*_1.fastq.Block}
    

    Or the ksh93-style:

    new_file=${infile/_1.fastq.Block/_2.fastq.Block}
    

¹ Though beware that if a trap is handled whilst read is running, the code in that trap will have the modified $IFS

added 1421 characters in body
Source Link
Stéphane Chazelas
  • 587k
  • 96
  • 1.1k
  • 1.7k

In

echo $second_file

Since you forgot to quote the parameter expansion, it's subject to split+glob, so with IFS=., ABC_2.fastq.Block12 is first split into, ABC_2, fastq and Block12 and each word subject to globbing, with no effect here since none of the words contain glob operators.

So 3 arguments are passed to echo which it prints space separated.

To print the contents of a variable followed by a newline character, you need:

printf '%s\n' "$var"

For more details, see:


Now, a few more comments on your code:

  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    Or switch to better shells with proper splitting operators.


¹ Though beware that if a trap is handled whilst read is running, the code in that trap will have the modified $IFS

In

echo $second_file

Since you forgot to quote the parameter expansion, it's subject to split+glob, so with IFS=., ABC_2.fastq.Block12 is first split into, ABC_2, fastq and Block12 and each word subject to globbing, with no effect here since none of the words contain glob operators.

So 3 arguments are passed to echo which it prints space separated.

To print the contents of a variable followed by a newline character, you need:

printf '%s\n' "$var"

For more details, see:

In

echo $second_file

Since you forgot to quote the parameter expansion, it's subject to split+glob, so with IFS=., ABC_2.fastq.Block12 is first split into, ABC_2, fastq and Block12 and each word subject to globbing, with no effect here since none of the words contain glob operators.

So 3 arguments are passed to echo which it prints space separated.

To print the contents of a variable followed by a newline character, you need:

printf '%s\n' "$var"

For more details, see:


Now, a few more comments on your code:

  • since bash doesn't have the equivalent of zsh's (N) glob qualifier or ksh93's ~(N) glob operator, before using a glob in a for loop (at least), you need to set the nullglob option:

    shopt -s nullglob
    for infile in *_1.fastq.Block*; do...
    

    If you don't and there's no matching file, you'll loop over a literal *_1.fastq.Block*

  • You can set IFS for read only with: IFS=_ read -ra ADDR <<< "$infile" (see also the quotes around $infile which are needed in older versions of bash). That way, $IFS is only changed while read runs¹ and it's restored to its previous value after read returns.

  • IFS=. read -ra <<< "$var" is a poor method for splitting. First is only works for single line $vars which is not necessarily the case of file names, and also it's quite inefficient. That involves either storing the contents of $var into a tempfile or feed it via a pipe depending on the version of bash and/or the size of $var and then reading it one byte at a time until a newline is found.

    Here, you could use the split+glob operator instead:

    IFS=:; set -o noglob
    addr=( $infile )
    

    Or switch to better shells with proper splitting operators.


¹ Though beware that if a trap is handled whilst read is running, the code in that trap will have the modified $IFS

Source Link
Stéphane Chazelas
  • 587k
  • 96
  • 1.1k
  • 1.7k
Loading