Skip to main content
edited body
Source Link
Gilles 'SO- stop being evil'
  • 866.1k
  • 205
  • 1.8k
  • 2.3k

So you want to remove the files listed in workflows.txt, except for those that are listed in workflows-sorted.txt. You can obtain the list of files by stripping off the checksums, sorting the names and running comm to extract the lines that are only present in workflows.txt. In a shell that supports process substitution (ksh93, bash, zsh):

comm -23 <(<workflows.txt sed 's/^[^ *][]*[ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ *][]*[ ][ ]*//' | sort)

comm -23 removes the lines that are only present in the second argument (-2) and the lines that are present in both files (-3), thus keeping only the lines that are present in the first argument but not the second argument. Keep in mind that comm requires the input files to be sorted.

To delete them:

comm -23 <(<workflows.txt sed 's/^[^ *][]*[ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ *][]*[ ][ ]*//' | sort) |
xargs -I rm -- {}

You can make the last line xargs rm to go slightly faster (by grouping calls to rm) if the file names don't contain any whitespace or \'". Alternatively you can make the last line tr '\n' '\0' | xargs -0 rm -- or xargs -d '\n' rm -- if your xargs supports these options. You don't need the -- if all your file names begin with / or ./ (or anything that's guaranteed not to begin with -).

So you want to remove the files listed in workflows.txt, except for those that are listed in workflows-sorted.txt. You can obtain the list of files by stripping off the checksums, sorting the names and running comm to extract the lines that are only present in workflows.txt. In a shell that supports process substitution (ksh93, bash, zsh):

comm -23 <(<workflows.txt sed 's/^[^ *][ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ *][ ][ ]*//' | sort)

comm -23 removes the lines that are only present in the second argument (-2) and the lines that are present in both files (-3), thus keeping only the lines that are present in the first argument but not the second argument. Keep in mind that comm requires the input files to be sorted.

To delete them:

comm -23 <(<workflows.txt sed 's/^[^ *][ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ *][ ][ ]*//' | sort) |
xargs -I rm -- {}

You can make the last line xargs rm to go slightly faster (by grouping calls to rm) if the file names don't contain any whitespace or \'". Alternatively you can make the last line tr '\n' '\0' | xargs -0 rm -- or xargs -d '\n' rm -- if your xargs supports these options. You don't need the -- if all your file names begin with / or ./ (or anything that's guaranteed not to begin with -).

So you want to remove the files listed in workflows.txt, except for those that are listed in workflows-sorted.txt. You can obtain the list of files by stripping off the checksums, sorting the names and running comm to extract the lines that are only present in workflows.txt. In a shell that supports process substitution (ksh93, bash, zsh):

comm -23 <(<workflows.txt sed 's/^[^ ]*[ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ ]*[ ][ ]*//' | sort)

comm -23 removes the lines that are only present in the second argument (-2) and the lines that are present in both files (-3), thus keeping only the lines that are present in the first argument but not the second argument. Keep in mind that comm requires the input files to be sorted.

To delete them:

comm -23 <(<workflows.txt sed 's/^[^ ]*[ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ ]*[ ][ ]*//' | sort) |
xargs -I rm -- {}

You can make the last line xargs rm to go slightly faster (by grouping calls to rm) if the file names don't contain any whitespace or \'". Alternatively you can make the last line tr '\n' '\0' | xargs -0 rm -- or xargs -d '\n' rm -- if your xargs supports these options. You don't need the -- if all your file names begin with / or ./ (or anything that's guaranteed not to begin with -).

Source Link
Gilles 'SO- stop being evil'
  • 866.1k
  • 205
  • 1.8k
  • 2.3k

So you want to remove the files listed in workflows.txt, except for those that are listed in workflows-sorted.txt. You can obtain the list of files by stripping off the checksums, sorting the names and running comm to extract the lines that are only present in workflows.txt. In a shell that supports process substitution (ksh93, bash, zsh):

comm -23 <(<workflows.txt sed 's/^[^ *][ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ *][ ][ ]*//' | sort)

comm -23 removes the lines that are only present in the second argument (-2) and the lines that are present in both files (-3), thus keeping only the lines that are present in the first argument but not the second argument. Keep in mind that comm requires the input files to be sorted.

To delete them:

comm -23 <(<workflows.txt sed 's/^[^ *][ ][ ]*//' | sort) \
         <(workflows-sorted.txt sed 's/^[^ *][ ][ ]*//' | sort) |
xargs -I rm -- {}

You can make the last line xargs rm to go slightly faster (by grouping calls to rm) if the file names don't contain any whitespace or \'". Alternatively you can make the last line tr '\n' '\0' | xargs -0 rm -- or xargs -d '\n' rm -- if your xargs supports these options. You don't need the -- if all your file names begin with / or ./ (or anything that's guaranteed not to begin with -).