I have a folder with several files. These files are either .xml or .zip files.
These .zip files contain .xml and/or .zip files. These .zip contains also .xml or .zip, and so on... until we finally found .xml files.
In others words, I can have several "levels" of zip before finding my .xml files (cf. example below).
My requirement is to detect which root ZIP files contain at least one XML file that is bigger than 100Mb.
When a ZIP file is in such case, it should be moved to another directory (let say ~/big-files).
Also, if a non zipped .xml file is bigger than 100Mb, then it should be moved to this directory.
For example:
foo1.xml
foo2.xml
baz.xml [MORE THAN 100Mb]
one.zip
+- foo.xml
+- bar.xml [MORE THAN 100Mb]
+- foo.xml
two.zip
+- foo.xml
+- zip-inside1.zip
| +- bar.xml [MORE THAN 100Mb]
+- foo.xml
three.zip
+- foo.xml
+- zip-inside2.zip
| +- zip-inside3.zip
| +- foo.xml
| +- bar.xml [MORE THAN 100Mb]
+- foo.xml
four.zip
+- foo.xml
+- zip-inside1.zip
+- foo.xml
In this example, baz.xml, one.zip, two.zip and three.zip should be moved to ~/big-files as they host at least one XML file bigger than 100Mb, but not four.zip.
How can I achieve that in bash shell?
Thanks.
finddoes not look inside zip files. You will need to write a script to do this in a more powerful language (Python, Ruby, Perl, etc.)-exec sh -c "unzip -l $@ ... | grep" xxx. That's a (small) shell script.