Skip to main content
deleted 3 characters in body
Source Link
Ed Morton
  • 36k
  • 6
  • 25
  • 60

There's a lot I'm not sure how you want handled so here's a starting point at least - this will find the files in your folders and print them prefixed with their size in bytes:

find folder1 folder2 -type f -printf '%s %P\n'

e.g. something like (just hand-editing the list from your question):

50000 a2021fileA123.txt
80000 a2022fileA123.txt
79000 a2021fileA124.txt
80000 a2022fileA124.txt
90000 a2021fileA125.txt
80000 a2022fileA125.txt

now pipe that to this awk command (using GNU awk for arrays of arrays) and it'll output the difference in sizes between the 2022 and 2212021 versions of the files:

$ cat tst.awk
{
    size = $1
    year = substr($2,2,4)
    base = substr($2,6)
    map[base][year]bases[base]
    map[base,year] = size
}
END {
    for ( base in mapbases ) {
        print base, map[base][2022]map[base,2022] - map[base][2021]map[base,2021]
    }
}

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk
fileA125.txt -10000
fileA123.txt 30000
fileA124.txt 1000

Pipe it to sort to get the output sorted by the size difference:

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk | sort -k2,2rn
fileA123.txt 30000
fileA124.txt 1000
fileA125.txt -10000

Hope that helps.

There's a lot I'm not sure how you want handled so here's a starting point at least - this will find the files in your folders and print them prefixed with their size in bytes:

find folder1 folder2 -type f -printf '%s %P\n'

e.g. something like (just hand-editing the list from your question):

50000 a2021fileA123.txt
80000 a2022fileA123.txt
79000 a2021fileA124.txt
80000 a2022fileA124.txt
90000 a2021fileA125.txt
80000 a2022fileA125.txt

now pipe that to this awk command (using GNU awk for arrays of arrays) and it'll output the difference in sizes between the 2022 and 221 versions of the files:

$ cat tst.awk
{
    size = $1
    year = substr($2,2,4)
    base = substr($2,6)
    map[base][year] = size
}
END {
    for ( base in map ) {
        print base, map[base][2022] - map[base][2021]
    }
}

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk
fileA125.txt -10000
fileA123.txt 30000
fileA124.txt 1000

Pipe it to sort to get the output sorted by the size difference:

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk | sort -k2,2rn
fileA123.txt 30000
fileA124.txt 1000
fileA125.txt -10000

Hope that helps.

There's a lot I'm not sure how you want handled so here's a starting point at least - this will find the files in your folders and print them prefixed with their size in bytes:

find folder1 folder2 -type f -printf '%s %P\n'

e.g. something like (just hand-editing the list from your question):

50000 a2021fileA123.txt
80000 a2022fileA123.txt
79000 a2021fileA124.txt
80000 a2022fileA124.txt
90000 a2021fileA125.txt
80000 a2022fileA125.txt

now pipe that to this awk command and it'll output the difference in sizes between the 2022 and 2021 versions of the files:

$ cat tst.awk
{
    size = $1
    year = substr($2,2,4)
    base = substr($2,6)
    bases[base]
    map[base,year] = size
}
END {
    for ( base in bases ) {
        print base, map[base,2022] - map[base,2021]
    }
}

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk
fileA125.txt -10000
fileA123.txt 30000
fileA124.txt 1000

Pipe it to sort to get the output sorted by the size difference:

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk | sort -k2,2rn
fileA123.txt 30000
fileA124.txt 1000
fileA125.txt -10000

Hope that helps.

Source Link
Ed Morton
  • 36k
  • 6
  • 25
  • 60

There's a lot I'm not sure how you want handled so here's a starting point at least - this will find the files in your folders and print them prefixed with their size in bytes:

find folder1 folder2 -type f -printf '%s %P\n'

e.g. something like (just hand-editing the list from your question):

50000 a2021fileA123.txt
80000 a2022fileA123.txt
79000 a2021fileA124.txt
80000 a2022fileA124.txt
90000 a2021fileA125.txt
80000 a2022fileA125.txt

now pipe that to this awk command (using GNU awk for arrays of arrays) and it'll output the difference in sizes between the 2022 and 221 versions of the files:

$ cat tst.awk
{
    size = $1
    year = substr($2,2,4)
    base = substr($2,6)
    map[base][year] = size
}
END {
    for ( base in map ) {
        print base, map[base][2022] - map[base][2021]
    }
}

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk
fileA125.txt -10000
fileA123.txt 30000
fileA124.txt 1000

Pipe it to sort to get the output sorted by the size difference:

$ find folder1 folder2 -type f -printf '%s %P\n' | awk -f tst.awk | sort -k2,2rn
fileA123.txt 30000
fileA124.txt 1000
fileA125.txt -10000

Hope that helps.