I have a space-delimited large file with thousands of rows and columns. I would like to remove all lines which have the same value across all columns but the first.
Input:
CHROM 108 139 159 265 350 351
SNP1 -1 -1 -1 -1 -1 -1
SNP2 2 2 2 2 2 2
SNP3 0 0 0 -1 -1 -1
SNP4 1 1 1 1 1 1
SNP5 0 0 0 0 0 0
Desired
CHROM 108 139 159 265 350 351
SNP3 0 0 0 -1 -1 -1
There is a similar question asked for the Panda Framework (Delete duplicate rows with the same value in all columns in pandas) and I found a somewhat partial solution that removes lines containing only zero
awk 'NR > 1{s=0; for (i=3;i<=NF;i++) s+=$i; if (s!=0)print}' input > outfile
but I want to do this for the numbers -1, 0, 1 and 2 in one go with header and 1st column as the identifier.
Any help will be highly appreciated.