^(.*)(\r?\n\1)+$
replace with \1
The above is a great way to remove duplicate lines using REGEX but it requires the entire line to be a duplicate
However – what would I use if I want to detect and remove dups – when the entire line s a whole is not a dup – but just the first X characters
Example: Original File
12345 Dennis Yancey University of Miami
12345 Dennis Yancey University of Milan
12345 Dennis Yancey University of Rome
12344 Ryan Gardner University of Spain
12347 Smith John University of Canada
Dups Removed
12345 Dennis Yancey University of Miami
12344 Ryan Gardner University of Spain
12347 Smith John University of Canada
1 1 2 1where 1s are duplicates?^(.{10}).*$[\s\S]*?\K^\1.*, but you'd have to run it until no more matches are found. This only works in some languages due to\K(e.g. PCRE)