I'm trying to remove duplicates in a list of Jira tickets that follow the following syntax:
XXXX-12345: a description
where 12345 is a pattern like [0-9]+ and the XXXX is constant. For example, the following list:
XXXX-1111: a description
XXXX-2222: another description
XXXX-1111: yet another description
should get cleaned up like this:
XXXX-1111: a description
XXXX-2222: another description
I've been trying using sed but while what I had worked on Mac it didn't on linux. I think it'd be easier with awk but I'm not an expert on any of them.
I tried:
sed -r '$!N; /^XXXX-[0-9]+\n\1/!P; D' file
$0with$1in the accepted answer to this related question should do the trickFSis set to, it defaults to sequences of spaces and tabs. This splitting sets the positional variables$1,$2, ... accordingly, so$1is the first field, up-to the first space/tabsed -r '$!N; /^XXXX-[0-9]+\n\1/!P; D'as I found another answer where it was used to delete duplicated lines. In the original answer instead ofXXXX-[0-9]+there was(.*). But it's sure I don't get how it works because it doesn't work.