Remove Specific Duplicates in java stream with a single matching field

Question

Given a java pojo containing the following fields

public class Pojo { 
    // for brevity I am excluding the property getters / setters etc.
    private int V;
    private BigDecimal Amount;
    private String Operation;
    private String Tag;
    private String Code; 
}

I have a list of pojos' that look like this. If I compare specifically Tag, Amount, Ver, and ignore Code, I would like to use a hash / stream / whatever to remove all the rows where there is exactly 1 "IU" and exactly 1 "DU" for the same Tag, Amount, and Ver.

Is there a way to join or group streams to ensure that if there is exactly 1 insert update "IU" and exactly 1 delete update "DU" for the same values then both are removed?

Ver Amount  Operation   Tag     Code

1   1       "IU"        6896450 5500 
1   1       "DU"        6898103 5500 
2   4       "IU"        6954561 5200 
2   4       "DU"        6954561 5500 
3   4       "IU"        7057717 5200 
3   4       "DU"        7057717 5500 
1   8       "IU"        7132952 5200 
1   8       "DU"        7132952 5500

What id you have 2 IU, and 2 DU for the same value are they deleted, or what if you have 2 IU and 1 DU, do you delete one IU for the same value? — Karim
– Karim, Commented Dec 23, 2019 at 14:29
I'd make a PojoKey class that references the Tag/Code/Op fields, then a couple of Map<PojoKey, List<Pojo>> that you add them to, one for IU and one for DU. From there you have a well structured data model that you can work with, in this case comparing the two maps to find the union of them. you need to decide what to do with the case where there are multiple IU/DU per item. — Stik
– Stik, Commented Dec 23, 2019 at 14:51

Venkata Raju · Accepted Answer · 2019-12-23 16:59:25Z

1

// import static java.util.stream.Collectors.groupingBy;

List<Pojo> pojos = new ArrayList<>(Arrays.asList(
    new Pojo(1, BigDecimal.ONE, "IU", "6896450", "5500"),
    // ...,
    new Pojo(1, BigDecimal.valueOf(8), "DU", "7132952", "5500")));

// Create a map (`opMapByTag`) by `Tag`,
// within the results/pojos create another map (`pojosByOp`) by `Operation`
Map<String, Map<String, List<Pojo>>> opMapByTag =
    pojos.stream().collect(
        groupingBy(Pojo::getTag,
           groupingBy(Pojo::getOperation)));

Set<Pojo> toBeRemoved = new HashSet<>();

for (Entry<String, Map<String, List<Pojo>>> e : opMapByTag.entrySet())
{
  Map<String, List<Pojo>> pojosByOp = e.getValue();
  List<Pojo> l1, l2;
  if ((pojosByOp.size() == 2) &&
      ((l1 = pojosByOp.get("IU")).size() == 1) &&
      ((l2 = pojosByOp.get("DU")).size() == 1))
  {
    Collections.addAll(toBeRemoved, l1.get(0), l2.get(0));
  }
}

pojos.removeIf(toBeRemoved::contains);

pojos.forEach(System.out::println);

edited Dec 23, 2019 at 16:59

answered Dec 23, 2019 at 16:42

Venkata Raju

5,3914 gold badges32 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Jim Over a year ago

yea thanks, I was being lazy and that's pretty much what I did, thanks for the tips. hash maps is the way to go, unfortunately its n^2 otherwise.

Collectives™ on Stack Overflow

Remove Specific Duplicates in java stream with a single matching field

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related