2

I have a list of objects in Java with two timeStamp, like :

Obj (TimeStamp ts, TimeStamp generationTs, int value).

At the end, I don't want two items in the list with the same ts. If there are, I want to keep only the one with the most recent generationTs.

Actually, I have that code, it works, but I'd like to know if with streams, I can't do something better ?

list.sort(Collections.reverseOrder());
List<Obj> returnedList = Lists.newArrayList();
if (!list.isEmpty()) {
   returnedList.add(list.get(0));
   Iterator<Obj> i = list.iterator();
   while (i.hasNext()) {
       Obj lastObj = returnedList.get(returnedList.size() - 1);
       Obj nextObj = i.next();
       if (!lastObj.getTs().isEqual(nextObj.getTs())) {
           returnedList.add(nextObj);
       } else {
           if (lastObj.getGenerationTs().isBefore(nextObj.getGenerationTs())) {
             returnedList.remove(lastObj);
             returnedList.add(nextObj);
           }
        }
    }
}

If the list is :

{("2019-05-02T09:00:00Z", "2019-05-02T21:00:00Z", 1),
("2019-05-02T09:30:00Z", "2019-05-02T21:00:00Z", 2),
("2019-05-02T10:00:00Z", "2019-05-02T21:00:00Z", 3),
("2019-05-02T10:30:00Z", "2019-05-02T21:00:00Z", 4),
("2019-05-02T09:30:00Z", "2019-05-02T22:00:00Z", 5),
("2019-05-02T10:00:00Z", "2019-05-02T22:00:00Z", 6) }

It must returns :

{("2019-05-02T09:00:00Z", "2019-05-02T21:00:00Z", 1),
("2019-05-02T09:30:00Z", "2019-05-02T22:00:00Z", 5),
("2019-05-02T10:00:00Z", "2019-05-02T22:00:00Z", 6) 
("2019-05-02T10:30:00Z", "2019-05-02T21:00:00Z", 4) }
8
  • Try looking for List.removeIf, but you need to make sure that you don't remove element for its own existence. Commented May 7, 2019 at 14:38
  • @Naman : removeIf() works when you have a predicate you know.. In my case, I want to remove element x if : ` list(x).getTs()= = list(y).getTs() && list(x).getGenerationTs().isBefore(list(y).getGenerationTs()) ` Commented May 7, 2019 at 14:52
  • I would though prefer not to go via the stream route here. Yet one possible way is to collect toMap, with timestamp as the key and Obj as a value and then in the mergeFunction ensure you fulfill your generatedTs condition. Finally get the values of this map that you would be interested in. Commented May 7, 2019 at 14:56
  • Better can be very subjective. If this is purely academic, fine. But if your code works, don't "fix" it. Commented May 7, 2019 at 14:59
  • just wondering: you are not doing anything that could be better done using SQL, right? Commented May 7, 2019 at 15:05

4 Answers 4

1

you can try like this:

Map<TimeStamp, Optional<Obj>> result = 
         list.stream().collect(Collectors.groupingBy(
                                Obj::getTs,
                                Collectors.maxBy(Comparator.comparing(Obj::getGenerationTs))
         ));

More complete options as @Naman stated in comment:

list.stream().collect(Collectors.groupingBy(
                       Obj::getTs,
                       Collectors.maxBy(Comparator.comparing(Obj::getGenerationTs))
              )).values().stream()
                .filter(Optional::isPresent) 
                .map(Optional::get)
                .collect(Collectors.toList());
Sign up to request clarification or add additional context in comments.

1 Comment

What OP might be looking for could be something like return list.stream() .collect(Collectors.groupingBy( Obj::getTs, Collectors.maxBy(Comparator.comparing(Obj::getGeneratedTs)) )).values() .stream() .filter(Optional::isPresent) .map(Optional::get) .collect(Collectors.toList());
1

You can certainly do it using Stream using a map collector and then getting the values

Collection<Obj> objects = list.stream()
    .collect(Collectors.toMap(Obj::getTimeStamp,
                              Function.identity(),
                              (o1, o2) -> o1.getGenerationTs().isBefore(o2.getGenerationTs()) ? o2 : o1))
    .values();

List<Obj> listOfObjects = new ArrayList<>(objects);

Or even shorter:

List<Obj> result = list.stream()
        .collect(Collectors.collectingAndThen(
                Collectors.toMap(Obj::getTimeStamp,
                        Function.identity(),
                        (o1, o2) -> o1.getGenerationTs().isBefore(o2.getGenerationTs()) ? o2 : o1),
                m -> new ArrayList<>(m.values())));

3 Comments

might want to wrap the output of stream pipeline around with new ArrayList<>()
Can do this right in the collector: List<Obj> result = list.stream().collect( Collectors.collectingAndThen(Collectors.toMap(…), m -> new ArrayList<>(m.values()));
that solution is almost like what I was begining to get, in better, thanks ! I only added .sorted(Comparator.comparing(LoadCurvePointBO::getTs)) to sort it by ts asc
0

Below is one way of doing it.

Grouping one the first timestamp and then using maxBy to find the object with the latest generation timestamp. Finally sort on the first timestamp and print it out.

The fact that maxBy will produce an Optional is a bit ugly, but I couldn't find a way to avoid it.

import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.maxBy;

import java.time.Instant;
import java.util.Optional;
import java.util.stream.Stream;

import org.junit.jupiter.api.Test;

public class SortTest {

@Test
public void t() {
    final Stream<Obj> s = Stream.of(new Obj("2019-05-02T09:00:00Z", "2019-05-02T21:00:00Z", 1),
            new Obj("2019-05-02T09:30:00Z", "2019-05-02T21:00:00Z", 2),
            new Obj("2019-05-02T10:00:00Z", "2019-05-02T21:00:00Z", 3),
            new Obj("2019-05-02T10:30:00Z", "2019-05-02T21:00:00Z", 4),
            new Obj("2019-05-02T09:30:00Z", "2019-05-02T22:00:00Z", 5),
            new Obj("2019-05-02T10:00:00Z", "2019-05-02T22:00:00Z", 6));

    s.collect(groupingBy(o -> o.ts, maxBy((o1, o2) -> o1.generationTs.compareTo(o2.generationTs))))
    .values()
    .stream()
    .map(Optional::get)
    .sorted((o1, o2) -> o1.ts.compareTo(o2.ts))
    .forEach(System.out::println);

}

private class Obj {
    Instant ts;
    Instant generationTs;
    int i;

    Obj(final String ts, final String generationTs, final int i) {
        this.ts = Instant.parse(ts);
        this.generationTs = Instant.parse(generationTs);
        this.i = i;
    }

    @Override
    public String toString() {
        return String.format("%s %s %d", ts, generationTs, i);
    }
}
}

Comments

0

If you already have a sorted list (descending by generationTs), like you have in your example code, you can use a HashSet and Collection.removeIf() to remove all duplicate timestamps from that list:

list.sort(Comparator.comparing(Obj::getTs)
        .thenComparing(Comparator.comparing(Obj::getGenerationTs)
                .reversed()));

Set<Timestamp> keys = new HashSet<>();
list.removeIf(o -> !keys.add(o.getTs()));

With this solution you don't have to create a new list, you just modify the list you have. The set stores all keys you want to maintain in the list. Because the list is sorted the newest objects are retained in the list, while the other values are removed.

The result with the data you shared will be:

Obj[ts=2019-05-02T09:00:00Z, generationTs=2019-05-02T21:00:00Z, value=1]
Obj[ts=2019-05-02T09:30:00Z, generationTs=2019-05-02T22:00:00Z, value=5]
Obj[ts=2019-05-02T10:00:00Z, generationTs=2019-05-02T22:00:00Z, value=6]
Obj[ts=2019-05-02T10:30:00Z, generationTs=2019-05-02T21:00:00Z, value=4]

If you already have a sorted list, this solution should be one of the fastest.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.