3

I want to remove lines from a CSV file which contain the wrong date. In the process the CSV file should retain the header line. All this I want to perform using java 8 streams.

At first I cam up with this:

try (Stream<String> linesUnfiltered = Files.lines(f.toPath(), StandardCharsets.UTF_8)) {
    Stream<String> firstLine = linesUnfiltered.limit(1);
    Stream<String> linesFiltered = linesUnfiltered
            .filter(e -> e.contains(sdfFileContent.format(fileDate)));
    Stream<String> result = Stream.concat(firstLine, linesFiltered);
    Files.write(f.toPath(), (Iterable<String>) result::iterator);
}

But this throws the exception java.lang.IllegalStateException: stream has already been operated upon or closed because linesUnfiltered is reused. The suggestion on the web is to use a Supplier<Stream<String>>, but my understanding is that the supplier would read the file for each supplier.get() call, which is not very efficient.

And thats why I am asking if there is another way which is more efficient that this? I am pretty certain that it should be possible to perform the two operations on the same stream...

EDIT:

It is NOT a duplicate as the first item should not be removed. It should only be excluded from the filtering process but still be available in the result stream

5
  • use skip(1) operator on stream. Commented Feb 7, 2020 at 10:42
  • But skip(1) removes the first line form the result as well which I don't want Commented Feb 7, 2020 at 10:43
  • You basically want to do a stateful operation, which is something that clashes with wanting to use streams. You want special treatment for the first element, which means you'll have to resort to some kind of hack (they're ugly, like keeping a mutable boolean to check whether you're dealing with the first line or not). Commented Feb 7, 2020 at 10:47
  • What about splitting the stream into 2 new streams where one only contains the first line and the second contains everything else? Commented Feb 7, 2020 at 10:55
  • 1
    @XtremeBaumer you can't really split streams Commented Feb 7, 2020 at 10:57

2 Answers 2

6

You can use a reader and call its readLine method to consume the header, then filter on the result of lines() (after consuming the first line from the same reader):

try (BufferedReader reader = Files.newBufferedReader(f.toPath(), 
                                  StandardCharsets.UTF_8)) {

    Stream<String> firstLine = Stream.of(reader.readLine());
    Stream<String> linesFiltered = reader.lines()
            .filter(e -> e.contains(sdfFileContent.format(fileDate)));
    Stream<String> result = Stream.concat(firstLine, linesFiltered);

    ...
Sign up to request clarification or add additional context in comments.

2 Comments

Nice solution that avoids the statefulness problem.
Not quite what I hoped for, but definitly good enough!
1

You can convert the Stream to an Iterator, take the first element, then convert back.

try (Stream<String> linesUnfiltered = Files.lines(f.toPath(), StandardCharsets.UTF_8)) {
    Iterator<String> it = linesUnfiltered.iterator();
    String firstLine = it.next();
    Stream<String> otherLines = StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 0), false);
    Stream<String> linesFiltered = otherLines
            .filter(e -> e.contains(sdfFileContent.format(fileDate)));
    Stream<String> result = Stream.concat(Stream.of(firstLine), linesFiltered);
    Files.write(f.toPath(), (Iterable<String>) result::iterator);
}

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.