0

Basically I have to do the following: 1)Read the CSV input file. 2)Filter the CSV data based on the blacklist. 3)Sort the input based on the country names in ascending order. 4)Print the records which are not blacklisted. The CSV file contains: id,country-short,country 1,AU,Australia 2,CN,China 3,AU,Australia 4,CN,China The blacklist file contains: AU JP And the desired output is 2,CN,China 4,CN,China

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Optional;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class StreamProcessing {
    public static void filterCsv(String fileName, String blacklistFile){


         try (Stream<String> stream1 = Files.lines(Paths.get(fileName))) {
             Stream<String> stream2 = Files.lines(Paths.get(blacklistFile));
             Optional<String> hasBlackList = stream1.filter(s->s.contains(stream2)).findFirst();
         } catch (IOException e) {
             e.printStackTrace();
         }
    }   



    public static void main(String args[])
    {
        StreamProcessing sp = new StreamProcessing();
        sp.filterCsv("Data.csv","blacklist.txt");
    }
}

I want to remove the entries that are present in second Stream from comparing from the first Stream without converting it into an array?

4
  • What exactly are you trying to do? Do you want to find if the first Stream contains a String that appears in the second Stream? Commented Sep 15, 2016 at 6:39
  • Try to add some information to the question like have the csv and txt file the same structure and do you really need the first entry or just the information that there is a 'violation'. Commented Sep 15, 2016 at 6:47
  • Please read stackoverflow.com/help/how-to-ask. Tell us what you already tried and what is not working. Commented Sep 15, 2016 at 8:19
  • Basically I have to do the following: 1)Read the CSV input file. 2)Filter the CSV data based on the blacklist. 3)Sort the input based on the country names in ascending order. 4)Print the records which are not blacklisted. The CSV file contains:id,country-short,country 1,AU,Australia 2,CN,China 3,AU,Australia 4,CN,China The blacklist file contains:AU JP And the desired output is 2,CN,China 4,CN,China Commented Sep 16, 2016 at 13:47

2 Answers 2

3

You can consume a stream only once. Since you need access to all the members of the blacklist while evaluating each member of the main file, you must first consume the blacklist stream in entirety. For efficiency reasons, don't convert to an array, but to a HashSet.

boolean hasBlacklistedWord(String fileName, String blacklistFile) {
    Set<String> blacklist = Files.lines(Paths.get(blacklistFile)).collect(toSet());
    return Files.lines(Paths.get(fileName)).anyMatch(s -> blacklist.contains(s));
}
Sign up to request clarification or add additional context in comments.

1 Comment

This is the place where we should praise the Stream API for forcing the developer to collect into a Set instead of allowing an n×m operation…
1

Unfortunately you can't use a stream more than once so code that checks the blacklist stream for each line will fail. The easiest solution would be to store the blacklist in a collection and then check each line against it.

List<String> blacklist = Files.readAllLines(Paths.get(blacklistFile));
boolean hasBlacklist = Files.lines(Paths.get(filename)).anyMatch(blacklist::contains);

1 Comment

perhaps a HashSet would be a more efficient choice than a List repeatedly looking up items.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.