I have some problems storing variables in an arrayList. The propose of the programm is to read from one file (A) , read another text file (B) and than compare how much percentage covers the occured vocabulary from A in B. For this reason, I store every word which occurs togheter in neuS. And here comes the problem. If I try to get the output, it seems to store the values random times inside! So for example I get output like:
elektrotechnik und
die bedeutendste
die bedeutendste
und simulation
erleben die
eine form
eine form
So there are some words (correctly said Ngramms, because I store always two words togheter), which are only one time inside neuS but others only one time. I also have seen the output like three times the same. I want all words only stored once inside neuS. What am I'm doing wrong? The code isn't complete, there are some code which I supposse that's irrelevant for this issue.
Thanks!
BufferedReader in = new BufferedReader(new FileReader("informatik_test.txt"));
String str;
//
while ((sCurrentLine = in.readLine()) != null) {
// System.out.println(sCurrentLine);
arr = sCurrentLine.split(" ");
for (int i = 0; i < arr.length - 1; i = i + 2) {
String s = (arr[i].toString() + " " + arr[i + 1].toString())
.toLowerCase();
if (null == (hash.get(s))) {
hash.put(s, 1);
} else {
int x = hash.get(s) + 1;
hash.put(s, x);
}
}
//
ArrayList< String> words = new ArrayList< String>();
ArrayList< String> neuS = new ArrayList< String>();
ArrayList< Long> neuZ = new ArrayList< Long>();
// Read all Lines from a file
for (String line = br.readLine(); line != null; line = br.readLine()) {
String h[] = line.split(" ");
words.add(h[0].toLowerCase());
}
//
for (String x : hash.keySet()) {
summe = summe + hash.get(x);
long neu = hash.get(x);
for (String s : words) {
if (x.equals(s)) {
neuS.add(x);
neuZ.add(neu);
disc = disc + 1;
}
}
}
// Testing which word for output -->! THE PROBLEM!!
for (String m : neuS) {
System.out.println(m);
}
}