1

Hi as it stands im currently using processing and learning java, my code essentially works its way through an ArrayList and finds the word which occurs most and outputs it to the console, my code is below:

import java.util.Arrays;

ArrayList<String> words = new ArrayList();


int[] occurrence = new int[2000];
void setup() {
size(800,480);
smooth();

String[] data = loadStrings("data/data.txt"); 
    Arrays.sort(data);

for (int i = 0; i < data.length; i ++ ) {
   words.add(data[i]); 
   words.add(data[j]);   //Put each word into the words ArrayList
   }
for(int i =0; i<data.length; i++) {
    occurrence[i] =0;
    for(int j=i+1; j<data.length; j++) {
   if(data[i].equals(data[j])) {
     occurrence[i] = occurrence[i]+1;
   }
 }
}
int max = 0;
String most_talked ="";
   for(int i =0;i<data.length;i++) {
if(occurrence[i]>max) {
  max = occurrence[i];
  most_talked = data[i];
 }
 }

println("The most talked keyword is " + most_talked + " occuring " + max + " times.");

I am wondering how I would go about altering it to add in the 2nd most occurring word, and so on and so forth.

I have looked into using a map, as well as collection.sort but cant quite get how to move forward with this. I am fairly new to java so anything at all would be helpful.

0

2 Answers 2

2

Seems like Multisets from the Guava library would be perfect for this job. You could store all the words you've read into a Multiset and when you want to get occurrences (counts) out, you could simply iterate over the copy returned by Multisets.copyHighestCountFirst(myMultiset):

import com.google.common.collect.*;
...

// data contains the words from the text file
Multiset<String> myMultiset = ImmutableMultiset.copyOf(data);

for (String word : Multisets.copyHighestCountFirst(myMultiset).elementSet()) {
    System.out.println(word + ": " + myMultiset.count(word));
}

That should do it.

Sign up to request clarification or add additional context in comments.

4 Comments

Do you know how i import the use of multisets ?
I added the import statement, but more importantly, you need to download the Guava library (use the "guava-18.0.jar" link) and put it in your classpath.
I'm being thrown an error "cannot convert from immutableMultiset<String[]> to Multiset<String>. I retried using just Multiset.of but it returs The function of(String[]) does not exist
@Nebbyyy Keep in mind that using a Java library will make it impossible to deploy as JavaScript through Processing.js, which is probably the best way of deploying Processing. Might not matter to you, but it's something to keep in mind.
1

First thing that comes to my mind is to save the used words in an auxiliary array and then for each word consulted, find it in this list.

If it match increase a counter for this word (if there are too many you can also add an int [] to store the occurrences) and then just display it (Each aux[index] with the Occurrence[index]).

Example: (Only a scheme)
If the list is:

Tom Tom Dog fish

Then:

Aux[0] = Tom;
Aux[1] = Dog;
Aux[3] = fish;

and the occurrences for each are in the "int list": for Tom index = 0, dog = 1 and fish = 3.

Hope it helps you!

3 Comments

The thing is im pulling the words from a .txt file with around 1000+ in it sholud I be looking to use a hashmap to do this?
Well, Im also new here and Im trying to help as much as I can. I would have done it in the way I told you. If you are worried about the size of the "words_used_array" you can put a default value as size and then, if it overflows, create a copy of it with more capacity. Sorry, I couldn't do that any better.
No i really appreciate it :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.