2

Im working on indexing a text file, I want to print out every word in a text file and page number in alphabetical order. Im running into a problem with the alphabetical sort though... here is what I currently have...

public void addWord(String word, int num) {
    boolean match = false;
    for (IndexEntry x : this) {
        String i = x.getWord();
        if (i.toUpperCase().equals(word.toUpperCase())) {
            x.add(num);
            match = true;
        }
    }
    if (match == false) {
        IndexEntry entry = new IndexEntry(word);
        int add = 0;
        int count = 0;
        boolean spot = false;
        while (count < this.size() && !spot) {
            String str = this.get(count).getWord();
            if (str.compareTo(word) > 0) {
                add = count;
                spot = true;
            }
            count++;
        }
        this.add(add, entry);
        this.get(indexOf(entry)).add(num);
    }
}

and the output of this is....

BLUE[5, 8]
BLACK[7]
NEW[11]
OLD[10]
RED[4]
TWO[2]
FISH[1, 2, 4, 5, 7, 8, 10, 11]
ONE[1]
Done.

Which is clearly not in alphabetical order... any help on this would be greatly appreciated. Thank you.

here is indexEntry

import java.util.List;
import java.util.ArrayList;

public class IndexEntry implements Comparable<IndexEntry>
{
  private String word;
  private List<Integer> numsList;  // contains Integer objects

  /**
   *  Constructs an IndexEntry for a given word
   *  (converted to upper case); stores the word and
   *  creates an empty ArrayList<Integer> for numsList
   *  @param aWord the word for this entry
   */ 
  public IndexEntry(String aWord)
  {
     word = aWord.toUpperCase();
     numsList = new ArrayList<Integer>();
  }

  /**
   *  Returns word of this IndexEntry object
   *  @return this entry's word
   */
  public String getWord()
  {
    return word;
  }

  /**
   *  Adds num at the end of this IndexEntry's numsList if
   *  num is not already in the list; otherwise makes no changes.
   */
  public void add(int num)
  {
    if(numsList.contains(num) == false)
      numsList.add(num); 
  }

  /**
   *  Compares this entry for equality to another IndexEntry;
   *  the entries are considered equal if their words are
   *  the same
   *  @param obj the other IndexEntry to be compared
   *  @return true if the words match, otherwise false
   */
  public boolean equals(IndexEntry obj)
  {
    if(word.equals(obj.getWord()))
      return true;
    return false;
  }

  /**
   *  Compares this entry to another IndexEntry
   *  by comparing their words
   *  @param obj the other IndexEntry to be compared
   *  @return negative if 'this' entry smaller, 0 if equal, positive is 'this' larger
   */
  public int compareTo(IndexEntry obj)
  {
    return obj.getWord().compareTo(word);
  }

  /**
   *  Converts this IndexEntry into a string
   *  @return the String representation of this entry: word and line numbers
   */
  public String toString()
  {
    return word + numsList;
  }
}

and documentIndex which contains addWord

import java.util.StringTokenizer;

public class DocumentIndex extends java.util.ArrayList<IndexEntry>
{

  /**
   *  Creates an empty DocumentIndex with the default
   *  initial capacity
   */
  public DocumentIndex()
  {
    super();
  }

  /**
   *  Creates an empty DocumentIndex with the capacity
   *  given by the parameter
   *  @param init the initial capacity of the list
   */
  public DocumentIndex(int init)
  {
    super(init);
  }

 /**
  *  If word is in this DocumentIndex and num is in its list, does nothing; 
  *  if word is in this DocumentIndex and num is not in its list, adds num 
  *  to this word's IndexEntry; otherwise creates a new entry with word and
  *  num and inserts it into this index in order
  *  @param word the word to look for
  *  @param num the line number this word is on
  */

  public void addWord( String word, int num )
    {
        boolean match = false;
        for ( IndexEntry x : this ){
            String i = x.getWord();
            if (i.toUpperCase().equals(word.toUpperCase())){
                x.add(num);
                match = true;}}
        if (match == false){
            IndexEntry entry = new IndexEntry(word);
            int add = 0;
            int count = 0;
            boolean spot = false;
            while (count < this.size() && !spot){
                String str = this.get(count).getWord();
                if (str.compareTo(word) > 0){
                    add = count;
                    spot = true;}
                count++;}
            this.add(add, entry);
            this.get(indexOf(entry)).add(num);}
    }

  /**
   *  For each word found in str, calls addWord(word, num)
   *  @param str a line of text
   *  @param num the line number for this line of text
   */
  public void addAllWords(String str, int num)
  {
    StringTokenizer tokens = new StringTokenizer(str, " .,-;?!");
           // " .,-;?!" lists delimeters that separate words 

    while(tokens.hasMoreTokens())
    {
      String word = tokens.nextToken();
      addWord(word, num);
    }
  }
}
2
  • 3
    Not that it solves anything but i.toUpperCase().equals(word.toUpperCase()) can be rewritten as i.equalsIgnoreCase(word). Commented Apr 30, 2015 at 22:06
  • @user3808597 Please update the code in your OP with everything someone from the outside would need to answer your question. Commented Apr 30, 2015 at 22:07

3 Answers 3

1

EDIT: You need to add the following line after the while loop in addWord:

if ( !spot && (count == this.size())){
    add = count;
}

That fixes the error when I tried it at my end.

Also, I think the following version is a cleaner and more efficient way of writing the addWord() method:

public void addWord( String word, int num ) {
    String upperCaseWord = word.toUpperCase();

    for ( IndexEntry x : this ) {
        String i = x.getWord();
        if (i.equals(upperCaseWord)){
            x.add(num);
            return;
        }
    }

    IndexEntry entry = new IndexEntry(word);
    entry.add(num);

    int currSize = this.size();     
    if (currSize == 0) {
        this.add(entry);
        return;
    }   

    int count = 0;
    while (count < currSize) {
        String str = this.get(count).getWord();
        if (str.compareTo(upperCaseWord) > 0){
            break;
        }   

        count++;
    }   

    this.add(count, entry);
}   
Sign up to request clarification or add additional context in comments.

12 Comments

could you post the code you use to print out the results?
it is in the toString method in indexEntry
I've changed my previous answer. Try out the modified fix.
Also, check out the revised version of the addMethod() I've added to the answer.
BLACK[7] BLUE[5, 8] NEW[11] OLD[10] ONE[1] FISH[1, 2, 4, 5, 7, 8, 10, 11] TWO[2] RED[4] Done.
|
0

Java's Strings normally sort based on the numerical order of their Unicode code points, which is not what you want. Use a Collator to do alphabetical ordering.

Comments

0

I believe the issue is when the alphabetical DocumentIndex is forming, it may not know where to place the new word.


For example here.

IndexEntry entry = new IndexEntry(word);
int add = 0;
int count = 0;
boolean spot = false;
while (count < this.size() && !spot)
{
    String str = this.get(count).getWord();
    if (str.compareTo(word) > 0)
    {
        add = count;
        spot = true;
    }

    count++;
}

this.add(add, entry);
this.get(indexOf(entry)).add(num);

What happens if the loop ends without ever finding an index to place String word? Your code states String word will simply be added to the beginning (index 0) of DocumentIndex if String word isn't lexicographically less than the final String str. Therefore, I think this creates a problem in the sorting algorithm by adding the word at the beginning of the list when it should be added at the end.


I believe you need to add a condition to check if String word needs to be added to the end of DocumentIndex.

IndexEntry entry = new IndexEntry(word);
int add = 0;
int count = 0;
boolean spot = false;
while (count < this.size() && !spot)
{
    String str = this.get(count).getWord();
    if (str.compareTo(word) > 0)
    {
        add = count;
        spot = true;
    }

    count++;
}

if (spot) // If the loop "spotted" an index, lets add it to the ArrayList.
{
    this.add(add, entry);
}
else // Otherwise, lets add it to the end of the ArrayList.
{
    this.add(entry);
}

this.get(indexOf(entry)).add(num);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.