0

I have the string below that i got from opening a file containing a list of US of states an their capitals

    String text = "US_states"; //text file with US States and capitols

    byte[] buffer = null;

    InputStream is;
    try {
        is = getAssets().open(text);
        int size = is.available(); //size of the file in bytes
        buffer = new byte[size]; //declare the size of the byte array with size of the file
        is.read(buffer); //read file
        is.close(); //close file

    } catch (IOException e) {
        e.printStackTrace();
    }

    String str_data = new String(buffer); // Store text file data in the string variable
    }

Now I'd like to parse this string and insert it into a map objet Map m = new HashMap(); but I am not sure how to parse/split the various elements...

State Capital ---------------- --------------- Alabama Montgomery Alaska Juneau Arizona Phoenix Arkansas Little Rock California Sacramento Colorado Denver Connecticut Hartford Delaware Dover Florida Tallahassee Georgia Atlanta Hawaii Honolulu Idaho Boise Illinois Springfield Indiana Indianapolis Iowa Des Moines Kansas Topeka Kentucky Frankfort Louisiana Baton Rouge Maine Augusta Maryland Annapolis Massachusetts Boston Michigan Lansing Minnesota Saint Paul Mississippi Jackson Missouri Jefferson City Montana Helena Nebraska Lincoln Nevada Carson City New Hampshire Concord New Jersey Trenton New Mexico Santa Fe New York Albany North Carolina Raleigh North Dakota Bismarck Ohio Columbus Oklahoma Oklahoma City Oregon Salem Pennsylvania Harrisburg Rhode Island Providence South Carolina Columbia South Dakota Pierre Tennessee Nashville Texas Austin Utah Salt Lake City Vermont Montpelier Virginia Richmond Washington Olympia West Virginia Charleston Wisconsin Madison Wyoming Cheyenne

3 Answers 3

0

You cannot write an efficient algorithm on an highly disorganised data and expect results. Data organisation is as important as any data processing algorithm.

Step 1 should be organising your data. Basically, my understanding is, when you work on parsing any plain text based data, the below two things should/could be kept in mind in order to help your parsing algorithm work effectively.

  1. Delimiter - A character which would be used as boundary to separate two consecutive values. For e.g. comma(,)

For e.g. Alabama, Montgomery, Louisiana, Baton Rouge,

  1. Field Qualifier (Optional) - A valid character that envelopes any multi-word values of which some special characters like space or the delimiter itself are part of value itself just so you avoid unintended results. For e.g. Baton Rouge doesn't end up being Baton, Rouge

With Qualifier you can have

"Alabama", "Montgomery", "Louisiana", "Baton Rouge"

Once you have your data in that format, you can simply apply String's split method and can proceed from there.

One word of caution, when you make use of String as a key, both Alabama and alabama could be used as two separate keys. You better have the strings saved as Uppercase or Lowercase as keys so as to have valid/unique identity as keys.

As for the HashMap and how to use it I hope this Map Tutorial helps.

Sign up to request clarification or add additional context in comments.

Comments

0

Here data displayed (state and capital) are not displayed according to a regular pattern. The single thing which separates them is a whitespace but it is not enough because some sates and some capitals use whitespaces (ex: South Carolina :2, Salt Lake City :3)
So, you cannot use a simple regex to parse data.
If you want to handle that data, you should have a separator char between unitary data (; for example) which is not a whitespace char since as explained already used by some states and capitals.

With that, you are stick...

Comments

0

Step (1) : Split the string (which contains the capitals & states) with white space (delimiter)

Step (2): Collect the Split result into an array object

Step (3) : Iterate over the array to collect States & Capitals the HashMap

  Map<String, String> capitals = new HashMap<>();
  //Split using the delimiter " " to the all elements
  String[] stateCapsArray = str_data.split(" ");
  //Iterate over the array
  for(int i=0;i<stateCapsArray.length-1;i++) {
  //Skip each other element as we are collecting 2 elements at a time
    if(i%2 == 0) {
       String state = stateCapsArray[i];
       String capital = stateCapsArray[i+1];
       capitals.put(state, capital);
    }
 }

1 Comment

Thank you. As per comments from others, I formated the input file to use ":" as a delimiter and use your code to collect the state/capital. Works great

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.