1

I have multiple strings that are in the following format:
12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]

From these string I need to get out the date, time, first and last name of the person, and the card number. The word admitted can be omitted and anything following the final digit of the card number can be ignored.
I have a feeling I want to use StringTokenizer for this, but I'm not positive.
Any suggestions?

1
  • If this is a file you're reading from, I would be tempted to process it and save it in a second file, say in CSV format, that's easier to process. This is because field-relative information can contain spaces. Either that or change the way it's being encoded. Commented Dec 23, 2009 at 13:49

6 Answers 6

3

The String Tokenizer is great when you have a common delimiter, but in this case I'd opt for regular expressions.

Sign up to request clarification or add additional context in comments.

1 Comment

So as an example for drawing out the date from the string, I'm trying the following: Pattern datePattern = Pattern.compile( "[0-9]{2}/[0-9]{2}/[0-9]{4}" ); Then using Matcher on the string, with that pattern, I get no result. How would I properly format this regular expression?
2

Your record format is simple enough that I'd just use String's split method to get the date and time. As pointed out in the comments, having names that can contain spaces complicates things just enough that splitting the record by spaces won't work for every field. I used a regular expression to grab the other three pieces of information.

public static void main(String[] args) {
    String record1 = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]";
    String record2 = "12/18/2009 02:08:26 Admitted Van Halen, Eddie (Card #222) at South Lobby [In]";
    String record3 = "12/18/2009 02:08:26 Admitted Thoreau, Henry David (Card #333) at South Lobby [In]";

    summary(record1);
    summary(record2);
    summary(record3);
}

public static void summary(String record) {
    String[] tokens = record.split(" ");

    String date = tokens[0];
    String time = tokens[1];

    String regEx = "Admitted (.*), (.*) \\(Card #(.*)\\)";
    Pattern pattern = Pattern.compile(regEx);
    Matcher matcher = pattern.matcher(record);
    matcher.find();

    String lastName = matcher.group(1);
    String firstName = matcher.group(2);
    String cardNumber = matcher.group(3);

    System.out.println("\nDate: " + date);
    System.out.println("Time: " + time);
    System.out.println("First Name: " + firstName);
    System.out.println("Last Name: " + lastName);
    System.out.println("Card Number: " + cardNumber);
}

The regular expression "Admitted (.*), (.*) \\(Card #(.*)\\)" uses grouping parentheses to store the information you're trying to extract. The parentheses that exist in your record must be escaped.

Running the code above gives me the following output:

Date: 12/18/2009
Time: 02:08:26
First Name: John
Last Name: Doe
Card Number: 111

Date: 12/18/2009
Time: 02:08:26
First Name: Eddie
Last Name: Van Halen
Card Number: 222

Date: 12/18/2009
Time: 02:08:26
First Name: Henry David
Last Name: Thoreau
Card Number: 333

3 Comments

Nice, but this breaks for names with spaces in them. For example "Van Halen, Eddie"
@Adriaan: Thanks for pointing that out. Real world data is such a pain sometimes! :) I changed my code to use regular expressions to pull out those pieces of data that were affected by the spaces in names.
Great answer. Might post a variant later on.
2

I'd go for java.util.Scanner... this code will get you started... you should really use the Pattern form of the scanner methods rather then the String form that I used.

import java.util.Scanner;

public class Main
{
    public static void main(String[] args)
        throws Exception
    {
        final String  str;
        final Scanner scanner;
        final String  date;
        final String  time;
        final String  word;
        final String  lastName;
        final String  firstName;

        str       = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]";
        scanner   = new Scanner(str);
        date      = scanner.next("\\d+/\\d+/\\d+");
        time      = scanner.next("\\d+:\\d+:\\d+");
        word      = scanner.next();
        lastName  = scanner.next();
        firstName = scanner.next();
        System.out.println("date : " + date);
        System.out.println("time : " + time);
        System.out.println("word : " + word);
        System.out.println("last : " + lastName);
        System.out.println("first: " + firstName);
    }
}

Comments

1

A few things to keep in mind while you are parsing this line:

  • Last names can have spaces so you should be looking for ,
  • First name could have a space so look for the (

Due to this I would work off of TofuBeer's answer and adjust the next for first and last name. The string split is gonna be messy due to the extra spaces.

Comments

0

Shortest regexp solution (with type casting):

String stringToParse = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In] ";
Pattern pattern = Pattern.compile("((\\d{2}/){2}\\d{4}\\s(\\d{2}:){2}\\d{2})\\s(\\w+)\\s((.*)),\\s((.*))\\s.*#(\\d+)");
Matcher matcher = pattern.matcher(stringToParse);
matcher.find();

String firstName = matcher.group(6);
String lastName = matcher.group(5);
int cardNumber = Integer.parseInt(matcher.group(7));

DateFormat df = new SimpleDateFormat("MM/dd/yyyy HH:mm:ss");
Date date = df.parse(matcher.group(1));

Comments

-1

Trust your guts... :) With StringTokenizer:

import java.io.*;
import java.util.StringTokenizer;
public class Test {
  public Test() {
  }

public void execute(String str) { String date, time, firstName, lastName, cardNo; StringTokenizer st = new StringTokenizer(str, " "); date = st.nextToken(); time = st.nextToken(); st.nextToken(); //Admitted lastName = st.nextToken(",").trim(); firstName = st.nextToken(",(").trim(); st.nextToken("#"); //Card cardNo = st.nextToken(")#"); System.out.println("date = " + date +"\ntime = " + time +"\nfirstName = " + firstName +"\nlastName = "+ lastName +"\ncardNo = " +cardNo); }

public static void main(String args[]) { Test t = new Test(); String record1 = "12/18/2009 02:08:26 Admitted Doe, John (Card #111) at South Lobby [In]"; String record2 = "12/18/2009 02:08:26 Admitted Van Halen, Eddie (Card #222) at South Lobby [In]"; String record3 = "12/18/2009 02:08:26 Admitted Thoreau, Henry David (Card #333) at South Lobby [In]"; t.execute(record1); t.execute(record2); t.execute(record3); } }

1 Comment

Thanks but using StringTokenizer, how would I break the string up?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.