6

I am parsing a file and it has time based entries in it. format is like:

00:02:10-XYZ:Count=10
00:04:50-LMK:Count=3

Here what I want is to extract the time value from string line

I have searched many links and couldn't find out the thing what I want, eventually I have written this code.

    Pattern pattern = Pattern.compile("((?i)[0-9]{1,2}:??[0-9]{0,2}:??[0-9]{0,2})"); //(?i)[0-9]{1,2}:??[0-9]{0,2}:??[0-9]{0,2}  //\\d{1,2}:\\d{1,2}:\\d{1,2}
    Matcher matcher;
    List<String> listMatches;

Below is the loop where I apply logic

    for(int x = 0; x < file_content.size(); x++)
    {
            matcher= pattern.matcher(file_content.get(x));
            listMatches = new ArrayList<String>();
            while(matcher.find())
            {
                listMatches.add(matcher.group(1));
                break;
            }
     }

I want when "matcher.find()" gives true it returns me [00:02:10] in first iteration and [00:04:50] in 2nd iterations.

2
  • 1
    have you considered using SimpleDateFormat instead of writing your own regex? Commented Sep 26, 2013 at 13:39
  • No sir, I didn't , Thanks for sharing your value-able code Commented Sep 26, 2013 at 13:57

4 Answers 4

6

Seems like an unnecessarily complicated pattern.... why not just (if you are doing line-by-line processing):

"^(\\d\\d:\\d\\d:\\d\\d)"

If you are doing multi-line processing you will want to use:

"(?m)^(\\d\\d:\\d\\d:\\d\\d)"

Here's some example code and output:

public static void main(String[] args) {
    final Pattern pattern = Pattern.compile("(?m)^(\\d\\d:\\d\\d:\\d\\d)");
    final Matcher matcher = pattern.matcher("00:02:10-XYZ:Count=10\n00:04:50-LMK:Count=3");
    while(matcher.find())
    {
        System.out.printf("[%s]\n", matcher.group(1));
    }        
}

outputs

[00:02:10]
[00:04:50]
Sign up to request clarification or add additional context in comments.

1 Comment

What about input like 99:99:99? Your regex matches non-time values :( p.s. and why group the match? Did you know you can just get group zero?
4

I did with this way.

00:02:10-XYZ:Count=10
00:04:50-LMK:Count=3

Pattern pattern = Pattern.compile("([2][0-3]|[0-1][0-9]|[1-9]):[0-5][0-9]:([0-5][0-9]|[6][0])");
//File Beginning Time
for(int x = 0; x < file_content.size(); x++)
   {
        matcher= pattern.matcher(file_content.get(x));
        ListMatches = new ArrayList<String>();
        if(matcher.find())
          {
                start_time = matcher.group();
                break;
          }                
    }
//File End Time
for(int x = file_content.size()-1; x > 0 ; x--)
        {
            matcher= pattern.matcher(file_content.get(x));
            listMatches = new ArrayList<String>();
            if(matcher.find())
            {
                end_time = matcher.group();
                break;
            }                  
        }

Comments

3

Don't use regex for this, use a SimpleDateFormat. This has two massive advantages

  1. The code in SimpleDateFormat is tested and robust
  2. The SimpleDateFormat will validate to ensure that you have real time numbers

This would look something like this:

public static void main(String[] args) throws Exception {
    final String s = "00:02:10-XYZ:Count=10\n"
            + "00:04:50-LMK:Count=3";
    final Scanner sc = new Scanner(s);
    final SimpleDateFormat dateFormat = new SimpleDateFormat("HH:mm:ss");
    while(sc.hasNextLine()) {
        final String line = sc.nextLine();
        final Date date = dateFormat.parse(line);
        final Calendar calendar = Calendar.getInstance();
        calendar.setTime(date);
        System.out.println(calendar.get(Calendar.HOUR));
        System.out.println(calendar.get(Calendar.MINUTE));
        System.out.println(calendar.get(Calendar.SECOND));
    }
}

Output:

0
2
10
0
4
50

From the javadoc for DateFormat.parse:

Parses text from the beginning of the given string to produce a date. The method may not use the entire text of the given string.

So the SimpleDateFormat will parse the String until it reads the whole pattern specified then stops.

2 Comments

seems like it should be ("KK:mm:ss")
@Anirudh, you're right - it shouldn't be hh, but I think it should be HH.
3
SimpleDateFormat dateFormat = new SimpleDateFormat("KK:mm:ss");    
Pattern pattern = Pattern.compile("\\d+:\\d+:\\d+");
Matcher matcher;
List<Date> listMatches = new ArrayList<Date>();
for(int x = 0; x < file_content.size(); x++)
{
    matcher= pattern.matcher(file_content.get(x));
    while(matcher.find())
    {
        Date temp=null;
        try{temp=dateFormat.parse(matcher.group(0));}catch(ParseException p){}
        if(temp!=null)
        listMatches.add(temp);
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.