9

I'm trying to extract 2 dates from a string using regex- and for some reason - the regex doesn't extract dates- this is my code:

private  String[] getDate(String desc) {
    int count=0;
    String[] allMatches = new String[2];
    Matcher m = Pattern.compile("(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\\d\\d(?:,)").matcher(desc);
    while (m.find()) {
        allMatches[count] = m.group();
    }
    return allMatches;
}

My string- desc is: "coming from the 11/25/2009 to the 11/30/2009" and I get back a null array...

2
  • 3
    Your regex is for the format dd-MM-yyyy, and the string has MM-dd-yyyy. Solution: you need a consistent format everywhere, which is not always possible with user input. You can't accept both, because you wouldn't know what 01-02-2013 represents... Commented Sep 3, 2013 at 11:38
  • +1 Kobi. You should also not forget to increment count. Commented Sep 3, 2013 at 11:40

5 Answers 5

15

Your regex matches day first and then month (DD/MM/YYYY), while your inputs start with month and then day (MM/DD/YYYY).

Moreover, your dates must be followed by a comma to be matched (the (?:,) part).

This one should suit your needs:

(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d

Regular expression visualization

Diagram by Debuggex.

Sign up to request clarification or add additional context in comments.

Comments

7

3 Problems :

1) You are trying to parse date with format dd/MM/YYYY where as your regex has format MM/dd/YYYY.

2) You forgot to increment count in the while loop.

3) The (?:,) part at the end of the regex is useless.

This codes works on my computer :

private static String[] getDate(String desc) {
  int count=0;
  String[] allMatches = new String[2];
  Matcher m = Pattern.compile("(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\\d\\d").matcher(desc);
  while (m.find()) {
    allMatches[count] = m.group();
    count++;
  }
  return allMatches;
}

Test :

public static void main(String[] args) throws Exception{
  String[] dates = getDate("coming from the 25/11/2009 to the 30/11/2009");

  System.out.println(dates[0]);
  System.out.println(dates[1]);

}

Output :

25/11/2009
30/11/2009

1 Comment

5

You've got the month and day of the month backwards, and (?:,) is requiring a comma at the end of each date. Try this instead:

(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\\d\\d

Comments

0

A date pattern recognition algorithm to not only identify date pattern but also fetches probable date in Java date format. This algorithm is very fast and lightweight. The processing time is linear and all dates are identified in a single pass. Algorithm resolves date using tree traverse mechanism. Tree data structures are custom created to build supported date, time and month patterns.

The algorithm also acknowledges multiple space characters in between Date literals. E.g. DD DD DD and DD DD DD are considered as valid dates.

Following date-patterns are considered as valid and are identifiable using this algorithm.

dd MM(MM) yy(yy) yy(yy) MM(MM) dd MM(MM) dd yy(yy)

Where M is month literal is alphabet format like Jan or January

Allowed delimiters between dates are '/', '\', ' ', ',', '|', '-', ' '

It also recognizes trailing time pattern in following format hh(24):mm:ss.SSS am / pm hh(24):mm:ss am / pm hh(24):mm:ss am / pm

Resolution time is linear, no pattern matching or brute force is used. This algorithm is based on tree traversal and returns back, the list of date with following three components - date string identified in the text - converted & formatted date string - SimpleDateFormat

Using date string and the format string, users are free to convert the string into objects based on their requirements.

The algorithm library is available at maven central.

<dependency>
    <groupId>net.rationalminds</groupId>
    <artifactId>DateParser</artifactId>
    <version>0.3.0</version>
</dependency>

The sample code to use this is below.

import java.util.List;  
 import net.rationalminds.LocalDateModel;  
 import net.rationalminds.Parser;  
 public class Test {  
   public static void main(String[] args) throws Exception {  
        Parser parser=new Parser();  
        List<LocalDateModel> dates=parser.parse("Identified date :'2015-January-10 18:00:01.704', converted");  
        System.out.println(dates);  
   }  
 }  

Output: [LocalDateModel{originalText=2015-january-10 18:00:01.704, dateTimeString=2015-1-10 18:00:01.704, conDateFormat=yyyy-MM-dd HH:mm:ss.SSS, start=18, end=46}]

Detailed blog at http://coffeefromme.blogspot.com/2015/10/how-to-extract-date-object-from-given.html

The complete source is available on GitHub at https://github.com/vbhavsingh/DateParser

2 Comments

Hi & welcome to SO! I think some formatting got lost in your post, as the examples "DD DD DD" and "DD DD DD" look identical in your second paragraph. Maybe use more code markup to preserve spaces and display formats?
its "DD DD DDDD" or "DDDD DD DD" or "DD DD DD"
0

LocalTime.parse instead of regex

Regex can be overkill for such a problem.

You could just split the string on SPACE character, and attempt to parse each element as a LocalDate. If the parse fails, move on to the next element.

String input = "coming from the 11/25/2009 to the 11/30/2009" ;
String[] elements = input.split( " " ) ; 
DateTimeFormatter f = DateTimeFormatter.ofPattern( "MM/dd/uuuu" ) ;
List<LocalDate> dates = new ArrayList<>() ;
for( String element : elements ) {
    try {
        LocalDate ld = LocalDate.parse( element , f ) ;
        dates.add( ld ) ;
    } catch ( DateTimeParseException e ) {
        // Ignore the exception. Move on to next element.
    }
}
System.out.println( "dates: " + dates ) ;

See this code run live at IdeOne.com.

dates: [2009-11-25, 2009-11-30]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.