1

There are many websites which take a string as user input and allow you to create a regular expression (regex) from pieces of the string.

But I could not find any java library which does the same. Is there any Java library available which generates a regular expression that exactly matches a string?

String inputString = "ABC345";
String regularExpression = Something.generateRegEx(inputString);

or something like that.

Note: I have a condition wherein I want to take some string from user, generate regular expression and then match for that pattern on some data-sets to extract similar patterns. I have created a small utility, but it is not that reliable yet. Moreover, I am looking for some well-tested library.

EDIT :

Please visit txt2re.com. I want a java library which performs the same function.

4
  • 2
    Your question is not clear. What would be the output? Commented Jul 30, 2012 at 14:19
  • Do you just want to escape the input? See this question: stackoverflow.com/questions/60160/… and the answer regarding quote method. Commented Jul 30, 2012 at 14:21
  • I used to write a library that do this. It always return .* Commented Jul 30, 2012 at 14:24
  • @ all - sorry for the inadequate description. I have just updated the question. Commented Jul 30, 2012 at 14:31

4 Answers 4

3

Pattern.quote(String) returns a (string) regex that matches the specified string exactly.

Sign up to request clarification or add additional context in comments.

5 Comments

I have just used Pattern.quote(String) and the output was \QABC123\E
...Yes? What's the problem with that?
@Saurabh: Is this a problem? What was your input?
@DougRamsey : sorry for inadequate information. I have just updated my question.
I can't figure out what that site is doing. Can you clarify the question yourself, rather than pointing to another source?
2

I think, the txt2re.com has a database from known regular expressions, since the tool extends its answers with semantics like "date" or "email" for date and email formats. Otherwise, it gives an expression, which validates only a string but not a "regular language". Regular languages are expressed by regular expressions and they can be calculated by finite-state machines, but they are sets of limited words (all finite languages are regular). For example a simple language like:

L = { (a^n)(b^n) | n >= 0 } is not regular. (proof with pumping lemma)

L = {ab, aabb, aaabbb,...} (not- regular) 

if you consider, that the input is a set of infinite words (inclusive natural languages), however, the regular expressions can not describe all of them. In order to generate regular expressions for a language, you had to first describe it with a (TYPE-3) grammar.

if your language has only a word like this:

L = { [email protected] }

then you can write a basic compiler iterating over the chars while checking their types, pseudo:

s = size(input) 
result = ""
for (i = 0; i < s; i++) {
   if input[i] is numeric
      result += "d"
   else if input[i] is word
      result += "w" 
   ...
}
return result

1 Comment

Thanks for your detailed answer. So, there's no such library already available. And to make such library, one need to have database of known regular expressions included in that library. rite? Thanks for your pseudo code, in fact my current running code (work-around) uses same logic for generating regular expression.
0

A genetic algorithm based java library like regex++ url: https://github.com/MaLeLabTs/RegexGenerator can be used for the same purpose.

Comments

-1

If what you want is to find a regex matching a given String, this does not make sense because there exists an infinite number of it.

On a contrary if you want to build a Pattern object from a regex that is input from the user, use the standard java API (java.util.regex.*) this way :

Pattern p = Pattern.compile(inputString);

7 Comments

he wants to get regular expressions from a string given.
@ErhanBagdemir yes, so what is the problem?
Pattern.compile takes the regular expression as parameter, but it doesn't give the expression self for a given string.
How could that be possible? For any String there is an infinite number of regex matching it.
an answer like your latest comment could get +1. but the current answer -1.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.