0

I'm working on some upkeep for a monolithic java codebase, where it was discovered that some of the @GET methods will actually start a write session, and should thus actually be @POST methods. I wrote the following regex to aide my search:

@GET[^}]+startWrite

This gave me all occurrences of the @GET annotation which reached a string 'startWrite' (which is part of the method names that start a write session) before reaching a '}', which is used to close a method in java. This solution is not perfect, as it is possible that a } is used inside a method before a write session is started (for instance in an if-statement), but it proofed effective enough to work with.

However, it has since come to my attention that a lot of the methods follow this format:

@GET
@Path("/methodName")
public ObjectName methodName(...){
    ...
    return methodNameInner(...);
}
private methodNameInner(...){
    startWriteSession();
    ...
}

In other words, the write session command is moved to another method, which always bears the same name as the original method (and the pathname), followed by 'Inner'. this inner method is always below the original method. I tried to write a regex that searched for occurences of @GET, followed by some strings until either the path name or the method name (which I isolated in a separate group), followed by more characters, followed by \1Inner, followed by the same [^}]+startWrite, meaning the inner method reached a 'startWrite' string before it reached the end of the method. But I could not get it to work.

Could someone please assist me?

4
  • 2
    Regex is not the tool for the job. You want something like ClassGraph. There's even an example there of finding annotated methods Commented Jul 5, 2024 at 9:49
  • 1
    @g00se is right, and not just subjectively, but objectively. It's not so much 'regex is not the tool for the job', it's: "RegEx are fundamentally incapable of doing this job, and what you want is therefore impossible'. Remember: "Regular expressions were not created by Ms. Josephina Regular" - they are named after the class of grammars they can parse - regular grammars. Java (the language) is not regular. What you're doing is akin to asking 'I bought the services of a japanese/korean translator. How do I make them translate this russian document?'. Commented Jul 5, 2024 at 10:25
  • @rzwitserloot I'm not quite sure I follow. As I've stated I have already used regex in similar vain to find get methods that start a write session, and I don't see how this request would be fundamentally different. To be clear, I expect to manually verify the results generated by the regex, and do not expect the results to be complete, similar to how I know they weren't complete in the original case. I also understand that adding another library might give a more complete result, but I do not currently have the power to add another dependency to this application. Commented Jul 5, 2024 at 12:25
  • 1
    @user9934848 Using regexes to parse java code is not appropriate. What you did wouldn't have found all possible ways to write that method, and will not work to find this situation. Commented Jul 5, 2024 at 13:34

1 Answer 1

2

Personal thoughts

I completely agree with the comments on your question, that a regular expression isn't the correct tool, as it will not handle all your cases, and will only work if the code you are analysing is written like you expected.

But in some situations, where you just want to correct/adapt some existing code, limited to a few files and methods, then a regex can quickly solve your needs.

Regular expression, without warranty

If and only if your code will be written like you mentioned, then you could have a go with this commented pattern:

"
^@GET\r?\n                     # A line with @GET
@Path\(\"(?<path>[^\"]*)\"\)   # Followed by @Path, and capture this path.
.*?                            # Match anything, in an ungreedy way.
# Capture the name of the method and its parameters.
public\s+ObjectName\s+(?<method>\w+)\((?<params>[^)]+)\)\s*\{
.*?                            # Match anything, in an ungreedy way.
\bstartWriteSession\(\);       # match a call to startWriteSession().
"gmxsi

A little test online: https://regex101.com/r/4Jpg1B/2

Sign up to request clarification or add additional context in comments.

2 Comments

Cheers! I had to adapt it a bit to fit my exact use case, and ended on "(?s)\@GET\n\t\@Path(\"\/(?<path>[^\"]*)\").*\k<path>Inner[^}]+startWrite"
@user9934848 Great! Happy to know that it helped you solve it. Good idea to add the back reference of the function name to match the *Inner() method.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.