0

I have following 2 urls:

https://docs.google.com/a/abc.com/spreadsheet/ccc?key=0Aj9Oa8x5fqsL678FNhOUF0ZEN5b25iVVZNdjdUQm9mM1E&usp=drive_web#gid=0

https://docs.google.com/a/abc.com/file/d/0Aj9Oa8x5fqsL678FNhOUF0ZEN5b25iVVZNdjdUQm9mM1E/edit

I am using following regex:

Pattern.compile(".*key=|/d/(.[^&/])")

as a result of it I want that the matcher.group() returns both urls upto fileId(0Aj9Oa8x5fqsL678FNhOUF0ZEN5b25iVVZNdjdUQm9mM1E) part and matcher.group(1) returns the fileId.

but I am not getting these results.

3
  • 4
    Do you absolutely need to use a regex? Using URI would make your job much easier Commented Mar 28, 2014 at 11:03
  • yes I want to know the answer with regex Commented Mar 28, 2014 at 11:07
  • Maybe you want to but here regex is very far from being the ideal solution; see my answer Commented Mar 28, 2014 at 11:09

3 Answers 3

1

you fell victim to the precedence rules in regex expressions and forgot the repetition specifier for your character class. try

Pattern.compile("(key=|/d/)([^&/]+)")

your result will be in $2.

Sign up to request clarification or add additional context in comments.

1 Comment

just a little change Pattern.compile(".+(key=|/d/)([^&/]+)") an it works. I wanted matching pattern too
1

If you don't need to use a regex, then use URI:

private static final Pattern PARAM_SEPARATOR = Pattern.compile("&");
private static final Pattern PATH_MATCHER = Pattern.compile("/file/d/([^/]+)");

// In query parameter...
public static String getKeyQueryParamFromURI(final String input)
{
    final URI uri = URI.create(input);
    final String params = uri.getQuery();
    if (params == null)
        return null;
    for (final String param: PARAM_SEPARATOR.split(input))
        if (param.startsWith("key="))
            return param.substring(4);
    return null;
}

// In path...
public static String getPathMatcherFromURI(final String input)
{
    final URI uri = URI.create(input);
    final String path = uri.getPath();
    if (path == null)
        return null;
    final Matcher m = PATH_MATCHER.matcher(input);
    return m.find() ? m.group(1) : null;
}

Note that unlike a regex, you will receive the result unescaped. If for instance the URI reads key=a%20b, this will return you "a b"!

If you insist on using a regex (why?), then do that instead for the query parameter:

private static final Pattern PATTERN = Pattern.compile("(?<=[?&])key=([^&]+)");

public static String getKeyQueryParamFromURI(final String input)
{
    final Matcher m = PATTERN.matcher(input);
    return m.find() ? m.group(1) : null;
}

But you'll have to unescape the parameter value yourself...

3 Comments

your first code snippet wouldn't match the google uri format, would it ? though i concede that in general, url segmentation with regexes is a bad idea, it appears well suited for the specific use case (known restricted url format, known relevant context, known restricted content [assuming ascii7 keys are to be extracted which it is safe to assume will not come percent-encoded]).
@collapsar what do you mean? That I don't check for the scheme, host etc?
you only check for the key= syntax, not for the info being part of the uri's location segment - or am i missing some subtlety?
0

It's prefer for two different regex pattern to split the regex statement and not use |(OR). With using different pattern you will have the first capture group the result you wanted.

Pattern1:

.*key=(.*)=.*

Pattern2:

.*\/file\/?\/(.*)\/.*

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.