3

This is in Java 7

I do not know regular expressions, so I was wondering if anybody knew how I could use the split method to get all usernames out of the string:

{tchristofferson=10, mchristofferson=50}

and then add the usernames to a String[] array? These are just two usernames in there, but I want this to work for an endless amount of usernames.

Usernames require the following format:

3-16 characters, no spaces, A-Z upper and lower case and 0-9, only special character is _ (underscore).

4
  • What are the valid characters in a username? Can it include numbers and special characters like _- and so on. Commented Feb 25, 2017 at 19:21
  • with which version of java you should complete this task? Commented Feb 25, 2017 at 19:32
  • I am doing this in java 7 Commented Feb 25, 2017 at 19:51
  • the usernames have these requirements: 3-16 characters, no spaces, A-Z upper and lower case and 0-9, only special character is _ (underscore). Commented Feb 25, 2017 at 19:55

4 Answers 4

1

This looks like JSON, so the "right" answer would probably be to use a JSON parser. If this is not an option, you can remove the enclosing {}, split the string according to ", ", and then split each string according to the = sign, taking the first item:

String input = "{tchristofferson=10, mchristofferson=50}";
List<String> users =
    Arrays.stream(input.substring(1, input.length() - 1).split(", "))
          .map(s -> s.split("=")[0])
          .collect(Collectors.toList());
Sign up to request clarification or add additional context in comments.

3 Comments

It's not valid JSON: JSON should have quotes around String values (and, strictly, key names too). Most parsers would reject this input.
actually JSON should have colons instead of equals
Pattern#splitAsStream may have been more concise.
1

Here is the wrong (job security) way:

String[] usernames = str.substring(1)
                        .split("=\\d+[,}]\\s*");

Why is this the wrong way? We are throwing out the stuff we don't want. The first character (whatever it is), and hoping that "=#, " and "=#}" is the only stuff we don't want. If the string began with "{ tchristofferson=10", then the first username would get a leading space.

The better way is to match the stuff you do want. And now that I'm not trying to create the answer on an iPhone screen, here it is:

    String input = "{tchristofferson=10, mchristofferson=50}";

    Pattern USERNAME_VALUE = Pattern.compile("(\\w+)=(\\d+)");
    Matcher matcher = USERNAME_VALUE.matcher(input);

    ArrayList<String> list = new ArrayList<>();
    while(matcher.find()) {
        list.add(matcher.group(1));
    }
    String[] usernames = list.toArray(new String[0]);

This assumes each character of your usernames match the \w pattern (i.e., [a-zA-Z0-9_] and other alphanumeric Unicode code points). Modify if your username requirements are more/less restrictive.

(\w+) is used to capture the username as matcher.group(1), which is added to the list which is eventually turned into your String[].

(\d+) is also being used to capture the number associated with this user as matcher.group(2). This capture group is not (presently) being used, so you could remove the parenthesis for a small efficiency gain, i.e., "(\\w+)=\\d+". I included it in case you wanted to do something with those values as well.

Comments

0

If username includes numbers and special characters like = then:

String str = "tchristofferson=10,mchristofferson=50";    
Pattern ptn = Pattern.compile(",");
String[] usernames = ptn.split(str); 

Comments

0

You can try splitting whenever there is not (^) a word (A-Za-z):

String[] tokens = test.split("[^A-Za-z]");

And if don't mind using a List, try it like @Mureinik suggested with:

    List<String> tokens2 = Arrays.stream(test.split("[^A-Za-z]"))
            .distinct()
            .filter(w -> !w.isEmpty())
            .collect(Collectors.toList());

Edit1:

If the list contains numbers try:

String[] tokens = test.split("[^A-Za-z\w]");

I highly recommend this site if you wish to experiment with regex:

http://regexr.com/

2 Comments

how could I do this if the username may contain numbers?
[^A-Za-z\\w] is equivalent to [^\\w], which is equivalent to \W. This gives an output of ["", "tchristofferson", "10", "", "mchristofferson", "50"], where "", "10", "" and "50" are not usernames.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.