1

we have this string: String input1 = "abbccd";

expected output: ab2c2d (note: if count=1, it shouldn't show in output).

the following code outputs a1,b2 c2 d2 on separate lines. Any suggestion to fix and improve?

input1.chars()
      .mapToObj(s -> Character.toLowerCase(Character.valueOf((char) s)))
      .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
      .entrySet().stream()
      .forEach(n -> {System.out.println(n.getKey()+""+n.getValue());});
0

2 Answers 2

3

Make the last forEach a map instead.

Instead of n.getValue() only add that part if n.getValue is not 1.

Then collect by joining.

At that point you will have a string you can print.

So, assuming we don't want to change your first part:

"abbccd".chars()
        .mapToObj(s -> Character.toLowerCase((char)s)) // notice here Character.valueOf was redundant, we're already dealing with a char
        .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
        .entrySet().stream()
        .map(n -> n.getKey()+""+(n.getValue() == 1 ? "" : n.getValue()))
        .collect(Collectors.joining());

Results in ab2c2d.

Sign up to request clarification or add additional context in comments.

5 Comments

Might as well do .mapToObj(s -> Character.toLowerCase((char) s)) too. No need to take the character value of a character.
@ElliottFrisch yeah, I only focused on the last part. But I'll make that change as well, while I'm at it.
Collectors.joining("") can just be Collectors.joining()
Your code breaks with most characters. See my Answer for an example of such failure, and for a solution. If you care to rework your Answer with code points, I'll gladly delete mine.
@BasilBourque right and noted. I won't change my answer because it still fits the given input. I upvoted yours, thought, in the hope it will surface to the top spot and be accepted.
2

Unfortunately, the other two Answers both fail with most characters.

Avoid legacy type char

The char type is legacy, essentially broken since Java 2, legacy since Java 5. As a 16-bit value, char is physically incapable of representing most of the 144,697 characters defined in Unicode.

See one Answer’s code break:

String input = "😷😷abbccd";
String output =
        input
                .chars()
                .mapToObj( s -> Character.toLowerCase( ( char ) s ) ) // notice here Character.valueOf was redundant, we're already dealing with a char
                .collect( Collectors.groupingBy( Function.identity() , LinkedHashMap :: new , Collectors.counting() ) )
                .entrySet().stream()
                .map( n -> n.getKey() + "" + ( n.getValue() == 1 ? "" : n.getValue() ) )
                .collect( Collectors.joining() );

System.out.println( "output = " + output );

output = ?2?2ab2c2d

Code point

Use code point integer numbers instead, when working with individual characters. A code point is the number permanently assigned to each character in Unicode. They range from zero to just over a million.

You will find code point related method scattered around the Java classes. These include String, StringBuilder, Character, etc.

The String#codePoints method returns an IntStream of code points, the code point number for each character in the string.

Here is a re-worked version of the clever code from Answer by Federico klez Culloca. Kudos to him, as I could not have come up with that approach.

String input = "😷😷abbccd";
String output =
        input
                .codePoints()
                .map( Character :: toLowerCase )
                .mapToObj( codePoint -> Character.toString( codePoint ) )
                .collect( Collectors.groupingBy( Function.identity() , LinkedHashMap :: new , Collectors.counting() ) )
                .entrySet().stream()
                .map( n -> n.getKey() + "" + ( n.getValue() == 1 ? "" : n.getValue() ) )
                .collect( Collectors.joining() );
System.out.println( "output = " + output );

output = 😷2ab2c2d

1 Comment

This solution still is incomplete. First, it still doesn’t handle all characters. A character can span multiple code points. E.g., try "🏳️‍🌈🏳️‍🌈". Then, mapping to lowercase is not the handling all characters for insensitive matching. In fact, it’s not even handling the (case sensitive) equality for all characters.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.