5

Wondering is there more simple way than computing the character count of a given string as below?

String word = "AAABBB";
    Map<String, Integer> charCount = new HashMap();
    for(String charr: word.split("")){
        Integer added = charCount.putIfAbsent(charr, 1);
        if(added != null)
            charCount.computeIfPresent(charr,(k,v) -> v+1);
    }

    System.out.println(charCount);
4
  • For ANSI characters, you can just have an array of size 256 and compute it. Commented Mar 12, 2019 at 19:10
  • @vivek_23 Which ANSI character set would that be? Or did you mean ASCII and 128? Commented Mar 12, 2019 at 19:39
  • 2
    @vivek_23 that is the windows code page 1252, not ANSI. The Unicode standard matches the iso-latin-1 character set for the first 256 codepoints. Referring to the windows code page 1252 is an unnecessary complication, as that code page does not match in the 128-159 range. Commented Jun 6, 2020 at 12:59
  • @Holger Ahh! Thanks for the correction. Deleted my previous comment to avoid confusion. Commented Jun 6, 2020 at 14:41

12 Answers 12

9

Simplest way to count occurrence of each character in a string, with full Unicode support (Java 11+)1:

String word = "AAABBB";
Map<String, Long> charCount = word.codePoints().mapToObj(Character::toString)
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(charCount);

1) Java 8 version with full Unicode support is at the end of the answer.

Output

{A=3, B=3}

UPDATE: For Java 8+ (doesn't support characters from supplemental planes, e.g. emoji):

Map<String, Long> charCount = IntStream.range(0, word.length())
        .mapToObj(i -> word.substring(i, i + 1))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

UPDATE 2: Also for Java 8+.

I was mistaken, thinking that codePoints() wasn't added until Java 9. It was added in Java 8 to the CharSequence interface, so it doesn't show in javadoc for String in Java 8, and shows as added in Java 9 for later versions of the javadoc.

However, the Character.toString​(int codePoint) method wasn't added until Java 11, so to use the Character.toString​(char c) method, we can use chars() in Java 8:

Map<String, Long> charCount = word.chars().mapToObj(c -> Character.toString((char) c))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

Or for full Unicode support, incl. supplemental planes, we can use codePoints() and the String(int[] codePoints, int offset, int count) constructor, in Java 8:

Map<String, Long> charCount = word.codePoints()
        .mapToObj(cp -> new String(new int[] { cp }, 0, 1))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Sign up to request clarification or add additional context in comments.

2 Comments

Am sorry, is there a simple way for Java 8?
Speaking of “full Unicode support” and Emojis, it’s worth pointing out that even using codepoints is not necessarily providing the intended semantics. E.g. "ā̧👩‍🇮🇩" has 10 chars, 7 codepoints, but only three characters; the first one demonstrates that this is not only an Emoji issue. The only solution, I currently know of, is to process grapheme clusters, e.g. with Java 9+: Pattern.compile("\\X").matcher(example).results() .collect(Collectors.groupingBy(MatchResult::group, Collectors.counting())).
2
     String str = "Hello Manash";
    Map<Character,Long> hm = str.chars().mapToObj(c-> 
    (char)c).collect(Collectors.groupingBy(c->c,Collectors.counting()));
    System.out.println(hm);

1 Comment

How does your answer differ from already answered mapping by Andreas ? Please explain, also the code.
2

Try the below approaches:

Approach 1:

    String str = "abcaadcbcb";
    
    Map<Character, Integer> charCount = str.chars()
            .boxed()
            .collect(toMap(
                    k -> (char) k.intValue(),
                    v -> 1,         // 1 occurence
                    Integer::sum));
    System.out.println("Char Counts:\n" + charCount);

Approach 2:

    String str = "abcaadcbcb";
    Map<Character, Integer> charCount = new HashMap<>();
    for (char c : str.toCharArray()) {
        charCount.merge(c,          // key = char
                1,                  // value to merge
                Integer::sum);      // counting
    }
    System.out.println("Char Counts:\n" + charCount);

Output:

    Char Counts:
    {a=3, b=3, c=3, d=1}

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
1

Try this one :

    List<Character> chars=Arrays.asList('h','e','l','l','o','w','o','r','l','d');
    Map<Character,Long> map=chars.stream().map(c->c).
    collect(Collectors.groupingBy(c->c,Collectors.counting()));
    System.out.println(map);

output:

{r=1, d=1, e=1, w=1, h=1, l=3, o=2}

Comments

1
word.chars().mapToObj(c-> (char)c).collect(Collectors.groupingBy(Function.identity(),LinkedHashMap::new, Collectors.counting()));

This will give you character count in order of appearance of the character.

2 Comments

MIght you format your code snippet as coded, to allow for greater readability? Thanks.
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
1
String str = "abcaadcbcb";

Map<String, Long> charCount  = 
Arrays.asList(str.split("")).stream().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
    

1 Comment

Code-only answers are discouraged. How does your answer differ from already answered mapping by Andreas ? Please explain, also the code, e.g. What is the purpose of Arrays.asList(str.split("")).
1

If you're open to using a third-party library that works with Java 8 or above, Eclipse Collections (EC) can solve this problem using a primitive Bag to count characters. Use a CharBag if char values are required, or an IntBag if codePoints (int values) are required. A Bag is a simpler data structure for counting things and may be backed by a primitive HashMap so as not to box the counts as Integer or Long objects. A Bag doesn't suffer from the missing keys return null values problem that a HashMap does in Java.

@Test
public void characterCountJava8()
{
    String word = "AAABBB";
    CharAdapter chars = Strings.asChars(word);
    CharBag charCounts = chars.toBag();

    Assertions.assertEquals(3, charCounts.occurrencesOf('A'));
    Assertions.assertEquals(3, charCounts.occurrencesOf('B'));
    Assertions.assertEquals(0, charCounts.occurrencesOf('C'));

    System.out.println(charCounts.toStringOfItemToCount());
}

Outputs:

{A=3, B=3}

CharAdapter and CharBag are primitive collection types available in EC. A CharBag is useful if you want to count char values. Notice that the charCounts.occurrencesOf('C') returns 0 instead of null as it would if this was a HashMap.

The following example shows using codePoints that are visually appealing using emojis. The code itself will work with Java 8, but I believe the Emoji literal support wasn't added until Java 11.

@Test
public void codePointCountJava11()
{
    String emojis = "🍎🍎🍎🍌🍌";
    CodePointAdapter codePoints = Strings.asCodePoints(emojis);
    IntBag emojiCounts = codePoints.toBag();

    int appleInt = "🍎".codePointAt(0);
    int bananaInt = "🍌".codePointAt(0);
    int pearInt = "🍐".codePointAt(0);
    Assertions.assertEquals(3, emojiCounts.occurrencesOf(appleInt));
    Assertions.assertEquals(2, emojiCounts.occurrencesOf(bananaInt));
    Assertions.assertEquals(0, emojiCounts.occurrencesOf(pearInt));

    System.out.println(emojiCounts.toStringOfItemToCount());

    Bag<String> emojiStringCounts = emojiCounts.collect(Character::toString);

    System.out.println(emojiStringCounts.toStringOfItemToCount());
}

Outputs:

{127820=2, 127822=3}  // IntBag.toStringOfItemToCount()
{🍌=2, 🍎=3}          // Bag<String>.toStringOfItemToCount()

CodePointAdapter and IntBag are primitive collection types available in EC. An IntBag is useful if you want to count int values. Notice that the emojiCounts.occurrencesOf(pearInt) returns 0 instead of null as it would if this was a HashMap.

I converted the IntBag to a Bag<String> to show the differences when printing int vs. char. You need to convert int codePoints back to String if you want to print anything.

The comment Holger left on the accepted answer about grapheme clusters was insightful and helpful. Thank you! The codepoint solution here suffers from the same issue as all of the other codepoint solutions.

Eclipse Collections 11.1 was compiled and released with Java 8. I wouldn't recommend staying on Java 8 any more, but wanted to point out this is still possible.

Note: I am a committer for Eclipse Collections.

Comments

1

Java stream solution for this, I hope the code is self-explanatory.

String s = "ccacbbaac"
Map<Character, Long> collect = s.chars().mapToObj(y -> (char) y).collect(Collectors.groupingBy(x -> (char) x, Collectors.counting()));

Comments

0

Hope this help : Java 8 Stream & Collector:

    String word = "AAABBB";
    Map<Character, Integer> charCount = word.chars().boxed().collect(Collectors.toMap(
                    k -> Character.valueOf((char) k.intValue()),
                    v -> 1,
                    Integer::sum));
    System.out.println(charCount);

Output:
    {A=3, B=3}

3 Comments

chars() requires Java 9, and better solution using codePoints() instead of chars() already posted 13 minutes earlier.
@Andreas agree withcodePoints()solution, butchars()introduce in java 8 String.chars()
That would be CharSequence.chars(), not String.chars(), but I accept your correction. Javadoc for Java 11 show method as added to String in Java 9, which is what lead me astray.
0

Figured out, below is another simple way.

Map<String, Integer> charCount = new HashMap();
    for(String charr: s.split("")){
        charCount.put(charr,charCount.getOrDefault(charr,0)+1);
}

1 Comment

charCount.put(charr,charCount.getOrDefault(charr,0)+1); can be simplified to charCount.merge(charr, 1, Integer::sum); By the way, you should use new HashMap<>()
0

Simple Java 8 solution I can think is:

Map<String, Long> map= Arrays.stream(word.trim().toLowerCase().split(""))
            .collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));

Comments

-1
String str = "edcba"

Map<String, Long> couterMap1 = str.codePoints()
                                  .mapToObj(Character::toString)
                                  .collect(Collectors.groupingBy(e -> e, Collectors.counting()));

1 Comment

I get compilation error as "Cannot infer type argument(s) for <U> mapToObj(IntFunction<? extends U>)"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.