0

This apparently simple question is bugging my head for a while, thought somebody might be of help.

I have a simple string

s = 'AAABCAA'

How to find the number of repetitions of first letter 'A'? Answer should be 3.

I have tried:

from collections import Counter
c = Counter(s)

But, this gives 'A' = 5, instead of 3.

5
  • looks like a job for regex Commented Feb 20, 2019 at 16:57
  • You can't do it with Counter. Commented Feb 20, 2019 at 16:58
  • Chip in guys, time to make some difference! Commented Feb 20, 2019 at 17:00
  • Why should the answer be 3? there are 5 'A's overall in the list. Mind clarifying? Commented Feb 20, 2019 at 17:01
  • @LeKhan9 I only want first repetitions, if there is anything after A, neglect them. Commented Feb 20, 2019 at 17:03

4 Answers 4

5

You could use a for loop with a break statement.

s = 'AAABCAA'
counter=0
firstletter=s[0]
for each in s:
    if each==firstletter:
        counter+=1
    else:
        break
print(counter)

This just returns 3.

Alternatively, you could return index of the first element of the string which is not the same as the first character of your string:

import numpy as np
s = 'AAABCAA'
firstletter=s[0]
checklist=[(each==firstletter)*1 for each in s]
print(np.where(np.asarray(checklist)==0)[0][0])

In this case, with list comprehension ([(each==firstletter)*1 for each in s]) we produce a list:

[1, 1, 1, 0, 0, 1, 1]

The value is 1 wherever the character in that spot is identical to the first character of the string. Then np.where(np.asarray(checklist)==0)[0][0] gives you the index of the first 0 (ie the first character not identical to starting character) of this newly created list.

Sign up to request clarification or add additional context in comments.

Comments

3

You can use the function groupby() to find all letter groups and then you can use next() to get the first group from the iterator:

from itertools import groupby

s = 'AAABCAA'

sum(1 for _ in next(groupby(s))[1])
# 3

Alternatively you can use the function takewhile():

from itertools import takewhile

sum(1 for _ in takewhile(lambda x: x == s[0], s))
# 3

And finally you can use regex:

import re

len(re.search(r'^(\w)\1+', s, flags=re.MULTILINE).group(0))
# 3

Comments

3

Here's a short solution that uses list comp. Of course, readability won't be the goal here :)

repetitions = lambda str, letter: [i + 1 for i, num in enumerate(str) if num == letter][-1]

Examples:

str = 'BBBBC'
letter = 'B'

repetitions(str, letter) # 4

str = 'AABC'
letter = 'A'

repetitions(str, letter) # 2

Comments

1

If you are looking for patterns in strings in general, use a suffix tree.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.