I have a character array list and wish to tally the number of substring occurrences against an index held in a numerical vector chr:
list =
CCNNCCCNNNCNNCN
chr =
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
Ordinarily, I am searching for adjacent string pairs i.e. 'NN' and utilise this method:
Count(:,1) = accumarray(chr(intersect([strfind(list,'CC')],find(~diff(chr)))),1);
Using ~diff(chr) to ensure the pattern matching does not cross index boundaries.
However, now I want to match single letter strings i.e. 'N' - how can I accomplish this? The above method means the last letter in each index is missed and not counted.
The desired result for the above example would be a two column matrix detailing the number of 'C's and 'N's within each index:
C N
2 2
5 6
i.e. there are 2C's and 2N's within index '1' (stored in chr) - the count then restarts from 0 for the next '2' - where there are 5C's and 6N's.