Implementing tokenize function with CString

Question

For the sake of learning, I'm trying to implement my own simple Tokenize function with CStrings. I currently have this file:

11111
22222
(ENDWPT)


222222
333333
(ENDWPT)
6060606
ggggggg
hhhhhhh
(ENDWPT)
iiiiiii
jjjjjjj
kkkkkkk
lllllll
mmmmmmm
nnnnnnn

Which I would like to be tokenized with the delimiter (ENDWPT). I coded the following function, which attempts to find the delimiter position, then add the delimiter length and extract the text to this position. After that, update a counter that is used so that the next time the function is called it begins searching for the delimiter from the previous index. The function looks like this:

bool MyTokenize(CString strText, CString& strOut, int& iCount)
{
    CString strDelimiter = L"(ENDWPT)";
    int iIndex = strText.Find(strDelimiter, iCount);

    if (iIndex != -1)
    {
        iIndex += strDelimiter.GetLength();
        strOut = strText.Mid(iCount, iIndex);
        iCount = iIndex;
        return true;
    }
    return false;
}

And is being called like so:

int nCount = 0;

while ((MyTokenize(strText, strToken, nCount)) == true)
{
    // Handle tokenized strings here
}

Right now, the function is splitting the strings in the wrong way, I think it is because Find() may be returning the wrong index. I think it should be returning 12, but it is actually returning 14??. I ran out of ideas, if anyone can figure this out I would really appreciate it.

Barmak Shemirani · Accepted Answer · 2018-04-15 05:36:45Z

2

If delimiter is found (iIndex) then read iIndex - iCount count, starting from (iCount). Then modify iCount

if(iIndex != -1)
{
    strOut = strText.Mid(iCount, iIndex - iCount);
    iCount = iIndex + strDelimiter.GetLength();
    return true;
}

The source string may not end with delimiter, it needs a special case for that.

You can also pick better names to match the usage for CString::Mid(int nFirst, int nCount) to make it easier to understand. MFC uses camelCase coding style, with type identifiers in front of variables, which is unnecessary in C++, I'll avoid it in this example:

bool MyTokenize(CString &source, CString& token, int& first)
{
    CString delimeter = L"(ENDWPT)";
    int end = source.Find(delimeter, first);

    if(end != -1)
    {
        int count = end - first;
        token = source.Mid(first, count);
        first = end + delimeter.GetLength();
        return true;
    }
    else
    {
        int count = source.GetLength() - first;
        if(count <= 0)
            return false;

        token = source.Mid(first, count);
        first = source.GetLength();
        return true;
    }
}

...

int first = 0;
CString source = ...
CString token;
while(MyTokenize(source, token, first))
{
    // Handle tokenized strings here
}

edited Apr 15, 2018 at 5:36

answered Apr 15, 2018 at 4:33

Barmak Shemirani

31.8k6 gold badges46 silver badges83 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Alexander Lopez Over a year ago

Thanks for the help!, its working perfectly now. Hopefully I will be able to see and correct my own dumb mistakes in the future.

Barmak Shemirani Over a year ago

No problem. I edited to make the code easier to follow.

Collectives™ on Stack Overflow

Implementing tokenize function with CString

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related