2

For the sake of learning, I'm trying to implement my own simple Tokenize function with CStrings. I currently have this file:

11111
22222
(ENDWPT)


222222
333333
(ENDWPT)
6060606
ggggggg
hhhhhhh
(ENDWPT)
iiiiiii
jjjjjjj
kkkkkkk
lllllll
mmmmmmm
nnnnnnn

Which I would like to be tokenized with the delimiter (ENDWPT). I coded the following function, which attempts to find the delimiter position, then add the delimiter length and extract the text to this position. After that, update a counter that is used so that the next time the function is called it begins searching for the delimiter from the previous index. The function looks like this:

bool MyTokenize(CString strText, CString& strOut, int& iCount)
{
    CString strDelimiter = L"(ENDWPT)";
    int iIndex = strText.Find(strDelimiter, iCount);

    if (iIndex != -1)
    {
        iIndex += strDelimiter.GetLength();
        strOut = strText.Mid(iCount, iIndex);
        iCount = iIndex;
        return true;
    }
    return false;
}

And is being called like so:

int nCount = 0;

while ((MyTokenize(strText, strToken, nCount)) == true)
{
    // Handle tokenized strings here
}

Right now, the function is splitting the strings in the wrong way, I think it is because Find() may be returning the wrong index. I think it should be returning 12, but it is actually returning 14??. I ran out of ideas, if anyone can figure this out I would really appreciate it.

1 Answer 1

2

If delimiter is found (iIndex) then read iIndex - iCount count, starting from (iCount). Then modify iCount

if(iIndex != -1)
{
    strOut = strText.Mid(iCount, iIndex - iCount);
    iCount = iIndex + strDelimiter.GetLength();
    return true;
}

The source string may not end with delimiter, it needs a special case for that.

You can also pick better names to match the usage for CString::Mid(int nFirst, int nCount) to make it easier to understand. MFC uses camelCase coding style, with type identifiers in front of variables, which is unnecessary in C++, I'll avoid it in this example:

bool MyTokenize(CString &source, CString& token, int& first)
{
    CString delimeter = L"(ENDWPT)";
    int end = source.Find(delimeter, first);

    if(end != -1)
    {
        int count = end - first;
        token = source.Mid(first, count);
        first = end + delimeter.GetLength();
        return true;
    }
    else
    {
        int count = source.GetLength() - first;
        if(count <= 0)
            return false;

        token = source.Mid(first, count);
        first = source.GetLength();
        return true;
    }
}

...

int first = 0;
CString source = ...
CString token;
while(MyTokenize(source, token, first))
{
    // Handle tokenized strings here
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the help!, its working perfectly now. Hopefully I will be able to see and correct my own dumb mistakes in the future.
No problem. I edited to make the code easier to follow.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.