0

I have a file that I have opened with std::ifstream. I have have a line of code that I want to parse:

<image source="tileset/grass-tiles-2-small.png" width="384" height="192"/>

And lets say I am interested in "384" found after width="

I am at a loss as how to best extract "384" from that line as the number 384 is not constant at all.

void parseFile(const std::string &mfName)
{
    std::ifstream file(mfName);

    std::string line;


    if (file.is_open())
    {
        while (getline(file, line))
        {
            std::size_t found = line.find("width");

            if (found != std::string::npos)
            {
                std::cout << found << std::endl;
            }
        }
    }
    else
        std::cerr << "file failed to open" << std::endl;
} 

Could anyone give me a hint or a link to a good tutorial that covers this?

4
  • Take a look at boost::regex or std::regex if you are using C++11 Commented Mar 2, 2014 at 19:02
  • Do you have fixed points? Like: do you know at which line your information is, or is the imagename always the same? Do you already use additional libraries in your project, or should it work with plain c++ and stl? Commented Mar 2, 2014 at 19:03
  • johndcook.com/cpp_regex.html if you use tr1 you can have alook to this link Commented Mar 2, 2014 at 19:04
  • yes, I know or can easily find out at which line my information is located at. and I am using c++11 std Commented Mar 2, 2014 at 19:42

4 Answers 4

1

This is your file:

<image source="tileset/grass-tiles-2-small.png" width="384" height="192"/>

And since all you're interested in is the width, we should first get the entire line:

if (std::getline(file, line))
{

Now we need to find width. We do that using the find() method:

    std::size_t pos = line.find("width");

The string inside find() is the value we want to look for.

Once we check if it found this position:

    if (pos != std::string::npos)
    {

We need to put it into a std::stringstream and parse out the data:

        std::istringstream iss(line.substr(pos));

The substr() call is used to select a subsequence of the string. pos is the position where we found "width". So far this is what is inside the stringstream:

 width="384" height="192"/>

Since we don't actually care about "width" but rather with the number inside the quotes, we have to ignore() everything before the quotes. That is done like this:

        iss.ignore(std::numeric_limits<std::streamsize>::max(), '"');

Now we use the extractor to extract the integer:

        int width;

        if (iss >> width)
        {
            std::cout << "The width is " << width << std::endl;
        }

I hope this helps. Here's a full example of the program:

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>

void parseFile(const std::string& mfName)
{
    std::ifstream file(mfName);
    std::string line;

    if (std::getline(file, line))
    {
        auto pos = line.find("width");
        if (pos != std::string::npos)
        {
            std::istringstream iss(line.substr(pos));
            int width;

            if (iss.ignore(std::numeric_limits<std::streamsize>::max(), '"') &&
                iss >> width)
            {
                std::cout << "The width is " << width << std::endl;
            }
        }
    }
}
Sign up to request clarification or add additional context in comments.

3 Comments

I am trying to understand igonre() after: std::istringstream iss(line.substr(pos)); we have: width="384" height="192"/> but after: if (iss.ignore(std::numeric_limits<std::streamsize>::max(), '"') && iss >> width) we get the number. I thought ignore would, well ignore all char's up to and including '"' how come we end up with just a number and not go from: width="384" height="192"/>
@user2299044 Inside the if statement, the ignore() call runs first. It will ignore (or rather "skip") all the characters up until and including ". When the ignore() call finishes, this is what will be the remaining content: 384" height="192"/>. When iss >> width runs, it will extract the integer until it finds a non-integral character, namely the other ".
quite clever. coming from python I expected to have to do what iss >> width does manually in C++.
0

Parse strings using a regex parser. As you are doing C++, include the <regex> header, and use the function regex_search to match results. The results go into a smatch object, which is iteratable.

Reference: http://www.cplusplus.com/reference/regex/

Also see: Retrieving a regex search in C++

Comments

0

If I were you, I'd use an XML library (if this is actually XML). This is one of the things you certainly don't want to reinvent but reuse! :)

In the past, I've successfully used TinyXML for smaller projects. Or google "c++ xml library" for alternatives.

1 Comment

I know there are library's I could use, but I am doing this more as a means of learning. :)
0

Using Boost-Regex, you can use something like following in your function

/* std::string line = "<image source= \
     \"tileset/grass-tiles-2-small.png\" width=\"384\" height=\"192\"/>";
*/

boost::regex expr ("width=\"(\\d+)\"");
boost::smatch matches;

if (boost::regex_search(line, matches, expr)) 
{
    std::cout << "match: " << matches[1] << std::endl;
}

2 Comments

Would using regex be the recommended way to parse a file that includes many lines like the one I have mentioned?
@user2299044 First of all I'd use a scripting language to do that, but if C++ is really the need here, boost::regex would be one of the preferred option, and yes you can use it for many lines too, in your case it will be line-by-line. You can refer the online document for any regex for for help, it will be all most same for boost too

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.