0

I've been looking thousand of questions and answers about what I'm going to ask, but I still didn't find the way to do what I'm gonna to explain.

I have a text file from which I have to extract information about several things, all of them with the following format:

"string1":"string2"

And after that, there is more information, I mean:

The text file is something like this:

LINE 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXX"string1":"string2"XXXXXXXXXXXXXXXXXXXXXXXXXX"string3":"string4"XXXXXXXXXXXXXXXXXXXXXXXXXXXX...('\n')

LINE 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXX"string5":"string6"XXXXXXXXXXXXXXXXXXXXXXXXXX"string7":"string8"XXXXXXXXXXXXXXXXXXXXXXXXXXXX...

XXX represents irrelevant information I do not need, and theEntireString (string used in the code example) stores all the information of a single line, not all the information of the text file.

I have to find first the content of string1 and store the content of string2 into another string without the quotes. The problem is that I have to stop when I reache the last quote and I don't know how exactly do this. I suppose I have to use the functions find() and substr(), but despite having tried it repeatedly, I did not succeed.

What I have done is something like this:

string extractInformation(string theEntireString)
{
  string s = "\"string1\":\"";    
  string result = theEntireString.find(s);
  return result;
}

But this way I suppose I store into the string the last quote and the rest of the string.

2
  • You really need a good programming book, and to start at the basics.... but, find takes an argument of exactly the string you want it to find (maybe "\":\"") and returns the position in the search string where the pattern was found. substr (not substring) takes the start position followed by the number of characters to select. So perhaps std::string string1 = theEntireString.substr(1, theEntireString.find("\":\"")-1); Commented Feb 22, 2014 at 17:10
  • 1
    O(n^2) if you scan everything, O(n) with a pattern matching algorithm or similar. Commented Feb 22, 2014 at 17:22

5 Answers 5

1

"find" function just give you the position of matched string to get the resulting string you need to use the "subst" function. Try This

string start,end;
start = theEntireString.substr(1,theEntireString.find(":")-2);
end = theEntireString.substr(theEntireString.find(":")+2,theEntireString.size()-1);

That will solve you problem

Sign up to request clarification or add additional context in comments.

Comments

0

Assuming either the key or value contains a quotation mark. The following will output the value after the ":". You can also use it in a loop to repeatedly extract the value field if you have multiple key-value pairs in the input string, provided that you keep a record of the position of last found instance.

#include <iostream>
using namespace std;

string extractInformation(size_t p, string key, const string& theEntireString)
{
  string s = "\"" + key +"\":\"";
  auto p1 = theEntireString.find(s);
  if (string::npos != p1)
    p1 += s.size();
  auto p2 = theEntireString.find_first_of('\"',p1);
  if (string::npos != p2)
    return theEntireString.substr(p1,p2-p1);
  return "";
}

int main() {
  string data = "\"key\":\"val\" \"key1\":\"val1\"";
  string res = extractInformation(0,"key",data);
  string res1 = extractInformation(0,"key1",data);
  cout << res << "," << res1 << endl;
}

Outputs:

val,val1

3 Comments

Your solution goes perfectly, but in my txt file the format is string data = "XXXXX"key":"val"XXXXX"key1":"val1"XXXXXXX" instead string data = "XXXXX\"key\":\"val\"XXXXX\"key1\":\"val1\"XXXXXXX"
Please note that the backslash is only needed to escape the quote mark ". So what's "XXXXX\"key\":\"val\"XXXXX\"key1\":\"val1\"XXXXXXX" on disk will become string data = "XXXXX"key":"val"XXXXX"key1":"val1"XXXXXXX" in cxx code. If you modify the code to read a file or from stdin, you will see.
Equivalently, you can also use the c++11 raw string syntax, string data = R"(XXXXX"key":"val"XXXXX"key1":"val1"XXXXXXX)";, which does not require escaping special characters.
0

Two steps:

First we have to find the position of the : and splice the string into two parts:

string first = theEntireString.substr(0, theEntireString.find(":"));
string second = theEntireString.substr(theEntireString.find(":") + 1);

Now, we have to remove the "":

string final_first(first.begin() + 1, first.end() - 1);
string final_second(second.begin() + 1, second.end() - 1);

1 Comment

But this way in the string "final_second" you store the rest of the line and don't stop when you reache the last quote. I have edited the post explaining that, sorry.
0

You don't need any string operation. I hope the XXXXX doesn't contain any '"', so You can read the both strings directly from the file:

ifstream file("input.txt");
for( string s1,s2; getline( getline( file.ignore( numeric_limits< streamsize >::max(), '"' ), s1, '"' ) >> Char<':'> >> Char<'"'>, s2, '"' ); )
    cout << "S1=" << s1 << " S2=" << s2 << endl;

the little help-function Char is:

template< char C >
std::istream& Char( std::istream& in )
{
    char c;
    if( in >> c && c != C )
        in.setstate( std::ios_base::failbit );
    return in;
}

Comments

0
#include <regex>
#include <iostream>

using namespace std;

const string text = R"(
XXXXXXXXXXXXXXXXXXXXXXXXXXXX"string1":"string2"XXXXXXXXXXXXXXXXXXXXXXXXXX"string3"  :"string4" XXXXXXXXXXXXXXXXXXXXXXXXXXXX...
XXXXXXXXXXXXXXXXXXXXXXXXXXXX"string5":  "string6"XXXXXXXXXXXXXXXXXXXXXXXXXX"string7"  :  "string8" XXXXXXXXXXXXXXXXXXXXXXXXXXXX...
)";

int main() {
    const regex pattern{R"~("([^"]*)"\s*:\s*"([^"]*)")~"};
    for (auto it = sregex_iterator(begin(text), end(text), pattern); it != sregex_iterator(); ++it) {
        cout << it->format("First: $1, Second: $2") << endl;
    }
}

Output:

First: string1, Second: string2
First: string3, Second: string4
First: string5, Second: string6
First: string7, Second: string8

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.