I am new to C++ and have to process a text File. I decided to do this with a Regex. The Regex I came up with:
(([^\\s^=]+)\\s*=\\s*)?\"?([^\"^\\s^;]+)\"?\\s*;[!?](\\w+)\\s*
I have written my C++ code according to the following Post:
c++ regex extract all substrings using regex_search()
Here is the C++ Code:
#include "pch.h"
#include <iostream>
#include <fstream>
#include <string>
#include <regex>
#include <chrono>
#include <iterator>
void print(std::smatch match)
{
}
int main()
{
std::ifstream file{ "D:\\File.txt" };
std::string fileData{};
file.seekg(0, std::ios::end);
fileData.reserve(file.tellg());
file.seekg(0, std::ios::beg);
fileData.assign(std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>());
static const std::string pattern{ "(([^\\s^=]+)\\s*=\\s*)?\"?
([^\"^\\s^;]+)\"?\\s*;[!?](\\w+)\\s*" };
std::regex reg{ pattern };
std::sregex_iterator iter(fileData.begin(), fileData.end(), reg);
std::sregex_iterator end;
const auto before = std::chrono::high_resolution_clock::now();
std::for_each(iter, end, print);
const auto after = std::chrono::high_resolution_clock::now();
std::chrono::duration<double, std::milli> delta = after - before;
std::cout << delta.count() << "ms\n";
file.close();
}
The file I am processing contains 541 lines. The Program above needs 5 SECONDS to get all the 507 matches. I have done things like this before in C# and never had a Regex this slow. So I tried the same thing in C#:
var filedata = File.ReadAllText("D:\\File.txt", Encoding.Default);
const string regexPattern =
"(([^\\s^=]+)\\s*=\\s*)?\"?([^\"^\\s^;]+)\"?\\s*;[!?](\\w+)\\s*";
var regex = new Regex(regexPattern, RegexOptions.Multiline |
RegexOptions.Compiled );
var matches = regex.Matches(filedata);
foreach (Match match in matches)
{
Console.WriteLine(match.Value);
}
This needs only 500 MILLISECONDS to find all 507 matches + printing it on the Console. Since I have to work with C++ I need to be faster.
How can I make my C++ Program faster? What do I do wrong?
-O2or-O3? What is your compiler version?