0

I have a text file which contains nearly million records in json format. Like

[{"ev":"AM","sym":"TMHC","v":1000,"av":74917,"op":18.92,"vw":19.1305,"o":19.13,"c":19.15,"h":19.15,"l":19.13,"a":19.143,"z":90,"n":1,"s":1549380300000,"e":1549380360000},{"ev":"AM","sym":"AAPL","v":7103,"av":184266,"op":35.27,"vw":35.3148,"o":35.3264,"c":35.34,"h":35.34,"l":35.3258,"a":35.3345,"z":710,"n":1,"s":1549380300000,"e":1549380360000}]
[{"ev":"AM","sym":"VB","v":213,"av":98285,"op":149.75,"vw":150.0575,"o":150.2104,"c":150.2104,"h":150.2104,"l":150.2104,"a":150.1944,"z":35,"n":1,"s":1549380300000,"e":1549380360000}]

So I need to find json element list from file which contains AAPL. Like if I will pass AAPL then it must give json element list of AAPL from whole file.

{"ev":"AM","sym":"AAPL","v":7103,"av":184266,"op":35.27,"vw":35.3148,"o":35.3264,"c":35.34,"h":35.34,"l":35.3258,"a":35.3345,"z":710,"n":1,"s":1549380300000,"e":1549380360000}

So how can I find it? I am trying to use JSONPATH for this but at a JObject convert time it's giving error like

Error reading JObject from JsonReader

I have apply below code for it :

const string filePath = @"D:\Aggregate_Minute_AAPL.json";
string text = System.IO.File.ReadAllText(filePath);
Newtonsoft.Json.Linq.JArray jsonArray = Newtonsoft.Json.Linq.JArray.Parse(text);
var json = Newtonsoft.Json.Linq.JObject.Parse(jsonArray.ToString());
var title = json.SelectToken("$.ev.sym[*]");
Console.WriteLine(title.First());
11
  • I think the first problem is that you've shown us that each line of your file is a unique JSON record, yet you're trying to parse the contents of the file as a single JSON record. Commented Feb 19, 2019 at 5:39
  • your file contains json data that directly start with Array of objects right? or it coluld be inside curly brackets like { ... } Commented Feb 19, 2019 at 5:41
  • No, each records start with [{...}]. And in searching for this error it's showing me that to remove square brackets from records. But there are very large record in file so for make loop for remove it will also take time. Commented Feb 19, 2019 at 5:43
  • @er-sho what is wrong with the JSON format? Each line of the file has a JSON array. OP needs to read and parse each line individually. Commented Feb 19, 2019 at 5:44
  • json value must have key and in your json there is no key defined for every array. copy and pate json here to validate json2csharp.com Commented Feb 19, 2019 at 5:45

2 Answers 2

1

You need to get all arrays whether they all or one line or each array is on new line and then parse each array to JArray and then find your property with desired key and get respective object of that key

public static List<JObject> GetObjectByValue(string filePath, string matchValue)
{
    var text = File.ReadAllText(filePath);

    var pattern = @"\[(.*?)\]";

    var matches = Regex.Matches(text, pattern);

    var result = matches.Cast<Match>()
            .Select(a => JArray.Parse(a.Value))
            .Select(b => b.ToObject<JObject[]>())
            .Where(x => x.Properties()
                         .Any(y => y.Name == "sym" && y.Value.ToString() == matchValue))
                         .FirstOrDefault()
            .ToList();

    return result;
}

Usage:

var list_obj = GetObjectByValue(@"Path to your text file", "VB");

Edit 1:

If you want to get your match objects by using Parallel.For then you can use below function,

public static List<JObject> GetObjectByValue(string filePath, string matchValue)
{
    var text = File.ReadAllText(filePath);

    var pattern = @"\[(.*?)\]";

    var matches = Regex.Matches(text, pattern);

    List<JObject> jObjects = new List<JObject>();

    Parallel.For(0, matches.Count, i =>
    {
        JArray jArray = JArray.Parse(matches[i].Value);
        var res = jArray.ToObject<JObject[]>().Where(x => x.Properties().Any(y => y.Name == "sym" && y.Value.ToString() == matchValue)).ToList();
        jObjects.AddRange(res);
    });

    return jObjects;
}
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for your valuable help. it's working fine. can we use AsParallel as a PLINQ in for our file and also can we make multiple threads for this ? so we can execute it very very speedy. And also we need to get all AAPL symbols from file so it will take lots time.
yes you can execute it as parallel to get it speedy and use .ToList() instead of second FirstOrDefault()
Answer updated view Edit 1 section in answer and let me know :)
edit 1 is working exactly. Just I need to improve it's time latency. due to very large records in text file. There are nearly 8 lacs records in file. Thank you very much for your best help.
And after you'd improve time latency then let me know how you did that.
0

Since the file is large it might be better to read it line by line rather than all at once.

Understand your structure: Each line is a JSON Array, and the array has one JSON object.

// reading lines in loop
foreach (var line in System.IO.File.ReadLines(filePath))
{
    // Parse the line into the array
    JArray jsonArray = Newtonsoft.Json.Linq.JArray.Parse(line);
    //parse the array into object, 
    //since each line has one object I have hardcoded the index to 0
    //if there can be more objects in one array then will need to iterate
    var json = JObject.Parse(jsonArray[0].ToString());
    // access the token
    var title = json["sym"]; // or json.SelectToken("sym");
    Console.WriteLine(title.First());
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.