1

I am trying to get the values from a computer vision api result from reading text on an image using Azure. The output is JSON data but the syntax of the result looks strange.

Ultimately I am trying to get the value "text" stripped out of it and written to a text file without any escape characters etc.

Here is the code I am using the parse the result.

static async Task MakeOCRRequest(string imageFilePath)
{
    try
    {
        HttpClient client = new HttpClient();
        client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
        string requestParameters = "language=unk&detectOrientation=true";
        string uri = uriBase + "?" + requestParameters;
        HttpResponseMessage response;
        byte[] byteData = GetImageAsByteArray(imageFilePath);

        using (ByteArrayContent content = new ByteArrayContent(byteData))
        {
            content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
            response = await client.PostAsync(uri, content);
        }

        string contentString = await response.Content.ReadAsStringAsync();

        ///////  It is at this point that I want to get the values from the "text" field
        JToken token = JToken.Parse(contentString).ToString();
        String[] result = contentString.Split(',');
        Console.writeline("\nResponse:\n\n{}\n", JToken.Parse(contentString).ToString());

    }
    catch (Exception e)
    {
        Console.WriteLine("\n" + e.Message);
    }
}

And here is the result that I get from the OCR process. I havent included the full result as it represents over 1700 lines.


  "language": "en",
  "textAngle": 0.0,
  "orientation": "Right",
  "regions": [
    {
      "boundingBox": "140,300,639,420",
      "lines": [
        {
          "boundingBox": "419,300,87,15",
          "words": [
            {
              "boundingBox": "419,300,87,15",
              "text": "0000175351"
            }
          ]
        },
        {
          "boundingBox": "140,342,337,47",
          "words": [
            {
              "boundingBox": "140,347,92,38",
              "text": "WE."
            },
            {
              "boundingBox": "241,347,13,36",
              "text": "1"
            },
            {
              "boundingBox": "266,342,211,47",
              "text": "0/1-1.9(2)"
            }
          ]
        },

With the current code I get the error message

JObject textResult = token["regions"]["text"].Value<JObject>();

Cannot access child value on NewtonSoft.Json.Linq.JValue.

I wonder if I am requesting the wrong key?

6
  • 2
    The pasted result isn't valid JSON, it doesn't have commas and strings can't be split over multiple lines. Plus "regions" is an array and not an object. Commented May 17, 2019 at 8:39
  • Hi Tom - I have adjusted the code to how it was on prior builds - this included a split to include the comma. Please see revision :-) Commented May 17, 2019 at 8:49
  • 1
    You can check if json string is valid using this tool Commented May 17, 2019 at 8:50
  • Ok - brilliant, thank you S. Oriolo. It says that this is now valid JSON - but what is the correct way to get the values of "text" out of the result? Commented May 17, 2019 at 8:57
  • From the json you pasted, there are multiple text values; can you be more specific and add the code that throw that error? From what you pasted seems that some code is missing Commented May 17, 2019 at 8:59

2 Answers 2

2

If you need to retrieve all text property value regardless of boundingBox then you can use below linq after parsing your json to JToken.

JToken jToken = JToken.Parse(json);

var allTexts = jToken["regions"].SelectMany(reg => reg["lines"].SelectMany(line => line["words"]).Select(word => word["text"].ToString()).ToList()).ToList();

Output: (From Debugger)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

Suppose that you have a valid JSON string now you can use Newtonsoft.Json package and deserialize your json string to object and then use object to get values:

ResponseModel res = JsonConvert.DeserializeObject<ResponseModel>(contentString);

your response model could be a pocco class like this:

public class ResponseModel
{
    public string language { get; set; }
    public string textAngle { get; set; }
    public string orientation { get; set; }
    //you have to create pocco class for RegionModel
    public List<RegionModel> regions { get; set; }
    ....
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.