0

I'm working with another API which calls the google Document AI API. I'm trying to read the JSON String from the file into a Document object. How should this be done?

I tried the following but it is not working.

import com.google.cloud.documentai.v1.Document;
import java.io.FileInputStream;

Document document = Document.parseFrom(new FileInputStream("src/main/resources/responseFromAPICall.json"));
System.out.println(document.getText());

I'm getting this error:

Exception in thread "main" com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
    at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:129)
    at com.google.protobuf.CodedInputStream$StreamDecoder.checkLastTagWas(CodedInputStream.java:2124)
    at com.google.protobuf.CodedInputStream$StreamDecoder.readGroup(CodedInputStream.java:2358)
4
  • Have you tried changing the path of the file that you are reading? Commented Feb 7, 2022 at 22:00
  • Yes I tried reading from several response json files. It gives me the same error each time. Commented Feb 8, 2022 at 7:34
  • Do the files that you tried reading were in the path: "src/main/resources/filename.JSON" ? Commented Feb 8, 2022 at 16:28
  • Yes. I did not get a FileNotFoundException Commented Feb 9, 2022 at 13:49

1 Answer 1

3

Today I came across this issue as well. This answer gave me the starting point for a solution.

If your json file was saved from a call to Document AI and looks like:

{
  "document": {
    ...
    "text": "...",
    ...
  },
  "humanReviewStatus": {...}
}

you may use the following code snippet:

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

import com.google.cloud.documentai.v1.Document;
import com.google.cloud.documentai.v1.ProcessResponse;
import com.google.protobuf.util.JsonFormat;

Path filePath = Paths.get("src/main/resources/responseFromAPICall.json");
ProcessResponse.Builder responseBuilder = ProcessResponse.newBuilder();
JsonFormat.parser().merge(Files.newBufferedReader(filePath), responseBuilder);
Document document = responseBuilder.getDocument();
System.out.println(document.getText());

If your json file only contains the "document" object:

{
  ...
  "text": "...",
  ...
}

This code will do the trick:

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

import com.google.cloud.documentai.v1.Document;
import com.google.protobuf.util.JsonFormat;

Path filePath = Paths.get("src/main/resources/responseFromAPICall.json");
Document.Builder docBuilder = Document.newBuilder();
JsonFormat.parser().merge(Files.newBufferedReader(filePath), docBuilder);
System.out.println(docBuilder.getText());
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.