0

How can I reliably handle exception traces that potentially contain binary data and store those traces in MongoDB from PHP?

We only use MongoDB for logging, and we have a dedicated collection for logging any form of uncaught exception. Recently, I managed to catch the following message in a text log in the event there's an issue storing the log in mongodb.

Detected invalid UTF-8 for field path "message":

We have recently started using BINARY(16) UUIDs in MySQL for a high-write table, and an exception was thrown on some code. The raw binary was in the getTrace() causing the above message when we tried to store this in MongoDB. We are currently storing these type of logs as such

$collection->insertOne([
  'errorMessage' => $exception->getMessage(),
  'errorFile'    => $exception->getFile(),
  'errorLine'    => $exception->getLine(),
  'message'      => htmlspecialchars(print_r($exception->getTrace(),true)),
]);

Would it be better to just store $exception->getTrace() without the wrapping functions and process on the other end, or would this result in the same scenario? We were doing the htmlspecialchars and the print_r as a quick and dirty way to convert from an array into a string for a different product to read in. At this point, we have no reservations against changing this methodology.

4
  • base 64 encode it, you can detect it with preg_match('/[^[:print:]]/', $data) Commented Jul 19, 2019 at 3:56
  • @ArtisticPhoenix Just base64 encode the entire message string based on that preg_match? Man, if it's really that easy, I'll just need to work on the processing side of things. :D Commented Jul 19, 2019 at 3:58
  • it is, it's because Mongo uses Byson with is basically JSON and JSON barfs on non-UTF8. [:print:] finds any UTF8 printable chars, so [^[:print:]] find not-printable chars. In PHP 7 you can do some things more with non-utf8 and JSON - but I don't know if it will help you with mongo. Commented Jul 19, 2019 at 4:01
  • Thanks man, definitely saved my day! I can worry about making it searchable after pumping it into a different system. Could you please put your comment in as an answer so I can accept it? Commented Jul 19, 2019 at 4:03

1 Answer 1

1

You should be able to "armor" binary with base64, this is typical when storing things like images etc into text. with base64_encode. For example when putting the image data directly into the HTML instead of a linked file... etc.

The tricky part is detecting that it's binary, you can probably get away with just detecting non-UTF8 chars because that's mainly the issue, a similar problem exists with JSON encode. Mongo uses Byson, which is their "flavor" of JSON.

   if(preg_match('/[^[:print:]]/', $data)) $data = base64_encode($data);

The only real downside besides the extra memory cost of converting is the size of the data will increase, similar to encrypting something.

But, sometimes that's an acceptable trade off.

Sign up to request clarification or add additional context in comments.

3 Comments

One other idea, would be to only encode it when you catch the error from Mongo, so you try to insert it, catch error / convert and then try again. This will work to as long as there is a specific exception class or code for this that you can use. The advantage to this is there is no additional cost until the exception happens (because your not testing every row)
I wound up going the try/catch route, including a base64 boolean to quickly determine if the message is encoded or not, and patching our displays.
Kool, glad I could help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.