-3

I get this error:

    "message": "Malformed UTF-8 characters, possibly incorrectly encoded",

This is my real id with UUID in the database:

944CC79D-5980-4587-8A52-000A2F11D7D1

And this is the UUID when try to fetch within php artisan tinker:

b"ØÃLöÇYçEèR\0\n/\x11ÎÐ" // ____4C__-__59-__45-__52-000A2F11____

Why is it happening? I have done casting to string/UUID in the models, cleared the cache, but to no avail.

My environment:

  • mac os, tahoe 26.1 ver
  • PHP 8.4 (also tested with PHP 8.2)
  • Laravel 11
  • DB is using SQL server

Previously my environment was on Windows, and the program was working fine, well, because the UUID was not malformed.

So I'm pretty sure there is nothing wrong about the code. I think it's something related to the Apple MAC. But, in my coworker's MAC it's not malformed. This makes it so weird to me.

I want the uuid to be correctly passed, not: b"ØÃLöÇYçEèR\0\n/\x11ÎÐ"

Here's the PHP artisan console:

All the UUID are malformed, not only on users:

$ php artisan tinker
Psy Shell v0.12.7 (PHP 8.2.29 — cli) by Justin Hileman
> User::first()
[!] Aliasing 'User' to 'App\Models\User' for this Tinker session.
= App\Models\User {#6624
    id: b"ØÃLöÇYçEèR\0\n/\x11ÎÐ",
    //rest of the data

On suggesting the casting on the model, I believe I already did that, but it is not a fix to the problem; here's the cast:

protected $casts = [
        'email_verified_at' => 'datetime',
    ];

And here's the database description of users for the id:

Column_name        |Type            |Computed|Length|Prec |Scale|Nullable|TrimTrailingBlanks|FixedLenNullInSource|Collation         


id                 |uniqueidentifier|no      |    16|     |     |no      |(n/a)             |(n/a)               |                            |
18
  • Please add relevant code on how to generate, send, fetch, and debug this uuid. Commented Nov 17 at 6:17
  • i dont generate. the data is already there. I debug/fetch with php artisan tinker, like select * from X. and I see that, the id is giving malformed format. altough, in the DB it is correct uuid format. I don't know why it turn malformed when in mac Commented Nov 17 at 6:23
  • Can you share your tinker console? We would like to know the exact code that you executed for retrieving the data. Commented Nov 17 at 7:32
  • I've updated the question Commented Nov 17 at 8:32
  • 2
    The id column is probably stored as binary in the db, and there is a mechanism configured that turns it to a readable format. Commented Nov 18 at 8:35

2 Answers 2

1

so I found that when I type
`\DB::connection()->getPdo()->getAttribute(\PDO::ATTR_DRIVER_NAME);`

on tinker, it return dblib, it should return sqlsrv. altough I set sqlsrv in .env and config/database.php. and I found that I need to install pdo_sqlsrv.so and sqlsrv.so, and set the extension in php.ini.

then i just restart the php service brew services restart [email protected] and clear cachec config laravel. I was think that when install via homebrew everything is installed. this is my first using mac, so yea. i'm stupid.

but i don't get it. why usuing dblib becomes error. if I search on internet, dblib is extension for sql server too, but it is from open source. meanwhile sqlsrv is from microsoft. but you know what? it solved

Sign up to request clarification or add additional context in comments.

5 Comments

I also commented again, the UTF-8 encoding problem in the diagnostic message is not the actual encoding problem and that is (in the end) that the raw binary value of the databases UUID colimn is assigned as string to the model's id property. You want to have it assigned as string, true, but not as raw binary, but formatted as the human readable UUID string (36 characters). You're not stupid when you stay curious, this is complicated even to explain, let alone talk about online. Thanks for also posting an answer, this clarifies a lot.
dblib does not become the error, the error is actually encoded within your code. This is not a judgment about the code: Using dblib is just with the consequence, that the uniqueidentifier columns will be put as raw binary strings to the model attributes; sqlsrv makes the uniqueidentifier columns human readable UUID strings in the model. Your code is just not handling those two different behaviors specifically (it is transparent to it, it "sees through it"), so any database configuration setting that is affecting this can make a difference. As it is transparent, it's hard to see until not.
And we should give some credit here: In essence @shingo commented about this two days ago, their description was: "The id column is probably stored as binary in the db, and there is a mechanism configured that turns it to a readable format." - the work to find that setting naturally remained, for your configuration this now was resolved configuring the PHP/Laravel code to use sqlsrv driver instead of dblib driver.
yes, the hint from what @shingo share is lifesaver. thanks guys!
Thanks! And if you want to make SO a better place even further, consider to answer the earlier question Fail generate UUID datatype (uniqueidentifier) from sql server using laravel in your own words there, too. E.g. "I had the same problem and for me changing the driver to X from Y resolved the readable UUIDs on my models" Just a suggestion, do in your own style.
1

Context matters for such questions, and while you provided a couple of details, the problem statement remains pretty generic which often is not a good fit on Stack Overflow (SO). That also only as explanation why you may dislike the answer as it most likely can't answer how to fix this in every instance, and I guess this is also a reason why others downvoted the question (as they are missing an anchor to actually answer it).

Nevertheless, as such questions may come up and then the best questions we can ask is for what we know currently, some more details here based on the information you've provided to help you dive deeper and find the culprit.

Free your mind though from the stance that this can't be the code or must be X/Y/Z. You can only say that after you have fixed it, before the fix it is only an assumption (and you have also documented why you say so, so that was with context, but remain open for surprises).


First of all it is very essential to know that you actually have the uniqueidentifier 16-byte column type and that the Relational Data Base Management System (RDBMS) is Microsoft SQL Server.

Let's start with some more common places, this is both with bad and good news.

First the diagnostic message:

Malformed UTF-8 characters, possibly incorrectly encoded

From your question alone, there is not much detail where it comes from, we only know the context of UUID and in JSON Text form.

The good/bad news with that is, that by my very rough estimation ca. 50% of all UUID numbers in their binary form yield that message.

The UUID you give as an example in its human readable representation is only one of the many that are not UTF-8 encoded in their binary representation (more about that in a moment):

    944CC79D-5980-4587-8A52-000A2F11D7D1

This is also the reason why even without that message you would still see a similar picture in the Laravel/PHP shell for the model's id property.

For that to work in your favour with Laravel 11 IIRC you have to tell Laravel Eloquent that the property is a UUID key per Eloquent Model Conventions. Now you may argue that this can't be because the code on the other system works, then I'd have to tell you that you're barking up the wrong tree unless you want to swap computers with your colleague. If not, read on.

So again good/bad news: Check the requirement for UUID columns in your Eloquent model at hand. There might be a migration missing, you might have forgotten to set it. Double checking might already get things into order, but even if everything is 100% correct in this regard, there are still things that can go wrong according to the error picture you show. I don't know, because the picture is not that clear (to me, as written roughly 50% of all UUIDs have the potential to yield that message).

Despite your Eloquent User Model's id property is not yet in the human readable form as documented by the Eloquent Model Conventions (they use the id property as primary key for articles, not users). Compare:

$article = Article::create(['title' => 'Traveling to Europe']);
 
$article->id; // "8f8e8478-9035-4d23-b9a7-62f4d2612ce5"

What you should expect on the ID property if you have configured and coded everything correctly, then the string is in the non-binary human readable representation of the 128-bit number distributing the 16 bytes of it in five segments of (lowercase) hexadecimal characters (0-9, a-f):

    XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
         4  - 2  - 2  - 2  - 6

This is the human readable representation, we can quickly identify it both in the example listing from the Eloquent documentation as well as for the UUID you provided:

    8f8e8478-9035-4d23-b9a7-62f4d2612ce5   -- Eloquent Sample
    944CC79D-5980-4587-8A52-000A2F11D7D1   -- Your Sample
         4  - 2  - 2  - 2  - 6

That Eloquent uses the lowercase letters a-f and you used the uppercase letters A-F is not playing a role here, both forms are common to express hexadecimal numbers, it is only using a different case (lower/upper) for the hexadecimal symbols. I'll continue therefore with your example after stressing the point that your Laravel code for it's configuration for that system is insufficient as you would otherwise have in the Laravel PHP shell, in which you "tinkered", a string with that human readable format (and probably in lowercase as in the Laravel docs if their example is not specifically for PostgreSQL only (Q&A)).

We now go deeper as so far we have only seen the human readable form of the UUID but you mention a problem specific to your RDBMS. Again your sample of the 128-bit number (16-byte) with the multi-byte segments in the readable form:

    944CC79D-5980-4587-8A52-000A2F11D7D1
         4  - 2  - 2  - 2  - 6

The database does not store this as a string, but as 16 bytes, and its also not a string of these bytes, but each of those multi-byte-segments is stored in a specific order:

    944CC79D-5980-4587-8A52-000A2F11D7D1    -- byte-order in readable form
         4  - 2  - 2  - 2  - 6
    9DC74C94-8059-8745-8A52-000A2F11D7D1    -- byte-order in SQL Server
         4* - 2* - 2* - 2  - 6

In this small listing, the star "*" marked segments are in reverse order. This creates a mixed-endian format used by the GUID specification (which I've simplifed, correct is little-endian).

We can also see that, when we take your binary string that has encoded 16 byte values, while some of its data got lost during the translation of the bytes through the various systems (e.g. posting it as a question here on Stack Overflow), values below 128 (0-127) are stable and we can see their matching positions:

         4* - 2* - 2* - 2  - 6
    9DC74C94-8059-8745-8A52-000A2F11D7D1    -- Your readable sample
    ____4C__-__59-__45-__52-000A2F11____    -- b"ØÃLöÇYçEèR\0\n/\x11ÎÐ"

This at least allows the educated guess that both the UUID and the id property string in artisan-tinker(1) are related.

And finally allows the educated guess, that your Eloquent database model is getting the data from the database which has it as a 128-bit binary number ("GUID"), which then gets assigned to the id property as raw binary data - which is what strings in PHP are.

Whenever that string is JSON encoded, PHP's json_encode() function will yield the JSON error or throw the JsonException with the message "Malformed UTF-8 characters, possibly incorrectly encoded" which you were able to summon within the JSON Text in your question.

So double check your model's configuration, related migrations, their representation in the database and the related database model in your RDBMS.

As users are involved create yourself a new model just for testing that is unrelated to your existing app functionality to verify that your mental model (that is what and how you understand all that) is aligned with the reality of the system on your computer.

Who knows, maybe it's just the configuration of an argument of the database driver that is different across the two systems default driver settings.


Edit (After): Your answer shows it was the concrete driver for the SQL Server in your case: dblib made your code having the raw binary UUID strings, while sqlsrv is having the human readable UUID strings behaviour.

2 Comments

hmm, thanks for the explanation. and sorry for poor information to the question. I dont know what is the problem to this spesific problem, so i just throw whatever information that might usefull. usually I ask about code, but this time is not about the code. anyway, I have found the solution.
Yes, a question that is not about a concrete segment of code is hard to ask about, because we don't have the minimal reproducer and with it the more direct correlations. Then this is all we can do, throw more and more information on it in the hope it remains an invitation to experience unlocking the "secret code" behind. This may often suggest it is a configuration issue, so the interactive tooling you did with the Laravel PHP shell ("tinkering") is a good approach even the reproduced picture itself did not lead to the fix immediately (which we naturally hope for).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.