1

I am currently considering an implementation of a service, that needs to store a lot of variable user data. That is: in one query a user might search for a,b, c, in the next a search goes over q, x, a and then someone comes up with additional data introducing z and y, which then again should be used for certain lookups.

If I would implement that as a standard SQL table, I had columns like a, b, c, ..., z, where most of those columns are null, and every few weeks someone would add another column for a new special case.

My solution for this issue is to describe those objects not through columns but as XML objects, which then only include those pieces of data that are relevant for the given object:

<myObject>
  <a>ValueA</a>
  <d>ValueD</d>
    ...
  <q>ValueQ</q>
</myObject>

While this solves the issue from a data perspective, reducing clutter and making the data much more readable, I wonder how the performance of MySQL compares to the usage of columns.

I cannot find any information about an XML index that allows me to search for all objects, where i.E. <a>value</a>. As look-up performance will be important for this service I need to know about what magnitude of performance hit implementing my data values as XML I have to face. Sadly I could not find any reliable performance analysis either.

What is the performance difference of using columns vs. saving data as XML in MySQL?

1 Answer 1

1

Data is not automatically indexed, if you query on ExtractValue results the comparison is done in-memory and no index is used. You need to manually define these as "STORED" columns and this only works on 5.7+.

ALTER TABLE foo ADD COLUMN some_node_of_my_xml TEXT AS (ExtractValue(doc, "/some/path/to/my/node/@name")) STORED;

Then you can query over some_node_of_my_xml like you'd do on any other column. But in essence it's exactly the same as extracting the data and storing it at an application level instead of DB level.

What I'd suggest instead is to store these "dynamic" fields in a separate table, name/value pairs, and JOIN. Or implement ElasticSearch, or switch to a "traditional" document database like Mongo.

Sign up to request clarification or add additional context in comments.

2 Comments

Putting them in tables again would bring me back to my initial problem, so that is not an option. I would choose a different database, but there is a clear requirement to use MySQL and nothing else, sadly.
Then moving them to a different table where "name" and "value" are indexed columns is your only option probably.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.