1

I posted earlier today this question related to the conversion of a JSON array into a table, and was quite lucky to find the solution after further search.

Now, and after searching more than the previous time, I'm still stuck (though I saw some entries in this forum, but they do not specifically resolve my problem).

There are some cases in which I need to respond to a request with the result of a select that has a variable number of records (could be thousands), each having about 20 columns.

Now the way I found to build a JSON out of the select (by adding FOR JSON AUTO) works very nicely and indeed creates an array of records, each having all the columns presided by the column name.

This, however, makes the result several times larger than needed (I'm thinking about network traffic, specially when it is not over a LAN).

To overcome this, I split the response into two, a Header and a Body, where the Header contains the list of the column names in the response (in the correct order) while the body contains, for each record, the list of values (matching the number and order of the Header).

Example:

If the source table would look like this:

  A     |      B      |         C
--------+-------------+--------------------
 11     |     22      |   2018-04-07 12:44
 33     |     44      |   2017-02-21 18:55
 55     |     66      |   2016-11-12 00:03

and the Body of the response should contain the values of columns "A" and "B" from a table, the response would look as follows:

{"Response": {
               "Header":["A","B","C"],
                "Body":[["11","22","2018-04-07 12:44"],
                        ["33","44","2017-02-21 18:55"],
                        ["55","66","2016-11-12 00:03"]
                       ]
             }
}

Unfortunately, I'm not finding a way to get the contents of the Body without the "A","B" and "C" names.

Update

I want to stress the fact that the different columns within the table record may be of different types, so I would convert them all to strings. See the updated example table and the expected result.

9
  • You are asking for a completely different result, not the JSON representation of the table data. A meaningful result woudl be [{"A":11,"B":22}, {"A":33","B":44},...]. Looks like you are trying to construct some web page response in SQL? Why not do that on the service itself? I won't say REST service because that shape is unsuitable for REST. It only makes sense as data for producing table reports Commented Jul 18, 2018 at 12:24
  • @PanagiotisKanavos, thank you for your comment. What I'm looking for is to get, for each table record, an array of strings (I'll update the question to represent that - thanks for triggering this correction). Commented Jul 18, 2018 at 12:27
  • Do so on the client. The result will defeat the very purpose of JSON though. Instead of the representation of objects you'll have something that will require extra processing to convert it back to objects Commented Jul 18, 2018 at 12:28
  • Network bandwidth shouldn't be a concern - after all, responses are compressed. If you have a lot of data though, perhaps you shouldn't be using JSON at all. Instead of inventing a new format check protobuf, Avro or one of the many big data formats Commented Jul 18, 2018 at 12:32
  • @PanagiotisKanavos, well, I'm used (for many years already) to consider everything, including NW load. As per your assumption that I'm inventing something new, I must say that I completely disagree. JSON is great for data transfer, and the only question is how to best MODEL the data to ease the load on all the parts forming the final system (and yes, in some cases you can distribute the processing load all over the client applications; in my case, the JavaScript handing large amount of data -several MB- in the above mentioned format completes the task in a small action of a sec.). Commented Jul 18, 2018 at 12:43

2 Answers 2

1

As @Jeroen-Mostert notes, this is pretty simple to do in procedural code. You can even have SQL Server do it using SQL CLR. Because this shape is not natural for FOR JSON queries, a CLR-based solution is probably be better than a TSQL one.

Below concatenates the results into a single string, but you could change that to stream the results over multiple rows (like FOR JSON queries do), or add in GZip compression.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Text;
using Microsoft.SqlServer.Server;

public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void RunQuery (string sql)
    {
        /*
         {"Response":{"Header":["A","B","C"],
                "Body":[["11","22","2018-04-07 12:44"],
                        ["33","44","2017-02-21 18:55"],
                        ["55","66","2016-11-12 00:03"]
                       ]
             }
         }
         * */
        using (var con = new SqlConnection("Context Connection=true"))
        {
            var sb = new StringBuilder();
            con.Open();
            var cmd = con.CreateCommand();
            cmd.CommandText = sql;
            using (var rdr = cmd.ExecuteReader())
            {
                sb.Append("{\"Response\":{\"Header\":[");
                for (int i = 0; i < rdr.FieldCount; i++)
                {
                    var fn = rdr.GetName(i);
                    sb.Append('"').Append(fn).Append('"');
                    if (i + 1 < rdr.FieldCount)
                        sb.Append(',');
                }
                sb.Append("],\"Body\":[");

                //SqlContext.Pipe.Send(sb.ToString());

                if (rdr.Read())
                {
                    while (true)
                    {

                        sb.Append('[');
                        for (int i = 0; i < rdr.FieldCount; i++)
                        {
                            var val = rdr[i].ToString();
                            sb.Append('"').Append(val).Append('"');
                            if (i + 1 < rdr.FieldCount)
                                sb.Append(',');
                        }
                        sb.Append(']');
                        if (rdr.Read())
                        {
                            sb.Append(',');
                        }
                        else
                        {
                            break;
                        }

                    }
                }
                sb.Append("]}}");

                var md = new SqlMetaData("JSON", SqlDbType.NVarChar,-1);


                var r = new SqlDataRecord(md);
                r.SetString(0, sb.ToString());
                SqlContext.Pipe.SendResultsStart(r);
                SqlContext.Pipe.SendResultsRow(r);
                SqlContext.Pipe.SendResultsEnd();

            }
        }
    }
}
Sign up to request clarification or add additional context in comments.

11 Comments

Except that SQLCLR is effectively deprecated in SQL Server 2017 due to the security restrictions. One of the reasons is ill-behaved code that caches a lot of data, stealing it from the servers buffers like this code does. For 6MB worth of data this code would have to cache at least that much in the StringBuilder. The actual behaviour would be far worse - the StringBuilder would have to reallocate its memory several times, resulting in severe memory waste.
Both values and column names ought to be escaped to ensure the result is always valid JSON. (Obviously this is usually more relevant for values than for column names, but still.)
On the other hand, generating this on the client, an MVC or Web API controller, would be far easier and cheaper. One could use libraries like JSON.NET to easily generate individual elements and even emit each record to the response stream instead of caching everything before responding.
David, thank you for the proposed solution (and code of course). I thought of attaching a JAVA stored procedure, but (independently of what @PanagiotisKanavos states about depreciation), I would rather pay the price on transferred data size than add a sort of foreign element to the overall solution. Still, will keep this proposal in mind in case it turns out to be the best. Once again, thanks!!
@FDavidov inventing a new file format is like adding another untested link. The servlet can easily generate any format, including what you posted.
|
1

This, however, makes the result several times larger than needed (I'm thinking about network traffic, specially when it is not over a LAN).

Given that:

  1. your goal is to reduce network traffic, and
  2. you are using a servlet to handle the DB communication
  3. there are multiple datatypes involved

you can try...

  1. simply not using JSON to return the result set to the servlet. Converting the result set to JSON is resource-intensive, requiring additional CPU, RAM, and time. And while JSON is a more efficient text format than XML, it is not necesarilly more efficient than sticking with the native TDS format. Consider that integers of 0 to 3 digits will be more efficient as text, 4 digits will be equally efficient, and 5 digits or more are less efficient. So for INT fields, which format is more efficient depends on the actual results. On the other hand, DATETIME fields are 8 bytes natively, but when serialized as text for JSON, it becomes 23 bytes. This, of course, assumes that you are converting the JSON to VARCHAR. If you are unable to convert to VARCHAR to return the JSON to the servlet due to having Unicode strings in the results, then those string dates take up 46 bytes each (SQL Server uses UTF-16 only for Unicode, there is no UTF-8 option). And that also changes the difference between INT values such that only values of 0 or 1 digits is more efficient as text, 2 digits is equivalent, and 3 or more digits are less efficient.

    The overall point being: you need to thoroughly test both approach because, I (and apparently others on here) suspect that you would, at best, end up with an equally efficient transport, yet would have paid the price in terms of having more convoluted code and require more system resources. Hence a net-negative. But, more than likely this approach will be slower, in addition to paying that price for it.

    So, as has already been suggested in various comments by @PanagiotisKanavos, @JeroenMostert, and @DavidBrowne: just convert to JSON in the servlet.

  2. compressing the returned, standard JSON via the built-in COMPRESS function (introduced in SQL Server 2016). By simply adding a single function to the query you can reduce the returned JSON by over 90%. For example:

    DECLARE @Results VARCHAR(MAX);
    SET @Results = (SELECT * FROM sys.objects FOR JSON AUTO);
    SELECT @Results AS [JSON],
           DATALENGTH(@Results) AS [UncompressedBytes],
           DATALENGTH(COMPRESS(@Results)) AS [CompressedBytes];
    

    returns:

    [{"name":"sysrscols","object_id":3,"schema_id":4,"parent_object_id":0,"type":"S","type_desc":"SYSTEM_TABLE","create_date":"2017-08-22T19:38:02.860","modify_date":"2017-08-22T19:38:02.867","is_ms_shipped":true,"is_published":false,"is_schema_published":false},...

    29521

    2594

    This dramastically reduces the network traffic without convoluting the code or coming up with a proprietary format.

    The compression is done using GZip, which appears to be rather easy to handle in Java:

    Java GZIP Example – Compress and Decompress File

    Proving that you really can have your JSON and compress it too 😸


P.S. Regarding your comment, on David's answer, of:

I thought of attaching a JAVA stored procedure

No, this is not an option with SQL Server.

P.P.S. Regarding the comment that SQLCLR is deprecated, even just "effectively", please see my post:

SQLCLR vs SQL Server 2017, Part 8: Is SQLCLR Deprecated in Favor of Python or R (sp_execute_external_script)?

6 Comments

Salomon, thanks for your post. About being "resource-intensive" (conversion to JSON), I though that was the case but, to my surprise, Microsoft made a great job when coding the new functions (SQL Server 2016). for instance, converting a 40,000 records table into a JSON array (almost 7MB string!!!) takes about 0.15 second. As far as I'm concerned, this is quite efficient indeed.
@FDavidov Even if it is rather efficient, that still doesn't mean that it is more efficient than the native transport method. But again, if you are inclined to go the JSON route, then my option #2 would still apply.
Solomon, thank you again for your inputs. As for option #2, the JAVA servlet sits next to the DB, so I would be saving bandwidth only within the server's connections, not at the connection within the servers farm and the client. I'll stay with the current approach for the time being and see what happens once we go live. Again, I appreciate your time and effort. Cheers!!!
@FDavidov While I was not expecting that issue, I am not sure that it is not solvable easily enough. It seems like you can pass on the GZipped binary (possibly Base64 encoded first) to the browser. Take a look at this S.O. answer, JavaScript implementation of Gzip, and this GitHub project, gzip-js. And there might be other, similar projects / answers that are even better.
Solomon, Base64, by definition, makes a binary string at least 25% bigger (or perhaps 33%? Not sure about that). IN any case, ZIP -> Convert (to Base64) -> transport -> Convert back (to binary) -> UNZIP... Hummmm... Don't know about that. In any case, I'm up-voting your answer as it at give ideas also to others. One more time, thank you Sir!!!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.