-2

I try to map a Point[] to a SqlServer database.

"Point" is a struct made of 2 doubles: X,Y coordinates.

I need to save an array of Point with EF Core into a SQL Server database. I can save them in different ways but I think those 2 ways would probably be more efficient if possible:

  1. Point[]
  2. X values as double[] and Y values as double[] separately.

Actually I have chosen version 2 (double[]) but EF Core maps it to nvarchar(max)

Does anyone know if nvarchar can receive a double[] and/or if there is a better way to map a double[] to SQL Server database and keep the dvalues in binary data?

Note: it would be nice to avoid turning the data into readable values, like JSON, because there are thousands of points and it takes a lot more space. It would be better to keep the data in binary format. We do not need to be able to search for specific point.

I saw: How to store double[] array to database with Entity Framework Code-First approach but the selected answer turn the data into string which take a lot more spaces.

Update: About double[] mapped to nvarchar, see: https://devblogs.microsoft.com/dotnet/array-mapping-in-ef-core-8/, that is normal since EfCore 8 which convert any primitive array to Json by default. But I will try Fabian answer on previous paragraph link to see if I can left the data in binary format instead (better speed and space). Or any better solution... if any suggested...

7
  • 4
    Why not normalize it into a proper table Map_Point (MapId, X, Y, PRIMARY KEY (MapId, X, Y)) Or have you considered using the Geography type learn.microsoft.com/en-us/ef/core/modeling/spatial Do you have a polygon shape, or is it just a list of points? Commented May 9 at 13:50
  • 1
    @charlieface, It is only a list of points. It represent a series (signal coordinates) with fixed offset bewteen point. I did not consider Geography. The only need for binary data is the one express into the question. I do not want to introduce a key for each point which is not necessary and just add more space. Commented May 9 at 14:01
  • 1
    SQL Server already has spatial data types that would allow you to query the data with SQL and index it. What you describe is a Multipoint. Depending on what the coordinates actually are, you should use either the geometry or geography type. Commented May 9 at 14:08
  • And even if you only really need a blob I'd still use NetTopologySuite to save the data using the Well-Known-Binary format in a varbinary(max) column, using the WKBWriter and WKBReader classes. All spatial libraries can read that format. Although just using EF Core+NetTopologySuite would take care of everything automatically Commented May 9 at 14:12
  • 1
    Worst case, use a custom type converter for this specific property. The big problem with this is that only your application will be able to read this. Should anything change, the data may become useless unless you modify your type converter to handle all changes. Forever Commented May 9 at 14:18

2 Answers 2

4

Ideally you would keep these in a separate normalized table. For example, if you have a Map with multiple points on it then:

CREATE TABLE Map_Point (
    MapId int REFERENCES Map (Id),
    X float,
    Y float,
    PRIMARY KEY (MapId, X, Y)
);

If you are really set on keeping them as one blob in the parent table then SQL Server supports the geometry and geography types. You could use NetTopologySuite in C#, which EF Core supports.

options.UseSqlServer(yourConnectionString,
    x => x.UseNetTopologySuite());

Then the class would be

public class Map
{
    public int Id { get; set; }

    public MultiPoint Points { get; set; }
}

And just to be safe, a CHECK constraint

modelBuilder.Entity<Map>(entity =>
    entity.HasCheckConstraint("CK_Map_MultiPoint", "Points.InstanceOf('MULTIPOINT') = 1");
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks but I do not want to save the data directly into a field with less possible conversion. I think "Fabian" answer is probably the best in the related post: stackoverflow.com/questions/15220921/… ... At least, I think I will try it! It seems great!
Thanks a lot! See my answer. I'm not sure at all it will works but I will give a try. It appears to me to be the best way to go and by far! It appears weird to me that it is not implemented by default.
3

Code at the end...

My solution is to implement a value conversion like Microsoft documentation: https://learn.microsoft.com/en-us/ef/core/modeling/value-conversions?tabs=data-annotations

Any solution related to Geometry of geography were rejected because I do not need to search for any element of the array independently. We do not want to add Database package in order to keep things simple. For our cases, where we only want to save a struct of Points, we prefer to keep a more easy solution to implement and maintain.

After pretty much a lot of work and tests, I got a pretty nice solution, I think. Efficient in speed and size. It is also pretty much flexible supporting any struct type arrays or simple type arrays. The only downside, like 'Panagiotis Kanavos' highlighted is saving data in binary format is less flexible than Json where Json seems to be the most popular choice theses days. But I choose binary for speed and space.

About the following SO question (related): How to store double[] array to database with Entity Framework Code-First approach: ... I found some negative aspect on the accepted answer (Jonas answer) and few others answers too:

  1. The addition of a "fake" property (only for persistence to the database) seems to me as a last resort. Adding a property only for persistence to the DB add code which could make hard to understand why it is there and prone to error. My opinion is that is should be avoided when possible. Because EFCore does enable us to do conversion at its level, I prefer to do so. The code stay cleaner.

  2. Also some peoples pointed out about conversion that should occur on each access to the property if it is not cached. See other comment on the same previous link. Very important to do caching when using an additional property (or field).

  3. There is also a memory issue when using string split on very large array where it creates a string object for each and every values. When your array have many thousands of points, it is preferable to avoid "string.split". So split should be avoided if possible and use of Span is highly recommanded (at least I think). Span<T> creates lots less temporary objects.

Stats of my solution (binary) vs Json (Microsoft Json serializer) vs string (values concatenation/less flexible - the way proposed in the accepted answer of the previous link):

Result for saving array of struct Point in SqServer database
Release build, 
Each result is the average of 4 test results, 
SqlServer 2019

Binary
Time to save 1000000 points: 1727 ms
Time to load 1000000 points: 1611 ms
Time to save 1000000 points: 600 ms
Time to load 1000000 points: 320 ms
Time to save 1000000 points: 470 ms
Time to load 1000000 points: 261 ms
Time to save 1000000 points: 330 ms
Time to load 1000000 points: 245 ms
size = 16 000 000 in DB

Json
Time to save 1000000 points: 3883 ms
Time to load 1000000 points: 4149 ms
Time to save 1000000 points: 2702 ms
Time to load 1000000 points: 2294 ms
Time to save 1000000 points: 2716 ms
Time to load 1000000 points: 1877 ms
Time to save 1000000 points: 2320 ms
Time to load 1000000 points: 3357 ms
size = ~48 540 581

String concatenation
Time to save 1000000 points: 3295 ms
Time to load 1000000 points: 1713 ms
Time to save 1000000 points: 1801 ms
Time to load 1000000 points: 1981 ms
Time to save 1000000 points: 1711 ms
Time to load 1000000 points: 1445 ms
Time to save 1000000 points: 1765 ms
Time to load 1000000 points: 1329 ms
size = ~38 538 490

enter image description here

Code:

DbContext

    public class DbContextAlgo : DbContext
    {
        protected override void ConfigureConventions(ModelConfigurationBuilder configurationBuilder)
        {
            base.ConfigureConventions(configurationBuilder);
            configurationBuilder.Properties<double[]>()
                .HaveConversion<DoubleArrayConverter, ArrayStructuralComparer<double>>();

            base.ConfigureConventions(configurationBuilder);
            configurationBuilder.Properties<SamplePoint[]>()
                .HaveConversion<StructArrayToByteArrayConverter<SamplePoint>, ArrayStructuralComparer<SamplePoint>>();

            //base.ConfigureConventions(configurationBuilder);
            //configurationBuilder.Properties<SamplePoint[]>()
            //  .HaveConversion<StructArrayToJsonConverter<SamplePoint>, ArrayStructuralComparer<SamplePoint>>();

            //base.ConfigureConventions(configurationBuilder);
            //configurationBuilder.Properties<SamplePoint[]>()
            //  .HaveConversion<SamplePointArrayToStringConverter, ArrayStructuralComparer<SamplePoint>>();
        }
...



StructArrayConverter


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Runtime.InteropServices;
using System.Text.Json;
using System.Reflection.Metadata.Ecma335; // Required for MemoryMarshal and StructLayout

namespace General.Converter
{
    public static class StructArrayConverter<T> where T : struct
    {
        public static byte[] StructArrayToByteArray_MemoryMarshal(T[] points)
        {
            if (points == null || points.Length == 0)
            {
                return Array.Empty<byte>();
            }

            // Get a read-only span of bytes representing the points array's memory.
            // This operation itself is very fast as it doesn't copy memory.
            ReadOnlySpan<byte> byteSpan = MemoryMarshal.AsBytes<T>(points.AsSpan());

            // If you need a byte[], you then copy the span.
            return byteSpan.ToArray();
        }

        public static T[] ByteArrayToStructArray_MemoryMarshal(byte[] byteArray)
        {
            if (byteArray == null || byteArray.Length == 0)
            {
                return Array.Empty<T>();
            }

            // Get a span of MyPoint reinterpreting the byte array's memory.
            Span<T> pointSpan = MemoryMarshal.Cast<byte, T>(byteArray.AsSpan());

            return pointSpan.ToArray();
        }

        public static string StructArrayToJson(T[] points)
        {
            return JsonSerializer.Serialize<T[]>(points);
        }

        public static T[] JsonToStructArray(string json)
        {
            return JsonSerializer.Deserialize<T[]>(json) ?? Array.Empty<T>();
        }
    }

}


StructArrayToByteArrayConverter.cs

using Microsoft.EntityFrameworkCore.Storage.ValueConversion;

namespace GeneralDb.Converter
{
    /// <summary>
    /// Converter for converting a struct array to a byte array and vice versa. Used for SqlServer and others.
    /// </summary>
    /// <typeparam name="T">T is the struct type without array</typeparam>
    public class StructArrayToByteArrayConverter<T> : ValueConverter<T[], byte[]> where T : struct
    {
        public StructArrayToByteArrayConverter() : base(v => ConvertStructArrayToByteArray(v), v => ConvertByteArrayToStructArray(v), true) { }

        public static byte[] ConvertStructArrayToByteArray(T[] points)
        {
            return General.Converter.StructArrayConverter<T>.StructArrayToByteArray_MemoryMarshal(points);
        }

        public static T[] ConvertByteArrayToStructArray(byte[] byteArray)
        {
            return General.Converter.StructArrayConverter<T>.ByteArrayToStructArray_MemoryMarshal(byteArray);
        }
    }
}

StructArrayToJsonConverter.cs

using Microsoft.EntityFrameworkCore.Storage.ValueConversion;


namespace GeneralDb.Converter
{
    /// <summary>
    /// This class is used to convert a struct array to a JSON string and vice versa. Used for SqlServer and others.

    /// </summary>
    /// <typeparam name="T">T is the struct type without array</typeparam>
    public class StructArrayToJsonConverter<T> : ValueConverter<T[], string> where T : struct
    {
        public StructArrayToJsonConverter() : base(v => ConvertStructArrayToJson(v), v => ConvertJsonToStructArray(v), true) { }

        public static string ConvertStructArrayToJson(T[] points)
        {
            return General.Converter.StructArrayConverter<T>.StructArrayToJson(points);
        }

        public static T[] ConvertJsonToStructArray(string json)
        {
            return General.Converter.StructArrayConverter<T>.JsonToStructArray(json);
        }
    }
}

SamplePointArrayToStringConverter.cs

using General.Collection;
using Microsoft.EntityFrameworkCore.Storage.ValueConversion;
using Standard;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace GeneralDb.Converter
{
    /// <summary>
    /// This class is used to convert a struct array to a JSON string and vice versa. Used for SqlServer and others.

    /// </summary>
    /// <typeparam name="T">T is the struct type without array</typeparam>
    public class SamplePointArrayToStringConverter : ValueConverter<SamplePoint[], string>
    {
        public SamplePointArrayToStringConverter() : base(v => ConvertSamplePointArrayToString(v), v => ConvertStringToSamplePointArray(v), true) { }

        public static string ConvertSamplePointArrayToString(SamplePoint[] points)
        {
            StringBuilder sb = new StringBuilder();

            sb.Append(points.Length);
            sb.Append(';');

            foreach (SamplePoint point in points)
            {
                sb.Append(point.X);
                sb.Append(';');
                sb.Append(point.Y);
                sb.Append(';');
            }

            return sb.ToString();
        }

        public static SamplePoint[] ConvertStringToSamplePointArray(string str)
        {
            try
            {
                ReadOnlySpan<char> span = str.AsSpan();
                int index = span.IndexOf(';');
                if (index == -1)
                {
                    return default;
                }

                int length;
                int indexStart = 0;
                int indexEnd = span.IndexOf(';');
                var part = span.Slice(indexStart, indexEnd - indexStart);

                if (!int.TryParse(part, out length))
                {
                    throw new InvalidDataException("Invalid string format");
                }

                int count = 0;
                if (count == length)
                {
                    return default;
                }

                SamplePoint[] samplePoints = new SamplePoint[length];
                
                while(count < length)
                {
                    indexStart = indexEnd + 1;
                    indexEnd = span.GetNextIndex(';', indexStart);
                    part = span.Slice(indexStart, indexEnd - indexStart);

                    double x = double.Parse(part);

                    indexStart = indexEnd + 1;
                    indexEnd = span.GetNextIndex(';', indexStart);
                    part = span.Slice(indexStart, indexEnd - indexStart);

                    double y = double.Parse(part);

                    samplePoints[count] = new SamplePoint(x, y);
                    count++;
                }

                return samplePoints;
            }
            catch (Exception ex)
            {
                Debug.WriteLine("Ooops");
                throw new InvalidDataException("Invalid string format");
            }
        }
    }
}


SpanExtension.cs

using System;
using System.Collections.Generic;
using System.Dynamic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace General.Collection
{
    public static class SpanExtension
    {
        /// <summary>
        /// Search for the first occurence of item and return it's index in the span
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="span">span source searched</param>
        /// <param name="item">item searched in the span</param>
        /// <param name="index">starting index where the search begin</param>
        /// <returns>index of the first occurence found or -1 if not found</returns>
        public static int GetNextIndex<T>(this ReadOnlySpan<T> span, T item, int index = 0)
        {
            if (index < 0)
            {
                throw new ArgumentOutOfRangeException(nameof(index), "Index cannot be negative");
            }

            while(index < span.Length)
            {
                if (span[index].Equals(item))
                {
                    return index;
                }
                index++;
            }

            return -1;
        }

        /// <summary>
        /// Search for the first occurence of item and return it's index in the span
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="span">span source searched</param>
        /// <param name="item">item searched in the span</param>
        /// <param name="index">starting index where the search begin</param>
        /// <returns>index of the first occurence found or -1 if not found</returns>
        public static int GetNextIndex<T>(this Span<T> span, T item, int index = 0)
        {
            if (index < 0)
            {
                throw new ArgumentOutOfRangeException(nameof(index), "Index cannot be negative");
            }

            while (index < span.Length)
            {
                if (span[index].Equals(item))
                {
                    return index;
                }
                index++;
            }

            return -1;
        }

    }
}

My tests

private void OnCommandTest()
{
    const int PointCount = 1000000;
    Random rand = new Random();

    TestSamplePointArray t = new TestSamplePointArray();
    Guid id = default;

    // EO: test Database SamplePoint[] 
    using (var ctx = new DbContextAlgo())
    {
        t.SamplePoints = new SamplePoint[PointCount];
        foreach(var index in Enumerable.Range(0, PointCount))
        {
            t.SamplePoints[index] = new SamplePoint(rand.NextDouble(), rand.NextDouble());
        }

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        ctx.Add(t);
        ctx.SaveChanges();
        stopwatch.Stop();   

        id = t.TestSamplePointArrayId;  

        Trace.WriteLine($"Time to save {PointCount} points: {stopwatch.ElapsedMilliseconds} ms");
    }

    using (var ctx = new DbContextAlgo())
    {
        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        var t2 = ctx.TestSamplePointArrays.Find(id);
        stopwatch.Stop();
        Debug.Assert(t.SamplePoints != null);
        Trace.WriteLine($"Time to load {t.SamplePoints.Length} points: {stopwatch.ElapsedMilliseconds} ms");

        if (t2 != null)
        {
            for (int index = 0; index < t2.SamplePoints.Length; index++)
            {
                Debug.Assert(t2.SamplePoints[index].Equals(t.SamplePoints[index]));
            }
        }
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.