0

I have this big array of doubles that is apart of an object. These objects are to be stored in a database. I have created a table that has each of the fields inside. I am stuck trying to figure out how to store this large array of doubles in the table. I can't necessarily make another table with these doubles in each column because the array size is variable per object and they are very large. So, going through the table and adding thousands of floats will be time consuming when setting up my database.

How can I store this array of doubles into a column?

7
  • 2
    Are you going to be querying these values directly -- or would a blob suffice? Commented Jul 18, 2013 at 19:21
  • If you need to access that array exactly as it was before, you could construct a comma-separated string of the values and save it into the database as such. When needed again, you can parse that string back out into an array. Commented Jul 18, 2013 at 19:22
  • Not too sure. How would I know if I would need to query them directly or write them as a blob? Totally new to databases. Commented Jul 18, 2013 at 19:26
  • What do the arrays of doubles actually represent? What are they used for? Commented Jul 18, 2013 at 19:27
  • The ARRAY represents y values that are apart of a spectroscopy object which contains absorption information across the infrared spectrum. So, somewhere in the object are values pertaining to each x value corresponding to some frequency. Commented Jul 18, 2013 at 19:30

4 Answers 4

3

I would simply have more identifying columns.

  • ArrayId
  • ArrayIndexX
  • ArrayIndexY (If necessary for 2 dimensional arrays)
  • Value

this will allow you to have one table that will not need to have additional columns added to it, despite the size of data you need to store.

Storing thousands/or millions of recoreds really isn't that big of a deal assuming the columns are properly indexed.

good luck

Sign up to request clarification or add additional context in comments.

Comments

1

The first thing you'll need to decide is whether the array is atomic from the data management perspective:

  • If yes (i.e. you never need to search for, read or write individual array elements while they are in the database), just serialize it and write it to a BLOB. You could even compress it if appropriate.
  • If no (i.e. you do need to access the individual elements), then create a separate table which is in N:1 relationship with the "main" table. For example:

    enter image description here

3 Comments

I am not sure what you mean by atomic.
@CoreyBerigan It's "atomic from the data management perspective". Look at the parentheses...
Well, ofcourse I would need to access the elements in the array from the database but not all the time. If I did want to access them wouldn't I have to make thousands of queries to get these elements at one time? They are only useful when I use these elements at once.
1

Pack the array of double into a different property on the model and save the serialized data. The fastest packing algorithm is http://nuget.org/packages/protobuf-net. I use the following in a production app that stores around 4.5million+ in the array.

Note, I stripped this down to the bare minimum, you would probably want to optimize the Pack/Unpack calls so they dont happen on every property access.

In the example below, we save Surface to the database, which contains an array of Scans, which contains an array of samples. The same concept applies to properties of double[].

[ProtoBuf.ProtoContract]
public class Sample
{
    public Sample()
    {
    }

    [ProtoBuf.ProtoMember(2)]
    public double Max { get; set; }

    [ProtoBuf.ProtoMember(3)]
    public double Mean { get; set; }

    [ProtoBuf.ProtoMember(1)]
    public double Min { get; set; }
}

[ProtoBuf.ProtoContract]
public class Scan
{
    public Scan()
    {
    }

    [ProtoBuf.ProtoMember(1)]
    public Sample[] Samples { get; set; }
}

public class Surface
{
    public Surface()
    {
    }

    public int Id { get; set; }

    public byte[] ScanData { get; set; }

    [NotMapped]
    public Scan[] Scans
    {
        get
        {
            return this.Unpack();
        }
        set
        {
            this.ScanData = this.Pack(value);
        }
    }

    private byte[] Pack(Scan[] value)
    {
        using (var stream = new MemoryStream())
        {
            ProtoBuf.Serializer.Serialize(stream, value);
            return stream.ToArray();
        }
    }

    private Scan[] Unpack()
    {
        using (var stream = new MemoryStream(this.ScanData))
        {
            return ProtoBuf.Serializer.Deserialize<Scan[]>(stream);
        }
    }
}

Comments

1

You should use a separate table

You should't be trying to do hacky formatting things with your data, it will make it a lot harder to tell what you are trying to do and make what you are trying to do at a functional level get confused

Why you need a another table

The reason you need a linking table is because you have no idea how much data you are going to be holding, even if you put it into a blob there if a chance you might exceed the maximum amount of data that can be held in the blob.

The Code

I want to make a separate object for the Absorption point and store it in a Dictionary, this way if you have some sort of default case that occurs frequently (like when nothing is found at a point) you don't need to store the entire thing.

With though it would be best if you made a separate class to represent the collection, that way if someone reuses the class and doesn't know what is going on they won't be adding unnecessary data.

public class SpectroscopyObject
{
     private FrequencyAbsorptionPointCollection _freqs = new FrequencyAbsorptionPointCollection ();
     public FrequencyAbsorptionPointCollection FrequecyAbsorption {get{ return _freqs;}}
     public int Id {get;set;}

     //other stuff...
} 

public struct Point 
{
    public int X {get;set;}
    public int Y {get;set;}

    public Point ( int x , int y ) 
    {
            X = x;
            Y = y;
    }
}

public class FrequencyAbsorptionPoint
{
    public double Frequency { get; set; }
    public Point Location { get; set; }
}

public class FrequencyAbsorptionPointCollection : IEnumerable<FrequencyAbsorptionPoint>
{
    private readonly Dictionary<int , Dictionary<int , FrequencyAbsorptionPoint>> _points = new Dictionary<int , Dictionary<int , FrequencyAbsorptionPoint>> ( ); 

    int _xLeftMargin , _xRightMargin , _yTopMargin , _yBottomMargin;
    public FrequencyAbsorptionPointCollection (int xLeftBound,int xRightBound,int yTopBound,int yBottomBound)
    {
        _xLeftMargin = xLeftBound;
        _xRightMargin = xRightBound;
        _yTopMargin = yTopBound;
        _yBottomMargin = yBottomBound;
    }

    private bool XisSane(int testX)
    {
        return testX>_xLeftMargin&&testX<_xRightMargin;
    }


    private bool YisSane(int testY)
    {
        return testY>_yBottomMargin&&testY<_yTopMargin;
    }

    private bool PointIsSane(Point pointToTest)
    {
        return XisSane(pointToTest.X)&&YisSane(pointToTest.Y);
    }

    private const double DEFAULT_ABSORB_VALUE= 0.0;
    private bool IsDefaultAbsorptionFrequency(double frequency)
    {
        return frequency.Equals(DEFAULT_ABSORB_VALUE);
    }

    //I am assuming default to be 0

    public FrequencyAbsorptionPointCollection 
        (int xLeftBound,
         int xRightBound,
         int yTopBound,
         int yBottomBound,
         IEnumerable<FrequencyAbsorptionPoint> collection )
        :this(xLeftBound,xRightBound,yTopBound,yBottomBound)
    {
        AddCollection ( collection );
    }

    public void AddCollection ( IEnumerable<FrequencyAbsorptionPoint> collection ) 
    {
        foreach ( var point in collection )
        {
            Dictionary<int , FrequencyAbsorptionPoint> _current = null;
            if ( !_points.ContainsKey ( point.Location.X ) )
            {
                _current = new Dictionary<int , FrequencyAbsorptionPoint> ( );
                _points.Add ( point.Location.X , _current );
            }
            else
                _current = _points [ point.Location.X ];

            if ( _current.ContainsKey ( point.Location.Y ) )
                _current [ point.Location.Y ] = point;
             else
                _current.Add ( point.Location.Y , point );
        }
    }

    public FrequencyAbsorptionPoint this [ int x , int y ] 
    {
         get 
         {
            if ( XisSane ( x ) && YisSane ( y ) )
            {
                if ( _points.ContainsKey ( x ) && _points [ x ].ContainsKey ( y ) )
                    return _points [ x ] [ y ];
                else
                    return new FrequencyAbsorptionPoint
                {
                    Id = 0 ,
                    Location = new Point ( x , y ) ,
                    Frequency = DEFAULT_ABSORB_VALUE
                };
            }
            throw new IndexOutOfRangeException (
                string.Format( "Selection ({0},{1}) is out of range" , x , y ));
        }
        set 
        {
            if ( XisSane ( x ) && YisSane ( y ) ) 
            {
                if ( !IsDefaultAbsorptionFrequency ( value.Frequency ) ) 
                {
                    Dictionary<int,FrequencyAbsorptionPoint> current = null;
                    if ( _points.ContainsKey ( x ) )
                        current = _points [ x ];
                    else
                    {
                        current = new Dictionary<int,FrequencyAbsorptionPoint>();
                        _points.Add ( x , current );
                    }

                    if ( current.ContainsKey ( y ) )
                        current [ y ] = value;
                    else
                    {
                        current.Add ( y , value );
                    }
                }
            }
        }
    }

    public FrequencyAbsorptionPoint this [ Point p ] 
    {
        get 
        {
            return this [ p.X , p.Y ];
        }
        set 
        {
            this [ p.X , p.Y ] = value;
        }
    }

    public IEnumerator<FrequencyAbsorptionPoint> GetEnumerator ( )
    {
        foreach ( var i in _points.Keys )
            foreach ( var j in _points [ i ].Keys )
                yield return _points [ i ] [ j ];
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator ( )
    {
        return GetEnumerator ( );
    }
}

Now, the sql code

CREATE TABLE SpectroscopyObject ( Id INT PRIMARY KEY NOT NULL, --other stuff )

CREATE TABLE FrequencyAbsorptionInfo ( Id INT PRIMARY KEY NOT NULL IDENTITY, XCoord INT NOT NULL, YCoord INT NOT NULL, AbsorptionInfo NUMERIC(5,5) NOT NULL, SpectroscopyObjectId INT NOT NULL FOREIGN KEY REFERENCES SpectroscopyObject(Id) )

Now all you need to do is store the points and reference your the related object with the id of the object, if you wanted to read it it would look like this

string commandStringSObjs = 
@"
SELECT Id, ..other attributes... FROM SpectroscopyObject
";
string commandStringCoords = 
@"
SELECT XCoord,YCoord,AbsorptionInfo 
WHERE SpectroscopyObjectId = @Id
";

var streoscopicObjs = new List<SpectroscopyObject>();
using(var connection = new SqlConnection(CONNECTION_STRING))
{
    using(var cmd = connection.CreateCommand())
    {
        cmd.CommandText = commandStringSObjs;
        connection.Open();
        using(var rdr = cmd.ExecuteReader())
        {
            while(rdr.Read())
            {

                streoscopicObjs.Add(new SpectroscopyObject
                {
                    Id = Convert.ToInt32(rdr["Id"])
                    //populate your other stuff
                }
            }
        }
    }
    //to read the absorption info
    foreach(var obj in streoscopicObjs)
    {
        var current = obj.FrequecyAbsorption;
        using(var cmd = connection.CreateCommand())
        { 
            cmd.CommandText = commandStringCoords;
            cmd.Parameters.Add(
                new SqlParameter("Id",DbType.Int){ Value = obj.Id});
            using(var rdr = cmd.ExecuteReader())
            {
                while(rdr.Read())
                {
                    var x = Convert.ToInt32(rdr["XCoord"]);
                    var y = Convert.ToInt32(rdr["YCoord"]);
                    var freq = Convert.ToDouble(rdr["AbsorptionInfo"]);

                    current[x][y] = new FrequencyAbsorptionPoint
                    {
                        Location = new Point(x,y),
                        Frequency = freq
                    };
                }
            }
        }
    }

    //do some stuff
    ...
   // assuming you update 
    string updatefreq = 
@"


INSERT INTO FrequencyAbsorptionInfo(XCoord,YCoord,
                   AbsorptionInfo,SpectroscopyObjectId )
VALUES(@xvalue,@yvalue,@freq,@Id) ";
    //other point already

    //to write the absorption info
    foreach(var obj in streoscopicObjs)
    {
        using(var cmd = connection.CreateCommand())
        {
            cmd.CommandText = 
@"
DELETE FrequencyAbsoptionInfo 
WHERE  SpectroscopyObjectId =@Id
";
            cmd.Parameters.Add(new SqlParameter("Id",DbType.Int){ Value = obj.Id});
            cmd.ExecuteNonQuery();
        }
        var current = obj.FrequecyAbsorption;
        foreach(var freq in current)
        {
            using(var cmd = connection.CreateCommand())
            { 
                cmd.CommandText = updatefreq ;
                cmd.Parameters.AddRange(new[]
                {
                    new SqlParameter("Id",DbType.Int){ Value = obj.Id},
                    new SqlParameter("XCoords",DbType.Int){ Value = freq.Location.X},
                    new SqlParameter("YCoords",DbType.Int){ Value = freq.Location.Y},
                    new SqlParameter("freq",DbType.Int){ Value = freq.Frequency },
                });
                cmd.ExecuteNonQuery();
            }
        }
    }
}

2 Comments

All of the class information for the objects have been standardized.
@CoreyBerigan why not just wrap the objects before you save and after you load? You didn't include your actual code so you couldn't expect me to give you a perfectly correct answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.