4

I have tab delimited text file. File is around 100MB. I want to store data from this file to SQL server table. The file contains 1 million records when stored in sql server. What is the best way to achieve this?

I can create in momory datatable in c# and then upload the same to sql server, but in this case it will load entire 100 MB file to memory. What if file size get bigger?

2
  • Why not just the BULK INSERT command? Commented Aug 7, 2012 at 6:39
  • @pst If I'm not mistaken, that requires the file to be located on the same server as SQL Server, which may be an issue for the poster Commented Aug 7, 2012 at 6:42

2 Answers 2

6

No problem; CsvReader will handle most delimited text formats, and implements IDataReader, so can be used to feed a SqlBulkCopy. For example:

using (var file = new StreamReader(path))
using (var csv = new CsvReader(file, true)) // true = first row is headers
using (var bcp = new SqlBulkCopy(connectionString))
{
    bcp.DestinationTableName = "Foo";
    bcp.WriteToServer(csv);
}

Note that CsvReader has lots of options more more subtle file handling (specifying the delimiter rules, etc). SqlBulkCopy is the high-performance bulk-load API - very efficient. This is a streaming reader/writer API; it does not load all the data into memory at once.

Sign up to request clarification or add additional context in comments.

2 Comments

I understand. This is developed by someone and is under MIT open source licence. What I am looking for is the best way to achieve the same using Microsoft provided SDK, API which comes along. No extra licence required.
@SamirLakhani the MIT license is very open, and grants you a license to use the code without cost, provided you include their license in works that you distribute: en.wikipedia.org/wiki/MIT_License
2

You should read the file line-by-line, so you don't have to load the whole line into memory:

using (var file = System.IO.File.OpenText(filename))
{
    while (!file.EndOfStream)
    {
        string line = file.ReadLine();

        // TODO: Do your INSERT here
    }
}

* Update *

"This will make 1 million separate insert commands to sql server. Is there any way to make it in bulk"

You could use parameterised queries, which would still issue 1M inserts, but would still be quite fast.

Alternatively, you can use SqlBulkCopy, but that's going to be rather difficult if you don't want to use 3rd party libraries. If you are more amenable to the MS license, you could use the LINQ Entity Data Reader (distributed under Ms-PL license), which provides the AsDataReader extension method:

void MyInsertMethod()
{
    using (var bulk = new SqlBulkCopy("MyConnectionString"))
    {
        bulk.DestinationTableName = "MyTableName";
        bulk.WriteToServer(GetRows().AsDataReader());
    }
}

class MyType
{
    public string A { get; set; }
    public string B { get; set; }
}

IEnumerable<MyType> GetRows()
{
    using (var file = System.IO.File.OpenText("MyTextFile"))
    {
        while (!file.EndOfStream)
        {
            var splitLine = file.ReadLine().Split(',');

            yield return new MyType() { A = splitLine[0], B = splitLine[1] };
        }
    }
}

If you didn't want to use the MS licensed code either, you could implement IDataReader yourself, but that is going to be a PITA. Note that the CSV handling above (Split(',')) is not at all robust, and also that column names in the table must be the same as property names on MyType. TBH, I'd recommend you go with Marc's answer on this one

4 Comments

@pst I sort of assumed that the poster knew how to do that part from the wording of the question
This will make 1 million separate insert commands to sql server. Is there any way to make it in bulk
@Cocowalla Why would SqlBulkCopy difficult to without third party library? I am doing following way and it looks good: SqlBulkCopy bulkCopy = new SqlBulkCopy(...) bulkCopy.BulkCopyTimeout = 0; bulkCopy.DestinationTableName = "<table>"; bulkCopy.WriteToServer(dt); Where dt is DataTable which I am populating by parsing the file. Using StreamReader.
Because you said you were concerned about memory usage. If you are using a DataTable then you are loading the whole file into memory. If you don't want to do that, and don't want to use any 3rd party libraries, then you are probably going to have to implement an IDataReader yourself

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.