3

I have an Excel table with about 50 columns and over 6000 rows.

I found the following solution to read the data: https://coderwall.com/p/app3ya/read-excel-file-in-c

It uses Microsoft.Office.Interop.Excel to read the file.

Sadly, it is really slow. Reading a file with only 50 rows allready take about a minute. I never finished loading the 6000 row file.

I then thought about using csv, but the table contains , and ; so this won't be an option.

Can anyone suggest another method?

10
  • Which Excel format will you be working on? Commented Jan 17, 2019 at 17:44
  • What does this have to do with storing Excel table into an array ? Why not read the Excel file into a DataTable and then manipulate from there? Commented Jan 17, 2019 at 17:44
  • 1
    Tim, C# is a beautiful thing. DataTables will do wonders for you. Reference: learn.microsoft.com/en-us/dotnet/api/… Commented Jan 17, 2019 at 17:47
  • 2
    How sparse is the table (for example, in your 300,000 cells, what percentage is full)? Do all 6000 rows have the same "shape" (i.e., in each row, do the cells correspond to the same column/meaning). If it's very sparse, you might want to use a sparse matrix technique. If it's completely regular, you probably want to read it into something like a List<TypeThatRepresentsOneRowOfData>, where TypeThatRepresentsOneRowOfData is a POCO class with properties that match the columns Commented Jan 17, 2019 at 17:51
  • 2
    You'll be much happier if you read the data into a list using a type that matches your columnar layout. I don't know if you can mix and match the ExcelDataReader and Dapper (github.com/StackExchange/Dapper), but if you can, it would make for a very clean solution. You can NuGet Dapper into your project. Dapper is pretty fast. Commented Jan 17, 2019 at 17:54

2 Answers 2

3

Apart from my comment-

Here is the method I use in order to read from an Excel file and into a table. You will need to have:

using Microsoft.Office.Interop; using statement, along with adding the correct Microsoft.Office.Interop.Excel reference to your project.


Method:

public DataTable ReadExcel(string fileName, string TableName)
{
        DataTable dt = new DataTable();

        OleDbConnection conn = new OleDbConnection(@"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 8.0\"");
        OleDbCommand cmd = new OleDbCommand("SELECT * FROM " + TableName, conn);

        try
        {
            conn.Open();
            OleDbDataReader reader = cmd.ExecuteReader();

            while (!reader.IsClosed)
            {
                dt.Load(reader);
            }
        }
        finally
        {
            conn.Close();
        }

        return dt;
}

Explanation:

fileName will be the file path to the Excel file you are wanting to read the data form.

TableName will be the Excel Sheet name you are wanting to read data from.

The reason it is written this way, is because C# will read it and treat the Excel file like a database, where instead of sheets, there are tables.


You may need to alter the OleDbConnection(@"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 8.0\"");

You can find the proper/correct Provider here: https://www.connectionstrings.com/excel/

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you, gonna test that. Is this function dangerous? -> Should I create a temporary copy of the file?
All this Method does is reads from an Excel file, fills it into a dataTable type, and then returns that dataTable. You can call it: DataTable testTable = ReadExcel("C:\TestExcel", "Sheet1"); . If it would make you feel more at ease, you can absolutely create a test of your Excel before running. This Method does not write to Excel, only reads from
Thanks, I know it only reads :) But even them some systems "open" the file to read it and if the program crashes, the data can get damaged :)
I don't understand how to use dataTable. from the internet I found I can use: dt.Rows[rownumber]["columnname"] But what is the column name? Is it the same like in Excel -> The first row? Also, the program will run on a diffrent machine. Is there way to programatically get the provider?
Keep in mind that everything in the sheet your data is in will get captured using this method. So if you have anything in the Worksheet outside of just the data, it will get imported as well. If all that's on the Worksheet is the data, then this is a great way of capturing it.
|
0

If you're only going to read the Excel file, I suggest ExcelDataReader instead of the interop.

2 Comments

I'm not familiar with importing and using such code. Could you give me a keyword to google how to handle such stuff?
Use Nuget package installer. If you're using Visual Studio, right click on the project and select Manage Nuget Packages. You can search for ExcelDataReader there and install it. The best thing about this library is that it is standalone and doesn't need Office installed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.