0

I'm trying to build a SSIS package to read from Excel. I've found an excellent example here that is very close to what I need.

My question here is: Starting with a blank solution, what are the bare minimum steps required to get a C# Script component to read from an .xlsx file?

Here is my code:

using System;
using System.Data;
using System.Data.OleDb;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{    
    public override void CreateNewOutputRows()
    {
        string fileName = @"E:\SFTP\RSS\Results.xlsx";    
        string cstr = "Provider.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\"";

        using (OleDbConnection xlConn = new OleDbConnection(cstr))
        {
            xlConn.Open();
            OleDbCommand xlCmd = xlConn.CreateCommand();
            xlCmd.CommandText = "Select top 10 * from Responses";
            xlCmd.CommandType = CommandType.Text;
            using (OleDbDataReader rdr = xlCmd.ExecuteReader())
            {
                while (rdr.Read())
                {
                    for (int i = 2; i < rdr.FieldCount; i++) //loop from 3 column to last
                    {
                        Output0Buffer.AddRow();
                        Output0Buffer.Question = rdr.GetName(i);
                        Output0Buffer.Response = rdr.ToString();
                    }

                }
            }
            xlConn.Close();
        }
    }

}

These are the kind of issues I'm getting & questions I have: Do I need to configure the connection manager at the solution level? On the script component or does the code contain everything I need?

What datatype/function/method do I use to read a cell as a number, a date or a string?

Here are some of the errors I'm getting:

(2,14): error CS0234: The type or namespace name 'Data' does not exist in the namespace 'System' (are you missing an assembly reference?)
(3,14): error CS0234: The type or namespace name 'Data' does not exist in the namespace 'System' (are you missing an assembly reference?)
(4,17): error CS0234: The type or namespace name 'SqlServer' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
(5,17): error CS0234: The type or namespace name 'SqlServer' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
(8,27): error CS0246: The type or namespace name 'UserComponent' could not be found (are you missing a using directive or an assembly reference?)
 + additional 10 errors

I'm using: VS2019, SQL Server 2016 SP2, Office 2016.

Here's a screen shot of my Excel Sheet:

Responses

It contains responses to surveys provided by a 3rd party call centre. There's a few columns to identify the call, And then each column header contains the question and the clumn data is the response. It currently has 189 Columns and will grow/change over time. My idea is to select the first 4 columns which reliably identify each row plus 1 additional column For each Q&A, and iterate through the sheet picking up each column header and contents so that can be inserted into a table. As more columns get added they'd just get appended as additional rows. The example code I pasted just shows 2 columns as I was trying to keep the example simple.

This is what I've tried. 1. Add a DataFlow task to Control Flow. ControlFlow

  1. Add Script Component to DataFlow.

When prompted select Source.

Double click and open the editor.

Select Connection Manager. Click Add

On new row select New connection,

Add new SSIS Connection Manager appears.

This is about where I get stuck. I've been picking OLEDB Connection.

  1. In the OLEDB Connection Manager what should I select?

I've tried Access and MS Oledb Simple Provider. And things start to unravel about here. Given that the filename, path and driver details are provided in the script what details do I need for the connection manager?

I started playing with the code, but I have no idea if what I have done upstream is working or is correct. So i'm trying to start from the beginning. I'm asking for help walking through the setup steps to get to the C# point.

Thanks to all for thier responses.

Cheers

Pete

11
  • 2
    This looks like a script task in your control flow. Any reason you're not using an Excel data source in a Data Flow task? Commented Apr 30, 2020 at 6:51
  • @thinkOfaNumber the code is a script task within a data flow. (Well thats what im trying to do) I want to use C# like the example because the xls layout will change over time. The sample code will allow me to loop thru the columns and get the column name and contents. Commented Apr 30, 2020 at 11:48
  • I think you mean script task within control flow. If you have within dataflow, it is script component. Please clarify Commented Apr 30, 2020 at 11:55
  • From your code, it appears that you are using script component. not script task Commented Apr 30, 2020 at 11:57
  • 1
    Don't choose a connection manager. you do this in the code. Commented May 1, 2020 at 17:31

1 Answer 1

1

There's nothing that needs to be added to the script component GUI except for the outputs (remember to get the data types correct.)

Do not set up a connection manager, you are doing that in code because you do not want SSIS trying to figure out your column names and frankly, we don't know the column names after column 4.

Here is your code. I only changed a few things.

public override void CreateNewOutputRows()
    {
        //Change this to your filename you do not need a connection manager
        string fileName = @"E:\SFTP\RSS\Results.xlsx";  
        string SheetName = "Sheet1";  
        string cstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\"";

        using (System.Data.OleDb.OleDbConnection xlConn = new System.Data.OleDb.OleDbConnection(cstr))
        {
           xlConn.Open();
           System.Data.OleDb.OleDbCommand xlCmd = xlConn.CreateCommand();
           xlCmd.CommandText = "Select * from [" + SheetName + "$]"; //I assume this is the data you want
           xlCmd.CommandType = CommandType.Text;
           using (System.Data.OleDb.OleDbDataReader rdr = xlCmd.ExecuteReader())
           {
              while (rdr.Read())
              {
                 for (int i = 4; i < rdr.FieldCount; i++) //loop from 5th column to last
                 {
                    //The first 4 columns are static and added to every row
                    Output0Buffer.AddRow();
                    Output0Buffer.UniqueID = Int32.Parse(rdr[0].ToString());
                    Output0Buffer.Year = Int32.Parse(rdr[1].ToString());
                    Output0Buffer.ReportingWave = rdr.GetString(2);
                    Output0Buffer.SubmissionDate = rdr.GetString(3);
                    Output0Buffer.Question = rdr.GetName(i);
                    Output0Buffer.Answer = rdr.GetString(i);
                 }

            }
          }
          xlConn.Close();
       }
    }

This code successfully imported a file that looks like this:

enter image description here

Sign up to request clarification or add additional context in comments.

6 Comments

Wow. Thank you Keith that is awesome. I will get onto it in the morning.
So. Leave the connection manager blank. What about the output columns?
you need to define the output
Progress of sorts.:) I started with a new script component and added your code block. I've set up the Output columns (Do they need to be numbered?) and I can now run the dataflow. when I do I get: An OLE DB Provider was not specified in the ConnectionString. An example would be, 'Provider=SQLOLEDB;'.
@SirSwears-a-lot This is fully tested now. I changed cstr (I messed it up when typing), how i was reading ints, and changed the date to a string (you can cast later)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.