SQL Server SSIS CSV flat file import > do stuff with that data > INSERT/UPDATE to SQL Table

Question

I am currently working on a prototype to create a package to crunch some data. There is a nightly CSV output from an Informix box for which there is an intention to make redundant. My plan is to read this CSV using BIDS, do stuff with the data, such as some basic cleaning and calculations and then insert those details into a SQL Server 2008 table.

I have only had mild exposure with SSIS, not quite enough to know which may be the best approach. I currently have a script task that is reading the data into a DataTable object. I have stopped for fear that this may not be the best approach.

To summarise;

Import CSV > Do stuff(calcs etc...) > insert into new Table.

Which combination of components would quickly and easily achieve this?

EDIT:

The data has nothing unique about it.

20140722|0000771935|000000000000012654|0000012775|      40.000-|      289.20-|       346.800 |        346.80 |       346.800 |GBP  |0|
20140722|0000771935|000000000000012654|0000012775|      40.000-|      289.20-|       346.800 |        346.80 |       346.800 |GBP  |0|
20140722|0000771935|000000000000012654|0000012775|      40.000-|      289.20-|       346.800 |        346.80 |       346.800 |GBP  |0|

That is a snippet of some of the rows. The format of some fields can vary.

000000000000012654 can become F021 or X00F5

This refer to SKU data and order/pallet quantity. These three rows are for a particular customer order/sku/date/order quantity/price/discounts/currency etc

As you can see they are all the same. The data has been like this for 15 years and why it has not been grouped is beyond my understanding. I'm fairly new at this business and this is a task they gave me. I expect these columns are 'SELECT' from a view that makes the rows unique. This is all I have to work with. Strange requirement.

Nick.Mc · Accepted Answer · 2014-08-02 04:17:12Z

1

Personally I wouldn't use SSIS but I prefer scripting solutions over UI solutions.

You can import the CSV into a staging table using any number of methods: BCP.EXE, BULK INSERT, OPENROWSET (and SSIS of course)

Then you can run required UPDATE/INSERT etc. on your staging table, writing log rows to a table if required.

Then move data to the final table again using UPDATE / INSERT

If you use BULK INSERT then this could all be written inside one stored procedure

If you would like more details post back and I will expand further.

I like the staging table approach because all of your workings can be seen in the staging table, as opposed to SSIS where calculations are performed on the fly and get pushed straight into the final table.

edited Aug 2, 2014 at 4:17

answered Jul 29, 2014 at 9:52

Nick.Mc

19.4k6 gold badges67 silver badges102 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Robert Anderson · Accepted Answer · 2014-07-28 10:30:52Z

1

The way I have done it is to call a rest web service to get the CSV. This is done through the first script task in the SSIS package. You may want to update a schedule table as one of the first steps aswell to update batch run info if more than one file is downloaded.

Once the CSV files are in a folder on the server - e.g. C:\Downloads, you add a flat file connection to the downloaded CSV file. Then create a data flow task, where you have the flat file source with an oledb source (for the database table that holds the data).

Then what you want is a sort for underneath each source. Then a merge join with a left outer join on ID - you can create a conditional split underneath this that gets new or exsisting depending on whether the ID is null or not (it will be null due to the outer join). Then underneath the conditional split, you will do an oledb update command for existing (ID not null) and a oledb destination for inserts (ID null - new records).

Here is the structure of one that I have done that is sheduled through SQL server agent:

flat file source (csv file)                       oledbsource (db table)
           |                                                 |
           |                                                 |
sort (by ID)                                      sort (by ID)
       |                                                 |
       |--------------------------------------------------
                               |
                           merge join (left outer)
                               |
                               |
                           conditional split (ID null = new, not null = existing)
                               |
                   [**your calculations here]
                               |
       existing-----------------------------------------new
       |                                                  |
oledb command (update table command)               oledb destination (insert)

Regards, Rob

edited Jul 28, 2014 at 10:30

answered Jul 28, 2014 at 10:14

Robert Anderson

1,2568 silver badges11 bronze badges

3 Comments

Splunk Over a year ago

Good suggestion. Pardon for the lack of clarity on my part here, but - this data does not always have an ID. It's data based on views which needs to go into its own table. So with that in mind....

Robert Anderson Over a year ago

Is there any candidate key combinations you could do the merge join on? e.g. a combination of fields that would give a unique value? If not, maybe you could extract this data first into A staging table, then call a truncate SQLCommand on the staging table to clear the data out once you have your data in your operational/reporting table. Then you could add some data flow tasks for data cleansing and data translation if required.

Splunk Over a year ago

I'm afraid not. This is what I looked for in the data. See my edit

Collectives™ on Stack Overflow

SQL Server SSIS CSV flat file import > do stuff with that data > INSERT/UPDATE to SQL Table

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related