I am currently working on a prototype to create a package to crunch some data. There is a nightly CSV output from an Informix box for which there is an intention to make redundant. My plan is to read this CSV using BIDS, do stuff with the data, such as some basic cleaning and calculations and then insert those details into a SQL Server 2008 table.
I have only had mild exposure with SSIS, not quite enough to know which may be the best approach. I currently have a script task that is reading the data into a DataTable object. I have stopped for fear that this may not be the best approach.
To summarise;
Import CSV > Do stuff(calcs etc...) > insert into new Table.
Which combination of components would quickly and easily achieve this?
EDIT:
The data has nothing unique about it.
20140722|0000771935|000000000000012654|0000012775| 40.000-| 289.20-| 346.800 | 346.80 | 346.800 |GBP |0|
20140722|0000771935|000000000000012654|0000012775| 40.000-| 289.20-| 346.800 | 346.80 | 346.800 |GBP |0|
20140722|0000771935|000000000000012654|0000012775| 40.000-| 289.20-| 346.800 | 346.80 | 346.800 |GBP |0|
That is a snippet of some of the rows. The format of some fields can vary.
000000000000012654 can become F021 or X00F5
This refer to SKU data and order/pallet quantity. These three rows are for a particular customer order/sku/date/order quantity/price/discounts/currency etc
As you can see they are all the same. The data has been like this for 15 years and why it has not been grouped is beyond my understanding. I'm fairly new at this business and this is a task they gave me. I expect these columns are 'SELECT' from a view that makes the rows unique. This is all I have to work with. Strange requirement.