0

I have to scan a network drive of 120gb with over 100.000 folders. I am looking for .ini and .par files. My initial thought was to list all files from all directories and then throw out what i don't need.

I put a foreach loop with . on the whole drive, with in the loop an execute sql command where i do an insert into into a table with the full file name that was found.

I realize that writing to SQL for every record is a big performance issue, but have been unable to write it to an SSIS Object variable. It would be good to write to an In Memory table and only when the scan is finished, to push it all at once into the SQL database.

All ideas are welcome, if it's a solution to write to the SSIS object, good, if you have a better solution, very welcome.

5
  • Sql Server 2014 Enterprise has in memory tables that you could use for this, but even if you had that, there would still be a row based performance penalty Commented May 27, 2015 at 15:04
  • Could you give a little more context to what you are trying to do? I would imagine a little c# or vb app would give much better performance if you are just trying to add a list of files to a database (even as a SSIS script task!) Commented May 27, 2015 at 15:09
  • You can probably build/populate your object variable in a script task. I've never done it before, but you can do pretty much anything in a script task. Commented May 27, 2015 at 15:35
  • Creating the list of files is the first step of the proces, after i will fetch and open these found files and take parameters out of them, but i'm already having a big performance issue on making the list. I'm running Vs 2010 on SQL 2012, so no in memory tables, and due to a faulthy install, no script component available either. Now that i read this again, i feel like i have to get IT to upgrade to 2014 and get SSIS working properly. Out of my control though. Commented May 27, 2015 at 15:39
  • We do things like this using a C# script that writes all of the file data to XML, then a simple SSIS package to load the XML data to a table all at once. You could use VS Express to make the XML, if SSIS scripting is broken. Commented May 27, 2015 at 19:18

2 Answers 2

1

SSIS will only be able to get a list of files on the network that exist in shared folders. Given this, you can do the following in a SSIS package to get a list of all of the files with a specific extension. The following example is based on the .ini file types. But you can easily add a second process in the same package for the .par files where the same two variables are reapplied.

  1. Create an object variable called FileList and a string variable called File.
  2. Create a script task to gather the .ini files where they are read from all subfolders and saved into an array. From there they are then saved into the object variable. Make certain it is defined in the ReadWrite part of the script when setting up.

    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    using System.IO;
    
    namespace xxxxxx
    {
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
    
        public void Main()
        {
            string[] ini_files = Directory.GetFiles(@"\\servername\sharedfolder", "*.ini", SearchOption.AllDirectories);
    
            foreach (string name in ini_files)
            {
                Dts.Variables["User::FileList"].Value += name.ToString();
            }
        }
    }
    

    }

  3. Create a Foreach Loop container applying the object FileList object variable in which each item saved to it is enumerated to the File string variable. From there just include in the container a SQL script or Data Flow task to save the contents to a database table.

    enter image description here

This is just one of many ways to approach this task. The approach here is more modular while applying a fast method of gathering the files using C#.

Sign up to request clarification or add additional context in comments.

Comments

0

Based on your comment that you don't have script task option, one of the approach I think of:- 1) You will need to create batch file with "dir %1 /s /b /o:n > %2" command to get the list of required list of names into some text file, where %1 and %2 are arguments.

2) You can add two different Execute Process Task into your package where you will add your batch file as Executable for both tasks and Arguments value will be "Z:*.ini,C:\tempSSIS\iniList.txt" for one and "Z:*.par,C:\tempSSIS\parList.txt" for other task.(assuming Z:\ is your network drive and second argument is file in which you would want to store the list of file names).

3) Then, you can add Data Flow Task after each Execute Process Task to read the text files and insert records into a same or different tables.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.