I have to apply simple formatting to 10,000+ Excel files. I already have a multicore program working. It opens a new Excel instance for every Excel file. If it matters, this code is currently inside of my Form code.
I was hoping to use one Excel instance with many workbooks. Is it possible to make use of multicore capabilities if I only use one instance of Excel? How?
-If the answer to the above is no, a more complicated question might be: How many instances of Excel should I spawn, and how can I split the workbooks between each instance?
Current code here:
private void SelectFilesButtonClick(object sender, EventArgs e)
{
var listOfExcelFiles = OpenExcel.FileNames.ToList();
Parallel.ForEach(listOfExcelFiles, TrivialExcelEditFunction);
}
private void TrivialExcelEditFunction(string file)
{
//Open instance of Excel
//Do processing
//Close instance of Excel
}
Updated code below, but still doesn't limit itself to the proper number of cores. Not sure why.
private void SelectFilesButtonClick(object sender, EventArgs e)
{
var listOfExcelFiles = OpenExcel.FileNames.ToList();
int cores = Environment.ProcessorCount;
//Split one list into list of lists. Number of lists based on number of cpu cores
List<List<object>> listOfLists = Split(listOfExcelFiles, cores);
//Limits number of threads to number of cores
Parallel.ForEach(listOfLists, new ParallelOptions { MaxDegreeOfParallelism = cores }, EditExcel);
}
private void TrivialExcelEditFunction(string file)
{
//Open instance of Excel
foreach (string file in files)
{
//Do processing
}
//Close instance of Excel
}
Lets say I have 4 cores. My thinking was to divide list of files into 4 equal lists, limit threads to 4, then I'd be able to process files in 4 instances of Excel. I thought that would mean TrivialExcelEditFunction would only run 4 times. Instead this function is being run anywhere between 14 and 27 times. Please tell me where I went wrong.