0

I am working on a new Dataflow Gen 2 to import Excel files from SharePoint.

The problem I am running into is our vendor, who is supplying the files, is not properly setting the worksheet dimension field in the XML of the XLSX excel file.

The file as downloaded from our vendors site has excel.zip > xl\worksheets\sheet1.xml defined as <dimension ref="A1:A56812" />.

The dataflow Gen 2 is reading that and only showing the first column.

However, there are columns A - Y in the sheet. If we open the sheet in Excel, change any of the data and save the file, the dimension is defined as <dimension ref="A1:Y56812" /> and the Dataflow Gen 2 pulls in all the columns.

Is there a way to force the Dataflow Gen 2 to ignore the dimension attribute and import columns A - Y?

Or do I need to have "Open the sheet, change something, save the sheet" as part of the manual download tasks? I want to automate as much as possible.

4
  • 1
    "The problem I am running into is our vendor, who is supplying the files, is not properly setting the worksheet dimension field" - Then get back to your vendor and tell them to properly set the worksheet dimension field (whatever that means). Don't try to "fix" bad data. Get back to the source of the data and get them to fix it. Commented Oct 28 at 20:21
  • @JesperJuhl This vendor is a pain in the ass. Since XLSX are just zipped xml files, in the worksheet xml, the first property is the dimension property which defines the sheets range. I ran into the issue before at a previous job where DocumentFormat.OpenXml was doing the same thing because I didnt add columns correctly to the worksheet in the application code. Its a straight forward fix but getting our vendor to bother with it most likely wont happen. Commented Oct 28 at 20:25
  • As a means of potentially ignoring a file's metadata: Power BI Options -> Global -> Data Load -> Never detect column types & headers for unstructured sources ? Commented Oct 29 at 1:00
  • @SpectralInstance the issue is happening before power bi attempts to detect the column types and headers. Its detecting (from the dimension xml property) only one column when initially opening the file. Commented Oct 31 at 14:58

1 Answer 1

0

Welp, finally found the right incantation to summon the result from the great wizard of google.

The option I was looking for is InferSheetDimensions. However the documentation for the Excel.Workbook power query function is a bit terse to understand.

using = Excel.Workbook(File.Contents("filePath.xlsx"), [InferSheetDimensions=true, UseHeaders=false], null) gets the ignores the dimension meta data in the excel file.

to use the InferSheetDimensions you need to pass an object to the useHeaders option AND pass null to the delayTypes option.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.