1

I am trying to bulk load in azure sql database some data in an azure blob storage. The file content is:

 customer,age,gender
'C1093826151','4','M'
'C352968107','2','M'
'C2054744914','4','F'

the file is in a container called silver. in the silver container I have the File1.fmt which content is:

14.0  
3
1       SQLCHAR       0       7       ","      1     customer       ""  
2       SQLCHAR       0       100     ","      2     age            SQL_Latin1_General_CP1_CI_AS 
3       SQLCHAR       0       100     "\r\n"   3     gender         SQL_Latin1_General_CP1_CI_AS

I have the extra line add the end of the fmt file.

I have created a SAS token will all enabled and allowed like the screenshot below: enter image description here

The firewall rules on datalake are as the picture below:

enter image description here

Below are my sql scripts (I removed the ? at the beginning of the SAS token, as my silver container is public, I know I should need the SAS token):

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'safepassword';
go
DROP EXTERNAL DATA SOURCE MyAzureInvoices

DROP DATABASE SCOPED CREDENTIAL UploadInvoices

CREATE DATABASE SCOPED CREDENTIAL UploadInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2019-12-12**********************************88%3D'; -- dl

--DROP EXTERNAL DATA SOURCE MyAzureInvoices

CREATE EXTERNAL DATA SOURCE MyAzureInvoices
    WITH (
        TYPE = BLOB_STORAGE,
        LOCATION = 'https://mydatalake.blob.core.windows.net/silver',
        CREDENTIAL = UploadInvoices
    );

Landing table:

CREATE TABLE [ext].[customer](
    [customer_id] [int] IDENTITY(1,1) NOT NULL,
    [customer] [varchar](100) NOT NULL,
    [age] [int] NOT NULL,
    [gender] [varchar](50) NOT NULL
) ON [PRIMARY]
GO

and these are the ways I tried to load the file into the sql database:

-- 1
    SELECT * FROM OPENROWSET(
   BULK 'bs140513_032310-demo.csv',
   DATA_SOURCE = 'MyAzureInvoices',
   FORMAT = 'CSV',
   FORMATFILE='File1.fmt',
   FORMATFILE_DATA_SOURCE = 'MyAzureInvoices'
   ) AS DataFile;   
-- 2    
    go
    SELECT * FROM OPENROWSET(
   BULK 'bs140513_032310-demo.csv',
   DATA_SOURCE = 'MyAzureInvoices',
   SINGLE_CLOB) AS DataFile;
   go
-- 3
BULK INSERT ext.customer
FROM 'bs140513_032310-demo.csv'
WITH (
DATA_SOURCE = 'MyAzureInvoices', FORMAT = 'CSV' );

They all give the same error:

Msg 4861, Level 16, State 1, Line 2
Cannot bulk load because the file "bs140513_032310-demo.csv" could not be opened. Operating system error code 5(Access is denied.).

I have tried for 3 days and I am lost. Thanks for your help NB:

While being disconnected, it can access the files:

*

mydatalake is fake, but I can access with the real name

4
  • see if the file is open in any spredsheet application. Many times it has happened to me that I saved an excel as csv and tried to upload it in sql but failed because it was still open in spread sheet application Commented Nov 14, 2020 at 16:09
  • Thanks Codeek, all file editors are closed and the file is in Azure Data Lake container Commented Nov 14, 2020 at 16:13
  • You can try to delete bs140513_032310-demo.csv, it will show the file doesn't exist. This error is confirmed on the side, SQL engine can access the file. Commented Nov 16, 2020 at 7:03
  • 1
    Hi Joseph, i am able to download the file Commented Nov 17, 2020 at 1:00

1 Answer 1

1

I think this error message is misleading.
I've created a same test as you, and encountered the same error.
But after I edited the bs140513_032310-demo.csv and File1.fmt, it works well.

  1. I changed the bs140513_032310-demo.csv like this: enter image description here

  2. I changed the File1.fmt like this, I changed the cutomer column length from 7 to 100 and age column length from 100 to 7 :

14.0  
3
1       SQLCHAR       0       100       ","      1     customer       ""
2       SQLCHAR       0       7         ","      2     age            SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR       0       100       "\r\n"   3     gender         ""
  1. I use the following statement to query:
   SELECT * FROM OPENROWSET(
   BULK 'bs140513_032310-demo.csv',
   DATA_SOURCE = 'MyAzureInvoices',
   FORMAT = 'CSV',
   FORMATFILE='File1.fmt',
   FORMATFILE_DATA_SOURCE = 'MyAzureInvoices'
   ) AS DataFile; 

The result shows:
enter image description here

  1. Don't BULK INSERT into your real tables directly.
  • I would always insert into a staging table ext.customer_Staging (without the IDENTITY column) from the CSV file
  • possibly edit / clean up / manipulate your imported data
  • and then copy the data across to the real table with a T-SQL statement like:
INSERT into  ext.customer_Staging with (TABLOCK) (customer, age, gender)
   SELECT * FROM OPENROWSET(
   BULK 'bs140513_032310-demo.csv',
   DATA_SOURCE = 'MyAzureInvoices',
   FORMAT = 'CSV',
   FORMATFILE='File1.fmt',
   FORMATFILE_DATA_SOURCE = 'MyAzureInvoices'
   ) AS DataFile;
   go

INSERT INTO ext.customer(Name, Address) 
   SELECT customer, age, gender
   FROM ext.customer_Staging
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.