0

I have created following things:

  1. External Stage (pointing to folder in Data Lake Gen2)
  2. File Format (Parquet)

And I want to do a select from a stage but I want to read only those files that have .part. in there filename. Using copy into command I am able to mention this in pattern but I do not want to copy data in a table. I only want to do a select for .part. files. Currently I have tried this:

SELECT 
    $1:Country::String as Country,
    $1:FeatureStr::String as FeatureStr,
    $1:Machineid::String as Machineid,
    $1:ProductId::number as ProductId
FROM @DB.RAW_SCHEMA.FEATURE_STAGE/ClientDetails/2020.03.ClientDetails/ (file_format => DB.RAW_SCHEMA.PARQUET_FORMAT);

But it also looks for files which do not have 'part' in them and so it fails. So then I tried the pattern functionality:

SELECT 
    $1:Country::String as Country,
    $1:FeatureStr::String as FeatureStr,
    $1:Machineid::String as Machineid,
    $1:ProductId::number as ProductId
FROM @DB.RAW_SCHEMA.FEATURE_STAGE/ClientDetails/2020.03.ClientDetails/ (file_format => DB.RAW_SCHEMA.PARQUET_FORMAT)
pattern => '.*part.*';

But it gives me syntax error. Any guidance will be appreciated. :)

1 Answer 1

1

The syntax you are after is:

select * 
from @STAGE_NAME/PATH
(FILE_FORMAT => PARQUET_FORMAT, PATTERN => '.*[.]part[.].*' )

Square brackets in the regex are there to also match the dots in .part.. The regex you used will work to just match all the files with the word part

Sign up to request clarification or add additional context in comments.

1 Comment

thanks, Yeah this is what i was looking for. Can you please also look into this issue: stackoverflow.com/questions/62823080/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.