1

I'm trying to build some stats on a table in snowflake and am using the kurtosis() and skew() functions. Since the calculations are under the hood (the divisions) I cannot use nullif to account for cases where I don't have enough distinct values which will return a Division by zero error. Does Snowflake have some sort of iferror() or is_error() function that I can use to return null or exclude from my table? Or could I build a UDF to do handle this scenario?

EDIT

This is the script I am using

select
    f.key colname,
    f.path path,
    typeof(f.value) type,
    count(distinct {player_col}) player_cnt,
    count(*) event_cnt,
    count(distinct f.value) unique_val_cnt,
    min(f.value) min,
    round(percentile_cont(.25) within group (order by iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) p25,
    round(avg(iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) mean,
    round(median(iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) med,
    round(percentile_cont(.75) within group (order by iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) p75,
    max(f.value) max,
    mode(f.value) mode,
    iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), max(f.value)-min(f.value), null) range,
    round(percentile_cont(.75) within group (order by iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) -
    round(percentile_cont(.25) within group (order by iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) iqr,
    round(stddev(iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value / 1000000, null)) * 1000000, 2) std,
    kurtosis(iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value, null)) kurtosis,
    skew(iff(typeof(f.value) in ('DECIMAL', 'INTEGER'), f.value, null)) skewness
from
    {json_table}, 
    lateral flatten({json_col}, recursive=>true) f
where
    typeof(f.value) not in ('OBJECT', 'NULL_VALUE', 'ARRAY') and f.path not like '%[%]%'
group by 1, 2, 3
order by 1, 2, 3

I get the below error

enter image description here

So I am guessing either some keys have less than 3 values or 1 distinct value across so it would return a DIV0# error.

3
  • @MikeWalton I've added the suggested edit Commented Nov 1, 2021 at 14:23
  • @AndreiBudaes well snowflake has div0 function to handle division by zero error , but doesn't apply to your case , you better look into your data and filter out bad data before doing your math operations Commented Nov 1, 2021 at 14:44
  • 1
    Related: Kurtosis function not playing nice Commented Nov 1, 2021 at 14:50

1 Answer 1

1

There is not a function like you asked in Snowflake. I think the SKEW and the KURTOSIS functions are causing the error because the total of the values is 0 for some groups.

You may try to use IFF and check the SUM of the data. Here is a simple test case:

create or replace table aggr(g number, k number);

insert into aggr values
    (1, 0),
    (1, 0),
    (2, 0);

select g,
IFF(sum(k) >0, skew(k), 0) skewness,
IFF(sum(k) >0, kurtosis(k), 0) kurtosis
from aggr
group by g;


+---+----------+----------------+
| G | SKEWNESS |    KURTOSIS    |
+---+----------+----------------+
| 1 |        0 | 0.000000000000 |
| 2 |        0 | 0.000000000000 |
+---+----------+----------------+

-- inserting more data to be able to calculate skew and kurtosis

insert into aggr values
    (1, 5),
    (1, 2),
    (2, 5),    
    (2, 2),
    (2, 3);

select g,
IFF(sum(k) >0, skew(k), 0) skewness,
IFF(sum(k) >0, kurtosis(k), 0) kurtosis
from aggr
group by g;


+---+--------------+-----------------+
| G |   SKEWNESS   |    KURTOSIS     |
+---+--------------+-----------------+
| 1 | 0.8076203007 | -1.272191860958 |
| 2 | 0.1364722935 | -0.775965134125 |
+---+--------------+-----------------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.