2

I found usefull JS function that I want to implement into snowflake's UDF in order to move complex computation to the dwh.

http://trentrichardson.com/2010/04/06/compute-linear-regressions-in-javascript/

Attempt to rework above as SQL procedure:

CREATE OR REPLACE PROCEDURE JS_LNR_REG(a array, b array)
RETURNS ARRAY
LANGUAGE javascript
AS
$$
function linearRegression(y,x){
var lr = {};
var n = y.length;
var sum_x = 0;
var sum_y = 0;
var sum_xy = 0;
var sum_xx = 0;
var sum_yy = 0;

for (var i = 0; i < y.length; i++) {

    sum_x += x[i];
    sum_y += y[i];
    sum_xy += (x[i]*y[i]);
    sum_xx += (x[i]*x[i]);
    sum_yy += (y[i]*y[i]);
} 

lr['slope'] = (n * sum_xy - sum_x * sum_y) / (n*sum_xx - sum_x * sum_x);
lr['intercept'] = (sum_y - lr.slope * sum_x)/n;

return lr;}      ; $$;

Execution fails with error: SQL compilation error: Invalid identifier which is odd since procedure has been created.

SELECT JS_LNR_REG('[2,4,5,3,1]', '[5.2, 5.7, 5.0, 4.2]');

2 Answers 2

1

You're almost there. To pass an array, try using ARRAY_CONSTRUCT or PARSE_JSON, like so:

SELECT JS_LNR_REG(array_construct(1,2,3,4), array_construct(5,4,3,2));
SELECT JS_LNR_REG(parse_json('[1,2,3,4]'), parse_json('[5,4,3,2]'));

To get the code working, make it a FUNCTION that returns OBJECT, and be sure to actually invoke linearRegression() from within your UDF:

CREATE OR REPLACE FUNCTION JS_LNR_REG(A array, B array)
  RETURNS OBJECT
  LANGUAGE JAVASCRIPT
  STRICT
AS
$$
function linearRegression(y,x){
  var lr = {};
  var n = y.length;
  var sum_x = 0;
  var sum_y = 0;
  var sum_xy = 0;
  var sum_xx = 0;
  var sum_yy = 0;

  for (var i = 0; i < y.length; i++) {

      sum_x += x[i];
      sum_y += y[i];
      sum_xy += (x[i]*y[i]);
      sum_xx += (x[i]*x[i]);
      sum_yy += (y[i]*y[i]);
  } 

  lr['slope'] = (n * sum_xy - sum_x * sum_y) / (n*sum_xx - sum_x * sum_x);
  lr['intercept'] = (sum_y - lr.slope * sum_x)/n;

  return lr;
  }      

  return linearRegression(A,B)
; $$;

Hope that's helpful.

Sign up to request clarification or add additional context in comments.

4 Comments

Would return linearRegression(A,B) work as positional parameter?
A and B are positional parameters in the call to lineanRegression(), is that what you mean?
I reparaphrase ;) CREATE OR REPLACE FUNCTION JS_LNR_REG(A array, B array). How A and B in CREATE statement know to pass these arrays into return linearRegression(A,B)
Ah yes, they are indeed positional.
1

Snowflake natively supports REGR_INTERCEPT and REGR_SLOPE functions:

Input as arrays:

SELECT REGR_SLOPE(f2.value, f1.value) AS slope
      ,REGR_INTERCEPT(f2.value, f1.value) AS intercept 
FROM TABLE(FLATTEN(input => parse_json('[2,4,5,3]'))) AS f1
    ,TABLE(FLATTEN(input => parse_json('[5.2, 5.7, 5.0, 4.2]'))) AS f2
WHERE f1.INDEX = f2.INDEX;
-- slope    intercept
-- 0.09      4.71

Input as table:

CREATE OR REPLACE TABLE tab(x NUMBER(18,2), y NUMBER(18,2));
INSERT INTO tab(x, y) VALUES (2,5.2), (4, 5.7), (5, 5.0),(3, 4.2);

SELECT REGR_SLOPE(y, x) AS slope, REGR_INTERCEPT(y, x) AS intercept
FROM tab;
-- slope    intercept
-- 0.09      4.71

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.