Snowflake : IN operator

Question

so I want something as below in my query

select * from table a
where a.id in(select id, max(date) from table a group by id)

I am getting error here , as IN is equivalent to = .

how to do it?

example :

id	date
1	2022-31-01
1	2022-21-03
2	2022-01-01
2	2022-02-01

I need to get only one record based on date(max). The table has more columns than just id and date

so I need to something like this in snowflake

select * from table a
where id in(select id,max(date) from table a group by id)
```-----------------------
All solutions are working , if i select from table .

but  i have case statement in view where duplicate records are coming

example :

create or replace view v_test
as
select * from

(

select id,lastdatetime,*,
case when start_date < timestamp and timestamp < end
and move_date = '9999-12-31' then 'Y'
else 'N' end as IND

from table a
) a


so if any one select view where IND= 'Y', more than  1 records are coming
what i want is to select latest records for ID where IND='Y' and max(lastdatetime)

how to incorporate this logic in view?

Mike Walton · Accepted Answer · 2022-05-20 22:16:49Z

1

I think you are trying to get the latest record for each id?

select * 
from table a
qualify row_number() over (partition by id order by date desc) = 1

answered May 20, 2022 at 22:16

Mike Walton

7,4592 gold badges14 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Simeon Pilgrim · Accepted Answer · 2022-05-20 23:56:03Z

So if we look at your sub-select:

using this "data" for the examples:

with data (id, _date) as (
select column1, to_date(column2, 'yyyy-dd-mm') from values
    (1, '2022-31-01'),
    (1, '2022-21-03'),
    (2, '2022-01-01'),
    (2, '2022-02-01')
)

select id, max(_date) 
from data
group by 1;

it gives:

ID	MAX(_DATE)
1	2022-03-21
2	2022-01-02

which makes it seem you want the "the last date, per id"

which can classically (ansi sql) be written:

select d.* 
from data as d
join (
    select 
        id, 
        max(_date) as max_date
    from data
    group by 1
) as c
    on d.id = c.id and d._date = c.max_date
;

ID	_DATE
1	2022-03-21
2	2022-01-02

which gives you "all the rows values". BUT if you have many rows with the same last date, you will get those, in the output.

Another methods is to use a ROW_NUMBER to pick one and only one row, which is the style of answer Mike has given:

with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
    (1, '2022-31-01', 'extra_a'),
    (1, '2022-21-03', 'extra_b_double_a'),
    (1, '2022-21-03', 'extra_b_double_b'),
    (2, '2022-01-01', 'extra_c'),
    (2, '2022-02-01', 'extra_d')
)
select *
from data
qualify row_number() over (partition by id order by _date desc) =1 ;

gives:

ID	_DATE	EXTRA
1	2022-03-21	extra_b_double_a
2	2022-01-02	extra_d

now if you want the "all rows of the last day" you method works, albeit the QUALIFY/ROW_NUMBER is faster. You can use RANK

with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
    (1, '2022-31-01', 'extra_a'),
    (1, '2022-21-03', 'extra_b_double_a'),
    (1, '2022-21-03', 'extra_b_double_b'),
    (2, '2022-01-01', 'extra_c'),
    (2, '2022-02-01', 'extra_d')
)
select *
from data
qualify dense_rank() over (partition by id order by _date desc) =1 ;

ID	_DATE	EXTRA
1	2022-03-21	extra_b_double_a
1	2022-03-21	extra_b_double_b
2	2022-01-02	extra_d

Now the last thing that it almost seems you are asking for, is "how do find the ID with the most recent data (here 1) and get all rows for that"

with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
    (1, '2022-31-01', 'extra_a'),
    (1, '2022-21-03', 'extra_b_double_a'),
    (1, '2022-21-03', 'extra_b_double_b'),
    (2, '2022-01-01', 'extra_c'),
    (2, '2022-02-01', 'extra_d')
)
select *
from data
qualify id = last_value(id) over (order by _date);

Anika Shahi · Accepted Answer · 2022-05-20 22:18:35Z

0

Here is an example of how to use the in operator with a subquery:

select * from table1 t1 where t1.id in (select t2.id from table2 t2);

answered May 20, 2022 at 22:18

Anika Shahi

112 bronze badges

1 Comment

coool_sweet Over a year ago

i need id based on max(date) here

Lukasz Szozda · Accepted Answer · 2022-05-21 09:23:34Z

0

Usage of IN is possible to match on both columns:

select * 
from tab AS a
where (a.id, a.date) in (select id, max(date) from tab group by id);

For sample data:

CREATE TABLE tab (id, date)
AS 
SELECT column1, to_date(column2, 'yyyy-dd-mm')
FROM VALUES
    (1, '2022-31-01'),
    (1, '2022-21-03'),
    (2, '2022-01-01'),
    (2, '2022-02-01');

Output:

answered May 21, 2022 at 9:23

Lukasz Szozda

181k26 gold badges278 silver badges326 bronze badges

Collectives™ on Stack Overflow

Snowflake : IN operator

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related