Select max value from column for every value in other two columns

Question

I'm working on a webapp that tracks tvshows, and I need to get all episodes id's that are season finales, which means, the highest episode number from all seasons, for all tvshows.

This is a simplified version of my "episodes" table.

 id   tvshow_id   season   epnum
 ---|-----------|--------|-------
 1  |     1     |    1   |   1
 2  |     1     |    1   |   2
 3  |     1     |    1   |   3
 4  |     1     |    2   |   1
 5  |     1     |    2   |   2
 6  |     2     |    1   |   1
 7  |     2     |    1   |   2
 8  |     2     |    1   |   3
 9  |     2     |    1   |   4
10  |     2     |    2   |   1
11  |     2     |    2   |   2

The expect output:

 id
 ---|
 3  |
 5  |
 9  |
11  |

I've managed to get this working for the latest season but I can't make it work for all seasons.

I've also tried to take some ideas from this but I can't seem to find a way to add the tvshow_id in there.

I'm using Postgres v10

If you say you already got parts of it working, it'd be good to see your source code. It's easier to fix a small error than writing it all anew. — Matthias Bö
– Matthias Bö, Commented Jun 29, 2018 at 21:34
I don't think it will be useful because its limited to the latest season and the tvshow_id is hardcoded, so the correct answer is unlikely to use any of this.. but here you go pastebin.com/DdVQMTx2 — nip
– nip, Commented Jun 29, 2018 at 21:38

Ajay Gupta · Accepted Answer · 2018-06-29 21:42:02Z

2

SELECT Id from
(Select *, Row_number() over (partition by tvshow_id,season order by epnum desc) as ranking from tbl)c
Where ranking=1

answered Jun 29, 2018 at 21:42

Ajay Gupta

1,8651 gold badge14 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

nip Over a year ago

How would this relate to @Vivek 's answer in terms of performance ? I got the same result (both output and operation time) and the test table that im using is quite small (~150 rows), so I dont have an idea.

Shawn Over a year ago

@nip I'm not sure about PostgreS, but window functions are usually very performant in most databases I've used. Plus it keeps things set-based. Also, WHERE x IN (1,2,3) is pretty much WHERE x = 1 OR x=2 OR x=3. For large lists, that can get pretty slow.

nip Over a year ago

@Shawn Well, now I'm concerced because this table is going to get somewhat big and I've added another WHERE IN to filter the tvshow_id's, since I only want the season finales from a certain number of tvshows. How would this work (the filtering) with this answer without using WHERE IN ? Is this even possible ?

Shawn Over a year ago

@nip I'd use the ROW_NUMBER() method. Use Ajay Gupta's answer. postgresqltutorial.com/postgresql-row_number

Vivek · Accepted Answer · 2018-06-29 21:40:22Z

2

You can use the below SQL to get your result, using GROUP BY with sub-subquery as:

select id from tab_x
where (tvshow_id,season,epnum) in (
select tvshow_id,season,max(epnum)
from tab_x
group by tvshow_id,season)

answered Jun 29, 2018 at 21:40

Vivek

8036 silver badges13 bronze badges

Comments

Sabari · Accepted Answer · 2018-07-01 02:56:36Z

0

Below is the simple query to get desired result. Below query is also good in performance with help of using distinct on() clause

  select
   distinct on (tvshow_id,season)
   id
  from your_table
  order by tvshow_id,season ,epnum desc

edited Jul 1, 2018 at 2:56

answered Jul 1, 2018 at 2:51

Sabari

2442 silver badges6 bronze badges

Collectives™ on Stack Overflow

Select max value from column for every value in other two columns

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related