0

Trying to order a list of records by each type with the limit of 5 records per type.

For example:

I have 25 different GICS code such as 2050, 4010, 2540 and so on. And each GICS code is a different type of Industry, such as 2050 is bank, and 4010 is automobile, and 2540 is cosmetics.

Now each GICS code are assigned to multiple company names and are given a score. I want to be able to select the bottom 5 companies from each GICS code.

Is it possible? or do I need multiple sql?

Below is my SQL:

select g.4digits, c.company_name, os.* from overall_scores os
join company c
on c.company_id = os.company_id
join gics g
on g.company_id = c.company_id
where g.4digits in ((2550), (4010), (2540))
and os.overall_score <> 'NA'
and os.overall_score <> 'NaN'
order by os.overall_score asc limit 5;

1 Answer 1

1

MYSQL doesn't support analytic function like ROW_NUMBER which can be used. We can do this using variables

SELECT T.*
  FROM (SELECT g.4digits, c.company_name, os.*,
               CASE 
                 WHEN @gistype != g.4digits THEN @rownum := 1 
                 ELSE @rownum := @rownum + 1 
               END AS seq,
               @gistype := g.4digits AS var_gistype
          FROM overall_scores os 
          JOIN company c
          ON c.company_id = os.company_id
          JOIN gics g
          ON g.company_id = c.company_id
          AND g.4digits in ((2550), (4010), (2540))
          AND os.overall_score <> 'NA'
          AND os.overall_score <> 'NaN'
          JOIN (SELECT @rownum := NULL, @gistype := '') r
      ORDER BY g.4digits, os.overall_score asc) T
 WHERE T.seq <= 5
Sign up to request clarification or add additional context in comments.

4 Comments

What does the T stand for in T.* and T.seq? Can you explain to me what the SQL is actually doing especially the CASE part?
we are maintaining a rownum variable and gistype variable when gis type is same, we increment rownum else we reset rownum to 1 as we ordered the results by g.4digits, all rows with same gistype will have row numbers set sequentialy till we find next gis type, where row number will again starts from 1. this is partitioning data on gis type and giving row number with in that partition. T is alias name for the subquery which has row number column along with other columns. now outer select is only getting top 5 in each bucket.
Thank you for the clarification. But It seems to me that its pulling the top 5 scores. Where would I change it to bottom 5?
add order by g.4digits, overall_score asc instead of order by g.4digits

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.