mysql table join with max

Question

Table t1 contains a series of basic data and is unique on Id.

Table t2 contains a large amount of time series data that I need to scope down to just a subset. I am only interested in somevalue and yetanothervalue. Struggling to find the cleanest way to do that in this context.

The query below runs, but I have used MAX incorrectly. Studying mysql docs related to greatest-n-pergroup and trying to get that solved.

I am interested in the where usage and efficiencies - what is the best pattern to add those where clauses.

select t1.*,
    t2.lastdate as lastdate,
    from t1
    left join
    ( select Id,
            max(LastDate) as lastdate
            from t2table
            where
            somecolumn like '%somevalue%'
            group by Id
    ) t2
    on t1.Id = t2.Id
    where yetanothercolumn = "yetanothervalue";

Also - any links to docs or other threads and examples appreciated.

If you like, consider following this simple two-step course of action: 1. If you have not already done so, provide proper DDLs (and/or an sqlfiddle) so that we can more easily replicate the problem. 2. If you have not already done so, provide a desired result set that corresponds with the information provided in step 1. — Strawberry
– Strawberry, Commented Aug 21, 2015 at 13:38
I don't understand, it looks correct, what is the problem exactly? — rlanvin
– rlanvin, Commented Aug 21, 2015 at 13:42
not sure why the left join and not just a join. focus on indexes in place — Drew
– Drew, Commented Aug 21, 2015 at 13:42

Gordon Linoff · Accepted Answer · 2015-08-21 13:42:59Z

1

Your query is reasonable:

select t1.*,
       t2.lastdate as lastdate,
from t1 left join
     (select Id, max(LastDate) as lastdate
      from t2table
      where somecolumn like '%somevalue%'
      group by Id
     ) t2
     on t1.Id = t2.Id
where yetanothercolumn = 'yetanothervalue';

However, it does unnecessary work on table 2 for ids that are not in the final result set. So, under many circumstances, a correlated subquery will be faster:

select t1.*,
       (select max(LastDate)
        from t2table t2
        where t2.Id = t.Id and t2.somecolumn like '%somevalue%'
       ) as lastdate,
from t1 
where yetanothercolumn = 'yetanothervalue';

For performance, you want indexes on t1(yetanothercolumn) and t2table(id, somecolumn, LastDate).

answered Aug 21, 2015 at 13:42

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Strawberry Over a year ago

That said, it will probably be slower than an uncorrelated one ;-)

DRapp Over a year ago

@Strawberry, not necessarily slower. If the "t1" table has 200 records with "yetanothervalue", but has 500k records with t2 LIKE condition, you are only going to correlate on the 200 records, not the 500k.

Strawberry Over a year ago

I'd like to see the stats for that.

Collectives™ on Stack Overflow

mysql table join with max

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related