2

I have a data set that looks like this example data set.

There are multiple users under one domain. I want only one row per email_domain and the row should correspond to the max(last_login) value. In short I only want the user from the email_domain who was the last to log in among all users from the same domain.

Ive tried a query which looks like this

select *
FROM
(
select LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain, last_login, last_name, first_name, email, phone  
from user_with_address ua 
order by email_domain
) as A
group by email_domain, last_login, last_name, first_name, email, phone  
having last_login = max(last_login)
order by email_domain   

I still get a list with multiple values for each email domain, what am I doing wrong? Please help.

Disclaimer: I have basic->intermediate knowledge of SQL.

1
  • You're grouping by last_login, probably want to select max(last_login) and not group by it? Also grouping by name, email, phone etc will separate records from the domain. Commented Mar 1, 2017 at 6:33

3 Answers 3

2

Use distinct on ()

select distinct on (email_domain) *
FROM (
  select lower(split_part(email, '@', 2)) AS email_domain, 
         last_login, 
         last_name, 
         first_name, 
         email, 
         phone  
  from user_with_address
) as A
order by email_domain, last_login desc;

I also incorporated Patrick's suggested to simplify the expression to extract the domain from the email.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! this worked perfectly, ordering by last_login desc and picking the first row using distinct.
0

One option is to use ROW_NUMBER() and retain the most recent login record for each group of same email domain records.

SELECT t.email_domain, t.last_login, t.last_name, t.first_name, t.email, t.phone
FROM
(
    SELECT a.*,
           ROW_NUMBER() OVER (PARTITION BY a.email_domain ORDER BY a.last_login DESC) rn
    FROM
    (
        SELECT LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
               last_login, last_name, first_name, email, phone
        FROM user_with_address ua 
    ) a
) t
WHERE t.rn = 1
ORDER BY t.email_domain

Note that I actually subquery twice here, to avoid having to repeat your code which computes the email domain. If not for that, we could have accomplished this with just a single subquery. We can use a single subquery here, but the query would be a bit harder to read.

2 Comments

Use split_part(ua.email, '@', 2) instead of substring(ua.email FROM position(...) + 1).
Thank you. this makes more sense and is simpler.
0

I like Tim Biegeleisen's answer, but this is a little more simple, SQL wise. Don't know about performance differences though.

select 
  LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain, 
  last_login, 
  last_name, 
  first_name, 
  email, 
  phone  
from user_with_address ua 
where last_login = (select max(last_login) 
                    from user_with_address ua2
                    where LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) =
                          LOWER(SUBSTRING(ua2.email FROM POSITION ('@' IN ua2.email))
order by email_domain;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.