0

I'm trying to do a JOIN like you see below. I only want records that have at least X email addresses in the property_res table. When I change the acount value from 10 to 20 for example the returned results stay at 949 records. This should decrease dramatically as there should be alot less matches where r.EmailAddress is found 20 times. Is there a limitation to using COUNT on a varchar data type? What is the best way to achieve this?

SELECT 
    r.FirstName AS ag_fname,
    r.LastName AS ag_lname,     
    r.EmailAddress AS ag_email, 
    COUNT(r.EmailAddress) AS `acount`
    FROM property_res e
    LEFT JOIN ActiveAgent_Matrix r 
    ON e.ListAgentMLSID=r.MemberNumber
    WHERE e.ListPrice >= 50000
    GROUP BY r.EmailAddress
    HAVING acount >=20

A sample output of the data shows a weird value for acount as I'd think it would be the count of the email address but they all are the same?

ag_fname | ag_lname | ag_email      | acount
    Jane |     Doe1 | [email protected] | 3390
    Jane |     Doe3 | [email protected] | 3390
    Jane |     Doe4 | [email protected] | 3390
    Jane |     Doe5 | [email protected] | 3390
3
  • I guess your group by should be GROUP BY e.ListAgentMLSID Commented May 7, 2014 at 14:30
  • try changing your group by to group by r.FirstName, r.LastName, r.EmailAddress Commented May 7, 2014 at 14:30
  • Try COUNT(1) instead. Commented May 7, 2014 at 14:52

1 Answer 1

1

What's happening is your join condition is not specific enough (or in fact multiple emails can be associated with the same id, or vice versa, in which case you GROUP BY is not specific enough). I suspect it is the former and that your result set is exploding. Not quite a Cartesian join, but similar.

Try to troubleshoot with the following two queries:

SELECT 
    r.EmailAddress,
    COUNT(*)
    FROM property_res e
    LEFT JOIN ActiveAgent_Matrix r 
    ON e.ListAgentMLSID=r.MemberNumber
    GROUP BY r.EmailAddress
    HAVING COUNT(*) > 1;

SELECT 
    e.ListAgentMLSID,
    COUNT(*)
    FROM property_res e
    LEFT JOIN ActiveAgent_Matrix r 
    ON e.ListAgentMLSID=r.MemberNumber
    GROUP BY e.ListAgentMLSID
    HAVING COUNT(*) > 1;

One (or both) of these result sets will be non empty. That's important because it means that this join condition: ON e.ListAgentMLSID=r.MemberNumber is not specific enough. Either there are multiple emails per ListAgentMLSID or there are multiple ListAgentMLSID's per email address... or both.

To trouble shoot this, I'd start by trying to identify where the "multiple X's per Y's" are. The queries above should actually help you do that. The first one will identify emails associated with multiple IDs. The second will help you identify IDs associated with multiple emails. The question you need to ask yourself is, should multiple emails be associated with any given id? Or should multiple ids be associated with any given email? If that's permissible, change your GROUP BY. If it's not, change your join condition.

It may be as simple as joining on id and email.... but if it is not, then you either need to group by the email as well (as suggested above in the comments... this is fine if indeed multiple emails should be permitted to have an association with an id, or vice versa) or you need to add an additional join condition that's specific enough to prevent data that shouldn't be joined, joined.

Hope this helps.

Sign up to request clarification or add additional context in comments.

1 Comment

Best answer I've ever gotten There were multiple ListAgentMLSID per email. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.