1

I have a problem I'm working on with Oracle SQL that goes something like this.

TABLE

 PurchaseID    CustID      Location  
----1------------1-----------A  
----2------------1-----------A    
----3------------2-----------A  
----4------------2-----------B  
----5------------2-----------A  
----6------------3-----------B  
----7------------3-----------B  

I'm interested in querying the Table to return all instances where the same customer makes a purchase in different locations. So, for the table above, I would want:

OUTPUT

PurchaseID    CustID      Location  
----3------------2-----------A  
----4------------2-----------B  
----5------------2-----------A  

Any ideas on how to accomplish this? I haven't been able to think of how to do it, and most of my ideas seem like they would be pretty clunky. The database I'm using has 1MM+ records, so I don't want it to run too slowly.

Any help would be appreciated. Thanks!

3
  • How many different locations, how many different customers? Commented Aug 23, 2013 at 17:39
  • The question is a simplified version of what I'm really doing at work, but in the real database there are 5 different values for the variable I'm calling Location here (also some nulls), and there are about 500,000 different "customers." Commented Aug 23, 2013 at 18:01
  • Then it might be best in performance terms to construct all five sets for different locations and intersect them. Commented Aug 23, 2013 at 20:07

6 Answers 6

8
SELECT *
FROM YourTable T
WHERE CustId IN (SELECT CustId
                 FROM YourTable
                 GROUP BY CustId
                 HAVING MIN(Location) <> MAX(Location))
Sign up to request clarification or add additional context in comments.

8 Comments

That was fast! Thanks! What is that Min(Location) <> MAX(Location) doing that is making it work?
@user1895076 It is for making sure that it has at least 2 different locations. You could also use HAVING COUNT(DISTINCT Location)>1
Ah, gotcha. Min would be the minimum number of locations by CustID? Also, I was going to tackle this next, maybe you can help. I have a fourth column with a purchase date. The next step was I wanted to reduce the OUTPUT table above down to just those instances where there were purchases made in different locations within 2 years of each other. It should return all instances where one customer made at least two purchases in different locations within 2 years of each other.
@user1895076 No, in this case the MIN(Location) is the minimum value of Location (in your example, 'A'). And, for your next question, it really is a different question as the one you have now
Deleted the edit. Will post as a new question if I can't find the answer.
|
7

You should be able to use something similar to the following:

select purchaseid, custid, location
from yourtable
where custid in (select custid
                  from yourtable
                  group by custid
                  having count(distinct location) >1);

See SQL Fiddle with Demo.

The subquery in the WHERE clause is returning all custids that have a total number of distinct locations that are greater than 1.

Comments

6

In English:

Select a row if another row exists with the same customer and a different location.

In SQL:

SELECT *
FROM atable t
WHERE EXISTS (
  SELECT *
  FROM atable
  WHERE CustID = t.CustID
    AND Location <> t.Location
);

Comments

0

Here's one approach using a sub-query

SELECT T1.PurchaseID
        ,T1.CustID
        ,T1.Location
FROM    YourTable T1
INNER JOIN
        (SELECT T2.CustID
                ,COUNT (DISTINCT T2.Location )
        FROM    YourTable T1
        GROUP BY
                T2.CustID
        HAVING  COUNT (DISTINCT T2.Location )>1
        ) SQ
ON      SQ.CustID = T1.CustID

Comments

0

This should only require one full table scan.

create table test (PurchaseID number, CustID number, Location varchar2(1));
insert into test values (1,1,'A');
insert into test values (2,1,'A');
insert into test values (3,2,'A');
insert into test values (4,2,'B');
insert into test values (5,2,'A');
insert into test values (6,3,'B');
insert into test values (7,3,'A');

with repeatCustDiffLocations as (
    select PurchaseID, custid, location, dense_rank () over (partition by custid order by location) r
    from test)
select b.*
from repeatCustDiffLocations a, repeatCustDiffLocations b
where a.r > 1
and a.custid = b.custid;

Comments

0

This makes most sense to me as I was trying to return the rows with the same values throughout the table, specifically for two columns as shown in this stackoverflow answer here.

The answer to your problem in this format is:

SELECT DISTINCT a.*
FROM TEST a
INNER JOIN TEST b
ON a.CUSTOMERID = b.CUSTOMERID AND
a.LOCATION <> b.LOCATION;

However, the solution to a problem such as mine with two columns having matching values in multiple rows (2 in this instance, would yield no results because all PurchaseID's are unique):

SELECT DISTINCT a.*
FROM TEST a
INNER JOIN TEST b
ON a.CUSTOMERID = b.CUSTOMERID AND
a.PURCHASEID = b.PURCHASEID AND
a.LOCATION <> b.LOCATION;

Although, this wouldn't return the correct results based on the what needs to be queried, it shows that the query logic works

SELECT DISTINCT a.*
FROM TEST a
INNER JOIN TEST b
ON a.CUSTOMERID = b.CUSTOMERID AND
a.PURCHASEID <> b.PURCHASEID AND
a.LOCATION = b.LOCATION;

If anyone wants to try in Oracle here is the table and values to insert:

CREATE TABLE TEST (
PurchaseID integer,
CustomerID integer,
Location varchar(1));

INSERT ALL
  INTO TEST VALUES (1, 1, 'A')
  INTO TEST VALUES (2, 1, 'A')
  INTO TEST VALUES (3, 2, 'A')
  INTO TEST VALUES (4, 2, 'B')
  INTO TEST VALUES (5, 2, 'A')
  INTO TEST VALUES (6, 3, 'B')
  INTO TEST VALUES (7, 3, 'B')
SELECT * FROM DUAL;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.