2

I have below data in a table

    Employee id       Status       email          partition 
            A          P           [email protected]      1
            A          P           [email protected]      2 

            D          T           [email protected]      1
            D          T           [email protected]      2

            G           P          [email protected]      1
            G           T          [email protected]      2

We expect all three columns for one employee should be same for partition 1 and 2. If there is any employee for which either of the three columns are different between partition 1 and 2, those two records should be returned.

For the above data, query should return two records for Employee G. Could anyone please help with the query?

3
  • Show your desired output. Commented Jun 19, 2015 at 5:29
  • @NareshK You could use analytic LAG(). Commented Jun 19, 2015 at 5:46
  • 3
    You need this query because you have denormalized screwed up data, admit it ;) Commented Jun 19, 2015 at 5:48

6 Answers 6

2

This code will return you all rows, where there is no partition=2 for the employee (single record) or some fields are different in two rows.

select t1.*, t2.* 
from tbl t1
    left join tbl t2 
    on t2.Employee_id = t1.Employee_id 
    AND t2.partition >  t1.partition
where t2.Employee_id is null
OR t1.Status != t2.Status 
or t1.email != t2.email
Sign up to request clarification or add additional context in comments.

3 Comments

For each employee, there should be two records with partition=1 and partition=2 respectively.
Just set outside Select * from tbl t3, (SELECT t1.Employee_id as Employee_id1, t2.Employee_id as Employee_id2 <..> ) as t2 where t3.Employee_id = t2.Employee_id1 or t3.Employee_id = t2.Employee_id2
@NareshK you say "there should be two records" but your need for this query is precisely because you have data which isn't how it should be . Once you start down the denormalization route it becomes impossible to predict how the data will become corrupted.
0

This should give you the expected result:

select * from tablename where employee_id in (
     select t1.employee_id 
     from tablename t1 
       left outer join tablename t2 on t1.employee_id = t2.t2.employee_id and t1.status = t2.status and t1.email=t2.email and t1.partition=1 and t2.partition=2 
     where t2.employeeid is null )

Comments

0

You could use analytic LAG() function.

Setup

CREATE TABLE t
  (
    Employee_id VARCHAR2(1),
    Status      VARCHAR2(1),
    email       VARCHAR2(10),
    partition   INT
  );
​
INSERT ALL 
    INTO t (Employee_id, Status, email, partition)
         VALUES ('A', 'P', '[email protected]', 1)
    INTO t (Employee_id, Status, email, partition)
         VALUES ('A', 'P', '[email protected]', 2)
    INTO t (Employee_id, Status, email, partition)
         VALUES ('D', 'T', '[email protected]', 1)
    INTO t (Employee_id, Status, email, partition)
         VALUES ('D', 'T', '[email protected]', 2)
    INTO t (Employee_id, Status, email, partition)
         VALUES ('G', 'P', '[email protected]', 1)
    INTO t (Employee_id, Status, email, partition)
         VALUES ('G', 'T', '[email protected]', 2)
SELECT * FROM dual;
COMMIT;

Query

SQL> WITH t1 AS(
  2  SELECT t.*, LAG(status) OVER(PARTITION BY employee_id, email ORDER BY status) rn FROM t
  3  ),
  4  t2 AS(
  5  SELECT Employee_id, Status, email, PARTITION FROM t1
  6  WHERE
  7  status <> rn
  8  )
  9  SELECT t.Employee_id,
 10    t.Status,
 11    t.email,
 12    t.partition
 13  FROM t,
 14    t2
 15  WHERE t.Employee_id = t2.Employee_id
 16  ORDER BY t.partition;

EMPLOYEE_ID STATUS EMAIL       PARTITION
----------- ------ ---------- ----------
G           P      [email protected]          1
G           T      [email protected]          2

SQL>

4 Comments

This fits the specific sample data but won't find rows which match on STATUS but have a different EMAIL
@APC OP stated We expect all three columns for one employee should be same for partition 1 and 2 Which means EMAIL would be same.
Well "all three columns" also means the STATUS should be the same. Why test for one column failing the rule and not the other?
@APC, Ah! I see. Reading OP's question again and I fimd that any of the three columns other than partition column could be different and not just status column.
0

Try this,

Select 
    t1.* 
from 
    table t1, table t2
where 
    t1.partition < t2.partition 
and t1.employee_id = t2.employee_id
and (t1.status != t2.status or t1.email !=t2.email)

Union all

Select 
    t2.* 
from 
    table t1, table t2
where 
    t1.partition < t2.partition 
and t1.employee_id = t2.employee_id
and (t1.status != t2.status or t1.email !=t2.email)

1 Comment

Sorry I missed to add temp table in query, I have updated answer.
0

This is quite universal and hits data only once:

select employee_id, status, email, partition 
  from ( 
    select test.*, 
        count(1) over (partition by employee_id, status, email) cnt1,
        count(1) over (partition by employee_id) cnt2
      from test )
  where cnt1 <> cnt2

SQLFiddle

This query will also deal with situation, when there are 3 or more rows for one person, not all matching. And if one employee has only one row - and you want to show it as anomaly - add or cnt2 = 1 in last line.

Comments

0

You can change the structure of your table insted of writing complex queries,

The simple solution is to have 2 Tables

  1. Table Employee (EmployeeID, Status, Email) with Employee Id as Primary Key
  2. Table Partition (EID Foreign key, Partition number).

This will assure you of better desing and non-redundant tables.

8 Comments

EmailAddr is typically a horrible idea for pk
@DrewPierce I admit that given lot of fake accounts, but this would be quiet good when it is a organization's internal database.
Not thinking fake at all from my thinking but rather normalized data and 3rd normal form. I wouldn't even have email address in employee table it is foolish. Google 3rd normal form
yes @DrewPierce, When 3NF is considered, Eliminating either EID or Email will suffice. :-)
The eid ought to be in employee. Then other tables hang under it using eid
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.