0

I have searched much about it but I couldn't find any related information about my problem. I have a dataset like this.

Column1    Column2
   A          B    
   A          B
   A          C
   X          B
   X          B
   Y          C
   Y          B
   T          A
   T          A
   T          A

I can distinct Column1 with total number of occurences. But I actually want to remove the constant rows. When I run the query, the result should be like this;

Column1    Column2
   A          B    
   A          B
   A          C
   Y          C
   Y          B

As we see above A and Y have different values in Column2. How do I query this? I am using Sql Server 2014

2
  • So you want to remove the rows where the value in column 1 only ever has one value in column 2? If they have multiple column 2 possibilities you want to leave them? Commented Nov 14, 2016 at 19:40
  • Yes. That is what exactly I want to query. Commented Nov 14, 2016 at 19:40

2 Answers 2

1

You want to count the occurences of the pairings before you do anything

SELECT ColumnA, ColumnB, count(*)
FROM [source]
GROUP BY ColumnA, ColumnB

That will give you a list of each pairing and how often it occurs. Next you want to count how many pairings each value in ColumnA has, and cut out the ones with only one option:

SELECT ColumnA, count(*)
FROM
    (
    SELECT ColumnA, ColumnB, count(*)
    FROM [source]
    GROUP BY ColumnA, ColumnB
    )
GROUP BY ColumnA
HAVING count(*) > 1

That will give you a list of ColumnA values that you're looking for. From there you want to look for those values of ColumnA in your original data with a WHERE .. IN statement:

SELECT ColumnA, ColumnB
FROM [source]
WHERE ColumnA IN
    (
    SELECT ColumnA, count(*)
    FROM
        (
        SELECT ColumnA, ColumnB, count(*)
        FROM [source]
        GROUP BY ColumnA, ColumnB
        )
    GROUP BY ColumnA
    HAVING count(*) > 1
    )
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you Fritz. With some configuration for sql server I could make it.
1

COUNT(DISTINCT...) might work with a CTE:

; WITH CTE AS (
    SELECT Column1
    FROM [my_table]
    GROUP BY Column1
    HAVING COUNT(DISTINCT Column2) > 1
)
SELECT t.*
FROM [my_table] t
JOIN CTE ON CTE.Column1 = t.Column1;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.