1

my query returns a dataset that looks like this:

+-----------+--------+-----------+-------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+---------+------------+--------+---------+------------+---------+---------+------------+--------+
| CLIENT_ID | count1 | TestFreq1 | stdv1 | count2 | TestFreq2 |  stdv2  | count3 | TestFreq3 |  stdv3  | count4 | TestFreq4 |  stdv4  | count5 | TestFreq5 |  stdv5  | count6 | TestFreq6 |  stdv6  | count7 | TestFreq7 |  stdv7  | count8 | TestFreq8 |  stdv8  | count9 | TestFreq9 |  stdv9  | count10 | TestFreq10 | stdv10 | count11 | TestFreq11 | stdv11  | count12 | TestFreq12 | stdv12 |
+-----------+--------+-----------+-------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+---------+------------+--------+---------+------------+---------+---------+------------+--------+
|    210893 |    136 |         0 |     0 |     81 |        41 | 79.2685 |     19 |        63 | 58.321  |     24 |        21 | 20.4896 |      5 |        25 | 8.228   |      6 |        24 | 24.0638 |      4 |        25 | 24.6103 | 2      | 25        | 2.12132 |      2 |        23 | 21.9203 | 1       | 33         | NULL   |       2 |         29 | 7.77817 | 1       | 38         | NULL   |
|    123321 |     50 |         0 |     0 |      5 |        26 | 7.87401 |     14 |        45 | 51.8002 |      3 |        25 | 14.7422 |      2 |        22 | 17.6777 |      4 |        36 | 21.4942 |      3 |        36 | 22.2711 | NULL   | NULL      | NULL    |      4 |        35 | 9.30949 | NULL    | NULL       | NULL   |       1 |         31 | NULL    | NULL    | NULL       | NULL   |
|    454322 |    232 |         0 |     0 |    173 |        10 | 33.8487 |     36 |        36 | 36.6602 |     32 |        15 | 17.485  |     10 |        38 | 22.4809 |     13 |        23 | 20.0477 |      7 |        18 | 11.4143 | 3      | 32        | 24.5425 |      6 |        25 | 16.8602 | 3       | 28         | 21.166 |       2 |         25 | 4.94975 | 1       | 34         | NULL   |
+-----------+--------+-----------+-------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+--------+-----------+---------+---------+------------+--------+---------+------------+---------+---------+------------+--------+

instead of forcing the data to go out to count13, stdv13, testfreq13, ..14..14..14, 15.15.15 how can i aggregate all the values 12 and over in the same field?

here's my query, and thank you so much for your guidance:

;WITH counted AS (
  SELECT
    client_id,
    COUNT(*) AS TimesTested,
    (datediff(day,MIN(received_date),max(received_date)))
  /COUNT(*) as TestFreq
  FROM f_accession_daily
  GROUP BY
    client_id,
    patient_id
),
counted2 as (
  SELECT
    client_id,
    TimesTested,
    CAST(COUNT(*) AS varchar(30)) AS count,
    CAST(AVG(testfreq) as varchar(30)) as TestFreq,
    CAST(STDEV(TestFreq) as varchar(30)) Stdv
  FROM counted
  GROUP BY
    client_id,
    TimesTested
    )
    ,
unpivoted AS (
  SELECT
    client_id,
    ColumnName + CAST(TimesTested AS varchar(10)) AS ColumnName,
    ColumnValue
  FROM counted2
  UNPIVOT (
    ColumnValue FOR ColumnName IN (count, TestFreq,stdv)
  ) u
),
pivoted AS (
  SELECT
    client_id clientid,
    count1, TestFreq1,stdv1,
    count2, TestFreq2,stdv2,
    count3, TestFreq3,stdv3,
    count4, TestFreq4,stdv4,
    count5, TestFreq5,stdv5,
    count6, TestFreq6,stdv6,
    count7, TestFreq7,stdv7,
    count8, TestFreq8,stdv8,
    count9, TestFreq9,stdv9,
    count10, TestFreq10,stdv10,
    count11, TestFreq11,stdv11,
    count12, TestFreq12,stdv12
  FROM unpivoted
  PIVOT (
    MAX(ColumnValue) FOR ColumnName IN (
      count1,TestFreq1,stdv1,
      count2,TestFreq2,stdv2,
      count3,TestFreq3,stdv3,
      count4,TestFreq4,stdv4,
      count5,TestFreq5,stdv5,
      count6,TestFreq6,stdv6,
      count7,TestFreq7,stdv7,
    count8, TestFreq8,   stdv8,
    count9, TestFreq9,   stdv9,
    count10, TestFreq10,stdv10,
    count11, TestFreq11,stdv11,
    count12, TestFreq12,stdv12
    )
  ) p
)
select * from pivoted

just to clarify i want to return the same exact results, it's just that for the last column i want to aggregate all values that fall into the 12+ bucket. all the fields are going to be the same except the last three, which are going to be:

+----------+-------------+---------+
| count12+ | TestFreq12+ | stdv12+ |
+----------+-------------+---------+
| 353      | 32423       | NULL    |
| NULL     | NULL        | NULL    |
| 342      | 25324       | NULL    |
+----------+-------------+---------+

please note the much greater numbers above compared to the rest because the 12+ have been aggregated.

thank you so much for your guidance!

7
  • 3
    Wouldn't you just need to change your CTE counted2 so the column TimesTested is like this: CASE WHEN TimesTested >= 12 THEN 12 ELSE TimesTested END? Commented Oct 23, 2012 at 15:52
  • thanks! would i need to make changes to the pivot ? Commented Oct 23, 2012 at 15:54
  • 3
    @АртёмЦарионов- Why don't you try it ;-) Commented Oct 23, 2012 at 15:57
  • actually part of the reason is because validation takes about 1 hour for this :) i can definitely execute the query though! Commented Oct 23, 2012 at 16:00
  • 4
    I'm not gonna stand by 1 hour ;-). You can tell me afterwards how it went Commented Oct 23, 2012 at 16:06

1 Answer 1

3

It seems that the fastest way to do what you want would be to change your counted2 CTE, so the column TimesTested take your logic into account. So it should be:

counted2 as (
  SELECT
    client_id,
    CASE WHEN TimesTested >= 12 THEN 12 ELSE TimesTested END TimesTested,
    CAST(COUNT(*) AS varchar(30)) AS count,
    CAST(AVG(testfreq) as varchar(30)) as TestFreq,
    CAST(STDEV(TestFreq) as varchar(30)) Stdv
  FROM counted
  GROUP BY
    client_id,
    CASE WHEN TimesTested >= 12 THEN 12 ELSE TimesTested END
    )
Sign up to request clarification or add additional context in comments.

4 Comments

@АртёмЦарионов - Not really. Since those are CTEs, and not physical tables, I don't see a benefit on changing the first CTE. The lazy man in me prefers changing the second CTE, because I don't have to use CASE WHEN COUNT(*).. and I can use the colum name directly. (note that this could be different if those were physical or temporary tables)
@Lamak: Interestingly, my reasoning might have been very similar to yours except I thought it easier to write out COUNT(*) than TimesTested, as the former is shorter. :) Seriously, though, I can't see a difference in this case, and your choice is by no means worse than mine.
Except... you've changed a GROUP BY expression in the SELECT clause, and it would be logically more correct to change it in the GROUP BY as well. I'm not sure if it won't affect the results if you don't.
@AndriyM - You are right, I actually tested this with that part changed, I don't know why I didn't posted it as that on my answer. Updated now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.