0

I am trying to aggregate data from one table into another. I've inherited this project; I did not design this database nor will I be able to change its format.

The [RawData] table will have 1 record per account, per ChannelCodeID. This table (where I currently have data) has the following fields:

[Account] int
[ChannelCodeID] int
[ChannelCode] varchar(10)

The [AggregatedData] table will have 1 record per account. This table (into which I need to insert data) has the following fields:

[Account] int
[Count] int
[Channel1] int
[Channel2] int
[Channel3] int
[Names] varchar(250)

For example, I might have the following records in my [RawData] table:

Account          ChannelCodeID     ChannelCode
12345            2                 ABC
12345            4                 DEF
12345            6                 GHI
54321            2                 ABC
54321            6                 GHI
99999            2                 ABC

And, after aggregating them, I would need to produce the following records in my [AggregatedData] table:

Account     Count    Chanel1   Channel2   Channel3    Names
12345       3        2         4          6           ABC.DEF.GHI
54321       2        2         6          0           ABC.GHI    
99999       1        2         0          0           ABC

As you can see, the count is how many records exist in my [RawData] table, Channel1 is the first ChannelCodeID, Channel2 is the second, and Channel3 is the third. If there are not enough ChannelCodeIDs from my [RawData] table, extra Channel columns get a '0' value. Furthermore, I need to concatenate the 'ChannelCode' column and store it in the 'Names' column of the [AggregatedData] table, but (obviously) if there is only one record, I don't want to add the '.'

I can't figure out how to do this without using a cursor and a bunch of variables - but I'm guessing there HAS to be a better way. This doesn't have to be super-fast since it will only run once a month, but it will have to process at least 10-15,000 records each time.

Thanks in advance...

EDIT:

ChannelCodes and ChannelCodeIDs map directly to each other and are always the same. For example, ChannelCodeID 2 is ALWAYS 'ABC'

Also, in the [AggregatedData] table, Channel1 is ALWAYS the lowest value, although this is incidental.

4
  • 1
    Are channels codes consistent in real-world (for example is channel 2 always the same (ABC in your example)? Also, is Chane1l in the new table always the lowest value? Commented Oct 28, 2014 at 18:53
  • Yes That is correct. ChannelCodes and IDs map directly to each other. It is also correct that Channel1 in [AggregatedData] is always the lowest value. Commented Oct 28, 2014 at 18:54
  • Do you clear out the aggregate table and rebuild it each time? Commented Oct 28, 2014 at 19:00
  • Nope, I don't clear it out, but this will always create a new record. There are other columns, month and year (which I intentionally omitted as to not unnecessarily obfuscate my question) - and these extra columns ensure that I'm always writing a new record to the [AggregatedData] table. Commented Oct 28, 2014 at 19:03

3 Answers 3

3

Test Data

DECLARE @TABLE TABLE (Account INT, ChannelCodeID INT, ChannelCode VARCHAR(10))
INSERT INTO @TABLE VALUES 
(12345 ,2 ,'ABC'),
(12345 ,4 ,'DEF'),
(12345 ,6 ,'GHI'),
(54321 ,2 ,'ABC'),
(54321 ,6 ,'GHI'),
(99999 ,2 ,'ABC')

Query

SELECT Account
      ,[Count]
      ,ISNULL([Channel1], 0) AS [Channel1]
      ,ISNULL([Channel2], 0) AS [Channel2]
      ,ISNULL([Channel3], 0) AS [Channel3]
      ,Names
FROM 
  (
    SELECT t.Account, T.ChannelCodeID, C.[Count]
          ,'Channel' + CAST(ROW_NUMBER() OVER 
                      (PARTITION BY t.Account ORDER BY t.ChannelCodeID ASC) AS VARCHAR(10))Channels
          ,STUFF((SELECT '.' + ChannelCode
                  FROM @TABLE 
                  WHERE Account = t.Account
                  FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'') AS Names
    FROM @TABLE t INNER JOIN (SELECT Account , COUNT(*) AS [Count]
                              FROM @TABLE 
                              GROUP BY Account) c
    ON T.Account = C.Account
  )A
PIVOT (MAX(ChannelCodeID)
       FOR Channels
       IN ([Channel1],[Channel2],[Channel3])
      )  p

Result

╔═════════╦═══════╦══════════╦══════════╦══════════╦═════════════╗
║ Account ║ Count ║ Channel1 ║ Channel2 ║ Channel3 ║    Names    ║
╠═════════╬═══════╬══════════╬══════════╬══════════╬═════════════╣
║   12345 ║     3 ║        2 ║        4 ║        6 ║ ABC.DEF.GHI ║
║   54321 ║     2 ║        2 ║        6 ║        0 ║ ABC.GHI     ║
║   99999 ║     1 ║        2 ║        0 ║        0 ║ ABC         ║
╚═════════╩═══════╩══════════╩══════════╩══════════╩═════════════╝
Sign up to request clarification or add additional context in comments.

3 Comments

I see it is comma and space delimiting the ChannelCodeList in your example. Is there a way to have it period-delimit without the spaces (e.g. 'ABC.DEF' instead of 'ABC, DEF')? If not, I can always run an update to replace ', ' with '.' on that column...
@Stan sorted ....... it looked a pretty simple query until I started to write it :)
Excellent - THANK YOU. Once I get this working in my environment, I'll accept the answer.
1

-- Back up raw data into temp table

select * into #rawData FROM RawData

-- First, populate the lowest channel and base records

INSERT INTO AggregatedData (Account,Count,Channel1,Channel2,Channel3)
   SELECT AccountID,1,Min(ChannelCODEID),0,0
   FROM #RawData
   GROUP BY AccountID

-- Gives you something like this

 Account     Count    Chanel1   Channel2   Channel3    Names
    12345       1        2         0          0           NULL
    54321       1        2         6          0           NULL
    99999       1        2         0          0           NULL

--

DELETE FROM #rawData 
WHERE account + str(channelCodeID) in 
      (SELECT account + str(channelCodeID) FROM AggregatedData)

-- Now do an update

UPDATE AggregatedData SET channel2= xx.NextLowest,count= count+1
FROM
    (   SELECT AccountID,Min(ChannelCODEID) as NextLowest
       FROM #RawData
       GROUP BY AccountID ) xx
WHERE AggregatedData.account=xx.accountID

-- Repeat above for Channel3

You then need an update statement against the final aggregated table based on the channel id's. If not run often, I would suggest a UDF which takes 3 parameters and returns a string, some like

UPDATE AggregatedData SET [names] = dbo.BuildNameList(channel1,channel2,channel3)

Will run a bit slow, but still not bad overall

Hope this gives you some ideas

3 Comments

Looks like you've got some better options already, mine is based on generic SQL, but if you version of SQL is newer, the other solutions are better...
This instance is 2008 R2. Thanks, again! Which answer do you recommend?
M.Ali's answers looks best for SQL 2008 R2. Good luck
0

Query:

WITH CTE AS (SELECT Account, ChannelCodeID, ChannelCode, RANK() OVER (PARTITION BY Account ORDER BY ChannelCodeID) [ChRank] FROM RawData)
SELECT A.Account, COUNT(Account) [Count], ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=1),0) [Channel1],
       ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=2),0) [Channel2],
       ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=3),0) [Channel3],
       STUFF((SELECT '.'+ChannelCode FROM CTE WHERE A.Account=CTE.Account FOR XML PATH('')),1,1,'') [Names]
FROM RawData A
GROUP BY A.Account

This uses a Common Table Expression to group and then display the data.

7 Comments

This returns an error "Subquery returned more than 1 value. This is not permitted..."
Is it possible for each Account to have duplicates in the ChannelCodeID field?
If you're asking if it's possible (for example) for account 12345 to have multiple records in the [RawData] table for ChannelCode2, the answer is no. As I stated in OP, 'The [RawData] table will have 1 record per account, per ChannelCodeID'. Thanks again for help!
What version of SQL Server are you using?
Thanks for the help - Check out M.Ali's answer.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.