I am very new to databricks SQL. And I need some help for a certain task.
I have a table POLICY which has the following columns:
| PolicyId | CustomerId | AgentId |
|---|---|---|
| P123 | C123 | A123 |
| P124 | C124 | A124 |
| P125 | C123 | A125 |
| P126 | C124 | A124 |
I need to determine the policies belonging to the same customer that has a different agent from the other policies from that respective customer.
In this example , policies with policyid P124 and P126 under customer C124 are safe since they have the same agent - agentId A124. But, policies P123 P125 for customer C123 is flagged for having different agents.
Basically, a customer can have more than 1 policy as long as they are under the same agent. Any policies belonging to the same customer but has a different servicing agent is flagged.
How can I achieve this in Databricks SQL? As of now , I have only determined policies belonging to the same customer by using collect_list() function aggregate.
select customerid,collect_list(distinct policyid) from Policy group by customerid
| CustomerId | collect_list(PolicyId) |
|---|---|
| C123 | ["P123","P125"] |
| C124 | ["P124","P126"] |
EDIT: Tried the solution below and it worked. But the requirement had a minor change. Using the dataset from the query below, I need now to determine the policies under the same customer, having different agents , and/while these agents belonging on the same group.
I have a look up table for the agents that has a groupcode
| AgentId | GroupCode |
|---|---|
| A123 | 1 |
| A124 | 2 |
| A125 | 2 |
| A126 | 2 |