0

does anybody know a statement where it would remove duplicates like this from the output?

DEPARTMENT_ID  DEPARTMENT_NAME                FULL_NAME               JOB_TITLE      
 50            Shipping                  Alana Walsh          Sales Representative
 50            Shipping                  Alana Walsh          Sales Representative  
 50            Shipping                  Winston Taylor       Sales Manager   
 50            Shipping                  Winston Taylor       Sales Manager
 60              IT                      Alexander Hunold     Sales Representative
 60              IT                      Alexander Hunold     Sales Representative

here's what i have so far:

select employees.department_id, departments.department_name, first_name || ' ' || last_name as full_name, job_title
from departments, employees, jobs
where employees.department_id = departments.department_id
and job_title like '%Sales%'
order by job_title, full_name;

3 Answers 3

2

you can use SELECT DISTINCT to remove duplicates.

select distinct employees.department_id, departments.department_name, first_name || ' ' || last_name as full_name, job_title
from departments, employees, jobs
where employees.department_id = departments.department_id
and job_title like '%Sales%'
order by job_title, full_name;

Alternatively you can also use a GROUP BY on each column

select employees.department_id, departments.department_name, first_name || ' ' || last_name as full_name, job_title
from departments, employees, jobs
where employees.department_id = departments.department_id
and job_title like '%Sales%'
group by employees.department_id, departments.department_name, first_name || ' ' || last_name as full_name, job_title
order by job_title, full_name;

It also looks like you're missing a join on jobs? That may be the reason your query is returning duplicate results.

I'd also recommend using explicit join syntax when you can.

FROM departments
 INNER JOIN employees ON employees.department_id = departments.department_id
Sign up to request clarification or add additional context in comments.

Comments

0

You can use a DISTINCT as Matt Busche indicates.

In many, many cases, however, people incorrectly add a DISTINCT to cover up the fact that they are missing a join condition. In your case, you are joining three tables together-- departments, employees, and jobs-- but you only have one join condition. You are, therefore, doing a Cartesian join to the jobs table which is almost certainly not what you want. How does the jobs table relate to the other two tables? If we assume that there is a job_id in both the jobs and the employees table, you would want something like

select employees.department_id, 
       departments.department_name, 
       first_name || ' ' || last_name as full_name, 
       job_title
  from departments, employees, jobs
 where employees.department_id = departments.department_id
   and jobs.job_id             = employees.job_id
   and job_title            like '%Sales%'
 order by job_title, full_name;

2 Comments

@user2145903 - Assuming that they both have a job_id, yes, you should include that join condition. You didn't tell us what your tables looked like so I guessed at the relationship.
@user2145903 - Bear in mind that we don't have your tables or your data. If you happen to be using the tables in the sample HR schema, then the query I posted should work. Frequently, though, your instructor will create a schema for you to work in that has different tables and different data.
0

I guess Union all can produce a result like that. Basically, you want to duplicate each distinct row of your dataset from what I have understood.

For example, if the query is like this: Find employees in the HR department and output the result with one duplicate. Output the first name and the department of employees.

Table name: Worker Columns: worker_idint first_name varchar last_name varchar salary int joining_date datetime department varchar

select first_name , department from worker where department='HR' union all select first_name , department from worker where department='HR';

This will give each row twice like you have it in your output.

So, in you case, you can fetch distinct records in a temp table, and then make self-join on same table to shape the outcome as you want. It will be too much of a task, but I could think of the solution in this direction.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.