2

Hi I am trying to insert values from excel sheet into SQL Database in java. SQL database has already some rows inserted by some other techniques. Now I need to insert new rows from excel sheet and should eliminate the duplicate values which are existed in the database as well as in the excel sheet. For that I write a query like this.

First I inserted the records from excelsheet into SQL database by using insert query

    Statement.executeUpdate(("INSERT INTO dbo.Company(CName,DateTimeCreated) values 
   ('"+Cname”' ,'"+ts+"');

Later I deleted the duplicate values using delete query.

      String comprows="delete from dbo.Company  where Id not in"
                + "(select min(Id) from dbo.Company "
                + "group by CName having count(*)>=1)";
          statement3.executeUpdate(comprows);

where Id is autoincremented integer. but it is not good to do insert and then delete. How do I know the values are already exist? If it is exist how do I remove during insertion???

1
  • What RDBMS? A staging table + MERGE is one option if your RDBMS supports it Commented Feb 15, 2012 at 4:35

3 Answers 3

2

You can simply fire a SELECT for the CName first. If a record is found, update else insert a new record. Edited to add code snippet:

ResultSet rs = Statement.query("SELECT Id from dbo.Company where CNAME = '" +Cname + "'");

if(rs.next()) {
  // retrieve ID from rs
  // fire an update for this ID
} else {
   // insert a new record.
}

Alternatively, if you think that there are already duplicates on your table and you want to remove them as well..

ResultSet rs = Statement.query("SELECT Id from dbo.Company where CNAME = '"+Cname + "'");

List idList = new ArrayList();
while(rs.next()) {
  // collect IDs from rs in a collection say idList
}
if(!isList.isempty()) {
    //convert the list to a comma seperated string say idsStr
    Statement.executeUpdate("DELETE FROM dbo.Company where id in ("+ idsStr + ")");
}
// insert a new record.
Statement.executeUpdate(("INSERT INTO dbo.Company(CName,DateTimeCreated) values('"+Cname”' ,'"+ts+"');

Of course good practice is to use PreparedStatement as it would improve performance. PS: Excuse me for any syntax errors.

Sign up to request clarification or add additional context in comments.

1 Comment

hmmm can you pls provide code snippet. am not getting clear picture.
0

One option would be to create a temp table and dump your Excel data there. Then you can write an insert that joins the temp table with the dbo.Company table and only insert the records that aren't already there.

You could do a lookup on each record you want to insert but if you are dealing with large volumes that's not a super efficient way to do it since you will have to do a select and an insert for each record in you excel spreadsheet.

Merge statements are pretty effective in these types of situations as well. I don't think all databases support them (I know Oracle does for sure). A merge statement is basically a combo insert and update so you can do the look up to the final table and insert if not found and update if found. The nice thing about this is you get the efficiency of doing all of this as a set rather than one record at a time.

Comments

0

If you can control the DB schema, you might consider putting a unique contraint for whatever column(s) to avoid duplicating. When you do your inserts, it'll throw when it tries to add the dup data. Catch it before it tosses you all the way out.

It's usually good to enforce constraints like this on the DB itself; that means no one querying the database has to worry about invalid duplicates. Also, optimistically trying the insert first (without doing a separate select first) might be faster.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.