0

I am trying to move some data from a database to another. I am currently having over a million entries in my database and I was expecting this to take long but already passed 50min and no result :) . Here is my query:

UPDATE xxx.product AS p 
LEFT JOIN xx.tof_art_lookup AS l ON p.model_view = l.ARL_SEARCH_NUMBER 
SET p.model = l.ARL_DISPLAY_NR 
WHERE p.model_view = l.ARL_SEARCH_NUMBER;

Any help on how to improve this query will be welcome. Thanks in advance!

11
  • Using Join in updates is very consuming! Commented Nov 12, 2013 at 12:20
  • OUTER JOINs on updates are vanishingly rare (except IS NULLS). Are you sure that's what you want? If not, switch to an INNER JOIN. Commented Nov 12, 2013 at 12:21
  • Would you recommend to get the data with php and save it to an array and then do the UPDATE ? Commented Nov 12, 2013 at 12:21
  • 50 minutes? That's 300 updates per second if you have a million rows. Hang in there. A job like this could take overnight. Also, you need to let us know stuff like whether you're using InnoDB. Commented Nov 12, 2013 at 12:22
  • Do NOT bring the data to PHP. Commented Nov 12, 2013 at 12:24

2 Answers 2

2

Indexes on p.model_view, l.ARL_SEARCH_NUMBER if you not gonna get rid of JOINs.
Actually, it might be optimized depending on actual data amounts and their values (NULLs presence) by use of:
1. Monitoring query execution plan and , if it's not good, putting query hints for compiler or exchange JOINs for subqueries so compiler uses another type of join inside it (merge/nested loops/hashs/whatever)
2. Making a stored procedure with more comlicated but faster logic
3. Doing updates by small portions

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your help. I will try to see what taking so long and will be back with updates or hopefully to close this thread :D
1

Identify what makes slow.

check JOIN is optimized

run SELECT only:

SELECT COUNT(*)
FROM xxx.product p LEFT JOIN xx.tof_art_lookup l 
  ON p.model_view = l.ARL_SEARCH_NUMBER;

how long takes? and EXPLAIN SELECT ... check proper INDEX is used for JOIN.

If everything is fine for JOIN, then UPDATEing row is slow. this situation is hard to make things faster.

UPDATE = DELETE and INSERT

I didn't tried this. but sometimes, this strategy is faster.. UPDATE is DELETE old row and INSERT new row using new value.

// CREATE new table and INSERT
CREATE TABLE xxx.new_product
SELECT p.model_model, l. ARL_DISPLAY_NR, ... 
FROM xxx.product p LEFT JOIN xx.tof_art_lookup l 
  ON p.model_view = l.ARL_SEARCH_NUMBER;

// drop xxx.procuct
// rename xxx.new_product to xxx.product

divide table into small chunk, and run concurrently

I think your job is CPU bounded and your UPDATE query uses just one CPU can't have benefit many cores. xxx.product TABLE has no constraint for join, there for 1M rows are updated sequencially

My suggestion following.

give some conditions to xxx.product so that xxx.product divided 20 group. (I don't no which column would be better for you, as I have no information about xxx.product)

then run 20 queries at once concurrently.

for example:

// for 1st chunk
UPDATE xxx.product AS p 
...
WHERE p.model_view = l.ARL_SEARCH_NUMBER
  AND p.column BETWEEN val1 AND val2; <= this condition spliting xxx.product

// for 2nd chunk
UPDATE xxx.product AS p 
...
WHERE p.model_view = l.ARL_SEARCH_NUMBER
  AND p.column BETWEEN val2 AND val3;

...
...

// for 20th chunk
UPDATE xxx.product AS p 
...
WHERE p.model_view = l.ARL_SEARCH_NUMBER
  AND p.column BETWEEN val19 AND val20;

It is important to find BETWEEN value distribute table evenly. Histogram may help you. Getting data for histogram plot

2 Comments

Thanks for the help. I will give it a try and check to see whats taking so long.
@Ionut Laurentiu I'm looking forward to your reply! even if my suggestion does not improved your query.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.