0

Folks,

There are 2500 records in master table city which has references in approx. 22 another tables as city ID or city name.

Now i will have to delete 2000 records from city master table and cross check in another 22 tables where city ID exists and if references exist then modify the ID.

I made a simple script in PHP but it took 15-16 hours and i cannot do same on production server. All 22 tables have on average 700,000 to 800,000 rows.

Please provide best possible solution.

Thanks

21
  • 5
    2000 records, 15 hours. You're doing something wrong. Commented Aug 26, 2011 at 2:35
  • 3
    Ravi - you're only going to get real help if you post the script for us to see Commented Aug 26, 2011 at 2:43
  • 1
    People wondering about the 15 hour time quoted: As I understand it (translation issues aside), there are 2000 records in the "master" table that need to be deleted, but for each of the other 22 tables, there are hundreds of thousands of rows that reference the deleted records and themselves need to be deleted or edited. So we're talking about working with millions of rows in total, not 2000. Commented Aug 26, 2011 at 2:43
  • 1
    @RaviRaj: there is a gap in explanation, not in understanding. There is a big difference Commented Aug 26, 2011 at 2:52
  • 3
    Are you the DBA by yourself? Do you index your tables with PK, FK and/or UK columns? Do you write your queries with the WHERE on an indexed column? Not doing so will make the queries to take ages. Commented Aug 26, 2011 at 3:17

2 Answers 2

2

Okay, this is a shot in the dark, but here goes:

Remove the cities one at a time. So for each city, update all the other 22 massive tables. This will be slower overall, but will execute in smaller chunks (1/2500th of 16 hours ~= 20-25 seconds). Once the tables are updated, remove that row from the master city table. Rinse and repeat ~2500 times.

Sign up to request clarification or add additional context in comments.

1 Comment

Since 1/2500th of an hour is about 1.5 seconds, that's about 20-25 seconds per city, isn't it? I agree with the method you suggest, though; I was debating whether to suggest it. One advantage of this is that it can be scheduled over many days, if need so be - a hundred cities here, a hundred cities there - without taking everything out of commission while it is happening.
0

I made a simple script in PHP but it took 15-16 hours and i cannot do same on production server

If it's simple then why not post it here? Without knowing your algorithm and data schema we can't guess what the best algorithm would be.

Certainly you should already have ensured that you're using appropriate indexes for your queries and that the mysql instance is tuned (you should probably be configuring more sort_buffer, repair_threads, thread_cache and max_sort_file_size than the default).

In general bulk operations will be much faster done in SQL rather than PHP. And its probably a lot simpler to change the foreign key references before you delete the master record (not to monetion you're not breaking your data integrity during the process).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.