2

I am running a php script that:

  1. queries a local database to retrieve an amount
  2. executes a curl statement to update an external database with the above amount + x
  3. queries the local database again to insert a new row reflecting that the curl statement has been executed.

One of the problems that I am having is that the curl statement takes 2-4 seconds to execute, so I have two different users from the same company running the same script at the same time, the execution time of the curl command can cause a mismatch in what should be updated in the external database. This is the because the curl statement has not yet returned from the first user...so the second user is working off incorrect figures.

I am not sure of the best options here, but basically I need to prevent two or more curl statements being run at the same time.

I thought of storing a value in the database that indicates that the curl statement is being executed at that time, and prevent any other curl statements being run until its completed. Once the first curl statement has been executed, then the database flag is updated and the next one can run. If this field is 'locked', then I could loop through the code and sleep for (5) seconds, and then check again if the flag has been reset. If after (3) loops, then reset the flag automatically (i've never seen the curl take longer than 5 seconds) and continue processing.

Are there any other (more elegant) ways of approaching this?

2
  • As far as I can tell, you never update the local database with the current amount (Step 3 says "insert a new row"). So the current amount in the local database will always be the same, irregardless of how many people are running curl concurrently. More details are needed here, but this sounds like a basic locking problem. Google "pessimistic lock" and "optimistic lock". It would also help to remove curl from the equation and replace it with a bonafide web service. Commented Jun 2, 2010 at 11:07
  • I didnt explain properly - but yes, I do update the local database with the current amount in Step 3. Actually, the amount calculated in Step 1 is based on summing all the records that have previously been inserted in Step 3. Commented Jun 2, 2010 at 23:19

3 Answers 3

2

You can use flock with arbitrary file. This way, the second script will block until it can acquire the lock.

$lockfile = 'foo.bar';
$fd = fopen($lockfile, "w");
if (flock($fd, LOCK_EX)) {
    do_your_stuff();
}
else
    die("error"); //should not happen; flock should block until the lock is acquired

fclose($fd);

EDIT:

PHP is not Java EE, there is no simple way to implement distributed transactions.

Sign up to request clarification or add additional context in comments.

4 Comments

Unfortunately that will not work, since the lock has to happen on a per-company basis - its fine to run concurrent curl operations, as long as there is only one per company at any one time.
@JonoB: Use a unique file for each company, like md5($companyName).
@Tom OK, makes sense, but then I still have to create a new file each time a new company is loaded onto the system. Why would this file-based method be preferred over creating a boolean value in the database?
@JonoB The database procedure you describe only works if you use a SERIALIZABLE isolation level.
0

What you need is called Concurrency control. A lot of people have been putting a lot of time and effort into researching this issue.

This issue is not as simple as you hope it is. Some forum engines (cough Stackoverflow cough) also implement something similar to this, so it can show the posts in the order in which they are created (not posted). This is done by creating a random token which the end-user has to use to notify the server that it is still active and still processing the record that's currently being edited/added, from time to time. The most common issues with this are issues of connection time-out and user time-out. The connection time-out is fixed by having the client send a heartbeat (on the web this is usually done by issuing an HTTP request just to notify the server that the connection is still alive — open) to the server at regular intervals; if the client stops sending hearbeats for a long period of time, it is considered to be timed out by the server. At the same time, the client should also be aware if the heartbeat has reached the server or not and it should consider what to do in case the connection has timed out. There is also the case of user time-out, when the user simply locks a record and leaves the computer for a long period of time. In this case, both the client and the server should be aware of the fact that the record has been locked but it has not been used (edited) for a long period of time and they should both take action.

The problem may seem simple, it can be formulated in a single sentence, but the answer is very complicated and depends on many factors.

Comments

0

curl supports parallel requests to N resources with curl_multi_exec(). If you want to make these calls (among which are multiple curl_* calls) sequentially and make the above statements an atomic operation do not use curl_multi in case you are using it.

If the database records that you access for update(s) cannot (or should not) be accessed by more than 1 user(s) at the same time then you should consider locking/transactions if available from your Database Server.

The use of a pseudo-transaction mechanism with a column marking a record as 'locked' might help you as you say but i cannot be certain (there is a method for pseudo-transactions using timestamps which you can google for more information).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.