I have running celery with django. I import a stream of objects into my database by using tasks. each task imports one object. concurrency is 2. within the stream objects can be duplicated, but should not be inside my database.
code i'm running:
if qs.exists() and qs.count() == 1:
return qs.get()
elif qs.exists():
logger.exception('Multiple venues for same place')
raise ValueError('Multiple venues for same place')
else:
obj = self.create(**defaults)
problem is that if objects inside the stream are duplicate and very close to each other, the app still imports the same objects twice.
I assume that the db checks are not working properly with this concurrency setup. what architecture du you recommend to resolve this issue?