Earlier I found that a jpg, which was a scan of an important document was corrupted.
I have no clue how it got corrupted. Since this file was managed by my app and never accounted that files might get corrupted one day, I want to know what is the standard practice to ensure the integrity of the file? There are many files and I want to spot the files that get corrupted and look at the backups to recover them proactively and do a reactive recovery.
Based on what I gathered so far, md5 can be used to spot if a file is corrupted. So once a file is introduced to the system, I get a md5 hash. Then periodically, I run a process to check if all the file's md5 is consistent.
This is part of an app that is written in Rails. But I would like a language agnostic solution(if feasible) as I have an old system in asp.net and I am building a new one in Django also. Or is there a specialised app that just does this?