I have a thousand of binary files in compress format and each file need to be decoded separately in a single pass. The max size of the file is 500 MB. Currently I am able to do the decoding of files one by one using python (with struct package). But since the number of files are huge in numbers and size, so its not possible to decode the file sequentially.
I am thinking to process this data in spark but I don't have much experience in spark. Can you please suggest if this task can be done in the spark. Many thanks in advance.