A computational scientist where I work wrote a program that scores inputs using a machine learning model built with scikit-learn. My task is to make this ML scorer available as a microservice.
So I wrote a few lines of code using Flask to accomplish this. Mission achieved!
Well, not quite. Because this service is going to be beaten on pretty heavily at times, it needs to be able to crunch on several requests in parallel. (I.e., on multiple cores. We have about 20 on our server.) A solution that I can achieve with about ten minutes of effort is to just spin up ten or twenty of these little REST servers on different ports and round-robin to them using nginx as a reverse proxy.
Although this will work fine, I am sure, I think it would be more elegant to have a single Python server handling all the requests, rather than having twenty Python servers. So I started reading up on WSGI and uWSGI and a bunch of other things. But all that I have accomplished with all this reading and web surfing is ending up very confused.
So I'll ask here instead of trying to unravel this on my own: Should I just stick with the brute force approach I described above? Or is there something better that I might be doing?
But if doing something "better" is going to require days of effort wading through incomprehensible documentation, doing frustrating experimentation, and pulling out all of my hair, then I'd rather just stick with the dumb brute force approach that I already understand and that I know for sure will work.
Thanks.