Get Mystery Box with random crypto!

D Inputs on scalable cost effective pipeline Hi, all. I have | Data Scientology

D Inputs on scalable cost effective pipeline

Hi, all.

I have multiple deep learning/machine learning / naive based tasks that I want to deploy online through an API. I have been trying to figure out the best way to do it for some time, but I am overwhelmed by the number of different frameworks and packages available on AWS and GCP.

Multiple tools on both platforms seem to have overlapping responsibilities with unclear limitations, making it hard to choose.

I want to obtain a scalable pipeline that saves as much money as possible (using, for example, spot pricing) and is easily expandable with new components.

My idea was to use celery and create a task for each different data processing method I have. The APIs would simply add an entry into the celery's queue, and the workers would take care of the rest.

Scaling up or down the pipeline would be just a matter of adding or removing celery workers, then.



How would you approach the problem?

Do you know of any resources worth readying to build an architecture like this?

Is there any particular instrument on AWS or GCP that would allow me to easily take care of this task?

/r/MachineLearning
https://redd.it/v34uhu