Identifying the right tools for high-performance production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this industry collaboration we aim to provide a hands-on guide on how practitioners can productionize optimized machine learning models in cloud native ecosystems using production-ready open source frameworks. We will dive into a practical use-case, deploying the renowned GPT-2 NLP machine learning model in Kubernetes leveraging the ONNX Runtime from the Seldon Core Triton server, which will provide us with a scalable production NLP microservice serving the ML model that can power intelligent text generation applications. We will present some of the key challenges currently being faced in the MLOps space, as well as how each of the tools in the stack interoperate throughout the production machine learning lifecycle.
Click here to view captioning/translation in the MeetingPlay platform!