🔥 Burn Fat Fast. Discover How! 💪

ONNX and deployment libraries Libraries like AllenNLP are gre | Neural Networks Engineering

ONNX and deployment libraries

Libraries like AllenNLP are great for model training and prototyping, they contain functions and helpers for almost any practical and theoretical task.
Some of these libraries even have functions for model serving, but they still might be a poor choice for a serving model in production.

Very same functionality, which makes them convenient for development, makes them hard to support in a production environment.
Docker image with only AllenNLP installed takes up a whole 1.9 GB compressed! It could hardly be called a micro-service.

In Tensorflow this problem was solved by saving computational graphs in a special serialization format, independent of training and preprocessing libraries.
This serialized view can later be served by the tensor serving service.
Good solution, but not universal - there are plenty of frameworks, like PyTorch, which does not follow Google's standard.

Now, this is a part where ONNX appears - an open standard for NN representation.
It defines a common set of operators - the building blocks of machine learning and deep learning models.
Not any valid Python-PyTorch model can be converted into ONNX representation. Only a subset of operations is also valid for ONNX.

Unfortunately, default implementation of most AllenNLP models does not fit this subset:

- AllenNLP model handles a vast variety of corner cases, conditions that are essentially python functions.
ONNX does not support arbitrary code execution, ONNX model should consist of computation graph only
- AllenNLP models take care of text preprocessing. It operates with dictionaries and tokenization. ONNX does not support these operations.

Luckily in most cases, AllenNLP models could be used as just a wrapper for actual model implementation.
For this, you need to have an AllenNLP model, which handles loss function, makes preprocessing, and interacts with the model trainer.
And also an internal class for the "pure" model, which implements standard nn.Module interface.
It should use tensors as input and output.
Internally it should construct a persistent computational graph.

This internal model now could be converted into the ONNX model and saved independently.

Having ONNX you can use whatever instrument you need to serve or explore your model.