A Batch Job ML Model Deployment

Deploying an ML Model In an ETL Job

Image for post
Image for post
python code

Introduction

Bonobo for ETL Jobs

ETL Application

- data ( folder for test data )
- model_etl ( folder for application code)
- __init__.py
- etl_job.py
- graph.py
- model_node.py
- s3_etl_job.py
- tests ( folder for unit tests )
- .gitignore
- LICENSE
- Makefile
- README.md
- requirements.txt
- setup.py
- test_requirements.txt

MLModelTransformer Class

class MLModelTransformer(object):
def __init__(self, module_name, class_name):
model_module = importlib.import_module(module_name)
model_class = getattr(model_module, class_name)
model_object = model_class()
if isinstance(model_object, MLModel) is False:
raise ValueError(“The MLModelNode can only hold references to objects of type MLModel.”)
self.model = model_object
def __call__(self, data):
try:
yield self.model.predict(data=data)
except MLModelSchemaValidationException as e:
raise e
pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
>>> from model_etl.model_node import MLModelTransformer
>>> model_transformer = MLModelTransformer(module_name=”iris_model.iris_predict”, class_name=”IrisModel”)
>>> generator = model_transformer(data={“sepal_length”: 4.4, “sepal_width”: 2.9, “petal_length”: 1.4, “petal_width”: 0.2})
>>> result = list(generator)
>>> result
[{‘species’: ‘setosa’}]

Creating a Graph

def get_graph(**options):
graph = bonobo.Graph()
graph.add_chain(
LdjsonReader(options[“input_file”], mode=’r’),
MLModelTransformer(module_name=”iris_model.iris_predict”, class_name=”IrisModel”),
LdjsonWriter(options[“output_file”], mode=’w’))
return graph
>>> from model_etl.etl_job import get_graph
>>> graph = get_graph(“data/input.json”, “data/output.json”)
>>> graph
<bonobo.structs.graphs.Graph object at 0x10a52ffd0>

Running the ETL Process Locally

def get_argument_parser(parser=None):
parser = bonobo.get_argument_parser(parser=parser)

parser.add_argument(“--input_file”, “-i”, type=str, default=None, help=”Path of the input file.”)

parser.add_argument(“--output_file”, “-o”, type=str, default=None, help=”Path of the output file.”)
return parser
if __name__ == ‘__main__’:
parser = get_argument_parser()
with bonobo.parse_args(parser) as options:
bonobo.run(
get_graph(**options),
services={}
)
export PYTHONPATH=”${PYTHONPATH}:./”
python model_etl/etl_job.py -—input_file=data/input.json --utput_file=data/output.json

Accessing Data from a Service

pip install fs-s3fs
def get_services(**options):
return {
‘fs’: S3FS(options[“bucket”],
aws_access_key_id=options[“key”],
aws_secret_access_key=options[“secret_key”],
endpoint_url=options[“endpoint_url”],)
}
def get_argument_parser(parser=None):
parser = bonobo.get_argument_parser(parser=parser)

parser.add_argument("--input_file”, “-i”, type=str, default=None, help=”Path of the input file.”)
parser.add_argument(“--output_file”, “-o”, type=str, default=None, help=”Path of the output file.”)
# these parameters are added for accessing different S3 services
parser.add_argument(“--bucket”, “-b”, type=str, default=None, help=”Bucket name in S3 service.”)
parser.add_argument("--key”, “-k”, type=str, default=None, help=”Key to access S3 service.”)
parser.add_argument(“--secret_key”, “-sk”, type=str, default=None, help=”Secret key to access the S3 service.”)
parser.add_argument(“--endpoint_url”, “-ep”, type=str, default=None, help=”Endpoint URL for S3 service.”)

return parser
if __name__ == ‘__main__’:
parser = get_argument_parser()
with bonobo.parse_args(parser) as options:
bonobo.run(
get_graph(**options),
services=get_services(**options)
)
docker run -p 9000:9000 — name minio -e “MINIO_ACCESS_KEY=TEST” -e “MINIO_SECRET_KEY=ASDFGHJKL” -v /Users/brian/Code/etl-job-ml-model-deployment:/data minio/minio server /data
Screenshot of the minio service UI
Screenshot of the minio service UI
The minio web UI.
export PYTHONPATH=”${PYTHONPATH}:./”
python model_etl/s3_etl_job.py — input_file=input.json — output_file=output.json — bucket=data — key=TEST — secret_key=ASDFGHJKL — endpoint_url=http://127.0.0.1:9000/

Closing

Coder and machine learning enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store