Skip to content

1.4. Modularity

Modularity is building analysis, code, systems as a series of building blocks rather than one singular piece.

Rather than having long scripts, these can be broken down into smaller functions, each that do a specific task, which can then be tested and reused across other processes.

Helper functions

To query a SQL database, there's a series of steps to get credentials, authorise and connect.

Rather than repeating these each time data is read, this logic could be put into a function, with a parameter for whether it's the test, dev or production database.

This can then be tested and used across multiple processes, and as a bonus some logic to detect the environment automatically so the code connects to the right database depending on where it's being run.

Modelling

A lot of the steps when building models can be wrapped up into modules for preprocessing, training, validation, deployment, monitoring.

A configuration file can be created for each type of model to define any parameters such as the name of the response variable, the loss function, features to use, etc. This way each model uses the shared components, with individual settings.

Not only can the code be modular, the model itself can be a modular system.

For example, a competitor model may continually retrain, producing validation packs which are sent out, and automatically pushed to the production endpoint if certain tests are passed.

Any downstream process that require scores from the model just need to interact with the endpoint rather than the rest of the process.

Impact Analysis

Proper impact analysis is a complex process, and using simple assumptions such as a flat elasticity is not likely to give an accurate assessment.

Building a module for properly estimatating impacts, with calls to conversion and renewal models, that can be used for each pricing change means time savings when impacts don't need to be built from scratch, improved impact analysis, and allows for consistent methodologies across projects.