Strange Loop

Sept 22 - Sept 24, 2022


Union Station


St. Louis, MO

Mitigating Bias in ML Models with Constraints

We train machine learning models so that they learn relationships in the data. The expectation is that these learned latent relationships are true and, by extension, fair. But what if the data is biased, or the data is accurate, but the truth is biased? Wouldn't we want to correct the bias?

There are a collection of lesser-known machine learning techniques called monotonic, interaction, and shape constraints. We can use them for bias-mitigation and for injecting domain knowledge into the model, placing guardrails so that it reflects the truth we want it to reflect.

The first part of the session outlines the many reasons we would want to use constraints. During the second part, we will dive into a criminal recidivism prediction. Authorities want to predict what defendants are at the highest risk of recidivism, and we know defendants with the highest amount of priors are the most likely to recidivate. Employing monotonic constraints in XGBoost and TensorFlow Lattice will place the guardrails so that defendants with the least priors aren't unfairly classified as high risk. Additionally, we will examine interaction constraints, which can allow us to restrict learning for interactions between features based on domain knowledge to improve fairness.

Attendees are expected to have a fundamental knowledge regarding machine learning and the Python programming language. An attendee will use Pandas, XGboost, and Tensorflow in this workshop.

Serg Masís

Serg Masís


Serg Masís has been at the confluence of the internet, application development, and analytics for the last two decades. Currently, he's a Climate and Agronomic Data Scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Before that role, he co-founded a startup, incubated by Harvard Innovation Labs, that combined the power of cloud computing and machine learning with principles in decision-making science to expose users to new places and events. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making ― and machine learning interpretation helps bridge this gap more robustly. His book titled "Interpretable Machine Learning with Python" was released March 2021 by UK-based publisher Packt.