Risk Scores for Worker Safety


Risk Scores for Worker Safety


Everyone wants to reduce workplace accidents: workers, companies, and the government.

Since accidents are not common, this leads to gross class imbalance between accident and non-accident data. However, the imbalance is opposite to normal expectations. Accident reporting is federally mandated, but companies are not required to share normal non-accident data. We had all the accident reports, but no non-accident data.


First, since absolutely no non-accident data was available, we helped the client construct a process to generate synthetic datasets to use as input for machine learning training until real datasets become available.

Then we developed machine learning models to compute accident risk scores for each worker in the work environment.

We also developed explanatory models (XAI models) to help companies understand the generated risk scores. We developed both population (global) explanations and individual worker (instance) explanation models.

The goal was to build a development pipeline to train machine learning models that correctly predict the synthetic datasets. Once real datasets are available, the pipeline will be rerun to build the final models.

All of our machine learning models were integrated into a user interface application developed by the client.


The machine learning models are finding and identifying the same parameters used to generate the test synthetic datasets. Development is on-going.