Transparency.dev Summit 2024
ML Model Signing: Cryptographically paving the way to transparency in machine learning models
🗓️ Friday 11 Oct 2024, 10:00am
📍 Ziggy Stardust

How do I know where my machine learning model came from, and how can I prove it? This question has remained largely unanswered as adoption of machine learning and artificial intelligence has skyrocketed, with over 600,000 ML models freely available on model repositories like Hugging Face. Current cryptographic signing mechanisms are not designed with ML models in mind, nor are they fit for purpose largely due to one simple fact: models are not just a singular file. There are a number of disparate files in one directory (often several hundred gigabytes or more), comprising many bespoke formats only seen in the machine learning context.

We present an open-source specification and implementation to cryptographically sign an arbitrary collection of files which comprise an ML model, to create a mechanism to verify the integrity of a machine learning model to ensure trust between the model producer and end-user. By implementing model signing, we are paving the way for model transparency which helps strengthen the AI supply chain. With this, one could see who has trained the model, what training framework has been used, what datasets were used, and much other useful information.


Speaker

Mihai Maruseac is a member of Google Open Source Security team (GOSST), working on Supply Chain Security for ML. Before joining GOSST, Mihai created the TensorFlow Security team after joining Google. Previously, he worked on a startup incorporating Differential Privacy (DP) within Machine Learning (ML) algorithms (now part of Snowflake). Mihai has a PhD in Differential Privacy from UMass Boston.