mltrace documentation

mltrace is an open-source Python tool to track data flow through various components and diagnose failure modes in ML pipelines. It offers the following:

  • coarse-grained lineage and tracing

  • Python API to log versions of data and pipeline components

  • database to store information about component runs

  • UI to show the trace of steps in a pipeline taken to produce an output

mltrace is designed specifically for Agile or multidisciplinary teams collaborating on machine learning or complex data pipelines. A more detailed blog post on why the tool was developed can be found here.

Design principles

  • Simplicity (users should know exactly what the tool does)

  • Rinse and repeat other successful designs
    • Decorator design similar to Dagster solids

    • Logging design similar to MLFlow tracking

  • API designed for both engineers and data scientists

  • UI designed for people to help triage issues even if they didn’t build the ETL or models themselves

Roadmap

We are actively working on the following:

  • REST API to log from any type of file, not just a Python file

  • Prometheus integrations to monitor component output distributions

  • Causal analysis for ML bugs — if you flag several outputs as mispredicted, which component runs were common in producing these outputs? Which component is most likely to be the biggest culprit in an issue?

  • Support for finer-grained lineage (at the record level)