What's the Dyfference?

Open science infrastructure for long-lived AI safety assessments

What you can do with Dyff

Create assessments

Create rich AI assessments using standard Python data science tools like Jupyter and PyArrow. Develop your assessments locally, then upload the final version to run against any AI system hosted on Dyff.

from dyff.client import Client
client = Client()

dataset = client.datasets.create_arrow_dataset("/my/local/data")
client.datasets.upload_arrow_dataset(dataset, "/my/local/data")

jupyter_notebook = client.modules.create_package("/my/jupyter/proj")
client.modules.upload_package(jupyter_notebook, "/my/jupyter/proj")

Why Dyff?

Assessment integrity

Safety assessment results are meaningless if the system under test has been trained on the test data. Dyff protects test data and selectively exposes safety assessment results so developers can't game the test.

Assessment lifetime

Dyff assessments are long-lived. Dyff is uncompromising on reproducibility, stores every parameter and result from every test run, and protects test data to preserve its validity over time.

Assessor viability

Dyff demonstrates a path to an economically sustainable evaluation ecosystem by providing a platform where assessors can develop, publish, and eventually market their assessments.

Get the Dyff client

python3 -m pip install dyff