Project: /responsible_ai/_project.yaml
Book: /responsible_ai/_book.yaml

<link rel="stylesheet" href="/site-assets/css/style.css">

<!-- DO NOT EDIT! Automatically generated file. -->


{% comment %}
The source of truth file can be found [here]: http://google3/third_party/py/tensorflow_model_remediation/docs
{% endcomment %}

# Integrating MinDiff with MinDiffModel

<div class="devsite-table-wrapper"><table class="tfo-notebook-buttons" align="left">
  <td><a target="_blank" href="https://www.tensorflow.org/responsible_ai/model_remediation/min_diff/guide/integrating_min_diff_with_min_diff_model">
  <img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
</td>
<td>
  <a target="_blank" href="https://colab.research.google.com/github/tensorflow/model-remediation/blob/master/docs/min_diff/guide/integrating_min_diff_with_min_diff_model.ipynb">
  <img src="https://www.tensorflow.org/images/colab_logo_32px.png">Run in Google Colab</a>
</td>
<td>
  <a target="_blank" href="https://github.com/tensorflow/model-remediation/blob/master/docs/min_diff/guide/integrating_min_diff_with_min_diff_model.ipynb">
  <img width=32px src="https://www.tensorflow.org/images/GitHub-Mark-32px.png">View source on GitHub</a>
</td>
<td>
  <a target="_blank" href="https://storage.googleapis.com/tensorflow_docs/model-remediation/docs/min_diff/guide/integrating_min_diff_with_min_diff_model.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
</td>
</table></div>

##Introduction

There are two steps to integrating MinDiff into your model:

1.   Prepare the data (covered in the [input preparation guide](./min_diff_data_preparation)).

2.   Alter or create a model that will integrate MinDiff during training.

This guide will cover the simplest way to complete the second step: using `MinDiffModel`.

##Setup

In [None]:
!pip install --upgrade tensorflow-model-remediation

In [None]:
import tensorflow as tf
tf.get_logger().setLevel('ERROR')  # Avoid TF warnings.
from tensorflow_model_remediation import min_diff
from tensorflow_model_remediation.tools.tutorials_utils import uci as tutorials_utils

First, download the data. For succinctness, the input preparation logic has been factored out into helper functions as described in the [input preparation guide](./min_diff_data_preparation#utility_functions_for_other_guides). You can read the full guide for details on this process.

In [None]:
# Original DataFrame for training, sampled at 0.3 for reduced runtimes.
train_df = tutorials_utils.get_uci_data(split='train', sample=0.3)

# Dataset needed to train with MinDiff.
train_with_min_diff_ds = (
    tutorials_utils.get_uci_with_min_diff_dataset(split='train', sample=0.3))

##Original Model

This guide uses a basic, untuned <a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model"><code>keras.Model</code></a> using the [Functional API](https://www.tensorflow.org/guide/keras/functional) to highlight using MinDiff. In a real world application, you would carefully choose the model architecture and use tuning to improve model quality before attempting to address any fairness issues.

Since `MinDiffModel` is designed to work with most Keras `Model` classes, we have factored out the logic of building the model into a helper function: `get_uci_model`.

### Training with a Pandas DataFrame

This guide trains over a single epoch for speed, but could easily improve the model's performance by increasing the number of epochs.

In [None]:
model = tutorials_utils.get_uci_model()

model.compile(optimizer='adam', loss='binary_crossentropy')

df_without_target = train_df.drop(['target'], axis=1)  # Drop 'target' for x.
_ = model.fit(
    x=dict(df_without_target),  # The model expects a dictionary of features.
    y=train_df['target'],
    batch_size=128,
    epochs=1)

### Training with a <a href="https://www.tensorflow.org/api_docs/python/tf/data/Dataset"><code>tf.data.Dataset</code></a>

The equivalent training with a <a href="https://www.tensorflow.org/api_docs/python/tf/data/Dataset"><code>tf.data.Dataset</code></a> would look very similar (although initialization and input randomness may yield slightly different results).

In [None]:
model = tutorials_utils.get_uci_model()

model.compile(optimizer='adam', loss='binary_crossentropy')

_ = model.fit(
    tutorials_utils.df_to_dataset(train_df, batch_size=128),  # Converted to Dataset.
    epochs=1)

## Integrating MinDiff for training

Once the data has been prepared, apply MinDiff to your model with the following steps:

1.   Create the original model as you would without MinDiff.

In [None]:
original_model = tutorials_utils.get_uci_model()


2.   Wrap it in a `MinDiffModel`.

In [None]:
min_diff_model = min_diff.keras.MinDiffModel(
    original_model=original_model,
    loss=min_diff.losses.MMDLoss(),
    loss_weight=1)

3.   Compile it as you would without MinDiff.

In [None]:
min_diff_model.compile(optimizer='adam', loss='binary_crossentropy')

4.   Train it with the MinDiff dataset (`train_with_min_diff_ds` in this case).

In [None]:
_ = min_diff_model.fit(train_with_min_diff_ds, epochs=1)

## Evaluation and Prediction with `MinDiffModel`

Both evaluating and predicting with a `MinDiffModel` are similar to doing so with the original model.

When calling `evaluate` you can pass in either the original dataset or the one containing MinDiff data. If you choose the latter, you will also get the `min_diff_loss` metric in addition to any other metrics being measured `loss` will also include the `min_diff_loss`.

When calling `evaluate` you can pass in either the original dataset or the one containing MinDiff data.  If you include MinDiff in the call to evaluate, two things will differ:

*   An additional metric called `min_diff_loss` will be present in the output.
*   The value of the `loss` metric will be the sum of the original `loss` metric (not shown in the output) and the `min_diff_loss`.

In [None]:
_ = min_diff_model.evaluate(
    tutorials_utils.df_to_dataset(train_df, batch_size=128))
# Calling with MinDiff data will include min_diff_loss in metrics.
_ = min_diff_model.evaluate(train_with_min_diff_ds)

When calling `predict` you can technically also pass in the dataset with the MinDiff data but it will be ignored and not affect the output.

In [None]:
_ = min_diff_model.predict(
    tutorials_utils.df_to_dataset(train_df, batch_size=128))
_ = min_diff_model.predict(train_with_min_diff_ds)  # Identical to results above.

##Limitations of using `MinDiffModel` directly

When using `MinDiffModel` as described above, most methods will use the default implementations of <a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model"><code>tf.keras.Model</code></a> (exceptions listed in the [API documentation](https://www.tensorflow.org/responsible_ai/model_remediation/api_docs/python/model_remediation/min_diff/keras/MinDiffModel#methods)).

In [None]:
print('MinDiffModel.fit == keras.Model.fit')
print(min_diff.keras.MinDiffModel.fit == tf.keras.Model.fit)
print('MinDiffModel.train_step == keras.Model.train_step')
print(min_diff.keras.MinDiffModel.train_step == tf.keras.Model.train_step)

For <a href="https://www.tensorflow.org/api_docs/python/tf/keras/Sequential"><code>keras.Sequential</code></a> or <a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model"><code>keras.Model</code></a>, this is perfectly fine since they use the same functions.

In [None]:
print('Sequential.fit == keras.Model.fit')
print(tf.keras.Sequential.fit == tf.keras.Model.fit)
print('tf.keras.Sequential.train_step == keras.Model.train_step')
print(tf.keras.Sequential.train_step == tf.keras.Model.train_step)

However, if your model is a [subclass of `keras.Model`](https://www.tensorflow.org/guide/keras/custom_layers_and_models#the_model_class), wrapping it with `MinDiffModel` will effectively lose the customization.

In [None]:
class CustomModel(tf.keras.Model):

  def train_step(self, **kwargs):
    pass  # Custom implementation.

print('CustomModel.train_step == keras.Model.train_step')
print(CustomModel.train_step == tf.keras.Model.train_step)

If this is your use case, you should not use `MinDiffModel` directly. Instead, you will need to subclass it as described in the [customization guide](./customizing_min_diff_model).

## Additional Resources

*   For an in depth discussion on fairness evaluation see the [Fairness Indicators guidance](https://www.tensorflow.org/responsible_ai/fairness_indicators/guide/guidance)
*   For general information on Remediation and MinDiff, see the [remediation overview](https://www.tensorflow.org/responsible_ai/model_remediation).
*    For details on requirements surrounding MinDiff see [this guide](https://www.tensorflow.org/responsible_ai/model_remediation/min_diff/guide/requirements).
*   To see an end-to-end tutorial on using MinDiff in Keras, see [this tutorial](https://www.tensorflow.org/responsible_ai/model_remediation/min_diff/tutorials/min_diff_keras).