tf.quantization.experimental.quantize_saved_model

Quantizes the SavedModel with the given quantization options.

Example usage:

# Quantizing a model trained with QAT.
quantization_options = tf.quantization.experimental.QuantizationOptions(
    signature_keys=['your_signature_key'],
)
tf.quantization.experimental.quantize_saved_model(
    '/tmp/input_model',
    '/tmp/output_model',
    quantization_options=quantization_options,
)

# When quantizing a model trained without QAT (Post-Training Quantization),
# a representative dataset is required.
representative_dataset = [{"input": tf.random.uniform(shape=(3, 3))}
                      for _ in range(256)]
tf.quantization.experimental.quantize_saved_model(
    '/tmp/input_model',
    '/tmp/output_model',
    quantization_options=quantization_options,
    representative_dataset={'your_signature_key': representative_dataset},
  )

# In addition to preset quantization methods, fine-grained control of
# quantization for each component is also supported.
_QuantizationComponentSpec = (
    tf.quantization.experimental.QuantizationComponentSpec
)
quantization_options = tf.quantization.experimental.QuantizationOptions(
    signature_keys=['your_signature_key'],
    quantization_method=tf.quantization.experimental.QuantizationMethod(
        quantization_component_specs=[
            _QuantizationComponentSpec(
                quantization_component=(
                    _QuantizationComponentSpec.COMPONENT_ACTIVATION
                ),
                tensor_type=_QuantizationComponentSpec.TENSORTYPE_INT_8,
            )
        ]
    )
)
tf.quantization.experimental.quantize_saved_model(
    '/tmp/input_model',
    '/tmp/output_model',
    quantization_options=quantization_options,
)

saved_model_path Path to the saved model. When representative_dataset is not provided, this should be a model trained with QAT.
output_directory The path to save the output SavedModel. Set overwrite_output_directory to True to overwrite any existing contents in the directory if not empty.
quantization_options A set of options for quantization. If None, it uses post-training static range quantization with XLA opset by default.
representative_dataset an iterator that returns a dictionary of {input_key: input_value} or a map from signature key to a dictionary of {input_key: input_value} that feeds calibration data for quantizing model. The representative should be provided when the model is a PTQ model. It can be provided either via this parameter or via the representative_datasets field in QuantizationOptions.
overwrite_output_directory If set to true, overwrites the output directory iff it isn't empty. The default value is false.

A SavedModel object with TF quantization applied, or None if no quantization is performed.

ValueError When 1) representative_dataset is not provided for non QAT model for enabling static range quantization, 2) invalid value is provided as a quantization method, or 3) provide representative dataset via both argument and QuantizationOptions.
ValueError When the specified quantization method is not yet supported.