Quantizes the SavedModel with the given quantization options.
tf.quantization.experimental.quantize_saved_model(
saved_model_path: str,
output_directory: Optional[str] = None,
quantization_options: Optional[tf.quantization.experimental.QuantizationOptions
] = None,
representative_dataset: Optional[repr_dataset.RepresentativeDatasetOrMapping] = None,
*,
overwrite_output_directory: bool = False
) -> autotrackable.AutoTrackable
Example usage:
# Quantizing a model trained with QAT.
quantization_options = tf.quantization.experimental.QuantizationOptions(
signature_keys=['your_signature_key'],
)
tf.quantization.experimental.quantize_saved_model(
'/tmp/input_model',
'/tmp/output_model',
quantization_options=quantization_options,
)
# When quantizing a model trained without QAT (Post-Training Quantization),
# a representative dataset is required.
representative_dataset = [{"input": tf.random.uniform(shape=(3, 3))}
for _ in range(256)]
tf.quantization.experimental.quantize_saved_model(
'/tmp/input_model',
'/tmp/output_model',
quantization_options=quantization_options,
representative_dataset={'your_signature_key': representative_dataset},
)
# In addition to preset quantization methods, fine-grained control of
# quantization for each component is also supported.
_QuantizationComponentSpec = (
tf.quantization.experimental.QuantizationComponentSpec
)
quantization_options = tf.quantization.experimental.QuantizationOptions(
signature_keys=['your_signature_key'],
quantization_method=tf.quantization.experimental.QuantizationMethod(
quantization_component_specs=[
_QuantizationComponentSpec(
quantization_component=(
_QuantizationComponentSpec.COMPONENT_ACTIVATION
),
tensor_type=_QuantizationComponentSpec.TENSORTYPE_INT_8,
)
]
)
)
tf.quantization.experimental.quantize_saved_model(
'/tmp/input_model',
'/tmp/output_model',
quantization_options=quantization_options,
)
Args |
saved_model_path
|
Path to the saved model. When representative_dataset is
not provided, this should be a model trained with QAT.
|
output_directory
|
The path to save the output SavedModel. Set
overwrite_output_directory to True to overwrite any existing contents
in the directory if not empty.
|
quantization_options
|
A set of options for quantization. If None, it uses
post-training static range quantization with XLA opset by default.
|
representative_dataset
|
an iterator that returns a dictionary of {input_key:
input_value} or a map from signature key to a dictionary of {input_key:
input_value} that feeds calibration data for quantizing model. The
representative should be provided when the model is a PTQ model. It can be
provided either via this parameter or via the representative_datasets
field in QuantizationOptions .
|
overwrite_output_directory
|
If set to true, overwrites the output directory
iff it isn't empty. The default value is false.
|
Returns |
A SavedModel object with TF quantization applied, or None if no quantization
is performed.
|
Raises |
ValueError
|
When 1) representative_dataset is not provided for non QAT model
for enabling static range quantization, 2) invalid value is provided as
a quantization method, or 3) provide representative dataset via both
argument and QuantizationOptions.
|
ValueError
|
When the specified quantization method is not yet supported.
|