View source on GitHub |
Returns a matrix to warp linear scale spectrograms to the mel scale.
tf.signal.linear_to_mel_weight_matrix(
num_mel_bins=20,
num_spectrogram_bins=129,
sample_rate=8000,
lower_edge_hertz=125.0,
upper_edge_hertz=3800.0,
dtype=tf.dtypes.float32
,
name=None
)
Returns a weight matrix that can be used to re-weight a Tensor
containing
num_spectrogram_bins
linearly sampled frequency information from
[0, sample_rate / 2]
into num_mel_bins
frequency information from
[lower_edge_hertz, upper_edge_hertz]
on the mel scale.
This function follows the Hidden Markov Model Toolkit (HTK) convention, defining the mel scale in terms of a frequency in hertz according to the following formula:
$$\textrm{mel}(f) = 2595 * \textrm{log}_{10}(1 + \frac{f}{700})$$
In the returned matrix, all the triangles (filterbanks) have a peak value of 1.0.
For example, the returned matrix A
can be used to right-multiply a
spectrogram S
of shape [frames, num_spectrogram_bins]
of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
M
of shape [frames, num_mel_bins]
.
# `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A)
The matrix can be used with tf.tensordot
to convert an arbitrary rank
Tensor
of linear-scale spectral bins into the mel scale.
# S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
Returns | |
---|---|
A Tensor of shape [num_spectrogram_bins, num_mel_bins] .
|