tf.signal.inverse_mdct

View source on GitHub

Computes the inverse modified DCT of mdcts.

To reconstruct an original waveform, the same window function should be used with mdct and inverse_mdct.

Example usage:

@tf.function
def compare_round_trip():
  samples = 1000
  frame_length = 400
  halflen = frame_length // 2
  waveform = tf.random.normal(dtype=tf.float32, shape=[samples])
  waveform_pad = tf.pad(waveform, [[halflen, 0],])
  mdct = tf.signal.mdct(waveform_pad, frame_length, pad_end=True,
                        window_fn=tf.signal.vorbis_window)
  inverse_mdct = tf.signal.inverse_mdct(mdct,
                                        window_fn=tf.signal.vorbis_window)
  inverse_mdct = inverse_mdct[halflen: halflen + samples]
  return waveform, inverse_mdct
waveform, inverse_mdct = compare_round_trip()
np.allclose(waveform.numpy(), inverse_mdct.numpy(), rtol=1e-3, atol=1e-4)
True

Implemented with TPU/GPU-compatible ops and supports gradients.

mdcts A float32/float64 [..., frames, frame_length // 2] Tensor of MDCT bins representing a batch of frame_length // 2-point MDCTs.
window_fn A callable that takes a frame_length and a dtype keyword argument and returns a [frame_length] Tensor of samples in the provided datatype. If set to None, a rectangular window with a scale of 1/sqrt(2) is used. For perfect reconstruction of a signal from mdct followed by inverse_mdct, please use tf.signal.vorbis_window, tf.signal.kaiser_bessel_derived_window or None. If using another window function, make sure that w[n]^2 + w[n + frame_length // 2]^2 = 1 and w[n] = w[frame_length - n - 1] for n = 0,...,frame_length // 2 - 1 to achieve perfect reconstruction.
norm If "ortho", orthonormal inverse DCT4 is performed, if it is None, a regular dct4 followed by scaling of 1/frame_length is performed.
name An optional name for the operation.

A [..., samples] Tensor of float32/float64 signals representing the inverse MDCT for each input MDCT in mdcts where samples is (frames - 1) * (frame_length // 2) + frame_length.

ValueError If mdcts is not at least rank 2.