Transforms a spectrogram into a form that's useful for speech recognition.
Mel Frequency Cepstral Coefficients are a way of representing audio data that's been effective as an input feature for machine learning. They are created by taking the spectrum of a spectrogram (a 'cepstrum'), and discarding some of the higher frequencies that are less significant to the human ear. They have a long history in the speech recognition world, and https://en.wikipedia.org/wiki/Mel-frequency_cepstrum is a good resource to learn more.
Nested Classes
class | Mfcc.Options |
Optional attributes for
Mfcc
|
Constants
String | OP_NAME | The name of this op, as known by TensorFlow core engine |
Public Methods
Output < TFloat32 > |
asOutput
()
Returns the symbolic handle of the tensor.
|
static Mfcc | |
static Mfcc.Options |
dctCoefficientCount
(Long dctCoefficientCount)
|
static Mfcc.Options |
filterbankChannelCount
(Long filterbankChannelCount)
|
static Mfcc.Options |
lowerFrequencyLimit
(Float lowerFrequencyLimit)
|
Output < TFloat32 > |
output
()
|
static Mfcc.Options |
upperFrequencyLimit
(Float upperFrequencyLimit)
|
Inherited Methods
Constants
public static final String OP_NAME
The name of this op, as known by TensorFlow core engine
Public Methods
public Output < TFloat32 > asOutput ()
Returns the symbolic handle of the tensor.
Inputs to TensorFlow operations are outputs of another TensorFlow operation. This method is used to obtain a symbolic handle that represents the computation of the input.
public static Mfcc create ( Scope scope, Operand < TFloat32 > spectrogram, Operand < TInt32 > sampleRate, Options... options)
Factory method to create a class wrapping a new Mfcc operation.
Parameters
scope | current scope |
---|---|
spectrogram | Typically produced by the Spectrogram op, with magnitude_squared set to true. |
sampleRate | How many samples per second the source audio used. |
options | carries optional attributes values |
Returns
- a new instance of Mfcc
public static Mfcc.Options dctCoefficientCount (Long dctCoefficientCount)
Parameters
dctCoefficientCount | How many output channels to produce per time slice. |
---|
public static Mfcc.Options filterbankChannelCount (Long filterbankChannelCount)
Parameters
filterbankChannelCount | Resolution of the Mel bank used internally. |
---|
public static Mfcc.Options lowerFrequencyLimit (Float lowerFrequencyLimit)
Parameters
lowerFrequencyLimit | The lowest frequency to use when calculating the ceptstrum. |
---|
public static Mfcc.Options upperFrequencyLimit (Float upperFrequencyLimit)
Parameters
upperFrequencyLimit | The highest frequency to use when calculating the ceptstrum. |
---|