The binary contents of the audio file to decode. This is a
scalar.
file_format
A string or scalar string tensor specifying which
format the contents will conform to. This can be mp3, mp4, ogg,
or wav.
samples_per_second
The number of samples per second that is
assumed, as an int or scalar int32 tensor. In some cases,
resampling will occur to generate the correct sample rate.
channel_count
The number of channels that should be created from the
audio contents, as an int or scalar int32 tensor. If the
contents have more than this number, then some channels will
be merged or dropped. If contents has fewer than this, then
additional channels will be created from the existing ones.
stream
A string specifying which stream from the content file
should be decoded, e.g., '0' means the 0-th stream.
The default value is '' which leaves the decision to ffmpeg.
Returns
A rank-2 tensor that has time along dimension 0 and channels along
dimension 1. Dimension 0 will be samples_per_second *
length_in_seconds wide, and dimension 1 will be channel_count
wide. If ffmpeg fails to decode the audio then an empty tensor will
be returned.