| Next: Advanced features | Up: MP3Enc Features | Previous: MP3Enc Features | Table of Contents |

Subsections


Basics

Samplerate

Sample rate  is the rate at which the samples are read from your sound card when you sample. Sample rate is directly linked to audio  bandwidth achievable: A sound file with a sample rate of 8 kHz does not contain frequencies beyond 4 kHz. This means that you should always use the highest sample rate that your sound card supports when you sample a signal.

The encoder changes the sample rate of your audio data to match it to the audio quality of the bitstream produced by the encoder. This process is called  downsampling.

Bitrate

The main parameter controlling the sound quality is the  bitrate that the encoder runs at. In a nutshell, the higher the bitrate, the better the quality.

The bitrate of the encoder is linked to the samplerate that the encoded file will have. Usually, the encoder will choose a samplerate that is suited best for encoding at that bitrate. You can override this samplerate using the -esr  switch (see section 2.2.1).

 The bitrate of the bitstream output is selected via the -br  switch. The bitrate is specified in bits/second. The bitrate is the total bitrate for all encoded channels, i.e. if you select -br  112000 and encode in stereo, both channels will be stuffed into one bitstream of 112000 bits/second.

The encoder supports bitrates of 8, 16, 18, 20, 24, 32, 40, 48, 56, 64, 96, 112, 128, 160, 192 and 256 kBit/s. While all of these can be used with mono signals, stereo works from 20 kBit/s on upwards.

Stereo mode

If encoding stereo, the bitrate of the encoder is linked to a stereo mode. MPEG Layer-3 knows four modes for stereo encoding.

 dual channel
(also known as  dual mono) In this mode, the encoder treats the two input channels as separate entities, assuming there is no similarity between the channels. This would be appropriate if you e.g. have a bilingual signal where one channel contains a german speaker and one contains an english speaker.
 stereo
In this mode, like in dual channel above, the encoder makes no use of potentially existing correlations between the two input channels. It can, however, negotiate the bit demand between both channel, i.e. give one channel more bits if the other contains silence.

  MS stereo
In this mode, the encoder will make use of a correlation between both channels. The signal will be matrixed into a sum (»mid«) and difference (»side«) signal. For quasi-mono signals, this will give a significant gain in encoding quality.

This mode does not destroy phase information like IS stereo (see below) and thus can be used to encode DOLBY ProLogic$^{\mbox{tm}}$  surround signals.

 MS/IS stereo
In this mode, high-frequency parts of the signal will be downmixed to mono and transmitted with a direction information (which is basically a pan). This mode (called »intensity stereo«  will loose phase information and should not be used for high-quality encoding.

Table 2.1 gives you an overview which mode will be used for which bitrate.


 

 
Table 2.1: different stereo modes
Bitrates -   stereo mode
8000 - 18000 mono only
18000 - 96000 MS/IS stereo
96000 - 192000 MS stereo
192000 - 256000 stereo


Encoding speed

 Several factors influence the speed of the encoder. They include:

Version V3.0 of the encoder reaches realtime speed on a Pentium 166 when encoding at 64 kBit/s, 22,050 kHz, stereo. On a SUN Sparc Ultra-1 (143 MHz) the performance is similar.

Input file specification

The encoder can read AIFF, AIFF-C, WAV/RIFF and raw PCM data files. While the first three only work from a file, plain PCM data can be fed into the encoder via a pipe. This is useful for live encoding (also known as  streaming).

Input from file: filename

-if  filename
will tell the encoder the filename it reads it input from. If the file is a RIFF/WAVE file or an AIFF/AIFC file, the encoder will automatically adapt to the sound file format. For other formats or plain PCM data, see below.

Piping data into the encoder

-sti 
tells the encoder to get its input from stdin rather than from a file. This only works when the input is plain pcm data (see below).

plain PCM data input

 If the encoder gets its input as plain pcm data (or if it does not recognize the sound format by itself), you need to tell it all about the structure of the PCM stream, i.e. the number of bits per sample, the number of channel and the samplerate.

-iff  fileformat
This is a string containing name=value pairs, separated by blanks. Table 2.2 gives a reference which names and values are possible here. For stereo files, the encoder assumes that the PCM data is interleaved and that the sample for the right channel follows that for the left channel.


   
Table 2.2: input file format specification
Name Value(s) Explanation
sr any The rate the PCM signal is sampled at [Hz]
nc 1, 2 The number of channels in the signal
bps 8, 16, 24, 32 The number of bits per sample
little-endian   The signal is little-endian (Intel format)
big-endian   The signal is big-endian (Motorola format)

As an example, -iff "nc=2 sr=44100 bps=16" would be used to read a 44.1 kHz stereo file with 16 bits per sample while -iff "nc=1 sr=8000 bps=8" would tell the encoder that the data is mono, sampled at 8 kHz with 8 bits per sample.

Remember that this feature is only needed for input from files other than RIFF/WAV, AIFF and AIFC.

Output file specification

On output, the encoder can be instructed to write a plain Layer-3 bitstream or a wave file containing the Layer-3 stream. These wave files can be played by the media control on a machine running under Microsoft Windows that has the MPEG Layer-3 ACM codec installed (you can get one by installing Microsoft Netshow$^{\mbox{tm}}$,http://www.microsoft.com/netshow/ ). If the output is a plain Layer-3 stream, it can be piped into other applications. This is useful for live streaming.

-of  filename
tells the encoder the filename of the file that the encoder will write the bitstream to. If the file does not exist, it is created; if it does exist, it will be overwritten.
-l3wav 
tells the encoder to wrap the MPEG Layer-3 file into a Microsoft RIFF/WAVE file.

Streaming data out of the encoder  

-sto 
tells the encoder to write its output into stdout rather than in a file. This only works when the output is a raw Layer-3 bitstream (i.e. it does not work in conjunction with -l3wav ).


layer3@iis.fhg.de, 03/98