PCM – which is short for Pulse Code Modulation – is often mistakenly referred to as an audio format but it is in fact something completely different. It is a method used to digitally represent analogue signals as well as convert analogue signals into digital and back. Most digital audio devices (except DSD equipment) use PCM encoding and transcoding since it is simple to use with digital signal processors. PCM-signals are made by regularly measuring the amplitude of a signal. Every measurement is called a sample and every sample represents an amplitude value (quantization level) in binary numbers. The more bits there are, the higher the dynamical range and the better the resolution.

For example, an audio file with a sample rate of 44.1 kHz and a bitrate of 16 bit has 44100 amplitude measurements with a 16 bit binary resolution. For audio purposes a specific type of PCM is used, namely LPCM which stands for Linear Pulse Code Modulation. LPCM is used because of the linearity of quantization levels when working with audio. Usually when PCM is mentioned for audio purposes, it’s assumed to be LPCM.


The first PCM signals where used in telegraphy. A telegraph could only send two values, on and off just like the ‘0’ and ‘1’ in the digital domain. Technically Morse Code – the first way to communicate via an electrical telegraph – is a form of digital communication. Although it quickly became possible to make phone calls this technique was still analogue, in contrast to Morse Code. In 1920, Harry G. Bartholomew and Maynard D. McFarlane founded the ‘Bartlane cable picture transmission system,’ a system that was used to send pictures. The system punched characters in paper tape to encode images with 5 levels of quantization.

In 1937, Alec Reeves was the first to use PCM for voice communication. However, the first real-world application of this technique would not come until 1943, when researchers from Bell Labs designed the SIGSALY, an encryption system for secure high-level communication between the allied countries during World War II. The SIGSALY system would pave the way for digital PCM audio.

PCM in audio

Sound is actually vibrating air. In the audio domain these vibrations are generated electronically. A visual reproduction of sound usually displays a sine wave going up and down, passing the graph’s X-axis and going back up. This is one single vibration. A frequency of 20 kHz would show 20,000 of these vibrations. In order to convert this correctly to the digital domain, a sample rate of at least 40 kHz is needed because the positive and negative side of the vibration has to be sampled. This theory is called the Nyquist-Shannon Theory which states that the sample rate has to be twice the frequency of the signal that needs to be converted. This is why the Red Book Audio standard is set to a sample rate of 44.1 kHz. With this sample rate it is possible to correctly convert audio frequencies of up to 22.05 kHz (22.05 kHz * 2 = 44.1 kHz), which is the absolute upper limit of the human hearing.


When a digital signal is translated back to the analogue domain a stepped signal will occur because all the digital samples represent the measured value at the time the sample was taken instead of a fluent curve (see figure below). If this signal is used it will result in a large amount of audible distortion. The difference between the original vibration and the converted vibration is called quantization noise. Quantization noise is an unwanted product that will cause soft, cyclical tones. Despite the fact that every DAC is equipped with a restore filter which tries to restore the original vibration, quantization still can be heard.

To prevent the quantization noise from becoming too vexatious, dither is added to the analogue signal after converting it from digital to analogue. Dither is a generated noise which will mask the quantization noise. There are quite a few different dither types, each with a different kind of noise to mask the quantization noise as well as possible. Every time audio gets processed it will change the digital signal, causing quantization noise. To ensure the best quality, dither should be added every time. By increasing the sample rate the quantization noise will also become less present. When using higher sample rates the need for dither isn’t as urgent as on low sample rates. With a 24 bit rate dither can be used more precisely when compared to a 16 bit file.

High resolution audio

Using higher sample rates allow the perception of higher frequencies. However, human hearing only goes up to 20 kHz, making it virtually impossible to hear anything above that. There are some theories stating that these frequencies make you more aware of the recorded ambience. The main reason for using higher sample rates is to make it possible to encode and decode high frequencies as naturally as possible. A tone of 20 kHz consists of only two samples when using a 44.1 kHz sample rate. When this is converted back to the analogue domain it actually generates a triangle shaped waveform. These triangular waveforms sound very harsh, especially in the treble range. If a sample rate of 96 kHz is used the same tone consists of almost 5 samples making it possible to recreate it better after converting. Higher sample rates will therefore sound more natural and less digital.

visit the primephonic store

Related News