In physics, sound is a vibration. This vibration takes the form of an audible wave of pressure, that can be transmitted through a medium such as a gas, liquid or solid. Sound travels in longitudinal waves, this means the particles in the medium (gas, liquid or solid) are moved in the same or the opposite direction to the travelling wave. This causes increases and decreases in pressure known as compression when the pressure is increased and rarefaction when the pressure is decreased. (see figure A)
The dynamic range of music normally perceived in a concert hall will rarely exceed 80 dB, and human speech is usually perceived over a range of about 40 dB.
When looking at digital audio, a 20-bit system is potentially capable of achieving 120 dB dynamic range. 24-bit digital audio calculates to 144 dB dynamic range. This suggests that either a 20-bit or 24-bit system achieves the same lower and upper limits of dynamic range as that of human hearing. When looking at digital audio workstations, most apply a 32-bit floating-point system which in turn allows for even higher dynamic range and because of this improved process loss of dynamic range is not of concern in digital audio processing anymore.
The final point raised is an important one, because when analogue signals (music) were first recorded to digital devices (PC’s, DAW’s) the low bit systems often couldn’t represent the full dynamic range. In turn the final digital version of the recording would have less of a range between the quietist and loudest sounds, which could potentially affect the quality of listening for the final audience.
There are two ways to achieve this and they are known as lossless and lossy compression.
Lossless compression (WAV, FLAC) allows for all of the original data to be recovered whenever the file is uncompressed again whilst significantly reducing the file size.
Lossy compression (MP3, AAC) works differently as it ends up eliminating all the “unnecessary” bits and pieces of information in the original file to make it even smaller when compressed. This works by applying techniques based upon how our brain interprets sound and removes harmonics and other frequency content that would not be perceived by our brain. Lossy techniques can reduce the original file size to a tenth whilst maintaining an almost identical file.
aptX-HD (also known as aptX Lossless), has a bit-rate of 576 kbps. This allows for high definition audio up to 48 kHz sampling rates and word lengths (bit depths) up to 24 bits. Thus providing sufficient detail in representing the analogue signal in digital form. Even though the name of the codec suggests it is lossless, it technically is still lossy. This is because it uses ‘near lossless’ coding for parts of the audio where it is impossible to apply lossless coding. ‘Near lossless’ coding is applied in these particular situations because there is only a limited amount of space to transmit digital signals. This is known as bandwidth and due to the bandwidth being a finite size applying fully lossless coding all the time would require a larger bandwidth to transmit the information. ‘Near lossless’ coding maintains high-definition qualities such as a dynamic range of at least 120 dB and still represents audio frequencies up to 20 kHz.
aptX-HD performs very well compared with other lossless compression formats, as long as the coding latency is kept to a maximum of 5 ms or less. This makes aptX-HD particularly useful for delay-sensitive interactive audio applications such as wireless headphones. This is of particular importance where the audio information is streamed from a user’s mobile device (mobile, tablet, PC) to the output (wireless headphones). A low latency, means that the user should experience no problems in terms of a delay between pressing play and receiving audio at the headphones. This is regardless of whether it is a small MP3 (lossy) file or a slightly larger WAV (lossless) file.
Additional information such as metadata and synchronisation data can be embedded into aptX-HD at variable rates. As the rate in which the data is embedded is variable, it allows in the event of data being corrupted or lost to resynchronise in order to maintain a high QoS (Quality of Service). Dependent on the settings applied within the decoding process, resynchronisation can occur within 1-2 ms.