Introduction
All television systems engineers understand the importance of proper video signal timing. They know that in order to ensure glitch-free switching, all video sources must be driven by a common reference. In addition, to ensure proper operation of video production switchers, master control switchers, and other devices that combine 2 or more video signals, differential delays between sources must be kept to a minimum.
Unfortunately, not all video engineers are aware of the importance of proper digital audio signal timing. The concept of “timing” as applied to audio signals is foreign to them. However, many of the same rules that apply to video signal timing apply to digital audio signal timing. If those rules aren’t followed, the result will be pops, clicks, and other undesirable audio artifacts.
This article will discuss basic digital audio system timing concepts and provide practical system design examples.
Video System Timing Fundamentals
In all television broadcast, production, or post production facilities, a master sync generator is used to provide a timing reference for all equipment in the facility. The most commonly used reference signal is analog color black, because nearly all professional video equipment – both analog and digital – is capable of locking to color black. Typically, color black is fanned out using analog video DA’s and distributed throughout the plant.
In larger facilities, delay lines and/or slave sync generators may be used to provide independently adjustable timing references that are all locked to the facility’s master reference. This is often done to allow all sources, including the outputs of studios, edit suites, and other equipment “islands” to be zero timed at the input of the house router.
A properly-designed timing chain will ensure that all video sources are locked to a common reference and synchronized with respect to one another.
Anatomy of a Digital Audio Signal
Digital audio is produced by sampling an analog audio signal and formatting the samples into a serial bit stream. By convention, in a video environment, digital audio is sampled at 48kHz. In order to maintain a stable, predictable relationship between video and audio sample clocks, the 48kHz digital audio sample clock must be derived from the same master oscillator as the facility’s video reference.
The digital audio bit stream is formatted per AES3, a standard published by the Audio Engineering Society. In addition to the audio samples contained in the bit stream, the AES3 signal also includes synchronizing information, status bits, user bits, and CRC (Cyclic Redundancy Check) data. Like a video signal, data is organized into frames, each of which contains 192 two-channel samples.
It is worth noting that, in 50 field per second television systems, there are exactly 5 AES3 frames per field of video. In 59.94 field per second television systems, there are 4.1708333 AES3 frames per field of video. Therefore, in 50 field/sec systems, at best, AES and video frame boundaries are in alignment only once every 5 video fields. In 59.94 field/second systems, AES and video frame alignment is totally arbitrary.
System Timing Concepts for Digital Audio
The first step in designing a well behaved digital audio system is to ensure that all sources are locked to a common reference. The reference may be color black, silent AES3, or Word Clock. In some instances, it may be necessary to utilize a mix of all three. However, in order to ensure satisfactory results, they must all be derived from the same master reference.
Here are 3 different categories of digital audio sources that might be used in a video facility:
- Those that are video sources as well: DVTRs, servers, non-linear video editing systems, etc. These typically derive their digital audio clocks from the video signal input or video reference input.
- Audio-only sources that have an AES3 or Word Clock reference input: A/D converters, tone/test signal generators, digital audio clip players, audio consoles, etc. These require an AES3 or Word Clock reference that is locked to the video reference.
- Audio-only sources that do not have any reference input whatsoever: CD players, DAT recorders, MP3 players, etc. These are not capable of being externally referenced, and they may not even utilize or support a 48kHz sample rate.
Integrating the first category of sources into a video facility is a straightforward exercise. Connect the video reference, adjust the video timing, and you’re done. Video and audio from the source will be synchronous with the plant reference and properly timed.