From HwB

S/PDIF = Sony/Philips Digital Interface Format (a.k.a SPDIF)

An interface for digital audio.



Since the early 80's, a step towards digital audio has been set by the introduction of the Compact Disc player. In the beginning, those signals stayed inside the set, and were converted to analog signals before leaving the cabinet. A new trend is to keep signals into the digital domain as long as possible, because this is the only way to keep the signal quality. To make this possible different devices must be able communicate with one another within the digital domain.

S/PDIF is the consumer version of AES/EBU professional digital audio interface.

S/PDIF is described in IEC 60958-3:2006. In Japan the equivalent standard is JEITA CPR-1205.


IEC 60958-3 specifies up to 768 kHz sampling frequency.

Signal bitrate:

Bitrate Fs Usage
2 MHz 32 kHz DSR (Digital Satellite Receiver)
2.8 MHz 44.1 kHz CD
3.1 MHz 48 kHz DAT

Resolution: up to 24 bits PCM.

Audio can be sent encoded as DTS/AC-3/ATRAC/AAC/MPEG-2 instead of PCM.

The interface

There is three different types of S/PDIF connection:

  • Phono connector (electrical)
  • TOSLINK connector (optical)
  • TTL
S/PDIF (IEC 60958-3)
Connector Phono TOSLINK Non-standard
Interface Unbalanced Optical Unbalanced
Cabling 75 Ω coaxial 1.0 mm plastic fiber optics  ?
Max length 10 m 1 m  ? m
Output level 0.5 Vpp  ? 5 Vpp (TTL)
Max output 0.6 Vpp  ? 5 Vpp (TTL)
Max current 8 mA  ? X mA
Min input 0.2 Vpp  ? 2 Vpp (TTL)
Modulation biphase-mark-code
Subcode information SCMS copy protection info
Max. Resolution 20 bits (24 bit optional)


Commonly used XLR-3 microphone cables have various impedance ratings (30 Ω to 90 Ω typical) and exhibit poor digital transmission performance. The result is signal drop out and reduced cable lengths due to severe impedance mismatching (VSWR) between AES/EBU 110 Ω equipment. AES/EBU signal transmission work for few tens of meters with a good cable.



There also an optical version of S/PDIF interface which is usually called TOSLINK, because uses TOSLINK optical components. The transmission media is 1 mm plastic fiber and the signals are transmitted using visible light (red transmitting LED). The optical signals have exactly the same format as the electrical S/PDIF signals, they are just converted to light signals (light on/off). Because high light signal attenuation in the TOSLINK fiber optic cable, the transmission distance available using this technique is less than 10 meters (with some equipments only few meters).

Connector is called JIS F05 (JIS C5974-1993 F05)

TTL level

The 2 pin header connection found on CD-ROM units and sound cards (internally) are usually TTL level. The data format is same as the other formats. You need an adapter to connect the TTL level signal to an amplifier with Phono or TOSLINK connectors.

Comparision with AES/EBU

The two formats are quite compatible with each other, differing only in the subcode information and connector. The professional format subcode contains ASCII strings for source and destination identification, whereas the commercial format carries the SCMS.

Both S/PDIF and AES/EBU can, and do transfer 24 bit words. In AES/EBU, the last 4 bits have a defined usage, so if anyone puts audio in there, it has to go to something that doesn't expect the standard specifies. But in S/PDIF, there's nothing that says what you have to use the bits for, so filling them all up with audio is acceptable. Typical S/PDIF equipments only use 16 or 20 bit resolutions. While many equipments use more than 16 bits in internal processing, it's not unusual for the output to be limited to 16 bits.

Multi channel audio and S/PDIF

IEC 958 was named IEC 60958 in 1998. IEC 60958 (The S/PDIF) can carry normal audio and IEC 61937 data streams. IEC 61937 data streams can contain multi channel sound like MPEG2, AC3 or DTS. When IEC 61937 data streams are transferred, the bits which normally carry audio samples are replaced with the data bits from the data stream and the headers of the S/PDIF signal. Channel-status information contains one bit (but 1) which tells if the data in S/PDIF frame is digital audio or some other data (DTS, AC3, MPEG audio etc.). This bit will tell normal digital audio equipments that they don't try to play back this data as they were audio samples. (would sound really horrible if this happens for some reason).

The equipments which can handle both normal audio and IEC 61937 just look at those header bits to determine what to do with the received data.


There are two things which can cause differences between the sound of digital interfaces:

Jitter (clock phase noise)

This really only affects sound of the signal going directly to a DAC. If you're running into a computer, the computer is effectively going to be reclocking everything. Same applies also to CD-recorders, DAT tape decks and similar devices. Even modern DACs have typically a small buffer and reclocking circuitry, so the jitter is not so big problem nowadays that it used to be.


This usually causes very significant changes in the sound, often loud popping noises but occasionally less offensive effects. Any data loss or errors in either are a sign of a very broken link which is probably intermittently dropping out altogether.

Bit rate

The signal on the digital output of a CD-player looks like almost perfect sine-wave, with an amplitude of 500 mVtt and a frequency of almost 3 MHz.

For each sample, two 32-bit words are transmitted, which results in a bit-rate of:

2.8224 Mbit/s 44.1 kHz sampling rate, CD, DAT
3.0720 Mbit/s 48 kHz sampling rate, DAT
2.0480 Mbit/s 32 kHz sampling rate, for satellite purposes

The Coding Format

The digital signal is coded using the 'biphase-mark-code' (BMC), which is a kind of phase-modulation. In this system, two zero-crossings of the signal mean a logical 1 and one zero-crossing means a logical 0.


The frequency of the clock if twice the bitrate. Every bit of the original data is represented as two logical states, which, together, form a cell. The length of a cel ('time-slot') is equal to the length of a databit. The logical level at the start of a bit is always inverted to the level at the end of the previous bit. The level at the end of a bit is equal (a 0 transmitted) or inverted (a 1 transmitted) to the start of that bit.

The first 4 bits of a 32-bit word (bits 0 through 3) form a preamble which takes care of synchronisation. This sync-pattern doesn't actually carry any data, but only equals four data bits in length. It also doesn't use the BMC, so bit patterns which include more than two 0's or 1's in a row can occur (in fact, they always do).

There are 3 different sync-patterns, but they can appear in different forms, depending on the last cell of the previous 32-bit word (parity):

Preamble cell-order cell-order
(last cell "0") (last cell "1")
"B" 11101000 00010111
"M" 11100010 00011101
"W" 11100100 00011011
Preamble B Marks a word containing data for channel A (left) at the start of the data-block.
Preamble M Marks a word with data for channel A that isn't at the start of the data-block.
Preamble W Marks a word containing data for channel B. (right, for stereo).
When using more than 2 channels, this could also be any other channel (except for A).

Word and Block Formats

Every sample is transmitted as a 32-bit word (subframe). These bits are used as follows:

bits meaning
0-3 Preamble (see above; special structure)
4-7 Auxillary-audio-databits
8-27 Sample

(A 24-bit sample can be used (using bits 4-27).
A CD-player uses only 16 bits, so only bits 13 (LSB) to 27 (MSB) are used. Bits 4-12 are set to 0).

28 Validity

(When this bit is set, the sample should not be used by the receiver. A CD-player uses the 'error-flag' to set this bit).

29 Subcode-data
30 Channel-status-information
31 Parity (bit 0-3 are not included)

The number of subframes that are used depends on the number of channels that is transmitted. A CD-player uses Channels A and B (left/right) and so each frame contains two subframes. A block contains 192 frames and starts with a preamble "B":

"M" Ch.1 "W" Ch.2 "B" Ch.1 "W" Ch.2 "M" Ch.1 "W" Ch.2 "M" ...
sub sub
Frame 191 Frame 0 Frame 1
Block start

Channel status and subcode information

In each block, 384 bits of channel status and subcode info are transmitted. The Channel-status bits are equal for both subframes, so actually only 192 useful bits are transmitted:

bit meaning
0-1 channel status bits:
bit 0 bit 1
0 0 IEC 60958-3 (Consumer)
1 0 IEC 60958-4 (Professional)
0 1 IEC 61397 (MPEG/AC-3/DTS/AAC/ATRAC), IEC 62105 and others
1 1 SMPTE 337M and others
2 copy-protection. Copying is allowed when this bit is set.
3 is set when pre-emphasis is used.
4-7 0 (reserved)
9-15 category-code:
  • 0 = common 2-channel format
  • 1 = 2-channel CD-format (set by a CD-player when a subcode is transmitted)
  • 2 = 2-channel PCM-encoder-decoder format
  • others are not used
19-191 0 (reserved)

The subcode-bits can be used by the manufacturer at will. They are used in blocks of 1176 bits before which a sync-word of 16 "0"-bits is transmitted

Electrical Interface

The electrical interface for S/PDIF signals can be either 75 Ω coaxial cable or optical fiber (usually called TOSLINK). Usually consumer models use that coaxial cable interface and semiprofessional/professional equipments use optical interface. The electrical signal in the coaxial cable is about 500mVtt.

Converting between AES/EBU and S/PDIF interfaces

There are differences in the electrical characteristics of AES/EBU and S/PDIF interfaces:

  • AES/EBU uses a balanced differential line based on XLR connectors and the signal levels are 5 volts
  • S/PDIF uses a coaxial unbalanced line with RCA connectors and the signal levels are around 0.5 volts

You can convert one electrical interface to another with a small amount of off-the-shelf hardware and a little time as you can see in the circuit below.

But the protocol used in AES/EBU and S/PDIF is not exactly the same and that can cause sometimes problems. The basic data format of AES and S/PDIF are identical. There is a bit in the channel status frame that tells which is which. Depending upon the setting of that bit, some bits have different meanings. For example, the bits used to describe de-emphasis in the AES/EBU protocol overlap the bits used to implement the SCMS protocol in S/PDIF land.

Some equipment are very flexible and will reject data that is different from what it is expecting. So an S/PDIF equipment may reject AES/EBU data.

Jitter specifications of AES/EBU interface

The AES/EBU standard for serial digital audio uses typically 163 ns clock rate and allows up to -20 ns of jitter in the signal. This peaks to peak value of 40 ns is aroun 1/4 of the unit interval. D/A conversion clock jitter requirements are considerably tighter. A draft AES/EBU standard specifies the D/A converter clock at 1 ns jitter; however, a theoretical value for 16-bit audio could be as small as 0.1 nsec. Small jitter D/A conversion is implemented by using separate PLL clocks for data recover and DAC and by using a buffering between data recovery and DAC.

Conversion circuits

Remember that although the audio data is the same in both AES/EBU and S/PDIF interfaces, they are indeed different formats, at least in their subcode. AES converted to coax is NOT S/PDIF, and S/PDIF converted to XLR balanced is NOT AES. They are still their native format, just the transmission medium has changed. Whether they will work in your application depends on the equipment chosen.

Some DATs have a switch that selects one format or the other regardless of the physical interface, some just ignore what they don't understand (usually resulting in the generally positive benefit of ignoring SCMS encoding), and some indeed gag on the "other" format. But simply fixing the physical interface works far more often than it doesn't.

Adapters between different formats:

Input/output circuits

See also



  • IEC 60958-3:2006 (old name: IEC 958-3:1989)
  • JEITA CPR-1205 (former EIAJ CP-340 1987-9 & EIAJ CP-1201)
  • (CP-1201 Japanese equivalent of IEC 60958)