Home

ISO/IEC 23003-2:2010
Subscribe to updates

Information technology -- MPEG audio technologies -- Part 2: Spatial Audio Object Coding (SAOC)

Abstract

ISO/IEC 23003-2:2010 specifies the reference model of MPEG Spatial Audio Object Coding (SAOC): an efficient parametric coding technology designed to encode, transmit, and interactively render multiple audio objects for playback with various kinds of channel configurations (mono, stereo, 5.1, headphones/binaural). Rather than performing a discrete coding of the individual audio input signals, MPEG SAOC captures the perceptually relevant properties of audio signals into a compact set of parameters that are used to synthesize a flexibly rendered audio scene from a transmitted downmix signal.

MPEG SAOC extends MPEG Surround in a way that provides several significant advantages in terms of additional functionality available to users. It allows the user on the decoding side to interactively control the multi-channel rendering of each individual audio object on different kinds of sound reproduction setup. In addition, MPEG SAOC inherits many advantages of MPEG Surround technology, like transmission (in a backward compatible way) of complex multi-object audio content at bitrates not much higher than what is required for its mono or stereo downmix. MPEG SAOC processing effectively reuses the multi-channel rendering functionality of MPEG Surround in a computationally efficient manner. Therefore, MPEG SAOC technology can be directly used to extend MPEG Surround and upgrade existing distribution infrastructures for stereo or mono audio content (teleconferencing systems, music downloads, Internet streaming, etc.) towards the delivery of audio content while retaining full compatibility with existing receivers. Rendering can be interactively controlled by the end-user and is independent of the playback system setup.

Key features of MPEG SAOC are:

  • interactive rendering of audio objects on the decoder/receiver side;
  • transmitted SAOC bit stream is independent of loudspeaker (or headphones) configuration;
  • low-power processing mode (e.g. for applications on portable devices);
  • low-delay processing mode (e.g. for communication applications);
  • flexibly selectable bitrate overhead, allowing scalability from low bitrate applications such as Internet streaming to high-quality applications such as custom remix of music;
  • it can be applied upon audio using any coding scheme;
  • backward compatibility: the default downmix is always available for legacy playback devices.

 

Related standards

Format
  • PDF

    This format preserves the paper layout, and is watermarked

  • EPUB

    This format allows documents to be read on tablets and smartphones

  • COLOUR PDF

    Enhanced user-friendly colour PDF format

  • REDLINE

    See any updates made from previous versions at a glance

  • PAPER

    Normally A4 size documents. Shipping costs apply

Language
PDF
Paper

Swiss francs CHF 198

Add to basket

Contact customer services

Send your enquiry by email
or call us on +41 22 749 08 88
09:00 – 12:30, 14:00 – 17:00 (UTC+1).