Amr-nb

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Amr-nb as PDF for free.

More details

  • Words: 1,847
  • Pages: 12
Strictly Confidential

Technical Documentation

Open AMR Initiative

Open AMR Initiative AMR Codec

Technical Documentation Version 2.0, Revision A July 2007

2.0, A, 07/2007

1 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

This material and information (“Information”) constitutes a trade secret of VoiceAge Corporation and is strictly confidential. You agree to keep this Information confidential and to take all necessary measures to maintain its secrecy. Without limiting the foregoing, VoiceAge Corporation considers its confidential Information, including, but not limited to, any source code and technical information, to be an unpublished proprietary trade secret. If an authorized publication occurs, the following notice shall be affixed to it: Copyright © 1996-2007 VoiceAge Corporation. All Rights Reserved. No part of this material may be reproduced, including, but not limited to, photocopying, electronic or mechanical recording, nor stored in a retrieval system, or otherwise transmitted, in any form or by any means, without the prior written permission of VoiceAge Corporation. VoiceAge Corporation assumes no responsibility for any errors or omissions. This Information is subject to continuous updates and improvements. All warranties implied or expressed, including but not limited to implied warranties of merchantability, fitness for purpose, condition of title, and non-infringement, are specifically excluded. In no event shall VoiceAge Corporation and its suppliers be liable for any special, indirect or consequential damages or any damages whatsoever arising out of or in connection with the use of this information. The foregoing disclaimer shall apply to the maximum extent permitted by applicable law, even if a particular remedy fails its essential purpose. ACELP and VoiceAge are registered trademarks of VoiceAge Corporation in Canada and/or other countries. Any unauthorized use is strictly prohibited. © Copyright 2007 VoiceAge Corporation

VoiceAge Corporation 750 Lucerne Road, Suite 250 Montreal, QC H3R 2H6 CANADA Telephone: (514) 737-4940 Fax: (514) 908-2037 [email protected] www.voiceage.com

2.0, A, 07/2007

2 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

Contents Revision history .........................................................................................4 References..................................................................................................4 The AMR codec ..........................................................................................5 Package contents.......................................................................................6 Data input/output format ...........................................................................7 Discontinuous Transmission (DTX) .........................................................9 About the Encoder/Decoder Sample Programs ....................................10 Usage of the encoder ..............................................................................................................10 Usage of the decoder ..............................................................................................................10 Building the sample programs.................................................................................................10

AMR API functions...................................................................................11 E_IF_init ..................................................................................................................................11 E_IF_encode ...........................................................................................................................11 E_IF_exit .................................................................................................................................11 D_IF_init ..................................................................................................................................12 D_IF_decode...........................................................................................................................12 D_IF_exit .................................................................................................................................12

2.0, A, 07/2007

3 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

Revision history July 2007

Updated descriptive text, frame bitmap table, references and document template.

July 2004

Second release.

July 2002

First release of this document.

References [1]

3GPP 1999 TS 26.071, “AMR speech Codec; General description.” http://www.3gpp.org/ftp/Specs/html-info/26071.htm

[2]

3GPP TS 26.104: “ANSI-C code for the floating-point Adaptive Multi-Rate (AMR) speech codec.” http://www.3gpp.org/ftp/Specs/html-info/26104.htm

[3]

IETF RFC 3267, “RTP payload format and file storage format for the Adaptive Multi-Rate (AMR) Adaptive Multi-Rate Wideband (AMR-WB) audio codecs,” March 2002. http://www.ietf.org/rfc/rfc3267.txt

[4]

3GPP 1999 TS 26.101, “AMR Narrowband Speech Codec; Frame Structure.” http://www.3gpp.org/ftp/Specs/html-info/26101.htm

2.0, A, 07/2007

4 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

The AMR codec VoiceAge’s AMR is an adaptive multi-rate narrowband speech codec with eight bit rate modes ranging from 4.75 kbps to 12.2 kbps and an additional low-bit-rate background noise mode. The codec includes a voice activity detector, a comfort noise generator and an error concealment mechanism, all of which improve speech quality over lossy transmission mediums. For a general description, please see [1]. The implementation provided in this package is the AMR floating-point speech encoder and fast fixedpoint speech decoder. The encoder produces output that is compatible with the AMR-NB IF2 format. The decoder is bit-exact with 3GPP TS 26.104 [2]. The RTP payload format defined in [3] enables the use of AMR in RTP packet-switched networks in applications like streaming and provides interoperability with existing codec transport formats on non-IP networks.

2.0, A, 07/2007

5 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

Package contents These files are included in the AMR Open Initiative package. AMR-NB.pdf

This document.

AMR-NB.lib

Win32 statically linkable library of AMR-NB floating-point encoder/ fixed-point decoder for Pentium and compatible processors.

encoder.c

Source code for encoder test program.

decoder.c

Source code for decoder test program.

interf_enc.h interf_dec.h typedef.h

Header files needed to compile encoder and decoder test programs.

encoder.exe

Encoder test program executable.

decoder.exe

Decoder test program executable.

2.0, A, 07/2007

6 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

Data input/output format Input to the encoder is in 16-bit pulse code modulation (PCM) speech data sampled at 8 kHz. The decoder outputs the reconstructed speech data in the same format. Each input speech frame of 20 ms consists of 160 16-bit PCM words containing 14-bit left-aligned uniform samples. The encoder outputs compressed speech data in octet aligned (by using bit stuffing) AMR-NB Interface Format 2, as defined in the 3GPP TS 26.101 [4]. Frame structure for AMR-NB IF2 Frame Type (4 bits) AMR-NB Core Speech Frame (size depends on bit rate mode) Bit Stuffing (n bits)

An AMR-NB IF2 frame contains a header with a “Frame Type” field. The 4-bit “Frame Type” field identifies the current frame as either an AMR-NB codec mode, comfort noise or an empty frame. The AMR-NB core frame is the compressed speech data or comfort noise data within a 20-ms frame. The size of this data depends on the current AMR-NB codec mode. The last field contains stuffing bits, which are necessary to align the AMR-NB IF2 frame to the next multiple of eight. The following table shows the bit allocation for AMR-NB IF2 frames. Table 1. Total bits used for an AMR-NB IF2 frame Frame Type Index

Bit rate (kbps)

Frame type bits

AMR-NB core bits

Padding bits

Total bytes per AMR-NB IF2 frame

0

4.75

4

95

5

13

1

5.15

4

103

5

14

2

5.90

4

118

6

16

3

6.70

4

143

6

18

4

7.40

4

148

0

19

5

7.95

4

159

5

21

6

10.2

4

204

0

26

7

12.2

4

244

0

31

8

AMR SID*

4

39

5

6

9

GSM-EFR SID

4

39

1

6

10

TDMA-EFR SID

4

39

6

6

11

PDC-EFR SID

4

39

7

6

12-14

(for future use)

-

-

-

-

15

No data

4

0

4

1

*Bit rate of comfort noise (FT index 8) is 1.75 kbps when assuming continuous transmission. 2.0, A, 07/2007

7 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

Table 2, based on table A.1a in [4], shows an example how the AMR 6.7-kbps mode is mapped into AMR IF2. The four least significant bits (LSB) of the first octet (octet 1) consist of the Frame Type (=3) for the AMR 6.7-kbps mode (see table 1a in [4]). This data field is followed by the 134 AMR core frame speech bits (d(0)..d(133)), which consist of 58 Class A bits and 76 Class B bits as described in table 2 in [4]. This results in a total of 138 bits, and 6 bits are needed for bit stuffing to arrive to the closest multiple of 8, which is 144 bits. Table 2. Example mapping of the AMR 6.7-kbps speech coding mode into AMR IF2 (The bits used for bit stuffing are denoted as UB (for "unused bit").) MSB Octet

Bit 8

Mapping of bits AMR 6.7 Bit 7

Bit 6

Bit 5

1

LSB

Bit 4

Bit 3

Bit 2

MSB

Frame Type (= 3) ........

Bit 1

LSB

d(3)

d(2)

d(1)

d(0)

0

0

1

1

2

d(11)

d(10)

d(9)

d(8)

d(7)

d(6)

d(5)

d(4)

3















d(12)

UB

UB

d(133)

d(132)

Stuffing bits

18 UB

2.0, A, 07/2007

UB

UB

UB

8 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

Discontinuous Transmission (DTX) In a typical telephone conversation, voice transmission alternates frequently between the speaking parties, leaving long pauses of silence. These pauses can be efficiently represented as background noise and transmitted at a much lower bit rate than speech. The discontinuous transmission mode is used to encode frames that contain only background noise. When AMR operates in DTX mode, a voice activity detector (VAD) on the transmission (TX) side evaluates whether a frame contains any voice data. In the absence of speech, a silence information descriptor (SID) frame, which contains characteristics describing the background noise, is transmitted. On the reception (RX) side, a comfort noise generator (CNG) is used to synthesize background noise based on the SID frame parameters. On the TX side, the encoder generates “no data” frames until it detects a change in the input signal (as background noise or speech).

2.0, A, 07/2007

9 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

About the Encoder/Decoder Sample Programs The sample programs encoder.c and decoder.c demonstrate how to initialize and call the encoding and decoding processes. Input to the encoder and output from the decoder is in the form of 16-bit PCM words containing 14-bit left-aligned uniform speech samples.

Usage of the encoder encoder (-dtx) mode speech_file bitstream_file -dtx

Enables discontinous transmission mode.

mode

Specifies encoding at one of the 8 AMR-NB bit rates.

–modefile filename

Can be used instead of the mode argument to specify the encoding mode for each frame from a mode control file. This text file should contain one mode number (0-7) per line.

This table shows the AMR encoding modes and their bit rates. 0

1

2

3

4

5

6

7

4.75

5.15

5.90

6.70

7.40

7.95

10.20

12.20

Mode Bit rate (kbps)

Usage of the decoder decoder

bitstream_file synth_file

Building the sample programs To build the speech encoder or decoder sample programs, compile the file encoder.c (or decoder.c). Link this object file to the codec static AMR-NB library.

2.0, A, 07/2007

10 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

AMR API functions E_IF_init Description

Allocates and initializes the encoder state memory.

Syntax

#include " interf_enc.h " void * E_IF_init (dtx);

Arguments Returned value

dtx void *

dtx = 1 to enable discontinuous transmission Pointer to the state memory used by the encoder

E_IF_encode Description

Encodes one frame of speech data into a byte-aligned IF2 compatible packed data stream.

Syntax

#include " interf_enc.h " int E_IF_encode (Word16 mode, Word16 *speech, Uword8 *serial);

Arguments

Returned value

mode speech serial

Encoding mode at one of 8 AMR-NB bit rates (0-7) Input buffer containing one frame of speech samples Output buffer containing compressed data

Number of bytes written to output buffer

E_IF_exit Description

Frees the encoder state

Syntax

#include " interf_enc.h " void E_IF_exit ();

Arguments Returned value

2.0, A, 07/2007

None

11 of 12

Strictly Confidential

Technical Documentation

Open AMR Initiative

D_IF_init Description

Allocates and initializes the decoder state memory.

Syntax

#include " interf_dec.h " void * D_IF_init (void);

Arguments Returned value

None void *

Pointer to state memory used by the decoder

D_IF_decode Description

Decodes one compressed speech frame.

Syntax

#include " interf_dec.h " void D_IF_Decode (Uword8 *bits, Word16 *synth);

Arguments

Returned value

bits

Input buffer containing compressed data from the encoder

synth

Output buffer containing one frame of decoded speech samples

None

D_IF_exit Description

Frees the decoder state memory.

Syntax

#include " interf_dec.h " void D_IF_exit ();

Arguments Returned value

2.0, A, 07/2007

None

12 of 12