Lossless And Lossy Audio Data Compression Revisi.docx

  • Uploaded by: Kuncoro Triandono Mukti
  • 0
  • 0
  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Lossless And Lossy Audio Data Compression Revisi.docx as PDF for free.

More details

  • Words: 3,938
  • Pages: 8
LOSSLESS AND LOSSY AUDIO DATA COMPRESSION Kuncoro Triandono Mukti1 and Anggunmeka Luhur Prasasti2 Computer Engineering, Telkom University Bandung, Indonesia 1 [email protected], 2 [email protected]

Abstract - Audio is one of the fastest growing multimedia data, especially in the advanced music industry its make a lot of large audio formats, such as data and video data, audio data is also required for storage issues and realtime access needs through computer networks. The smaller audio size will reduce the delay time so that data transmission will be faster. Traditional data compression is applied to computer machines, this is done because every symbol that appears on the computer has different bits. Data compression is used to subtract the number of bits generated from each symbol that appears. This compression is expected to reduce (reduce the size of the data) in the storage space. Similar to image compression, there are two kinds of data compression techniques on audio, namely lossy and lossless. For daily consumption, lossy compression is more widely used because the resulting compression ratio is large or the resulting audio size is very small. This paper discusses the basic principles in the compression of audio, Camparison Algortihm, and Audio file format. Keywords - Audio Compression, Lossless, Lossy. I.

INTRODUCTION

In general, data compression is a change of a symbol into a code. Compression when there is a very small code with the size of the original symbol code. From a code or the basic symbols of a model will be in a special code. Simply model a data set and rules to set a symbol to determine a codecode as the output [3]. Audio is one of the fastest growing multimedia data, especially in the advanced music industry its make a lot of large audio formats, such as data and video data, audio data is also required for storage issues and real-time access needs through computer networks. The smaller audio size will reduce the delay time so that data transmission will be faster [1]. Like most compression techniques,

audio data compression, both lossy and lossless, makes use of information redundancy with encoding, pattern recognition, and linear predictions like video compression. In lossless compression, the compression result can be restored like the original data without any change, then the compression ratio cannot be too large to ensure all data can be restored to its original form. Lossy compression is a compression scheme that eliminates some of the information contained in the original data so that when done decompression process then the output data will not exactly the same as the original data. Both utilize the limitations of the human senses that can only capture (perceive) environmental conditions in a certain range, based on the frequency of his voice is divided into 4 groups of which the ears of human hearing is between 20 Hz to 20000 Hz. Two of the most popular audio formats is FLAC and MP3. In terms of quality, FLAC is definitely better than MP3. However, in fact, especially in Indonesia itself, more people choose to listen to MP3 format (MPEG-1 Audio Layer 3) due to its small size. If we compare with the MP3 format, this FLAC format does require a large enough space. If an audio CD-quality audio data use 44.1 kHz sampling rate, 16 bits per sample, 2 channels (stereo), then the total audio data storage per second is approximately 176,400 Bytes so for a duration of 60 seconds (1 minute) it takes 10,584 MB. If the average duration in a song is about 4 minutes, then it takes about 42.336 MB of space to store 1 song where 1 CD can hold only 16 songs [2, 4]. Currently, there are many compression algorithms, including Dynamic Markov Compression (DMC), Run Length Encoding (RLE), Lempel Ziv Welch (LZW), Arithmetic coding, Huffman Code, Rice Code, Golomb Code, BW Transform, and others. Based on some previous research the Huffman algorithm is faster at compression, and better in audio compression, according to [11] Huffman algorithm is better, faster, and produces high PSNR than Arithmetic coding in compression. And according to [12], Huffman compression results better than LZW and

DMC in the case of binary files, multimedia files, image files and compressed files.

sequences are huffman coding and arithmethic coding [6].

In this paper first we will explain the basic theory of lossless and lossy compression, then the algorithm that can be used to compress, and then compare the results of previous research. II. LOSSLESS & LOSSY COMPRESSION Data compression is the process of encoding information using bits or other information-bearing units that are lower than the data representation that is not encoded with a particular encoding system, there are two compression techniques Lossless and Lossy. Here is an explanation of Lossless, Lossy and audio data compression algorithms. For specific classification shown in figure 1.

figure 2. Lossless compression A. Usability of Lossless Compression Lossless compression is primarily used for archiving, and editing. For the purposes of archiving, of course, the desired quality is the best quality. So also with editing. Editing lossy compressed data causes a decrease in the sound quality of each storage. Then lossless compression is always used in sound engineering. In addition to both uses, lossless compression is also commonly used by audiophile, the music fans who enjoy listening to music with high quality with high-quality hardware as well. Lossless compressed audio data is also used to generate lossy audio data for distribution. Nowadays, with the increasingly low cost of digital data storage media and bandwidth, lossless compression is becoming increasingly popular among consumers. B. Basic Principle of Lossless Compression

figure 1. Classification data compression 2.1. LOSSLESS Lossless compression in audio data means that the compressed results of the data can be decompressed to produce exactly the same data as the original data, without any loss of quality at all. Lossless compression for audio data is somewhat similar to the generic lossless compression algorithm, with a compression ratio of about 50% to 60%, although it can achieve 35% in orchestral music data or less noise chorus. usually a lossless compression program uses two different types of algorithms. First, which produces statistical models for data input, and second, maping the input data to the bit sequence using this model in the way that "probable" data will produce shorter outputs of "improbable" data. The main encoding algorithms used to generate bit

There are two main stages in lossless compression for audio data, first prediction, and coding. Prediction uses the previous samples to predict the next sample. Then the difference between the sample predicted the result and the actual sample is coded. For each format, usually, the difference is only found in prediction and/or coding techniques. Some audio formats that support lossless compression include Shorten, while those commonly used today are Free Lossless Audio Codec (FLAC), Apple Lossless, MPEG-4 ALS, Monkey's Audio, WavPack, and True Audio. Each format has a different step or stage on its compression as an example in this paper will explain the compression process in FLAC format.

Example Audio Compression on FLAC The FLAC format is issued by the Xiph.Org Foundation by utilizing the high correlation between samples in the audio data. FLAC uses linear predictions to convert samples into rows of numbers called residues, which are then stored with Golomb-Rice coding. The resulting compression ratio is 40% to 50%. The compression process is done by several stages: 



Blocking, Block in FLAC refers to a row of samples spanning multiple channels. Block size may vary depending on several factors including sample rate. This block size affects the compression ratio directly. If the block size is too small, it takes a lot of frames so many bits will be wasted to store the frame header. If it is too large, the characteristic of the audio signal will be too varied making it difficult to find an optimal predictor. FLAC limits block size between 16 to 65535 samples per block. Interchannel Decorrelation, For stereo data, there are often many correlations between the left and right channels. Thus there are several channel storage methods into blocks as follows: o Independent, Independent, both channels are encoded separately. o Mid-side, storing the average signal of both channels as mid channel and the difference between the left channel and the right as the side channel. o Left-side, save left channel and side channel. o Right-side, save right channel and side channel. In certain cases, Left and Right is the most efficient method [15].



Prediction, the encoder looks for an approximation of the mathematical description of the signal on each block. The size of this description is generally much smaller than the size of the signal itself. This prediction method is known by encoder or decoder so that the compression



C.

results enough to include the prediction parameters. There are four methods used by FLAC for prediction: o Verbatim, The prediction signal is zero, so the residue is the same as the actual signal (no compression). o Constant, This method is used if in the certain channel in a block there is digital silence or constant value. The encoding used is run-length. o Fixed linear prediction o FIR linear prediction Residual coding using Golomb-Rice Coding, when predictors cannot describe signals exactly, it is necessary to keep the difference between the original signal and the predicted signal. This difference is called a residue. The effectiveness of the prediction can be seen from the size of the required residue. This residue is stored with one of two ways of rice coding: o Using one parameter for the entire residue. This parameter is based on the residual value variance. o The residue is divided into several parts of the same length, with each section having its own parameters determined from the average value of the residue.

Algorithm for Lossless Compression       

Huffman Code Golomb Code Rice Code Tunstall Code Arithmetics Code Dictionary Code Run-Length Code

2.2. LOSSY The compression technique where the decompression data is not the same as the data before compression but is "enough" to be used. Examples: Mp3, streaming media, JPEG, MPEG, and WMA. The advantage of this method is its smaller size compared to Lossless. In the lossy compression for audio data, there will be a decrease in quality if the compression results are attempted to be decompressed. This quality degradation is called compression “artifacts”.

Usually this technique removes pieces of data that are actually not so useful, not so perceived, not so visible, so people still think that the data can still be used even if it is compressed. for example on MP3. examples of these methods are Transform Coding, Wavelet, and others. lossy compression is also called irreversible compression because the original data is impossible to restore. the advantages of this technique is a high compression ratio compared to lossless methods [6].





Some audio formats that support lossy compression include a very popular format is MP3 which is part of MPEG that handles the audio layer (MPEG layer III), AAC which is further developed, and OGG. For speech data, there are several formats such as A-law / μ-law used on the phone, AMR on GSM, AMR WB for CDMA, and so on.

figure 3. Lossy Compression A. Usability of Lossy Compression Lossy compression in audio data is very widely used, either directly (eg on mp3 players) or indirectly (contained in DVD video, digital television, video streaming, etc.). This compression is used by the music lovers because the result of a very high compression ratio between 0% to 100% with sound quality is still "good enough". B. Basic Principle of Lossy Compression Primarily, lossy compression of audio data utilizes psychoacoustic, which is related to the ability of the human ear to sound. The human ear can only capture sounds with frequencies between 20Hz and 20000Hz. But there are still some other basic techniques in lossy compression for audio data, namely: 

 

Voc File Compression. This technique is very simple, ie removing samples of silent samples (no sound) such as pauses between paragraphs in a speech or a moment's silence on some parts of a song. Linear Predictive Coding (LPC), Code Excited Linear Predictor (CELP). CELP is a further development, with a more complex analytical model to produce greater compression ratios and better sound quality. Slightly similar to lossless compression, at CELP the difference between the original sound and the analytical

model is also stored in a compressed form as well. ADPCM, Simply the first sample is kept intact, while for the next samples, the saved is the difference with the previous sample, which is generally not very large. MPEG, This technique uses psychoacoustic theory. If the voice cannot be heard by the human ear, then the sound part does not need to be encoded. In addition, which is still associated with psychoacoustic is noise shaping. High-frequency signals can only be heard by humans if they have large volumes, therefore noise is 'hidden' in these high-frequency areas with small volumes.

C. Algorithm for Lossy Compression  

Scalar Quantization Vector Quantization III. AUDIO FILE FORMAT

Audio file format grouped in 3 classification which is Free and Open file format, Open File Format and Proprietary Format. More detailed shown on table 1. Table 1. Audio File Format [16]

Free and Open File format Format

Descripton

*.wav

standard audio file container format used mainly in Windows PCs. Commonly used for storing uncompressed (PCM), CDquality sound files, which means that they can be large in size — around 10 MB per minute. Wave files can also contain data encoded with a variety of codecs to reduce the file size (for example the

*.ogg

*.mpc

*.flac

*.aiff

*. raw

*.au

GSM or mp3 codecs). Wav files use a RIFF structure. a free, open source container format supporting a variety of codecs, the most popular of which is the audio codec Vorbis. Vorbis offers compression similar to MP3 but is less popular. Musepack or MPC (formerly known as MPEGplus, MPEG+ or MP+) is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s. Musepack and Ogg Vorbis are rated as the two best available codecs for high-quality lossy audio compression in many doubleblind listening tests. Nevertheless, Musepack is even less popular than Ogg Vorbis and nowadays is used mainly by the audiophiles. a lossless compression codec. This format is a lossless compression as like zip but for audio. If you compress a PCM file to flac and then restore it again it will be a perfect copy of the original. (All the other codecs discussed here are lossy which means a small part of the quality is lost). The cost of this losslessness is that the compression ratio is not good. Flac is recommended for archiving PCM files where quality is important (e.g. broadcast or music use). the standard audio file format used by Apple. It is like a wav file for the Mac. a raw file can contain audio in any codec but is usually used with PCM audio data. It is rarely used except for technical tests. the standard audio file format used by Sun, Unix and Java. The audio in au files can be PCM or compressed with the μ-law, a μlaw or G729 codecs.

an industry-standard protocol that enables electronic musical instruments, computers, and other equipment to communicate, control, and synchronize with each other.

*.mid

Open File Format Format

Description

*.gsm

*.dct

*.vox

*.aac

*.mp4/m4a

*.mmf

designed for telephony use in Europe, gsm is a very practical format for telephone quality voice. It makes a good compromise between file size and quality. Note that wav files can also be encoded with the gsm codec. A variable codec format designed for dictation. It has dictation header information and can be encrypted (often required by medical confidentiality laws). the vox format most commonly uses the Dialogic ADPCM (Adaptive Differential Pulse Code Modulation) codec. Similar to other ADPCM formats, it compresses to 4-bits. Vox format files are similar to wave files except that the vox files contain no information about the file itself so the codec sample rate and number of channels must first be specified in order to play a vox file. the Advanced Audio Coding format is based on the MPEG2 and MPEG4 standards. aac files are usually ADTS or ADIF containers. MPEG-4 audio most often AAC but sometimes MP2/MP3. a Samsung audio format that play a music of ringtone.

Proprietary Formats Format

Description

*.mp3

*.wma

*.wav

*.ra

*.ram

*.dss

*.msv

*.dvf

the MPEG Layer-3 format is the most popular format for downloading and storing music. By eliminating portions of the audio file that are essentially inaudible, mp3 files are compressed to roughly onetenth the size of an equivalent PCM file while maintaining good audio quality. the popular Windows Media Audio format owned by Microsoft. Designed with Digital Rights Management (DRM) abilities for copy protection the older style Sony ATRAC format. It always has a .wav file extension. To open these files simply install the ATRAC3 drivers. a Real Audio format designed for streaming audio over the Internet. The .ra format allows files to be stored in a self-contained fashion on a computer, with all of the audio data contained inside the file itself. a text file that contains a link to the Internet address where the Real Audio file is stored. The .ram file contains no audio data itself. Digital Speech Standard files are an Olympus proprietary format. It is a fairly old and poor codec. Prefer gsm or mp3 where the recorder allows. It allows additional data to be held in the file header. a Sony proprietary format for Memory Stick compressed voice files. a Sony proprietary format for compressed voice files; commonly used by Sony dictation recorders.

*.ivs

*.mp4

*.iklax

*.mxp4

A proprietary version with Digital Rights Management developed by 3D Solar UK Ltd for use in music downloaded from their Tronme Music Store and interactive music and video player. A proprietary version of AAC in MP4 with Digital Rights Management developed by Apple for use in music downloaded from their iTunes Music Store. An iKlax Media proprietary format, the iKlax format is a multi-track digital audio format allowing various actions on musical data, for instance on mixing and volumes arrangements. a Musinaut proprietary format allowing play of different versions (or skins) of the same song.

For example compression with different format shown in figure 4.

figure 4. Compression with different file format

IV. RESULTS AND DISCUSSION Table 2. Comparison of Algorithm Lossy and Lossless No. Referensi

Format Algorithm

Type

Compression Results (%)

Source

Results *.aac *.flac *.midi *.mp3 *.wav

3% 1% 1% - 35% 1% - 3% 10% - 27%

[1]

Huffman

Lossless

*.aac *.flac *.midi *.mp3 *.wav

[2]

Huffman Shift Coding

Lossy

*.flac

*.mp3

90% - 95%

[3]

Huffman

Lossless

*.mp3 *.wav *.wma

*.mp3 *.wav *.wma

1% 17% 2%

[4]

Huffman

Lossless

*.wav

*.wav

20% - 40%

[5]

Arithmethic

Lossy

[6]

Arithmethic

Lossy

*.mp3 *.wav *.aif *.au *.mid *.wav

*.mp3 *.wav *.aic *.aic *.aic *.aic

2.4% 11% 18.443% 27.589% 31.819% 36.741%

-

-

40.56%

-

-

51.47%

Lossless

*.mp3 *.wav

*.mp3 *.wav

0.46 % 13.83 %

Lossless

*.mp3

*.wav

3.73% - 29,56%

Lossless

*.wav

*.wav

14,87%

Huffman [7] [8] [9] [10]

Lempel-Ziv Welch (LZW) Run Length Encoding (RLE) Lempel-Ziv Welch (LZW) Huffman Shift Coding

Lossless

In the study [1], the compression and decompression testing on 30 sample data to see the results of the ratio and time, after the test obtained the conclusion if the highest to lowest compression ratio in the test data format can be sorted as follows *.midi, *.wav , *.mp3, *.aac, *.flac. In contrast to the research [1], [3], [4], [7] that using only the Huffman algorithm, the research [2] used the Huffman Shift Coding algorithm, with the aim of performing a lossy compression obtained with a very satisfactory compression ratio reaching 90%. Then in the study [5], [6] using Arithmetic Coding to perform compression Lossy only get the highest compression ratio of 31.819% in *.wav format. while in research [7], [9] the algorithm used is Lempel-Ziv Welch (LZW) with lossless compression. The highest compression ratio was 51.47%. and of all the proposed algorithms compared to Run Length Encoding (RLE) in the study [8] only got a 0.46% compression ratio alone.

V. CONCLUSION From previous research, can be concluded if currently the popular algorithm that is Huffman Coding for Lossless compression, with the advancement of data storage technology today a variety of lossless compression began to be developed due to the storage price which can be spelled has started relatively cheap and to get comfortable while listening to music. From the various formats used as an object, testing got the conclusion if the format * .wav is the format that has the best compression results compared with other formats. If grouped by type and algorithm, Lossless compression is best suited to use Lempel-Ziv Welch (LZW) which has been proven by research [7], and

for Lossy is Huffman Shift Coding algorithm that achieves a compression ratio of over 90% [7]. At least, that all based on the previous research for now all compression application or application that using audio compression such as Factory Format, Xilisoft, Instagram, Facebook, Line, Whatsapp they use their own compression that produces different results even though the size of the data is not too much different. therefore the compression percentage is not very important because the percentage of compression can be adjusted but what is important now is a good compression result. for example by using MSE (mean square error) we can know the difference of the original file with the compression result, the smaller value is good compression results. If the compression reaches 100% but the value of MSE is very large the compression results would not be good.

[7]

Rhen Anjerome Bedruz, Ana Riza F. Quiros, Comparison of Huffman Algorithm and Lempel-Ziv Algorithm for Audio, Image and Text Compression. 8th IEEE International Conference Humanoid, Nanotechnology, Information Technology Communication and Control, Environment and Management (HNICEM). 2015.

[8]

Aditya Rahandi, Dian rachmawati, Sajadin Sembiring, Analisis dan Implementasi Kompresi File Audio Dengan Menggunakan Algoritma Run Length Encoding (RLE). Jurnal Online Program Studi S1Ilmu Komputer, Vol 1, No 1 (2012).

[9]

Erwin Dwika Putra, Dedy Abdullah, Analisis Perbandingan Kompresi Gambar (*.bmp) dan Audio (*.wav) Menggunakan Algoritma Lempel Ziv Welch (LZW), Bengkulu. Amplifier Vol. 6 No. 2, Mei 2016.

[10]

Prasetyo, Galang Bagus, Kompresi File Audio Wave Menggunakan Algoritma Huffman Shift Coding. Sarjana thesis, Universitas Brawijaya. 2013.

[11]

Venkatasekhar D., Aruna P, A Fast Fractal Image Compression Using Huffman Coding. Dept of Computer Science & Engg., Annamalai University, Annamalai Nagar, India. 2012.

[12]

Sutoyo T, Teori Pengolahan Citra Digital. Andi, Yogyakarta. 2009.

[13]

Daryanto, T. Sistem Multimedia dan Aplikasinya. Yogyakarta: Graha Ilmu. 2005.

[14]

Benjamin, A. Music Compression Algorithms and Why You Should Care. Alexander Benjamin. 2010.

[15]

Satrio Adi Rukmono, Kompresi Data Audio. Bandung. 2009.

[16]

Audio File Format, diakes 2008.

VI. REFERENCE [1]

[2]

Rendra Warsita, Rahmat Agus Setiawan, Yoannita, Rancang Bangun Aplikasi Kompresi Audio Berbasis Android Menggunakan Algoritma Huffman. 2015 Luthfi Firmansyah. Data Audio Compression Lossless FLAC Format to Lossy Audio MP3 format with Huffman Shift Coding Algorithm. Fourth International Conference on Information and Communication Technologies (ICoICT). 2016.

[3]

Ari Wibowo, Kompresi Data menggunakan Algoritma Huffman. Batam. 2007.

[4]

Hari Purwanto, Penerapan Algoritma Huffman pada Kompresi file WAV. JURNAL VOL.2, No.2-40-59 Universitas Suryadarma. 2015.

[5]

[6]

Uswatun Hasanah, IMPLEMENTASI ALGORITMA ARITHMETIC CODING PADA KOMPRESI FILE AUDIO VIA FTP (FILE TRANSFER PROTOCOL). Skripsi 2017. Yahya Fathoni Amri, Analisis Kinerja Kompresi File audio menggunakan Algoritma Arithmethic Coding dengan bilangan integer. 2012.

Related Documents


More Documents from "lipika008"