ATS: a Lisp Environment for Spe tral Modeling Juan Pampin
Center for Computer Resear h in Musi and A ousti s (CCRMA), Stanford University
juan
rma.stanford.edu, http://www-
rma.stanford.edu/juan
Abstra t
frequen y entroids, formants, vibrato patterns, et ., an be treated as symboli obje ts and used ATS is a library of Lisp fun tions for spe - to reate abstra t sound stru tures or spe tral tral Analysis, Transformation, and Synthesis of lasses. In a higher layer of abstra tion, the sound. It provides a variety of tools for spe tral on ept of spe tral lass is used to implement modeling in luding dierent analysis front-ends, predi ates and pro edures, onforming spe tral
omplex transformation algorithms, and several logi operators. For instan e, in terms of this synthesis te hniques. This paper presents a this logi , sound morphing be omes a union (a snapshot of ATS' urrent state, as well as an dynami ex hange of features) of spe tral lasses overview of re ent resear h work related with this that generates a parti ular hybrid sound inon-going proje t. stan e. Spe tral information is stored in a data abstra tion alled sound. Sounds an be treated symboli ally like any other Lisp obje t. This 1 Introdu tion means that the user an map fun tions or losures through lists of sounds or in lude sound ATS is a library of Lisp fun tions for spe - obje ts in algorithmi omposition pro edures tral Analysis, Transformation, and Synthesis of in a transparent way, spe tral and musi al data sound. The Analysis se tion of ATS implements being inter hangeable. two omplementary partial tra king algorithms. This allows the user to de ide whi h strategy is the best suited for a parti ular sound to be ana3 Analysis lyzed. Analysis data is stored as a Lisp abstra tion alled sound. A sound in ATS is a symboli Sound analysis in ATS is based on the sinuobje t representing a spe tral model that an be soidal model [3℄. In its present stage, ATS has s ulpted using a wide variety of transformation two analysis front-ends, one using a pit h synfun tions. ATS sounds an be synthesized us- hronous algorithm (sieve ), and a more omplex ing dierent target algorithms, in luding addi- one performing partial tra king and extra tion tive, subtra tive, granular, and hybrid synthesis (tra ker ). Both fun tions have their foundation te hniques. on the Short-Time Fourier Transform (STFT)
2
and perform sinusoidal parameter dete tion (i.e. frequen y, amplitude, and phase). The output of these analysis fun tions is an ATS sound stru ture.
Symboli Pro essing
ATS is written in Common Lisp [1℄. Using Lisp's listener the user an intera t with the system in many ways, performing analysis of sounds, visualizing and transforming spe tral data, and running synthesis in real time. The use of a high level language like Common Lisp presents the advantage of a symboli representation of spe tral qualities [2℄. For instan e, high level traits of a sound, su h as global spe tral envelopes,
3.1
Sieve
This algorithm performs harmoni tra king of partials. The parameters for the STFT are omputed as a fun tion of the fundamental of the analyzed sound. This fun tion is useful for stable isolated harmoni tones, it pla es dete ted peaks 1
into a frequen y sieve that is used to reate sinusoidal traje tories. This algorithm is based on a pit h syn hronous strategy where ea h tra k of the sieve has a ontrollable frequen y bandwidth dependent on the fundamental. 3.2
(let* ((partials (ats-sound-partials my-sound)) (transp-env (loop for i from 0 below partials with even-env = ’(0 1.0 1 2.0) with odd-env = ’(0 1.0 1 0.5) collect (if (oddp i) odd-env even-env)))) (trans-sound ’my-sound transp-env :formants T :name ’my-new-sound :simp T))
Tra ker
Tra ker implements a more robust analysis al- Figure 1: Example Lisp ode using trans-sound. gorithm than sieve that is suitable for the analysis of non harmoni tones. After performing peak dete tion and interpolation, peaks are ontinued a ross frames, and sinusoidal traje tories are extra ted [4℄. Tra ker uses also psy hoa ousti information to determine the salien e of ea h dete ted traje tory. This information is based on the masking ee ts produ ed within
riti al bands and a
ounts for the audibility of the sinusoidal traje tories. Salien e information an be used to redu e the number of partials using a psy hoa ousti metri . In a postpro essing stage to analysis, the user an de ide to keep just a ertain number of traje tories, ltering out the less salient ones. Salien e an be also used to de ide whi h partials to synthesize in a real-time s enario where synthesis resour es are limited [5℄.
4
Even and odd partials are transposed using different envelopes. The loop ma ro is reating a list of envelopes transp-env that is used in the
all to the fun tion. 4. A list of envelopes : ea h partial is transformed with a dierent dynami value. 5. A list of numbers and envelopes : some partials are transformed with onstant values and others with dynami ones.
Transformation fun tions operate on frequen y, amplitude, and time parameters. ATS has at present about 20 built-in transformation fun tions that an be ombined using ma ros to design more omplex algorithms. In the following se tions three of the transformation fun tions are presented as an example.
Transformations
4.1
Transposition
A sound in ATS is an intermediate represen- trans-sound sound transposition &key formants name tation of the spe tral evolution of an analyzed simp signal [6℄. The user an manipulate the parameters of a sound to operate spe tral transforma- The fun tion trans-sound performs frequen y tions on it. Transformations an be destru tive transposition. This fun tion takes the following (i.e. the original sound stru ture is hanged), arguments: or generative (i.e. the transformation generates sound : sound instan e to transform a new instan e of the sound, keeping the original sound untou hed). An ATS sound an u transposition : transposition fa tor (being mulate several transformations before being reof any of the formats des ribed above) synthesized. Parameters passed to the transformation formants : if T (true), formants of the sound fun tions an be, of any of the following types: are kept after transposition. The amplitude of the partials are s aled a
ording to the 1. A number : transformation is parallel and spe tral envelope of the original sound. syn hronous, this numeri value is used name : optional name for the new generated to transform all partials over all the time sound (if NIL the fun tion is destru tive) frames of the sound. 2. A list of numbers : ea h partial is transformed with a dierent value. 3. A list ontaining a CLM-style [7℄ transformation is dynami .
envelope : 2
simp : if T (true) partials with a mean fre-
quen y over half sampling rate or below zero Hertz after transformation, are eliminated from the sound stru ture.
4.2
Time Stret hing
stret h-sound
sound stret h
&key name
This fun tion performs time stret hing over the partials of a sound:
stret h : time stret hing fa tor (being of any
of the formats des ribed above). In ATS ea h partial an be stret hed by a dierent onstant or dynami fa tor. This an produ e spe tral stru tures where verti- Figure 2: Spe tral User Interfa e real-time on al relationships between partials are om- trols pletely altered (dia hroni transformation). During synthesis, parameters are interpoIn its present version the SUI has the followlated between windows a
ording to this ing ontrols: (altered) time information. Amplitude : amplitude s aling of the spe tral omponents 4.3 Amplitude Gate gate-sound
sound
&key limit s aler predi ate name
This fun tion performs a sele tive ltering of the partials of a sound. The user an dene an amplitude threshold in dB and a sele tion predi ate. This predi ate ompares the mean signal-to-mask ratio (SMR) of the partials with the spe ied threshold. For example if the predi ate is only partials with mean SMR greater than the threshold are transformed. Relevant parameters are:
Transposition :
Shift : frequen y shifting. This pro edure
Distortion : this transformation onsiders
>
5
limit : SMR threshold in dB
s aler : amplitude s aler. The amplitudes of the sele ted partials are multiplied by this value. This parameter an be a number ( onstant s aling) or an envelope (dynami s aling).
predi ate : numeri Lisp predi ate used for the partial sele tion. The predi ate takes two parameters: a threshold in dB and the average SMR of the tested partial. Spe tral (SUI)
User
Interfa e
Besides the des ribed transformation fun tions, ATS provides a Spe tral User Interfa e (SUI) for real-time spe tral transformations. Using realtime CLM apabilities [8℄, the SUI provides the user with a set of graphi sliders that ontrol transformation parameters during resynthesis in real-time.
3
frequen y transposition. All partials of the sound are transposed in frequen y as in a variable-speed tape re order without hanging the length of the sour e. adds a xed amount of frequen y to all the partials of the sound, reating a spe tral translation or ompression. If we onsider a harmoni spe trum generated by the formula y = a x , where y is the frequen y of the partial, x its rank, and a the frequen y value of the fundamental, the spe tral shift
an be expressed as: y = a x + b, where b is the shift fa tor. The user ontrols the amount of shift in terms of a per entage of the fundamental frequen y of the sound (the default range goes from 0% to 100%). that the sour e has a harmoni stru ture (linear spe trum) and lets the user exponentially distort it. Spe tral distortion an be expressed as: y = a xb , where y is the frequen y of the transformed partial, x its rank, a the frequen y value of the fundamental, and b the distortion fa tor. If the value of b is 1.0 we obtain a harmoni stru ture, if we in rease its value we get a nonlinear frequen y stru ture that is per eived as inharmoni .
Proportional Time : this slider a ts as a time-frame s rubber. The user an move a ross the frames of the spe tral stru ture
during synthesis or even freeze the synthesis at a given frame. Using the play toggle button the SUI an be set into s rubbing mode or into a loop synthesis mode.
Written in a high level language like Common Lisp, ATS allows for the symboli pro essing of spe tral traits. Analysis data is stored as a Lisp abstra tion alled sound, a symboli Lisp obje t representing a spe tral model, that an be s ulpted using a wide variety of transformation 6 Synthesis fun tions. ATS sounds an be synthesized using dierent target algorithms written in CLM. The synthesis engine of ATS is implemented us- A Spe tral User Interfa e oers graphi -oriented ing the CLM (Common Lisp Musi ) synthesis spe tral transformation apabilities during realand sound pro essing language [7℄[8℄. ATS has time resynthesis. ATS provides an environment many target synthesis algorithms, the most imfor sound design and omposition that allows portant ones are: the user to explore the possibilities of spe tral 1. Additive Synthesis : Real-time additive syn- modeling in a very exible way. thesis an be performed with the Spe tral User Interfa e (SUI) or the it-synth fun - Referen es tion, both using an IFFT overlap-add algorithm. ATS has also an os illator-bank [1℄ Steele, G. 1990. Common LISP. Digital additive synthesizer (os -synth ) that allows Press. the use of phase information during resyn[2℄ Graham P. 1994. On Lisp: Advan ed te hthesis. niques for ommon lisp. Prenti e Hall. 2. Subtra tive Synthesis : two subtra tive algorithms are available using spe tral infor- [3℄ M Aulay, R. and T. Quatieri. 1986. Spee h Analysis/Synthesis based on a Sinusoidal mation to ontrol a lter-bank. The fun Representation, IEEE Transa tions on tion t-synth implements the lter bank A ousti s, Spee h and Signal Pro essing, using an IFFT overlap-add approa h, and 34(4):744-754. fmt-synth uses an array of robust bandpass lters (resonators). Both fun tions an l[4℄ Serra, X. 1989. A system for Sound Analter white noise or a sound le spe ied by ysis/Transformation/Synthesis based on a the user.
Deterministi plus Sto hasti de omposition.
3.
Granular Synthesis : spe tral information
an be used as a ontrol grid for granular [5℄ Gar ia, G. and J. Pampin. 1999. Data Comsynthesis. In this terms a sound be omes a pression of Sinusoidal Modeling Parameters time-frequen y grid that spe ies the evoBased on Psy hoa ousti Masking, Pro eedlution of sound grains over time. The fun ings of ICMC 99. tion grn-synth lets the user design spe tral grids to be used in grain generation. The [6℄ Pampin, J. 1995. Systhème d'Analysedensity, shape and duration of the grains Traitement-Synthèse. Master Thesis, CNSM
an also be ontrolled dynami ally. Lyon.
4.
Hybrid : as for transformations, synthesis [7℄ S hottstaedt, W. The CLM Manual. fun tions an be ombined using ma ros to http://www-
rma.stanford.edu/CCRMA/
reate hybrid synthesis methods. For inSoftware/ lm/ lm-manual/ lm.html stan e, granular synthesis an be ombined with subtra tive synthesis to perform some [8℄ F. Lopez-Lez ano, and J. Pampin. 1999. Common Lisp Musi Update Report, Prokind of granular ltering on a sound le us eedings of ICMC 99. ing spe tral data to ontrol the lters.
7
Ph.D. Dissertation, Stanford University.
Con lusions
ATS is an on-going resear h and development proje t, this paper dis ussed its urrent state. 4