This document was uploaded by user and they confirmed that they have the permission to share
it. If you are author or own the copyright of this book, please report to us by using this DMCA
report form. Report DMCA
Overview
Download & View 2009_book_therubatocomposermusicsoftware.pdf as PDF for free.
Series Editors Guerino B. Mazzola Moreno Andreatta
Gérard Milmeister
The Rubato Composer Music Software Component-Based Implementation of a Functorial Concept Architecture
Dr. Gérard Milmeister Langackerstrasse 49 8057 Zürich Switzerland [email protected] Contributors: Prof. Dr. Guerino B. Mazzola 164 Ferguson Hall Minneapolis MN 55455 USA [email protected]
DOI 10.1007/978-3-642-00148-2 Library of Congress Control Number: 2009921145 c 2009 Springer-Verlag Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Printed on acid-free paper 9 8 7 6 5 4 3 2 1 springer.com
Foreword
Gérard Milmeister’s thesis, which is now published in Springer’s innovative series Computational Music Science, is a key text for accessing the present version of the R UBATO software. It is also the beautiful trace of a conceptual and technological completion of a development that was initiated in 1992, when Oliver Zahorka and I conceived, designed and implemented the R UBATO sofware for musical performance. This first implementation on NeXT computers, written in Objective C, was driven by the idea to implement Theodor W. Adorno’s theory of an analytical performance, i.e., a performance that is defined upon musical analysis of the given score. The original architecture of R UBATO was therefore modular and split into modules for analysis (rhythmical, melody, and harmonic) and modules for performance. These modules, coined rubettes, were only conceived as organic parts of an overall anatomy. However, the successive developments and also research driven investigations, such as Anja Fleischer’s work1 on rhythmical analysis or Chantal Buteau’s work2 on motivic analysis showed that there is also a legitimate interest in rubettes without being necessarily parts of a given fixed anatomy. Successive work by Stefan Göller3 on geometric concept spaces associated with R UBATO data formats, and Stefan Müller4 on performance and gesture rubettes proved that there is a different approach to R UBATO, which by far transcends the original “hardcoded” anatomy. 1 Fleischer, Anja. Die analytische Interpretation: Schritte zur Erschließung eines Forschungsfeldes am Beispiel der Metrik. Dissertation, Berlin 2003 2 Buteau, Chantal. A Topological Model of Motivic Structure and Analysis of Music: Theory and Operationalization. Dissertation, Zürich 2003 3 Göller, Stefan. Object Oriented Rendering of Complex Abstract Data. Dissertation, Zürich 2004 4 Müller, Stefan. Pianist’s Hands—Synthesis of Musical Gestures. Dissertation, Zürich 2004
v
vi
Foreword
The new requirements had to face different conceptual, soft-, and hardware conditions and challenges. To begin with, the NeXT computer had finished to exist, and the platform-dependent strategies, such as the original Objective C implementation had become obsolete by the now standard Java virtual platform environment. The other point of change was that the rubettes had to become a modular construct that would be of any size and would also be of open usage without much predefined larger anatomical preconditions. It turned out that the decisive requirements were defined by component-driven programming. However, this generic setup also entailed a radical redesign of the data format of denotators, which was invented in the early collaboration with Zahorka. The redesign was however not only affected by the component-driven data architecture, but by a meanwhile dramatic urge to step from naive (zero-addressed) denotators to functorial denotators, i.e., to quit the old-fashioned concept of a point and to introduce functorial points, i.e., morphisms defined on variable address modules and with values in not necessarily representable space functors. All these delicate requirements set up an agenda that could not be realized except by a computer scientist with excellent programming and really solid mathematical competence. Gérard Milmeister was the ideal researcher to bring such a difficult enterprise to its completion. His award-winning doctoral dissertation, which is now in your hands, is the written counterpart of his remarkable programming work, available as a GPL software on http://www.rubato.org. The thesis is not only a clear and concise introduction to the conceptual and mathematical architecture of the R UBATO enterprise, but offers also precise and concrete tutorials for the programming of rubettes and their networks. The success of Milmeister’s work is, last, but not least, documented by contributions from Florian Thalmann and myself, which prove that the R UBATO software may now reach out to compositional tasks that were postponed since the early developments of a geometric composition software presto in the late 80s. Thalmann’s BigBang rubette is the long-awaited extension of R UBATO to gestural strategies in composition, and it is the proof that Milmeister’s work is not only the completion of a long march through the hard- and software meanders and conceptual revolutions, but also is setting a pointer to creative future music software perspectives. Minnesota, October 2008
Prof. Dr. Guerino Mazzola
Preface to the Springer Edition
For this edition published by Springer, I am happy to be able to include as chapter 17 and chapter 18 two contributions by Guerino Mazzola and Florian Thalmann. The first is the description of a sophisticated rubette that provides an extensive gestural interface to manipulate musical structures. The second contribution is the first major application of R UBATO C OM POSER in music theory and computational composition. It resulted in a remarkable piece of music starting from the idea of “analyse créatrice” forwarded by Pierre Boulez. The whole process involves many of the features presented in this book, and, thus, is something of a “proof by construction” of the usefulness of these concepts. I therefore thank both for their energy and ingenuity in putting the R UBATO C OMPOSER system to test and exercising its capabilities. Zurich, December 2008
Gérard Milmeister
vii
Preface
Trained as a computer scientist, I have always been interested in music and musicology. Therefore I took the opportunity offered by PD Dr. Guerino Mazzola to work with his MusicMedia group, then a part of the MultiMedia Laboratory at the Department of Informatics of the University of Zurich, directed by Prof. Dr. Peter Stucki. Thus, first and foremost, thanks go to Guerino Mazzola, who introduced me to mathematical musical theory, most of which I had never heard of before. He gave me the conviction of working at the forefront of music research and taught me the use of modern mathematical methods, such as category theory. He supervised my work with all the enthusiasm and competence one could wish for. I would also like to thank the reviewers Prof. Dr. Bernd Enders of the University of Osnabrück and Prof. Dr. Renato Pajarola of the University of Zurich, who suggested improvements to this thesis. The thesis could not have been accomplished without the backing by Prof. Dr. Pfeifer, whom I like to thank for his willingness to support it as the responsible member of the Faculty of Mathematics and Natural Sciences. Finally, I have to thank the staff of the Department of Informatics for helping me with the tedious administrative tasks that a doctoral student and assistant has to manage. Zurich, November 2006
Gérard Milmeister
ix
Introduction
It is significant that the art and theory of music production and performance have always been connected to the newest developments of the natural and engineering sciences of the time. Indeed, a sophisticated theory of sound and music has been an important part of the earliest mathematics in ancient Greece. The theory of musical intervals set forth by Pythagoras is still a matter of discussion among psychologists of music and theorists alike. On the other hand, since the appearance of digital computers, and the development of computer science as a mathematical and engineering discipline in the late 1940s and early 1950s, music has been among the first applications to appear besides numerical computations on the newly invented machines. At lot has happened since, and there have always been researchers in mathematics who have been trying to apply the newest trends in their disciplines to the explanation of the principles of music, with various degrees of success. Whatever the outcome of each of these developments, the outlook of music as a whole has been changed forever to the open-minded observer. Unfortunately, it is still the case that the intersection of the set of mathematical music researchers and the set of musicologists or music theorists is vanishingly small compared to the combined number of people active in the field. To further the penetration of mathematical music theory into the realm of music production, it is vital to offer a computer-based tool to contemporary composers and music theorists that provides the most advanced ideas from mathematics applied to music. Category theory is the field of mathematics that has crept into almost every mathematical domain and reformulated most basic tenets in a modern language. Computer science is another discipline that has benefited enormously from the exposure to category theory, as has mathematical music theory. It is certainly not exaggerated to assert that the colossal volume The Topos of Music by Guerino Mazzola has brought music theory to a new
xi
xii
Introduction
level by infusing it with advanced and recent ideas from category and topos theory, whence the name. The following work takes the fundamental ideas expounded in that book and describes the design and implementation of a software system, called R UBATO C OMPOSER, that provides the tools to work creatively and analytically with those principles. The software system is both an application development framework, targeted at programmers proficient in mathematics or music theory, or both, and a graphical user interface intended for the composer or the theorist who wants to make use of components created by developers to build an application that embodies his or her musical ideas. The first part presents the concepts and theory. There is neither place nor need for a complete exposition of the theory developed in The Topos of Music. Therefore only those parts that have found their way into the implementation are discussed. The second part is a thorough account of the implementation of the R U BATO C OMPOSER software system. Here the high-level organization as well as some details are covered. This chapter is also of importance to the developer who wants to extend and build on the R UBATO framework. The third part is about the practical aspects of using R UBATO C OMPOSER. A tutorial describes a typical use through a step-by-step tour illustrated with screenshots of the running program. Several uses of the framework by external projects are introduced that exemplify the application developer aspect. The fourth and last part acts as an appendix. The greatest part is taken up by the R UBATO C OMPOSER user’s manual. This manual is intended as a stand-alone reference and also features many details that are not essential to understanding and therefore are not included in the treatment in the main text.
The title of this chapter states Music Theories in the plural and not the singular Music Theory or Theory of Music. Probably no single theory will ever cover the enormous richness of music in the world, although there have been promising endeavors to extract certain principles common to most. This is a boon and a bane at the same time: a bane, because all attempts at reducing music to a single set of rules have failed so far to satisfy adherents of universally valid systems; a boon, since it relieves music lovers from the potential danger of the dreaded reduction of music to a mere system of rules, devoid of all mysticism cherished by many. Even putting aside all aspects of the subjective, this state of affairs promises new vistas for future musical activity, since the reservoir of new and modified theories for generating new music seems inexhaustible. The role of theory in the analysis as well as the production of music has however not been the same at all times in the history of music. To better understand how theories and what theories fit into contemporary music, it is necessary to shed some light on the last few hundred years during which music theories have been perceived as such by musicians. It is necessary to restrict the discussion to theories about Western music. Of course theories of music from other cultural environments such as India exist and its literature is very broad indeed, but including it would by far break the tight limits set by this text. Certainly the most famous theory linking music and natural science known from Antiquity is the description of music intervals using of chord ratios by Pythagoras. However from the post-Antiquity treatise by Boethius through the Middle Ages until 19th century, music theories were rather enshrouded in the philosophical (or theological) frameworks of the day. Those parts relevant to the analysis and composition of music did not much more than describe the state of the art as it was a decade before the publication of the treatises.
3
4
1 Overview of Music Theories
Thus the Gradus ad Parnassum by Johann Joseph Fux published in 1725 [25] laid out the rules of counterpoint, even if those rules did not accurately describe the practice of most advanced contemporary composers, such as Johann Sebastian Bach. This work, well used even into the present, also illustrates features common to similar theoretical treatises: they delivered a set of recipes justified by empirical or psychological phenomena, thus providing weak theories, even in comparison to the studies pursued by Greek mathematicians. The schema behind the history of these theories can be summarized in the phrase: Theory follows Composition. There is hardly any theoretical proposal that would provide the composer with new ways and guidelines to drive his work. Therefore, it is mostly the case that “avant-garde” composers, such as Beethoven in his late string quartets, produce the material that musicologists try to put into theory after the event. The beginning of the 20th century saw new developments of music theorists trying to give the various aspects of music, such as harmonic and formal structure, a more exact and scientific character. This is certainly connected to the rise of abstract mathematics in the 19th century, which relies on new rigorous types of notation, as well as the use of diagrammatic methods. Such theories include the Funktionstheorie by Hugo Riemann and the structural theories by Heinrich Schenker. During the second half of the 20th century, finally, modern mathematics found its way into theoretical considerations of music [5]. Combinatorics is seen as playing a major role throughout the history of musical composition. Dodecaphonic composition and its generalization to serial techniques provide the telling example of this trend. Driven by the popularity of structuralist views, generative theories as they were first proposed by Noam Chomsky for the formalization of language have been applied to musical form as well [38]. This has been a breakthrough in that, for the first time, theories have been devised not least with a view to provide new means for composition. Composers start complementing their musical works by theoretical studies explicating the methods that led to their construction, for example Musiques formelles (1962) by the Greek composer Iannis Xenakis [74]. Several other eminent composers, such as Arnold Schoenberg, Ferruccio Busoni or Paul Hindemith, have published theoretical works that give insight into their works. It has eventually become possible to state that: Composition follows Theory. In the last decades, the role of mathematics in music has been on a steady increase. The invention of digital computers in the late 1940s was soon followed by the creation of music with the help of computers (for example the famous Illiac Suite from 1957 by Hiller and Isaacson [31]) and along with it the development of mathematical theories quite specific to musical applications. It was Milton Babbitt, Hillers teacher of composition at
1 Overview of Music Theories
5
Princeton University, who promoted the use of mathematics, its terminology and, thus, scientific precision in studying theories of music. A music theory should be objective and be stated as a body of definitions, axioms, and theorems, just as in the case of other mathematical investigations. An analogy can be made to the relation of physics to engineering: mathematical music theories play the role of physics, and musical composition the role of engineering. It is obvious that without the theoretical work provided by physics, engineering would be impossible, or, even worse, engineering with a disregard of physics would result in bad work. Cautiously applying the analogy, we would say that contemporary composition without regard to mathematics results in bad music. . . . Two types of composition by electronic means have been predominant. One type is exemplified by the computer programs Project 1 (1964) and Project 2 (1966) developed by the composer Gottfried Michael Koenig. They implement composition algorithms and operate on the level of musical structure. On the mathematical side, aleatoric and combinatorial methods play an important role [36]. Another line of development involving the use of computers for musical composition concerns the production of sound by means of digital processing. In this case music is controlled on a very low level, effectively down to the sound wave. Many parameters, which before were constrained by acoustic instruments, become available. These parameters allow qualities of sound never heard before and thus fruitful for new principles of composition. The pioneering work has flown into the MUSIC-N family of synthesis languages, the prototype program MUSIC having been written by Max Mathews in 1957 at Bell Labs [42]. The most recent descendant in the family, widely used today on many computer platforms, is Csound [10]. The import from mathematics to musical theories comes from all domains of abstract algebra, such as group theory. More recently, category theory has provided comprehensive tools for both analysis and composition. Mathematical studies have been conducted for fields such as counterpoint or the theory of the string quartet. A main event in this direction has been the publication of The Topos of Music by Guerino Mazzola [47], which uses extensive knowledge from category theory to provide theoretical treatments of a broad selection of branches in musical theory. The mathematical approach to music analysis and composition also allows new means for joining the musical past to the future. In accordance with the idea of “analyse créatrice (creative analysis)” put forward by Pierre Boulez [49], Guerino Mazzola undertook the ambitious project of a new composition based on the analysis of the first movement of Beethoven’s sonata op. 106 “Hammerklavier” [43]. The process involved the discovery of mathematical principles buried in the piano sonata and the extraction of significant parameters (“analysis”). The result is an analytical model. After changing the analytical coordinates of this model using mathematical
6
1 Overview of Music Theories
transformations, the model was remapped to produce a new musical work that fundamentally based on, but different from, Beethoven’s composition (“creative”). The process itself is not entirely new, as similar principles have been employed in a vague manner throughout the history of music. But the mathematical approach, the exactness of the procedure, and the awareness of its happening are aspects of a modern phenomenon and open new vistas for future experimentation. The result of the project is the piano sonata “L’Essence du Bleu” which has also been recorded [48]. The sonata has been constructed completely by hand, using the traditional tools available to the composer and mathematician, such as pencil and graph paper. To manage this kind of composition, computer software implementations of the methods used would provide an ideal tool. The development of such software is of paramount importance for future experimentation in this direction. The R UBATO C OMPOSER presented in this thesis is the result of this insight. It is the attempt to bring the mathematical and computational tools to a level that a composer is comfortable with. The mathematics it is based on is category theory, specifically the category of modules. Module theory is very important in mathematical music theory, since it allows the representation of the common objects in music, such as notes and chords, and the description of the usual transformations in the theory of counterpoint and composition, such as transposition, inversion and retrograde, among others. The theory of modules contains the complete theory of linear algebra, and thus Euclidean spaces, which ensures the availability of general geometric manipulations at the hands of the composer. Categorical considerations led to the development of a very general data model, called denotators, which is used throughout R UBATO C OM POSER for the transport of concepts and information.
Chapter 2
The Representation of Music
Theories, whether in mathematics or in music, presuppose a world of objects that we can be theorize upon. In mathematics the notion of objects is quite clear, from obvious number domains to high dimensional manifolds. Musical objects, if we may call them thus, are much more elusive. Identifying the musical objects would be nothing less than determining what music ultimately is. So far, no one really knows what happens in the brain when listening to music, what symbolic structures are generated and and how they are processed. The psychology of music does make some inroads, but without effective results useful to the theorizing we have in mind. We are left with the only possibility of grasping musical objects, namely proposing more or less formal schemes for the representation of music, preferably in a mathematical context. Such a representation should allow the manipulation required by the various methods of musical composition as well as analysis. Certainly, no one format will be able to provide all of this, and, in effect, the formats developed during the centuries of written music targeted overlapping, but different, uses.
2.1 Types of Representation To bring a little order into the various types of representation, it is helpful to borrow from the ontology and semiotics of music. The cube shown in figure 2.1 is an abstract model of locating music within our knowledge system. It considers multiple aspects, which may or may not enter in a specific type of representation. The levels of reality discriminate between the layers where music takes place. The physical and mental levels will be of primary concern in this context. The communication coordinate of the cube refines these layers and relates the musical work to its creator and the recipient.
7
8
2 The Representation of Music
This coordinate is usually not made explicit in the representations that we discuss below. The third coordinate semiosis deals with the semiotics of music. It provides the theory of how expressions (significants) are related to the meaning (significate) of a musical work. The act of relating significants to significates is called signification. The process of signification is the attempt of extracting the content from a representation. An in-depth discussion of musical ontology can be found in [47].
Se io
te ca ifi n tio ign S ca i f ni g nt i S ca ifi gn sis Si
Fig. 2.1: Topographic cube of musical ontology, after [47].
The semiotic status of the various types of music representation is very difficult to assess, since we have to deal with two types of information. On the one hand, there is explicit information (or extension). It is the kind of information that can be controlled by rules, many of which have even been cast into formal clothes, for example counterpoint rules according to Fux [25] or harmonic theories such as Schoenberg’s [63]. On the other hand, there is implicit knowledge (or intension). This knowledge cannot be formalized. It depends on environment, personal preferences, historical background, etc.
2.2 Symbolic Representation of Music
9
[2]. The consequence of this is that, whatever representation we use, the act of signification, and therefore the relation of expressions to meanings, varies from individual to individual, thus making impossible the “ultimate”, completely unambiguous representation. It has already been hinted that the levels of physical and mental realities play an important role in the representation of music. A physical representation aims to reproduce music in terms of sound, often as low-level as air pressure changes as in the physical theory of sound. The theories of sound production as in the acoustics of instruments also belongs to this class. Common formats of representation include analog tape recordings or digital audio file formats such as MP3. It is the quality of sound rather than of music that is transported in this kind of representation. It has been the means of old to preserve musical performances and to archive oral, nonwritten music common in ethnomusicologist studies. More recently, audio has been supplemented by video, thus capturing an important part of performance, namely the gestures involved in making music. However important the interest in the other types of representation is, we are going to concentrate on symbolic representations, which try to capture the mental aspects.
2.2 Symbolic Representation of Music The printed score is the most famous of all representations. Developed from a gestural language, the neumatic notation used in manuscripts of early church music (see figure 2.2 for an example) [19], it was extended and refined over ten centuries to the point of integrating a wealth of information for the musical performer. In its original form it reached its peak during the first decade of the 20th century, when composers such as Gustav Mahler determined every aspect of the performance of their works using extremely fine-grained instructions affecting even single notes. Some structure is indicated and beginning in the Romantic era, content is hinted at by often poetic directions (for example, très doux et pur in figure 2.3 showing the beginning of Scriabin’s 10th Sonata). Other printed representations targeted at specific instruments and types of music have been devised, such as the various tablatures for lute and keyboard music, very common from the 15th to the 18th centuries [23]. Along with common notation, many composers from the second half of 20th century devised notations of their own to communicate the intent of their works or even specific to a single work [17]. In many cases these graphic “scores” have been considered as works of art in themselves.
10
2 The Representation of Music
Fig. 2.2: Neumatic signs describing the melodic motion above the lines of text (St. Gall, 10th century).
Moderato
169 p 9 16 très doux et pur
Fig. 2.3: The first four bars of Alexander Scriabin’s Sonata No. 10, Op. 70, in traditional notation.
2.2.1 Electronic Scores Early in the development of computers, handling of music has already been a hot topic. Thus it comes as no surprise that many efforts have been made to bring musical scores into electronic form. One aim has been, of course, to follow the way of digital book production and make the production of musical scores entirely digital. A whole industry is devoted to the development of so-called music notation software, and with it, the invention of electronic score representations. Two examples of quite different approaches shall illustrate this idea. The popularity of XML has had its effect on music representation too, and we now have a large number of XML based markup languages for music, such as SMDL (Standard Music Description Language) [65], NIFF XML (Notation Interchange File Format), and many more. One particular scheme is MusicXML [59], which has become the lingua franca for the interchange of electronic scores between many notation pro-
2.2 Symbolic Representation of Music
11
grams, e.g., Finale and Sibelius. As is often the case with XML, even a small musical fragment generates a rather large instance of MusicXML. Therefore only part of the music from figure 2.3 as coded in MusicXML is shown in figure 2.4.
Fig. 2.4: MusicXML representation of the fragment figure 2.3.
12
2 The Representation of Music
These XML based formats are mainly intended for interchange and are not meant to allow authoring on the source itself. In contrast, there is a school of typography that favors textual representation of typesetting instructions that are compiled into high quality renderings. The main representative is Donald Knuth’s TEX, which allows for score typesetting using the add-on package MusixTEX. Modern descendants are ABC [70] and the powerful score typesetting software Lilypond [28]. The score fragment in figure 2.3 is effectively the result of compiling the code in figure 2.5. Observe how Lilypond tries to find out a good layout without the author providing precise instructions. The implementation of such layout software requires advanced research and techniques in computer science [27] and the construction of a fully functional and complete software application is a major undertaking. \score { \new PianoStaff << \time 9/16 \new Staff { \clef treble { r8^\markup{\bigger\bold{Moderato}}\p\< d’’16^\markup{\italic{tres doux et pur}}\( bes’4.\! ~ bes’8[ beses’16] ges’4.\) << { r8. r8 c’’’16-.( 8\arpeggio)-.--\laissezVibrer r16 } \\ { bes’16\>\([ c’’ beses’] as’4.\! ~ as’4. ~ as’8.\) } >> } } \new Staff { \clef bass { ges’4.( ges’8.) es’4. ~ es’8. ~ es’4. ~ es’8. ~ es’4. ~ es’8. } } >> } Fig. 2.5: The Lilypond source code used to generate figure 2.3.
2.2 Symbolic Representation of Music
13
All of these types of representation, different as they may be, serve the main purpose of the visual layout of traditional printed scores. Machine-based manipulation and feature extraction for analysis would be hard, if not impossible. This is to be expected, since they adopt all of the idiosyncrasies of human music notation accrued over the centuries. Some attempt to address the strictly musical aspect and allow the export of a MIDI version of their data (Lilypond for example), but for a more deeply musical understanding, other representations are needed.
2.2.2 MIDI The Musical Instrument Digital Interface (MIDI) was first proposed in 1981 as a means for driving electronic instruments, such as synthesizers. The standard [55] comprises three parts: 1. a specification for a serial cable and plug that establishes the physical connection of electronic devices; 2. a protocol that defines the possible events that are sent over the cable in real-time and are to be interpreted by the receiving instrument (where each instrument is assigned a channel number); 3. a file format that integrates such events, provides them with the times of their occurrence, and is enriched with so-called meta events for changing tempi and selecting programs (a means for choosing among several sound types provided by a device). The most important type of event is Note on/off. The Note on event signifies the pressing of a key. This event requires as parameters the channel (which is the address of the receiving device), the pitch (where the number 60 is defined to be c4 ), and the velocity, which essentially corresponds to the loudness. The Note off event analogously signifies the release of a key. Figure 2.6 lists the MIDI events generated from a simple rendering of the fragment in figure 2.3. MIDI does show its age though, in particular in its hardware limitations. Its affinity to keyboard instruments makes it cumbersome to control musical events that are not based on semitone scales, the workaround being to make use of meta events such as pitch bending. Nevertheless, MIDI can be put to good use when it comes to create fragments of musical performance, since the file format is probably the most common of all among music software and therefore is a sure value for interchanging data. We will come back later to this use when discussing the R UBATO C OMPOSER software.
Tempo Program Note on Note on Note off Note on Note off Note on Note off Note on Note off Note on Note off Note on Note off Note on Note off Note on Note off Note on Note on Note off Note on Note on Note on Note off Note off Note off Note off Note off
Fig. 2.6: A sequence of MIDI-events. In this example the unit of time is defined to be 1/480 of a MIDI quarter note. The Tempo event specifies the duration of a MIDI quarter note as MSPQ, which means milliseconds per quarter note.
2.2.3 Musical Representation Languages Musical notation and instrument control languages are ultimately surface representation and not conducive to analytical research. They require a musical mind to make sense of the analytical signification looming behind the music they describe. On the other end of the spectrum are the languages explicitly designed for the purpose of unfolding analytical structure. These languages are invented in the context of the development of software for musical composition and analysis. In the world of computer-based manipulation of symbolic data, the programming language LISP assumes a prominent place. Predominantly used
2.2 Symbolic Representation of Music
15
in the research of artificial intelligence, LISP and, in particular, CLOS (Common Lisp Object System) is also the basis of some of the best-known computer-based composition systems. Two particular implementations are Common Music and OpenMusic.
Common Music Common Music or CM [67] is an object-oriented LISP environment for music composition. It represents music as high-level objects, and through programs and algorithms designed by the user transforms them into events that rendered through MIDI. CM is very powerful, but it requires a good knowledge of programming, in particular of LISP, to even get started. It belongs therefore to the realm of programmer-musicians. Figure 2.7 shows an example of a seq object, which is a list of subobjects, in this case a time ordered sequence of eighty MIDI notes with pitches randomly chosen from the specified collection. (new seq :name ’pulse :subobjects (loop with d = .1 for i below 80 collect (new midi :time (* i d) :keynum (pickl ’(c4 d4 ef4 f4 g4 af4 bf4 c5)) :duration d :amplitude (interp (mod i 8) 0 .25 7 .75)))) Fig. 2.7: An example for the definition of a seq object in Common Music.
OpenMusic The more recent OpenMusic software has been developed at the IRCAM in Paris by Gérard Assayag and Carlos Agon [3]. It is also built on top of LISP, but in addition features a graphical user interface (figure 2.8) that separates the composer from the low-level entrails to a certain extent. The interface follows the familiar concept of visual programming by connecting components in networks (patches). OpenMusic has already met with considerable success, as testified by the contributors to [4]. OpenMusic exports the concepts of object-orientation to the user oriented interface. Classes can be defined and instances of classes can be created in a way that is inherited from the underlying CLOS. There are classes
16
2 The Representation of Music
Fig. 2.8: A “patch” in OpenMusic.
for notes, chords and sequences, which are combined and manipulated using the methods provided by the system or programmed by the composer. Thus, while offering appealing visual interaction, OpenMusic requires programming skills do unlock its considerable power.
Humdrum David Huron’s Humdrum Toolkit [32] takes a quite different path. Whereas OpenMusic is primarily a visual environment for composing, Humdrum targets musical analysis and takes advantage of all the features that a UNIX environment offers. This means, first of all, that Humdrum is commandline-based. In fact, the toolkit consists of a number of shell scripts that operate on text-based representations of music. Therefore the standard UNIX text utilities, such as grep, sed, awk, or sort, can be put to good use.
2.2 Symbolic Representation of Music
17
Humdrum does not rely on one unique format of representation, but allows the use of several different formats, each tailored to one specific use. One such format is the so-called **kern representation. This correspond roughly to common practice music notation, with pitches encoded as equally-tempered values at concert pitch (figure 2.9). **kern *staff3 *clefF4 *k[] *M9/16 =14.g-\ . . 8.g-\ =2 [4.e-\ . . 8.e-\ =3 4.e-\ . . . . 8.e-\ . =4 4.e-\ 8.e-]\ *-
Fig. 2.9: The first four bars of Sonata No. 10 in **kern notation. The score is subdivided into bars, for example =3 starts the third bar. A single note is specified using its duration and its pitch, for example 4.b- denotes a dotted quarter b flat.
Other representations include **pc for pitch classes instead of absolute pitches, or **harm which encodes Western functional harmony. Many commands have the sole purpose of converting between different representations, others extract information such as the most probable key of the musical fragment or possible patterns that reoccur periodically. The command key applied to our example gives the following result: Estimated key: E-flat minor
(r=0.8370)
confidence:
68.3%
All in all, Humdrum provides some unique features, but expects the user to be familiar with the UNIX way of doing things in general and text processing in particular.
18
2 The Representation of Music
2.2.4 Language of General Concepts The foregoing sections presented a variety of representations, from the very specific, such as notation and MIDI format, to the more general, such as OpenMusic’s classes and objects. It is certainly preferable to design a representation format that allows for the most general variations. The idea is to use a language of general concepts, based on hierarchical abstract structures. This is a generic approach, but it is strongly influenced by structuralist thought. In music, Schenkerian analysis is an example of such an approach. More recently, the principles of Chomsky’s notion of human language grammar have been transferred to music under the title of generative theory of tonal music (GTTM) by the composer Fred Lerdahl and the linguist Ray Jackendoff [38]. This theory has found great favor with a lot of music theorists. Unfortunately, this popularity has masked grave limitations of GTTM. Its rather dubious assumptions, its reliance on unequivocal hierarchies, and its normative statements do not adequately reflect the realities of music [46]. It is at least necessary to consider multiple hierarchies in order to respect the inherent ambiguities present in most types of music. These ambiguities must not be resolved, otherwise the essential richness of music responsible for its intellectual and emotional power is irretrievably lost. In addition to strict hierarchies, it is important to allow circular structures. This permits arbitrarily deep hierarchies, high complexity on every level, and even fractal-like self-similarity relations. In the next chapters, an architecture of concepts is presented and embedded in a complete mathematical theory.
Chapter 3
Architecture of Concepts I: Principles
An architecture of concepts deals with question: “How is a concept built?” To answer the question, the architecture provides the means of construction to build a concept from ground up. We are going to present two version of such an architecture, the first one is pure and works without primitive (or undefined) concepts. The second one supplements the pure architecture with atomic concepts. The pure architecture is self-sufficient with minimal means and is of such generality as to provide the foundations for virtually all of mathematics. As a matter of fact, it is required to even understand what the most basic of all mathematics, the set, is. This approach has been chosen in [52] to build a comprehensive environment of mathematics suitable for computer scientists.
3.1 Pure Architecture The pure architecture of concepts builds on the following principles: 1. A concept has a name. This is essentially different from traditional mathematics, where the building blocks are generally nameless. A mathematical space, such as the Euclidean plane R2 is defined by its construction, but there is no way to distinguish two spaces of exactly the same form, but with different intent. This is in fact a manifestation of the dichotomy of conceptual extension and intension. The mathematical space solely considers extension and disregards intension. There is a difference between a Euclidean space describing time versus position and one describing frequency versus amplitude, although the extension in both cases is R2 . In the language of concepts the first space might be given the name “Route” and the second space the name “Spectrum”. There can be no automatic conversion between these spaces. After all, what would be the
19
20
3 Architecture of Concepts I: Principles
interpretation of a point (x, y) of “Spectrum” as a point in “Route”? Any transfer from the one to the other must be explicitly defined. Thus, the name fixes a concept to a particular intent and discriminates it from other concepts, whatever explicit formal equality there may otherwise be. 2. A concept is built from other concepts, called its components. This is the basic principle of construction, without which no structural relationship could exist. 3. There are three methods of how to build concepts from other concepts: • Conceptual Selection: A selection concept has one component. • Conceptual Conjunction: A conjunctive concept has one or more components. • Conceptual Disjunction: A disjunction concept has two or more components. 4. A concept has instances with the following properties: • An instance has a value. • An instance has a name. The role of the name of an instance is analogous to its role for concepts.
3.1.1 Selection Selection provides the method for building collections. As an example consider the concept “SoftIce” which has the single component “Flavor”. The “Flavor” concept has several instances, say “Strawberry”, “Banana”, “Chocolate, “Vanilla”, etc. How exactly “Flavor” is built itself is of no interest here. How is an instance, say, “mySoftIce”, of “SoftIce” construed? Imagine we enter a soft ice cream parlor and order “mySoftIce”. We may point to the boxes containing the various flavors and say “this one” and/or “that one”. This is exactly the way an instance is created: “mySoftIce” is a selection of a number of instances of “Flavor”. An intuitive example from music is the concept of a score. In a naive view, an instance of the concept “Score” is a selection of some instances of the concept “Note”. A special notation for denoting concepts makes it easier to write about them. The notation for a conceptual selection is the following (here we use the musical example): Score:.Power(Note)
3.1 Pure Architecture
21
The term Power for selection introduced here will be explained in chapter 4 which deals with the mathematical theory. The terminology further uses Limit for conjunction and Colimit for disjunction. Parallel to the notation for concepts a notation for instances is introduced. An instance of “Score”, say “SimpleMelody”, made from instances “FirstNote”, “SecondNote”, “LastNote” of the concept “Note” is denoted as follows: SimpleMelody:@Score({FirstNote, SecondNote, LastNote})
or, leaving out the braces: SimpleMelody:@Score(FirstNote, SecondNote, LastNote)
It is important to observe that there is no order implied in the succession of the arguments.
3.1.2 Conjunction A conceptual conjunction determines a concept based on a fixed number of components. Paradigmatic is the concept of Note which is constructed from the concepts of Onset, Pitch, Duration, and Loudness. Here again we assume that the component concepts have already been suitably defined. This example is written as follows: Note:.Limit(Onset, Pitch, Duration, Loudness)
An instance of Note, say myNote, is determined whenever we know its onset and pitch and duration and loudness: myNote:@Note(myOnset, myPitch, myDuration, myLoudness)
3.1.3 Disjunction For a conceptual disjunction a sequence of at least two components needs to be given. An instance of a disjunction is defined whenever exactly one of these components is defined. In music, a note can be generalized to include rests, which are only determined by their onsets and durations: Rest:.Limit(Onset, Duration)
22
3 Architecture of Concepts I: Principles
A general note GeneralNote, then, is the disjunction of Rest and Note, and an instance of GeneralNote, say myGeneralNote, is determined whenever we know either the note or the rest. In this example, myGeneralNote is a rest: myGeneralNote:@GeneralNote(myRest)
These tools are sufficient to build set theory as in [52]. To get things going, the empty set is defined as the instance of the concept Set:.Power(Set) with an empty collection, i.e., emptySet:@Set(). However, the entire construction is long-winded, and it takes a significant effort to even construct the integers. Therefore we assume that we have a collection of primitive concepts available, such as various classes of numbers, strings, etc. This non-pure architecture is the basis of the further development in this work.
3.2 Architecture with Primitives In the previous section some concepts, such as Onset, have been left undefined. Onset is assumed to be the space of continuous time, modeled by the real numbers R. Such a primitive concept is atomic, i.e., it is no longer further analysed in terms of the architecture of concepts. Their properties come from various mathematical theories, such as module or field theory, assumed to be available to us. The notation uses the type Simple to describe such primitive concepts: Onset:.Simple(R)
An instance of Onset is a real number, for example 5.4, where the unit has been arbitrarily set to be the duration of a quarter note: myOnset:@Onset(5.4)
With the addition of simple concepts, we also have the practical tools available to assemble the various representations useful for handling different aspects of music. Beside the textual notation that has been introduced above, a tree-like twodimensional style for visualizing concept hierarchies is helpful for grasping conceptual structures. For each of the four principles, there is a diagrammatic schema:
3.2 Architecture with Primitives
23
Simple A concept of type Simple, such as Name:.Simple(R2 ), is drawn as: Name R2
Limit A concept of type Limit, such as Name:.Limit(Name1 , Name2 , . . . Namen ), is drawn as: Name Q Name1
Name2
···
Namen
The upper case Greek letter π is a reminder for “product” which is the mathematical name for this construction.
Colimit A concept of type Colimit, such as Name:.Colimit(Name1 , Name2 ,. . . Name3 ), is drawn as: Name ` Name1
Name2
···
Namen
The inverted upper case Greek letter π is a reminder for “coproduct” which is the mathematical name for this construction.
24
3 Architecture of Concepts I: Principles
Power A concept of type Power, such as Name:.Power(Name1 ), is drawn as: Name {} Name1 The symbol {} serves as a reminder for notation used for sets in mathematics. With these rules, the simple score concept can be graphically represented as in figure 3.1. Here an alternative, but equivalent, style of drawing the links has been used. Score {} Note Q Onset
Pitch
R
Q
Loudness Duration Z
R
Voice Z
Fig. 3.1: The conceptual representation of a score.
3.3 Examples In the rest of the chapter, some examples are presented to illustrate how the intuitive principles outlined above can be used to organize common concepts from the realm of music.
3.3 Examples
25
3.3.1 Macro Notes The simple score concept shown in figure 3.1 considers a piece of music only as a bag of notes, without any internal structure. Schenkerian analysis, however, imposes hierarchical structures, from the “Urlinie” to phrases and motifs, down to ornaments, such as trills [38]. A simplified realization of this approach considers macro notes, which consist of a traditional note (the anchor note) and an attached bag of satellite macro notes. In this way arbitrarily deep hierarchies can be built. The graphical representation of a macro score consisting of macro notes (called nodes here) is shown in figure 3.2.
MacroScore {} Node Q Note Q Onset
Pitch
R
Q
Loudness Duration Z
R
Voice Z
Fig. 3.2: The conceptual representation of a macro score.
One important feature to notice in this figure is the presence of circularity. This circularity is in fact the guarantee for hierarchies of arbitrary depth. A particular instance of the MacroScore concept is, however, non-circular. The recursion is ended by specifying a node with a note and an empty MacroScore set at the lowest level. A simple example of such an instance is figure 3.3, where one Node has as anchor the first note d5 and and as satellites (MacroScore again) the notes of the trill (which are again nodes, with empty sets of satellite notes). The second Node has as anchor the dotted half e5 and an empty set of satellite notes.
26
3 Architecture of Concepts I: Principles
Fig. 3.3: Example of a macro score.
The subject of macro objects will be taken up again in section 16.1 as a powerful application of R UBATO C OMPOSER.
3.3.2 Frequency Modulation Another example, still from music, but this time related to the generation of sound, is frequency modulation (FM). Developed my John Chowning in 1973 [18], FM is a method for synthesizing sound by chaining sine generators. A single sine generator is of the form: A sin(2π · F · t + ϕ) where A is the amplitude, F the frequency, ϕ the phase shift and t the time parameter. Fourier synthesis consists of taking some such simple generators: X Ai sin(2π · Fi · t + ϕi ). i
The term 2πFt+ϕ is called the support of a particular generator. A first order FM object is created by adding another simple generator to the support: A sin(2π · F · t + ϕ + B sin(2π · G · t)). By iterating this procedure, higher-order FM objects can be built. This way of recursively defining objects suggests a circular concept similar to the one for macro notes from subsection 3.3.1. In fact, it is a further example of a macro object and already shows a common principle. The FMObject concept is illustrated in figure 3.4. Observe the exact correspondence to the MacroScore concept in figure 3.2.
3.3 Examples
27
FMObject {} Node Q Support Q A
F
Ph
R
R
R
Fig. 3.4: The conceptual representation of frequency modulation. Here the concepts A, F and Ph represent the amplitude, frequency and phase, respectively. The concept FMObject is a defined as a set, and represents the Fourier sum of nodes (no order needs to be implied, since addition is commutative and associative.
Figure 3.5 shows an example for an FM object that follows this definition. Expanded into mathematical notation, it represents the function 1 1 4 1 2 sin(2π·440·t+ π+ sin(2π·620·t))+ sin(2π·860·t− π+ sin(2π·120·t)). 3 2 5 2 3 The emergence of similar principles in two different kinds of objects hints at the possibility of factorizing common operations and manipulations. It opens the way for questions such as: What happens, if a certain, meaningful operation for a MacroScore object is applied to an FMObject? New insights into structure may turn up by the simple recognition of analogies between disparate concepts.
3.3.3 Full Score The simple realization of a score concept as shown in figure 3.1 is ideal for discussing the principles exposed in the following chapters. Its simplicity
Fig. 3.5: A instance of the concept from figure 3.4. The tree-like drawing of instances is analogous to the drawing of a concept structure. The main difference concerns sets, where the number of elements is indicated by dots, and the elements themselves branch off from the set object.
is also the reason that current applications in R UBATO C OMPOSER use it as a basis. However, it only considers simple notes and does not deal with the many other parts relevant to music from the real world, such as key signatures, tempo changes, bar lines, and other instructions for articulation. A much more fully featured schema for musical scores has been devised based on a design in [56]. It is shown in figure 3.6. Its complexity, however, prohibits its use for the exemplification of fundamental principles. On the other hand, whenever the full power of musical representation is required, the need for such a full featured score schema is obvious. ❦ The next chapter deals with the embedding of the architecture of concepts in a mathematical framework. With the theory in place, the discussion of concepts continues in chapter 5 by introducing forms and denotators, which are the mathematical realizations of the notions presented in this chapter.
Fig. 3.6: The conceptual representation of a fully featured score. In order not to overload the picture, some links are not shown, but are instead indicated by a small arrow. Occurrences of concepts with the same name always denote the same concept.
R
Voice Onset Duration RelativeTempoSymbol
RelativeTempo
`
GeneralNotes
`
Z-String
R
{}
R
R
GeneralNote
Z-String
Z-String
{}
Tempi
R
Q
Q BarLine Duration
Repetition
BarLine
Onset
{}
{}
Voice
Repetitions
BarLines
Tempo
Z-String
Voice Onset ClefSymbol
Q
R
Z2
Q
Clef
{}
Clefs
AbsoluteTempo
Z-String
Z12 3
Voice Onset KeySymbol Voice Onset TimeSymbol
TimeSignatures
Q
Signatures
KeySignatures
Voice Onset AbsoluteTempoSymbol
Z-String
Date Instruments Edition
Q
Q
Z-String Z-String Z-String Z-String Z-String
Composer
Lines
BibInfo
Q
Score
3.3 Examples 29
Chapter 4
The Category of Modules
Mathematics provides the tools for describing the physical world. A travel deep down to the most elementary physics, the domain of subatomic phenomena, leaves us with more or less pure mathematics. The application of mathematics to the representation of thought and modeling of thought processes seems to lend itself to a similar treatment. Many areas of mathematics qualify for this task. However, two specific domains will form the basis of the precise formulation of an architecture, as proposed in the last chapter: The theory of modules and category theory. Whereas the former is contingent and could be exchanged for any other theory of mathematical spaces, the latter is necessary to hold the whole construction together.
4.1 From Monoids to Modules Abstract algebra deals with algebraic structures, i.e., structures consisting of a base set and one or more operations that satisfy a number of properties. The study begins with very simple structures and proceeds by enriching them with more and more properties. One may say that each additional property is intended solve a certain problem, quite similar to the way that the path from integers to complex numbers goes through the requirement of the solution of certain equations. In this section a selection of some important types of algebraic structures is presented. Observe that many more types exists, in-between and beyond, but these are the ones that play a major role in the mathematics used in R UBATO C OMPOSER. The following discussion is not meant as an overview of algebraic theory, but rather as a refresher. Since the algebra provides the fundamental computational structure of R UBATO C OMPOSER, hints at the
31
32
4 The Category of Modules
implementation discussed in part II are interspersed and preceded by the ➟ symbol. A basic knowledge of set theory will be assumed.
4.1.1 Monoids Definition: A monoid is a pair (M, ∗) where M is a set and ∗ : M × M → M is a binary operation with the following properties: 1. Associativity: For k, l, m ∈ M: (k ∗ l) ∗ m = k ∗ (l ∗ m). 2. Neutral element: There is e ∈ M, such that e ∗ m = m ∗ e = m for all m ∈ M. The monoid is commutative if, in addition, m ∗ n = n ∗ m for all m, n ∈ M. Since the requirements are modest, most of the familiar structures such as the various number domains (using either addition or multiplication as the operation ∗, and 0 or 1 as the neutral element, respectively) are monoids. An important monoid is the word monoid (W, ∗). This monoid is defined as follows: The set W consists of all finite sequences of letters from a given alphabet A. Its elements are called words or strings and include the empty word ǫ. The operation ∗ is the concatenation of strings and can thus be notated by simple juxtaposition. It is clear that concatenation is associative, but generally not commutative. A word monoid over an alphabet A is only commutative if A is a singleton set or the empty set. The neutral element is the empty word ǫ. The second important concept besides algebraic structures themselves is that of a homomorphism. Given two structures M and N, a homomorphism from M to N is a map that preserves structure. What structure preservation exactly means is to be defined as the case arises. For monoids, we have the following definition: Definition: Given two monoids (M, ∗M ) and (N, ∗N ), a monoid homomorphism f : (M, ∗M ) → (N, ∗N ) is a map of sets f : M → N such that 1. f(m ∗M n) = f(m) ∗N f(n) for all m, n ∈ M, 2. f(eM ) = eN for the neutral elements eM ∈ M and eN ∈ N. The following terminology is generally useful:
4.1 From Monoids to Modules
33
Definition: An isomorphism is a homomorphism that is bijective. An endomorphism is a homomorphism where the domain and codomain coincide. An automorphism is an endomorphism that is an isomorphism.
4.1.2 Groups The next step enriches the monoid structure with an additional property, that of invertibility. This property follows from the desire to solve the equations x ∗ m = e and m ∗ x = e for x, where m is any element of the structure. Definition: A monoid (G, ∗) is a group if every g ∈ G is invertible, i.e., there is h ∈ G such that g ∗ h = h ∗ g = e. Obviously (Z, ·) is not a group, but (Z, +) certainly is, as is (R\{0}, ·). The study of group theory is far-reaching and extends into such diverse scientific fields as botany, computer science and, not least, music theory. Permutations, and combinatorics in general, form a significant part of group theory. The definition of monoid homomorphism extends to that of a group homomorphism, i.e., a group homomorphism is a monoid homomorphism between groups. A group is commutative, or abelian, if it is so as a monoid.
4.1.3 Rings So far, only structures featuring a single operation have been presented. The ring structure adds a second operation and those properties that define how both operations interact. Definition: A ring is a triple (R, +, ∗), such that (R, +) is a commutative (additive) group with neutral element 0R and (R, ∗) is a (multiplicative) monoid with neutral element 1R . The operations of these two structures are linked through the distributivity property: for all x, y, z ∈ R, x ∗ (y + z) = x ∗ y + x ∗ z and (x + y) ∗ z = x ∗ z + y ∗ z. A ring (R, +, ∗) is commutative if the monoid (R, ∗) is so.
34
4 The Category of Modules
A commutative ring where every non-zero element is invertible is called a field. The definition of ring homomorphisms follows from the definitions of homomorphisms for groups and monoids. Definition: A map of sets f : R → S between rings R is S is a ring homomorphism if f is a group homomorphism of the additive groups of R and S and if f is monoid homomorphism of the multiplicative monoids of R and S, i.e., for all a, b ∈ R f(1R ) = 1S f(a + b) = f(a) + f(b) f(a ∗ b) = f(a) ∗ f(b) It follows that f(0R ) = 0S and f(−a) = −f(a), where −a is the additive inverse of a ∈ R. Rings are fundamental in the design of the foundation of R UBATO C OM POSER. The complete inclusion chain of number rings is provided, i.e., Z, Q, R, and C (➟ page 91). The last three are also fields. Remark: The rings Zn of integers modulo n are usually constructed as quotient rings. However for reasons of simplicity they are treated as independent rings in R UBATO C OMPOSER, since otherwise a more comprehensive algebraic environment including the theory of ideals would have to provided. This would be against the goal of compactness of the implementation. Such functionality is provided by general-purpose computer algebra systems such as M APLE and M ATHEMATICA.
Polynomials Polynomials with one or more indeterminates can be formed into a ring structure by building on the word monoid introduced above. However, it is also possible to short-cut and simply define the polynomial ring RhXi with coefficients in the ring R and one indeterminate X as follows: An element p in RhXi is of the form: k
p = ck · X + ck−1 · X
k−1
+ ck−2 · X
k−2
2
+ · · · c2 · X + c1 · X + c0 =
k X i=0
ci · X i
4.1 From Monoids to Modules
35
where ci ∈ R and k ≥ 0. The integer k is called the degree of the polynomial p, if k is the greatest number such that ck 6= 0. The ring operations are: n X
ci · X i +
i=0
n Y
i=0
n X
di · X i =
i=0
ci · X i ·
m Y
n X
(ci + di ) · X i
i=0
di · X i =
i=0
m+n X i=0
X
k+l=i
ck dl · X i
For the definition of the sum, it is assumed that both polynomials have the same degree. If this is not the case, the polynomial with the lower degree can be augmented with terms of the form 0 · X i to increase the “degree”. The additive and multiplicative neutral elements are 0R and 1R , respectively, regarded as constant polynomials (i.e., k = 0). R UBATO C OMPOSER currently provides polynomials with one freely specified indeterminate over the number rings (➟ page 91).
Strings A structure such as a polynomial ring with coefficients in the ring R is called an R-algebra. Another important R-algebra is the class of rings of strings with coefficients in a ring R. String rings are a generalization of polynomial rings and build on the word monoid. However, as in the case of polynomials, a direct definition is straightforward. The ring of R-strings RhUNICODEi over the alphabet of UNICODE characters is defined as follows: An element s in RhUNICODEi is of the form: s=
k X
ci · Wi
i=0
where ci ∈ R and the Wi are pairwise different words in the word monoid over the UNICODE alphabet. String rings are implemented in R UBATO C OMPOSER for all the number rings (➟ page 91). From the computer science perspective an element from a string ring may be viewed as a dictionary (implemented as a hash table) mapping UNICODE (native) strings to elements from the ring R, with the assumption that strings not present in the table implicitly map to 0R . The ring operations are:
36
4 The Category of Modules n X
ci i=0 n Y
· Wi +
ci · Wi ·
i=0
n X i=0 m Y
di · Wi = di · Wi =
i=0
n X
(ci + di ) · Wi i=0 n X m X
ci dj · Wi Wj
i=0 j=0
For the addition, both string elements must be “filled up” with word terms having zero coefficient to match them term by term. The multiplication Wi Wj is the concatenation of the words Wi and Wj . If the result of the multiplication contains terms a · Wi,1 Wj,1 and b · Wi,2 Wj,2 such that Wi,1 Wj,1 = Wi,2 Wj,2 = V, then both are replaced by the single new term (a + b) · V. Observe that, in general, the ring is not commutative. The additive neutral element is the string where all coefficients are 0R and the multiplicative unit is the string 1R · ǫ where ǫ is the empty word (i.e., the multiplicative unit of the word monoid). A few examples shall illustrate the particular ring of strings ZhUNICODEi. Since addition is commutative, we may just as well order the terms lexicographically along the words. a = 9 · ABU − 3 · BE + 7 · POL b = 3 · POL + 2 · REGA a + b = 9 · ABU − 3 · BE + 10 · POL + 2 · REGA a · b = 27 · ABUPOL + 18 · ABUREGA − 9 · BEPOL − 6 · BEREGA + 21 · POLPOL + 14 · POLREGA b · a = 27 · POLABU − 9 · POLBE + 21 · POLPOL + 18 · REGAABU − 6 · REGABE + 14 · REGAPOL c = 3 · AA + 5 · A c2 = 9 · AAAA + 15 · AAA + 15 · AAA + 25 · AA = 9 · AAAA + 30 · AAA + 25 · AA d=2·ǫ c · d = 6 · AA + 10 · A =d·c The embedding of strings in the structure of a ring is important, since, this accomplished, most, if not all, of the essential basic data types used in computer programming can now be regarded from the common perspective of ring theory.
4.1 From Monoids to Modules
37
Product Rings Given two rings (R, +R , ∗R ) and (S, +S , ∗S ), the product ring (R×S, +R×S , ∗R×S ) is defined on the Cartesian product of sets R × S, i.e., the set of pairs (x, y) with x ∈ R and y ∈ S. The rings R and S are the factors of the product R × S. The following properties describe the construction of the product ring R × S. For every x, u ∈ R and y, v ∈ S: (x, y) +R×S (u, v) = (x +R u, y +S v) (x, y) ∗R×S (u, v) = (x ∗R u, y ∗S v) 0R×S = (0R , 0S ) 1R×S = (1R , 1S ) The construction of the product of two rings can be iterated to three and more factors: in fact, (R × S) × T and R × (S × T) are isomorphic, thus both may be identified and simply written as R × S × T. A type of ring homomorphism specific to products is the class of projections. Given a product R1 ×· · · Rn with n factors, the projection pi to the i-th factor (i ∈ [1, n]) is defined as pi : R1 × · · · Rn → Ri : (x1 , . . . xn ) 7→ xi . Product rings provide an easy way to “bundle” several ring structures as one. They are, however, not to be confused with the higher-dimensional module structures, discussed in subsection 4.1.4. Product rings with any number of factor rings (but at least two) and their projections are available in R UBATO C OMPOSER (➟ page 91 and page 99).
4.1.4 Modules In computer programming, one way of constructing types larger than the basic types, which we have dealt with using the language of rings, is the introduction of (fixed-length) arrays with elements from a basic type. This is the starting point of the implementation of linear algebra including the best studied and most efficient algorithms in computer science based on matrix computation. The origin in mathematics is found in the study of vector spaces and, more generally, modules. These include linear and affine spaces, with their rich set of geometrical transformations, for example the two- and three-dimensional Euclidean spaces familiar from everyday perception. The application of such transformations is visible throughout science and art, and music in particular. Many manipulations common in musical compo-
38
4 The Category of Modules
sition can be expressed in the language of modules and module morphisms. The availability of an implementation of modules in R UBATO C OMPOSER therefore provides a necessary and powerful foundation for the development of compositional tools. Definition: Given a ring (R, +R , ∗R ), a (left) R-module is a triple (R, M, · : R × M → M), where (M, +) is an abelian group of vectors, and · is the scalar multiplication, with the following properties: 1. For all m ∈ M, 1R · m = m. 2. For all r, s ∈ R and m, n ∈ M, (r +R s) · m = r · m + s · m, r · (m + n) = r · m + r · n, r · (s · m) = (r ∗R s) · m. The subscripts on + and ∗ are usually omitted and ∗ is also written · if any confusion is unlikely. If R is a field, then M is called an R-vector space. There are several types of homomorphisms that one can define on modules. Let us begin with the most common and restricted one, the class of R-linear module homomorphisms. Definition: Given two R-modules (R, M, ·M ) and (R, N, ·N ), an R-linear homomorphism f : M → N is a group homomorphism such that for all r ∈ R and m ∈ M f(r ·M m) = r ·N f(m). All in all, the characteristic property of an R-linear homomorphism can be expressed as follows, where all possible notational simplifications have been made: For all r, s ∈ R and m, n ∈ M f(rm + sn) = rf(m) + sf(n). The set of n-dimensional column vectors over a ring R is made into an Rmodule using the element-wise addition of matrixes as the group operation and the external scalar multiplication. Such a module is then denoted by Rn and n is called its dimension. Any R-module isomorphic to Rn is called a free R-module of dimension n. Due to this isomorphism n-dimensional free R-modules can be identified and simply denoted Rn without any distinction. Most familiar are the Euclidean plane R2 and the Euclidean 3D-space R3 . Both are, of course, vector spaces. Each ring R is a one-dimensional free Rmodule itself. On the other hand, any abelian group G can be regarded as a Z-module, where the module addition is clear and the scalar multiplication
4.1 From Monoids to Modules
39
is defined by n · g = (sgn n)(g + g + · · · g) for n ∈ Z and g ∈ G. | {z } |n| times
So we see that modules encompass a large part of the common algebraic structures and thus serve as a useful container to handle them within one consistent framework. The homomorphisms considered so far are only defined between modules over the same ring R. An extension allows the definition of linear homomorphisms between two different rings R and S. Such a dilinear homomorphism consists of two parts: an S-linear homomorphism and a scalar restriction ϕ. Definition: If ϕ : R → S is a ring homomorphism and N is an S-module, then N[ϕ] (or N[R] if ϕ is clear) is the R-module defined by the scalar restriction ϕ via s · m = ϕ(s) · m, for s ∈ R and m in N. A dilinear homomorphism from an R-module M to an S-module N is a pair (ϕ : R → S, f : M → N[φ] ) consisting of a scalar restriction ϕ and an R-linear homomorphism f. As an example of a dilinear homomorphism, consider a scaling by 12 from Q2 to R2 . The scalar restriction ϕ : Q → R is the embedding of the rationals in the reals, which is a ring homomorphism. We now have to define the Q-linear homomorphism f : Q2 → R2[Q] . All we have to do, is to express the module elements in terms of the basis vectors e0 = (1, 0) and e1 = (0, 1) of R2 . Then for each (x, y) ∈ Q2 : f(x, y) = ϕ( 21 x) · e0 + ϕ( 12 y) · e1 . In practice and in the implementation of R UBATO C OMPOSER, we have a domain of the form Rm , a codomain of the form Sn , an R-linear homomorphism f : Rm → Rn and a ring changing homomorphism ϕ : R → S. The first step applies f to a vector x in Rm to produce a vector y in Rn . Then ϕ is applied component-wise to y and the result is a vector z in Sn . Analogously to rings, the construction of products of modules leads to new modules based on given ones: Definition: For any family (M Qi )i∈I of R-modules where I is a set of indexes, we have the product module i∈I Mi . This is the Cartesian product of the underlying groups where the addition as well as the scalar multiplication Q are defined coordinate-wise. The subset of all vectors m = (mi )i ∈ i∈I Mi where only a finite number of coordinate mi 6= 0 is also a module which is L then called the direct sum module i∈I Mi . In particular, a product where the set of indexes is finite is a always a direct sum.
40
4 The Category of Modules
The definition of direct sums entails two specific kinds of linear homomorphisms: L the class of projections and the class of injections. Given a direct sum I Mi we have for each index j ∈ I the projection πj :
M
Mi → Mj : (mi )i 7→ mj
I
and the injection ιj : Mj →
M
Mi : mj 7→ (0, . . . 0, mj , 0, . . .).
I
In addition to linear spaces one often considers affine spaces by specifying two domains, the space of points and the spaces of vectors. An object in an affine space is then an anchored vector. This distinction leads to an unnecessarily elaborate theory. It is also possible to locate the points and vectors in the same module space and extend the definition of dilinear homomorphisms to diaffine homomorphisms which are composed of a dilinear part and a translation part. For an R-module M and an element m ∈ M, the translation by m is the map Tm : M → M : x 7→ x + m. The notation Tm is suggestive of the way translations are composed, i.e., Tm Tn = Tm+n . Definition: Given two rings R and S, an R-module M and an S-module N, a diaffine homomorphism f from M to N is a map of the form Tn ◦ f0 , where Tn is the translation on N by n ∈ N and f0 : M → N is a dilinear homomorphism. If R = S and the underlying scalar restriction is the identity, then f is called an R-affine homomorphism. In R UBATO C OMPOSER, free modules over all the implemented rings are available (➟ page 87). Product modules are also available, which are always direct sums, since the index set is always finite. There is also a large collection of diaffine homomorphisms (➟ page 95) which can be arbitrarily composed and combined in the ways allowed and guaranteed by the theory.
4.2 Categories
41
4.2 Categories Two concepts occur again and again throughout mathematics: the notion of object, which may be numbers, points in space, etc., and the notion of mapping, which is known under various names, such as function, transformation, morphism, etc. If we abstract these two notions, we get a theory with an extremely simple basis, but with enormous consequences, the theory of categories. Virtually every field of mathematics can be recast into category theory, foremost set theory, but also formal logic. Because of the terminology used it is probably not unexpected that this theory plays an important role in computer science (see [61] and [71] for just two such examples, complete sub-fields of computer science, such as functional programming [8] have been built on category theory). It has also been applied to much broader contexts [37].
4.2.1 Definition The following treatment is based on the discussions in [53] and [47]. The two concepts objects and morphisms can in fact be reduced to the sole concept of morphism. Indeed, if we consider any object x, we can always construct an identity morphism Idx : x → x on this object, and use x and Idx interchangeably. Thus we have the following definition: Definition: A category C is a collection of entities f, g, h, . . . which are called morphisms, together with a partial composition, i.e., for some morphisms f and g, a new morphism f ◦ g of C is defined. An identity or object in C is by definition a morphism e such that, whenever defined, we have e ◦ f = f and g ◦ e = g. In addition, the following axioms are given: 1. If f ◦ g and g ◦ h are both defined, (f ◦ g) ◦ h is defined. 2. Whenever one of the two compositions (f ◦ g) ◦ h or f ◦ (g ◦ h) is defined, both are defined and they are equal; we denote the resulting morphism by f ◦ g ◦ h. 3. For every morphism f there are two identities, a “left” identity eL and a “right” identity eR , such that eL ◦ f and f ◦ eR are defined (and necessarily equal to f). The (unique) right identity of a morphism f is called the domain of f, written dom(f), and the (unique) left identity is called the codomain of f, written codom(f). Therefore one also uses the familiar notation f : a → b where a = dom(f) and b = codom(f). The collection of all morphisms f : a → b
42
4 The Category of Modules
is denoted by HomC (a, b) or C(a, b), or, if the category C is clear from the context, Hom(a, b) or a@b. The category Sets of sets is the most obvious example. Here the objects (or identity morphisms, depending on the choice of language) are the sets, and the morphisms are the set maps, where composition of morphisms is the usual composition of set maps. The terminology domain and codomain coincides with the familiar terminology known from the definition of functions on sets. The codomain of a function must not be confused with the range of a function. The range of a function is a subset of the codomain of the function and is defined as the set of all values produced by that function. The range is also called the image of the function. In many categories, for example the category of groups or the category of vector spaces, the morphisms are mappings, in the first case group homomorphisms and in the second case linear mappings. This need not be the case, however. The main characteristic of morphisms is their composability, which is a natural property of functions, but can also defined for other concepts that are not directly related to functions. As an example, consider the category of matrixes (over the real numbers, for example). Here the morphisms are the (m × n)-matrixes with m rows and n columns (where m and n are non-negative integers). The composition of morphisms is the multiplication of matrixes, i.e., the composition AB of two matrixes A and B is defined whenever A is an (m × n)-matrix and B is an (n × l)-matrix. The result AB is an (m × l)-matrix. The identity morphisms are the identity (n × n)-matrixes, which have 1 on the diagonal, and 0 otherwise. According to the identification introduced above, they also provide the objects of the category. Another example is the path category of a directed graph Γ . The morphisms of the category are the paths starting a vertex v and ending at a vertex w, for all vertexes v, w ∈ Γ . The identity morphisms are the lazy paths, which are the paths from v to v, for all v ∈ Γ , and containing no arrow at all. Of course, the identity path on v identifies with the vertex v. This example makes it obvious why morphisms are called “arrows” by many treatments of category theory. Path categories will play an important role later in this discussion (subsection 4.2.2). In this context we shall concentrate on the category of modules Mod. The objects are modules over arbitrary rings and the morphisms are the diaffine homomorphisms. Composition is obviously the composition of homomorphisms. A first construction of a category based on a given category is the following: Definition: For every category C we have the opposite category Copp . Its morphisms are the same as those of C, but composition is defined as f ◦opp g = g ◦ f. More simply stated, a morphism f : a → b in C thus becomes f : b → a in Copp , i.e., all directions are reversed.
4.2 Categories
43
The properties commonly used to characterize functions can be carried over in a more general way to the abstract setting of the morphisms of a category. Definition: A morphism f : a → b is a monomorphism, iff for any two morphisms g : c → a and g′ : c → a, f ◦ g = f ◦ g′ implies g = g′ . A morphism f : a → b is an epimorphism, iff for any two morphisms g : b → c and g′ : b → c, g ◦ f = g′ ◦ f implies g = g′ . A morphism f : a → b is a section, iff there exists a left inverse g : b → a, i.e., g ◦ f = Ida . A morphism f : a → b is a retraction, iff there exists a right inverse g : b → a, i.e., f ◦ g = Idb . A morphism f : a → a is an endomorphism. A morphism f : a → b that is a section and a retraction is an isomorphism. An isomorphism that is also an endomorphism is called an automorphism.
4.2.2 Functors Going one level higher up, morphisms between categories, also known as functors, are introduced: Definition: If C and D are categories, a functor F : C → D is a function which assigns to every morphism c in C a morphism F(c) in D such that 1. F(c) is an identity if c is so, this implies that F maps objects to objects, 2. if c◦c′ is defined in C, then F(c)◦F(c′ ) is defined and F(c◦c′ ) = F(c)◦F(c′ ). The composition of two functors F : C → D and G : D → E is a functor F ◦ G : C → E. In contrast to the usual covariant functors, a contravariant functor is a functor F : Copp → D. When the combined term contravariant functor F is made explicit, then the notation F : C → D is also used instead of F : Copp → D. Two categories C and D are called isomorphic is there exists a functor isomorphism, i.e., there exist two functors F : C → D and F−1 : D → C such that F ◦ F−1 = IdD and F−1 ◦ F = IdC , where IdC and IdD are the identity functors on C and D, respectively. An important class of categories is based on directed graphs. In the following, the definition of directed graphs from [52] is used. A directed graph Γ
44
4 The Category of Modules
is specified as Γ : A → V 2 , where V is the set of vertexes and A is the set of arrows, and provided with the usual definitions for the head and the tail of an arrow, i.e., head = pr1 ◦ Γ and tail = pr2 ◦ Γ . A path p in Γ is a sequence of arrows (ai )i=1,...n such that for every i < n, head(ai ) = tail(ai+1 ), where n = l(p) is the length of the path p. A path of length 0, which consists only of a vertex v and has no arrows, is called the lazy path at v and is simply denoted by v. Given two paths p and q, such that the vertex at which p ends is equal to the vertex at which q begins, the composition qp is the path resulting from joining p and q at their common vertex. We have l(qp) = l(q) + l(p). It is clear that a lazy path v provides the identity, if composition is defined, i.e., vp = p or pv = p. Now, given a graph Γ : A → V 2 , the path category Path(Γ ) on Γ is defined using the above definitions as follows: Its elements are the paths on Γ and the composition is the composition of paths. The identities are the lazy paths v for each v ∈ V. Therefore the objects of Path(Γ ) can be identified with the set V of vertexes of Γ . The domain of a path is its starting vertex, the codomain is its ending vertex. If we are given a binary relation ∼ among some paths with equal domain and codomain, we can consider the smallest equivalence relation ∼′ which contains ∼ such that f ∼′ g with dom(f) = dom(g) = d and codom(f) = codom(g) = c and for every h and k with dom(h) = c and codom(k) = c, we have c ◦ f ∼′ c ◦ g and f ◦ d ∼′ g ◦ d. Such a relation gives rise to the quotient category Path(Γ )/∼. Its morphisms are the equivalence classes of paths and the composition is the composition of representatives of these classes. The path categories allow us to introduce a graphical notation, called diagram, to represent relationships in categories. Definition: A diagram in a category C is a functor ∆ : Paths(Γ ) → C, where Γ is a digraph (also called diagram scheme in this context). The diagram ∆ is said to commute with respect to a relation ∼ among the paths of Γ , whenever ∆ factorizes through Paths(Γ )/∼. If the relation is maximal (i.e., it identifies all paths having common start and end vertexes), then the diagram is said to be commutative without specification of ∼. A simple example shall illustrate the above definition of a diagram. Given objects A, B, C, D, and set maps f : A → B, g : A → C, h : B → D, k : C → D, in Sets we have a diagram ∆ : Paths(Γ ) → Sets on the following graph Γ : A → V 2 with V = {A, B, C, D} and A = {f, g, h, k}:
4.2 Categories
45
A
f
h
g ? C
- B
k
? - D
The diagram ∆ maps the vertexes (or lazy paths) A, B, C, and D to the homonymous sets (or identity maps) in Sets. The arrows f, g, h, and k are mapped to the homonymous set maps. In particular, this diagram commutes whenever h ◦ f = k ◦ g. A commutative diagram is effectively a graphical way of representing an equation or a set of equations. Definition: A functor F : C → D is full whenever F(C(x, y)) = D(F(x), F(y)) for all pairs of objects x and y. It is called faithful whenever F : C(x, y) → D(F(x), F(y)) is injective for all pairs of objects x and y. It is fully faithful if F : C(x, y) → D(F(x), F(y)) is a bijection. Definition: If C is a category, a subcategory is a subcollection C′ of C such that for each morphism f in C′ its domain and codomain are also in C′ , and such that for any two morphisms f and g in C′ , whenever f ◦ g is defined in C, it is also defined in C′ . A subcategory C′ of C is a full subcategory, if for every pair of objects x and y in C′ every morphism x → y in C is also in C′ . The identity on the morphisms in C′ induces an embedding functor C′ → C.
4.2.3 Natural Transformations If we consider functors as objects, then natural transformations are morphisms between these objects. Definition: If F, G : C → D are two functors, a natural transformation t : F → G is a system of morphisms t(c) : F(c) → G(c) in D, for each object c in C, such that for every morphism f : x → y in C, we have G(f) ◦ t(x) = t(x) ◦ F(f). This property can also be written as the following commutative diagram in D:
46
4 The Category of Modules
F(x)
F(f) ? F(y)
t(x)G(x) G(f) ? - G(y) t(y)
It is easy to see that we have a category Func(C, D) where the objects are the functors F : C → D and the morphisms are all the natural transformations Nat(F, G) between functors F, G : C → D. If the collections of morphisms x@y = Hom(x, y) in a category C are sets, i.e., objects in the Sets category, then two related functors can be constructed: For every object x in C, we have the functor x@ : C → Sets with y 7→ x@y. This functor sends every object y in C to the set of morphisms from x to y and every morphism f : y → z to the collection of set functions x@f : x@y → x@z with u 7→ f ◦ u, where u : x → y (see figure 4.1 for an illustration of this relationship).
C
Sets x@y
x@
x
u:x→y y x@f
f z
f◦u:x→z x@
x@z
Fig. 4.1: The covariant functor x@ : C → Sets for each object x ∈ C. The picture also illustrates the characteristic property of a covariant functor F = x@, i.e., F(f : y → z) : F(y) → F(z).
Analogously, we have for every object y in C the contravariant functor @y : Copp → Sets with x 7→ x@y. This functor sends every object x in C to the set of morphisms from x to y and every morphism f : x → z to the collection
4.2 Categories
47
of set functions f@y : z@y → x@y with u 7→ u ◦ f, where u : z → y (see figure 4.2 for an illustration of this relationship).
C
Sets
x@y
@y
y
u◦f :x→y x f@y
f z
u:z→y @y
z@y
Fig. 4.2: The contravariant functor @y : C → Sets for each object y ∈ C. The picture also illustrates the characteristic property of a contravariant functor F = @y, i.e., F(f : x → z) : F(z) → F(x).
Of course, the functors just introduced can be embedded in categories. In particular, the category Func(Copp , Sets) of contravariant set-valued functors on C is denoted by C@ . Its objects are called (set-valued) presheaves over C. Thus, amongst others, for each y ∈ C, @y is a presheaf over C. Ap@ plied to the category of modules, we have the category Mod of presheaves over the modules. For instance, @M, where M is a module, is an object of @ Mod . Thus, for a given module M, the presheaf @M over Mod is a function which maps any module N to the set of all diaffine homomorphisms N → M. Definition: Given two categories C and D and an object a of D, we have the constant functor [a] : C → D with [a](x) = a for all objects x ∈ C and [a](f) = Ida for all morphisms f in C. Given a digraph Γ and a fixed object c in a category C, the constant diagram ∆c = [c] associates every vertex of Γ with c and every arrow with Idc . For a diagram ∆ in C, a natural transformation [c] → ∆ is called a cone on ∆. Similarly, a natural transformation ∆ → [c] is called a cocone on ∆. The preceding definition is a precise statement of the situation exemplified in figure 4.3. There we have a diagram scheme with the vertexes u, v, w, and y, and the arrows auy , auv , avw , and ayw . A particular natural transformation t : [c] → ∆ is the cone represented in the center of the
48
4 The Category of Modules
figure. The stated fact is that the central figure commutes, when regarded as a diagram. An example is the commutative diagram on the right for the triangle {c, ∆(u), ∆(v)}, which in the form of an equation states that ∆(auv ) ◦ t(u) = t(v) ◦ Idc , or, more simply, ∆(auv ) ◦ t(u) = t(v). c u auv
auy
avw
t(u)
∆(u)
∆c (auv ) = Idc
∆(auv )
∆(y)
∆(u)
v
t(w)
t(v)
ayw
Γ
∆c (u) = c
t(y)
t(u)
y
∆(auy )
w ∆(auv )
∆(v)
∆(w)
∆(ayw )
∆c (v) = c
∆(avw )
(b)
(a)
t(v)
∆(v)
(c)
Fig. 4.3: The digraph Γ for the diagrams ∆ and ∆c (a). A cone on ∆, the reason for the name should be obvious (b). The particular commutative diagram for u and v (c).
Figure 4.4 depicts the analogous example for a cocone instead of a cone. ∆(y)
∆(u)
u auv
v
auy
Γ
∆(auv )
y
∆(v) ayw
∆(auy )
∆(ayw )
∆(u)
t(u)
∆(w)
∆c (u) = c
∆(avw )
t(u)
t(w)
t(y)
∆(auv )
∆c (auv ) = Idc
t(v) avw
w
∆(v) c (b)
(a)
t(v)
∆c (v) = c
(c)
Fig. 4.4: The digraph Γ for the diagrams ∆ and ∆c (a). A cocone on ∆ (b). The particular commutative diagram for u and v (c).
4.2.4 Yoneda’s Lemma All the tools are now available to come to a central point in the theory. Definition: Given a category C where the collections of morphisms x@y are sets, the Yoneda embedding Y is the functor Y : C → C@ : x 7→ Y(x) = @x
4.2 Categories
49
where the natural transformations Y(f : x → y) : Y(x) → Y(y) are defined by u 7→ f ◦ u for u : z → x ∈ Y(x)(z) = z@x. If C = Mod, we write Y(M) = @M for a module M. A functor F in C@ is called representable whenever there is an object c ∈ C such that F is isomorphic to Y(c). Obviously, for every module M the functor @M is representable. Lemma: For every functor F in C@ and every object c ∈ C, we have the collection of natural transformations Nat(Y(c), F) from the functor Y(c) to F, and the map ǫ : Nat(Y(c), F) → F(c) : h 7→ h(c)(Idc ) is a bijection. The lemma states that the full subcategory of representable functors in C@ is equivalent to C (i.e., there is a fully faithful functor between C@ and C that is essentially surjective on objects), and that such an equivalence is given by the Yoneda embedding. When we put F = Y(d), we get a bijection ∼ ǫ : Nat(Y(c), Y(d)) → Hom(c, d). Again, if C = Mod, we write F(A) = A@F, even in the case where F is not representable, and we have then the bijection ∼ Nat(@A, F) → A@F. In the case where F is the representable functor @M, ∼ we have Nat(@A, @M) → @M(A) = A@M. In this context, we call A the address, since we look at F (or @M) under all morphisms from the viewpoint of A.
4.2.5 Limits and Colimits Constructions through the characterization of limit and colimit provides us with a class of objects which will be of importance later on, when we take up again the discussion of concepts. We need a few preliminary definitions: Definition: An object e in a category C such that there is a unique morphism ! : c → e for every object c in C, is called a final object, it is usually denoted by 1. A final object i in the opposite category Copp is called an initial object in C, it is usually denoted by 0. This means that for every object c in C, there is a unique morphism ! : 0 → c.
50
4 The Category of Modules
Definition: Given a “basis” object b of a category C, the comma category C/b has all the morphisms f : x → b for x ∈ C as objects, and for two objects f : x → b and g : y → b we have HomC/b (f, g) = {u | g ◦ u = f}, the set of “commutative triangles above b” with the evident composition. The cocomma category C/opp b is the comma category (Copp /b)opp , i.e., for two objects f : b → x and f : b → y we have HomC/opp b (f, g) = {u | u ◦ f = g}. See also figure 4.5. uxy y
x
uyx fy
fx uxz
b fz z
Fig. 4.5: The objects b, x, y and z are objects of a category C. The morphisms fx , fy and fz of C are objects of the comma category C/b. The morphisms from fx and fy in C/b are all the morphisms uxy ∈ C such that the triangle formed by fx , fy and uxy commutes. The cocomma category is similar, except that the arrows are reversed.
Given a category C and a digraph Γ , we have the category Func(Path(Γ ), C) of diagrams in C. Given such a diagram ∆, we have the comma category Func(Path(Γ ), C)/∆ and, in this category, the full subcategory Cone(∆) of cones [c] → ∆. Similarly, the subcategory Cocone(∆) of cocones ∆ → [c] is defined in the cocomma category Func(Path(Γ ), C)/opp ∆. Definition: For a diagram ∆ : Path(Γ ) → C, a limit of ∆ is a final object in the category of cones Cone(∆). This object is denoted by lim(∆). Thus, the limit is not an object of C, but a cone, including the constant “top object” of the cone. But often, if the rest is clear, one also writes lim(∆) to denote the top object itself.
4.2 Categories
51
Dually, a colimit of ∆ is an initial object in the category of cocones Cocone(∆). This object is denoted by colim(∆). Again, the colimit is not an object of C, but a cocone, including the constant “bottom object” of the cocone. If the rest is clear, one also writes colim(∆) to denote the bottom object itself. A category C for which all limits (of finite diagrams) exist is called (finitely) complete. A category C for which all colimits (of finite diagrams) exist is called (finitely) cocomplete. The familiar construction of a product, for example the Cartesian product in set theory, is just a special case of a limit, and serves as an example for elucidating the above definition. Figure 4.3 helps to understand the following commutative diagram which characterizes the product as a limit: c A A u A ? A pra ◦ u a × b A prb ◦ u A A @ @ A @ A pra prb @ A R @ UA a b The cone of the product is the natural transformation pr : [a × b] → ∆. The diagram ∆ is discrete (i.e., it there are no arrows) and represented by the vertexes labeled a and b. The natural transformation pr is defined through the projections pra and prb to the factors a and b, respectively, of the product. The top object of the cone is what is called the product, and therefore denoted by a × b. By the common abbreviation introduced in the definition of the limit, we may write a × b = lim(∆). The condition that lim(∆) is a final object translates to the fact that for every object c there is a unique morphism u : c → a × b such that the diagram commutes. The dual of a product is a coproduct. It is a special case of a colimit. The following commutative diagram characterizes the coproduct:
52
4 The Category of Modules
a
b
A@ A @ ina inb A @ A @R @ A a⊔b A u ◦ ina A u ◦ inb A u A A UA ? c The cocone of the coproduct is the natural transformation in : ∆ → [a ⊔ b]. The diagram ∆ is discrete (i.e., there are no arrows) and represented by the vertexes labeled a and b. The natural transformation in is defined through the injections ina and inb from the cofactors a and b, respectively, of the coproduct. The bottom object of the cone is what is commonly called the coproduct, and therefore denoted by a ⊔ b. By the common abbreviation introduced in the definition of the colimit, we may write a ⊔ b = colim(∆). The condition that colim(∆) is an initial object translates to the fact that for every object c there is a unique morphism u : a ⊔ b → c such that the diagram commutes. An important result (without proof) is this: the category of sets Sets has arbitrary limits and colimits, thus it is both complete and cocomplete. The category C@ of presheaves over a category C is complete and cocomplete.
4.2.6 Topoi If we are working with a category such as Mod we have to deal with a dilemma. On the one hand we have the category of sets Sets that provides us with a wealth of principles to construct objects, such as limits, colimits, and powersets. But its morphisms are not characterized in any specific way, they are just arbitrary set maps. On the other hand, we have the category of modules, where the morphisms are the highly characterizable diaffine homomorphisms. In this category, we do not have, however, all the possibilities for the construction of useful new objects as in Sets, such as powersets. The goal is to find a way to provide such a category as Mod with the desired properties and this leads to special categories, called topoi that imitate such constructions, but also retains the particular structure of the morphisms. We shall not go into the details of the study of topoi, which is a farreaching theory. However, the most important result is that the category @ of presheaves Mod is a topos. This means that it allows all limits and col-
4.2 Categories
53
imits, as well as subobject classifiers Ω. The last notion can be intuitively understood when related to the category of sets. There Ω = 2 = {0, 1}, @ and Set(S, 2) is the set of subsets of S, i.e., the powerset of S. In Mod , the subobject classifier Ω plays the role of 2, i.e., S@Ω characterizes the set of of subobjects X ⊂ S. @
Thus, by introducing Mod , we have provided the category of modules with the properties (that Mod did not have) that we need to cast our architecture of concepts in a mathematical mold. The importance of topoi for the mathematics of music may be motivated by the following example: We already stated that the theory of modules, more precisely the category of modules, provides us with a powerful tool to pursue mathematical theories of music. Pitch classes modeled using the ring Z12 of integers modulo 12 is a case in point. It is inevitable that the need for arbitrary sets of pitch classes, and sets of sets of pitch classes (sets of chords), arises. These sets cannot, however, be embedded in Mod. We would have to switch to the Sets category and thereby lose module algebra. Both conflicting requirements, that of powersets and that of module algebra, @ are met by working in the category Mod of presheaves over the modules.
Chapter 5
Architecture of Concepts II: Forms and Denotators
Armed with the necessary tools from module and category theory introduced in the previous chapter, we may now embark on the restatement of the principles from section 3.2 in terms of this theory. We shall see that the concepts of Simple, Limit, Colimit and Power reappear in a natural way.
5.1 Forms The theory of functors allows us to speak of generalized points in space, i.e., addressed points instead of traditional points known from standard mathematical approaches. @
opp
In the following we work with the category Mod of functors Fu : Mod → @ Sets. An argument module M to a functor Fu ∈ Mod is called an address of Fu and a module morphism f : M → N is called an address change. A form is a mathematical space and is formally defined as follows based on the presentation in [47]: Definition: A form F is a quadruple F = (NF, TF, CF, IF) where 1. NF is the name N(F) of F. We shall later see what a name is exactly made of. 2. TF is one of the following symbols: Simple, Limit, Colimit, or Power. It is called the type T(F) of F. 3. CT is one of the following base on the type of F: a. Case Simple: CF is a module M. b. Case Power: CF is a form. c. Case Limit or Colimit: CF is a diagram D of forms.
55
56
5 Architecture of Concepts II: Forms and Denotators
It is called the coordinator C(F) of F. The diagram D is a diagram of functors Fun(Fi ) for a family of (Fi )i of forms (see point 4 below). @
4. IF is a monomorphism of functors IF : Fu → X in Mod with: a. Case Simple: X = @M. b. Case Power: X = Ω Fun(CF) . c. Case Limit: X = lim(D). d. Case Colimit: X = colim(D). It is called the identifier I(F) of F. The domain Fu is called the space (functor) of F and denoted by Fun(F). ➟ In the current implementation Fu is always taken to be X, and the monomorphism is the identity on X. This means that Fun(F) = X in that special case. @
A remark about diagrams: a “diagram of forms” is a diagram in Mod since @ Fun(F) returns a functor which is an object in Mod . For the notation of forms, we resort to the notation introduced in section 3.1, i.e.: Name:.Type(Coordinator). A form of type Simple with module R and name Reals is thus denoted by Reals:.Simple(R). A form of type Limit with the name L and the coordinate forms Li , with i ∈ {1, 2, 3}, is written L:.Limit(L1 , L2 , L3 ). Observe that the diagram, although essential, is not explicitly mentioned. It has to be provided separately if necessary. The graphical notation, also introduced in section 3.1, can easily be transferred to the representation of forms. This representation has the advantage of providing a means for capturing the diagram, too. If the diagram of the above Limit form L consists of the morphisms f1 : L1 → L2 and f2 : L3 → L1 , the graphical representation of L looks as follows: L Q L1
f1
L2 f2
L3
5.2 Denotators
57
Observe also that we have not yet discussed the structure of names, which may therefore taken to be simple strings, for the time being.
5.2 Denotators A denotator is a (generalized) point in a form. Its formal definition is as follows: Definition: Let M be an address. An M-addressed denotator is a triple D = (ND, FD, CD) where 1. ND is the name N(D) of D. 2. FD is the form F(D) of D. 3. CD is an element of M@Fun(F(D)) called the coordinates C(D) of D. Again, the notation of denotators is based on the notation from section 3.1. However this time, we have an additional address to deal with: Name:M@Form(Coordinates). We can now also say what a name is, both for forms and denotators: it is just another denotator, of whatever form. For the sake simplicity, however, the names will henceforth just be simple strings. How does a Simple denotator D with address Z of form Reals defined above look like? The name is arbitrary, for example N(D) = r. The form is of course F(D) = Reals. Finally C(D) must be an element of M@Fun(F(D)). In the case at hand Fun(Reals) = @R2 and M = Z, thus C(D) ∈ Z@R. In short, it must be a diaffine module homomorphism f : Z → R. Thus: r:Z@Reals(f) This most simple example illustrates the functorial approach of generalized points in mathematical spaces. Here, the value of a point is no longer a real number, but a module homomorphism with domain Z and codomain R. Of course, we can get back to our system of traditional points by limiting ourselves to constant morphisms, in particular to 0-addressed denotators. A denotator of form Reals that just represents the number 2.3, for example, will be defined using a constant morphism f : R0 → R with f(x) = 2.3. In this case, we may also notate more simply r:R0 @Reals(2.3),
58
5 Architecture of Concepts II: Forms and Denotators
or even r:0@Reals(2.3), where the symbol 0 makes clear that the null module corresponding to the codomain module is meant as the address. We may omit that too and are left with r:@Reals(2.3). These comments naturally apply to denotators of types other than Simple.
5.3 Computational Category Theory Forms and denotators provide us with the means for implementing an important part of what we may call computational category theory. This section elucidates this idea.
5.3.1 Data Types in Programming Languages Forms are a generalization of the concept of data types that have been present in programming languages almost since the beginning of modern computer science. It is therefore useful to explain the relationship between the form types and the different types of data structures as they occur in Pascal [34] and Haskell [9], to take two programming languages, one ancient and one recent. Observe that the term “type” is used in two quite different meanings: In denotator theory, the type is one a few fundamental ways that a form (respectively, a denotator) is constructed, while in the theory of programming languages a type corresponds to a form in denotator theory.
Simple Forms of type Simple correspond to the primitive data types in programming languages, such as integer or real in Pascal, or Int or Char in Haskell. Most commonly, the primitive data types betray a particular closeness to the machine representation, which natively provides for integers, characters and floating-point numbers. Only in higher-level languages, such as Haskell or Lisp, do primitive types also include strings or similar types. This is in stark contrast to the Simple forms, which are based on module theory. Modules encompass all the primitive data types usually found in programming languages. Additionally, strings, vectors, and products belong
5.3 Computational Category Theory
59
to this class. In short, rings and free modules over these rings are available. Haskell does provide a product data type with tuples as objects. However module and ring theory allows consistent algebraic operations throughout, which is not possible in programming languages, where operations are provided ad hoc for each data type separately. Thus, tuples in Haskell only allow construction and projection, and not pair-wise multiplication or addition, for example.
Limit The special case of type Limit, product, is provided by most programming languages. As an example, consider the Note form (the other forms are assumed to have already be defined, this applies also to subsequent data types): Note:.Limit(Onset,Pitch,Duration,Loudness,Voice)
In Pascal, the equivalent to this form would be implemented as a record data type: type Note = record o: Onset; p: Pitch; d: Duration; l: Loudness; v: Voice end
Haskell uses the data construction: data Note = Note { o p d l v }
:: :: :: :: ::
Onset, Pitch, Duration, Loudness, Voice
There are several possible ways of defining such a product in Haskell. In this case labeled fields have been used. This is essentially the same as the following: data Note = Note Onset Pitch Duration Loudness Voice
The second Note is in fact the constructor for objects of this type. To create a Note object, it is invoked as follows: Note 0.0 67 1.0 50 0
60
5 Architecture of Concepts II: Forms and Denotators
An interesting effect is that the Note constructor behaves like a curried function, i.e., partial evaluations, such as Note 0.0 are possible.
Colimit Data types corresponding to the special case of Colimit, the coproduct, are implemented in various ways. Object oriented languages, for example, favor subclassing to provide its functionality. In Pascal, the corresponding data type is called variant record. To illustrate the correspondences, consider the form Event:.Colimit(Note,Rest)
by which an event consisting of either a note or a rest is meant. The corresponding variant record in Pascal uses an additional field kind to discriminate between the two alternatives: type EventType = (NOTE,REST); type Event = record case kind: EventType of NOTE: (note: Note); REST: (rest: Rest) end end
In Haskell, this translates again into a data declaration, this time with the version using alternatives: data Event = NoteEvent Note | RestEvent Rest
Here, NoteEvent and RestEvent are the constructors of the two alternatives of the Event data type. These constructors are also used for pattern matching in the definition of functions which have Event as formal argument type.
Power General purpose languages do not provide data type constructors for setvalued data types. A notable exception is SETL [64], which is built around the handling of sets. However, many higher level languages, in particular all functional programming languages, directly support lists. Lists are preferred to sets, since they map more comfortably to machine representations,
5.3 Computational Category Theory
61
which are naturally ordered. Sets are then implemented using lists by eliminating duplicates and enforcing a canonical order. Therefore sets are less efficient than lists during runtime. Pascal has no direct provision for general lists or sets, except for a very special data type that implements bitsets. It allows for sets containing numbers from 0 to 31, and maps directly to a 32-bit machine word. Haskell has natural support for lists. To handle sets, library functions are provided, which regard lists as sets and take care of the correct management. The well-known Score form Score:.Power(Note)
translates to the Haskell list data type data Score = Score [Note]
A main characteristic common to all these types is the absence of addresses in programming language data types. In fact, in every case, an implicit null address can be assumed. Programming languages do not provide the power of the functorial approach, this is unique to forms and denotators. It is to be noticed that the types Limit and Colimit have correspondences only in the special case of discrete diagrams. The next section discusses the meaning of Limit and Colimit forms that have more general diagrams.
5.3.2 The Role of Diagrams Discrete diagrams in forms of type Limit and Colimit do not raise any questions. This is the case of products and coproducts which are handled the same way as in programming languages. However, whenever arrows are present, we must explain what this means for the concrete realization of denotators of such a form.
Limit Consider again the Limit form introduced above and repeated in figure 5.1 (a). The diagram corresponding to the form L is shown in figure 5.1 (b). This diagram is characterized by the two arrows (or morphisms) f1 and f2 . The universal property of limits requires that the diagram commutes, in particular the bottom part involving L1 , L2 , L3 , f1 and f2 . Consider now a denotator d with this form. It is constructed from coordinate denotators d1 with form L1 , d2 with form L2 and d3 with form L3 . The morphisms of the diagram
62
5 Architecture of Concepts II: Forms and Denotators
c L
u
Q L1
f1
L2 f2
L L3
prL1 prL2 L1
prL3 L2
f1
L3
f2 (a)
(b)
Fig. 5.1: A form of type Limit (a). The corresponding diagram (b).
now specify constraints on the coordinates, i.e., we must have f1 (d1 ) = d2 and f2 (d3 ) = d1 . If these conditions are not fulfilled, the denotator d does not exist, i.e., its construction fails. Constraints can be exploited in two ways: Either one provides all coordinates, and the constraints ensure the consistency of the result. Or one provides only some coordinates, in this case, for example, d3 . The other coordinates d1 and d2 are then inferred from the morphisms f1 and f2 . Diagrams, therefore, effectively provide a means for a certain type of constraint programming.
Colimit Although the Colimit type is dual to the Limit type, the concrete meaning of diagrams is completely different. Consider the Colimit form in figure 5.2 (a). The corresponding diagram is shown in figure 5.2 (b). The graphical representation of a form is not to be confused with its diagram defining its universal property. A denotator d with form L is constructed from either a denotator d1 with form L1 or a denotator d2 with form L2 or a denotator d3 with form L3 . What is the diagram of L used for? Take a denotator D1 with coordinate d1 , a denotator D2 with coordinate d2 , and a denotator D3 with coordinate d3 , for example. The diagram now generates a
5.3 Computational Category Theory
63
f2 L ` L1
f1
L2
f1
L1
L2
inL1 inL2
L3
f2
L3
inL3
L u
c
(a)
(b)
Fig. 5.2: A form of type Colimit (a). The corresponding diagram (b).
partition on the collection of denotators with form L where the equivalence relation ∼ is given by the diagram. In our example, this amounts to the equivalences D1 ∼ D2 iff f1 (d1 ) = d2 and D3 ∼ D1 iff f2 (d3 ) = d1 and, finally, D3 ∼ D2 iff f1 (f2 (d3 )) = d2 . To determine whether two denotators with a given Colimit form L are equivalent involves checking a system of equivalences generated from all relevant paths in the diagram of L. Similarly, to check the consistency of a denotator with a given Limit form L means to check a system of equivalences generated from all paths in the diagram of L. If the diagram contains loops or cycles, this may require the computation of fixed points. It is thus clear that diagrams add considerable complexity to the implementation of forms and denotators.
Chapter 6
Software Components for Computational Theories
We have now described the theory with hardly any allusion to issues of implementation. An important aim of this work is to make available the theory in a computational form as far as it is possible at all. A complete discussion of the actual implementation forms the whole of part II. In this chapter, a few issues about the relation between theories, mathematical or other, and machine based interfaces are commented on. In fact bringing theories to computers involves two related but subtly different aspect: On the one hand, the computational realization of the concerned theoretical objects through data structures and algorithms. In our case, this includes mathematical modules as well as forms and denotators and is the subject of chapter 10 and chapter 11. On the other hand, having the necessary data structures and algorithms is fair enough, at least to the competent programmer. It is, however, not sufficient for the user more interested in getting some interesting work done, based on his knowledge of the theory and not on the knowledge of its implementation. Indeed, skills in programming should rarely be required for handling theoretical knowledge with computers. Thus, an adequate interface must be developed that allows the mathematician, or composer, or musicologist, to communicate with the tool that embodies the theories he or she uses. The design of a user interface is of crucial importance. There are many ways to address this problem, from the basic premises such as the choice between a graphical or textual presentation, to the more intricate details within the chosen interface paradigm itself. The fundamental insight is this: the form of the human-computer interface effectively determines how theories are perceived and shapes the options of handling and manipulating theories, whether it restricts or expands them. This does matter even more at the meeting point of art and abstract theory. Here theories themselves and the way of perceiving them are in constant flux. Interfaces must therefore assist in the experimentation within this fluctuating world.
65
66
6 Software Components for Computational Theories
6.1 Types of User Interface There are many types of user interfaces, although a precise taxonomy is not really possible, since most actual implementations employ hybrid approaches. Leaving aside command-line interface such as realized in Humdrum, we consider only graphical user interfaces that are the most common today. One immediate distinction that can be made is between those interfaces that realize some kind of direct manipulation and those that follow the principles of data flow computation.
Direct Manipulation Direct manipulation is the way almost all graphical design software works, such as P HOTO S HOP for bitmap manipulation or B LENDER for 3D animation. The principle basically involves selecting an object, and then performing an operation on it. The ubiquitous cut-and-paste operation scheme also falls into this category. In the field of music software, this principle also dominates. Audio data editors work by allowing the user to select an extract of the sound wave data shown on-screen, which can then be cut, copied and pasted, as well as transformed using typical operations such as amplification or digital filtering (figure 6.1). In the field of symbolic music manipulation, software MIDI sequencers take a middle position in degree of abstractness. A main use of a sequencer is to allow the placement of MIDI events through various types of visualization, for example mimicking common music notation or the simpler form resembling a piano-roll. Apart from cut-and-paste operations, simple musically meaningful manipulations can be effectuated, such as transpositions, inversions or retrograde transformations (figure 6.2). This kind of approach is commonly regarded as the most intuitive, where the meaning of “intuitive” is left rather unclear. It is true, however, that it emulates the pattern of grasping an object, doing something with it, and putting it back where it comes from or to another place. On the other hand, it becomes difficult, although not quite impossible, to keep a history of all operations and to make changes at a particular step in the process. Thus large-scale experimentation is not the domain of direct manipulation interfaces. Another disadvantage is the near impossibility to automate processes. To remedy this, most software employing the direct manipulation strategy offer the possibility to create scripts or plug-ins using a data flow oriented language.
6.1 Types of User Interface
67
Fig. 6.1: The S WEEP audio data editor.
Data Flow The data flow approach is quite different from direct manipulation. In direct manipulation, the user performs the actions required to accomplish the task one after the other, building on previous results, thereby ordering the steps in real time. In contrast, the principle of data flow takes a bird’s eye view. It requires the tasks to be laid out in advance by devising subtasks and specifying how the result of a subtask fits in as input to another subtask, thus producing a network of nodes and links. Each link is a channel wherein data “flows” from a producing subtask (source node) to a consuming subtask (sink node). The user never or seldom intervenes in the actual computation, in contrast to direct manipulation, where he initiates each task not until the previous one has completed its execution (figure 6.3). The advantage of the data flow approach is that the designer of the process has a clear overview at all times. The configuration of different nodes can be changed at will without touching any other node. This is of importance to composers who may want to change parameters very early in the process to see their effect on the resulting composition. This is difficult if not impossible using direct manipulation, since usually the steps are not closely linked and history management does not allow for arbitrary adjustments. One of the best known data flow languages is L AB V IEW, a software system for data acquisition. Real instruments are symbolically represented and their outputs can be connected to various processing nodes for the analysis
68
6 Software Components for Computational Theories
Fig. 6.2: The R OSEGARDEN MIDI sequencer with a common music notation view (bottom) and a piano-roll (called matrix view in R OSEGARDEN) representation (right).
Fig. 6.3: Direct manipulation requires the user to intervene at each stage in a linear time-directed sequence (left). In the data flow approach the user operations and the dependencies between them using a static network. The actual performance happens without intervention by the user (right).
of the acquired data. This is an example of a real-time system, which is however only feasible if the processing algorithm are fast enough to handle the data as it comes in, or if such algorithms are supported by hardware acceleration, e.g., DSP implementation of FFT. In music applications, many software synthesizers use data-flow networks to link unit generators, filters and other processing elements. In this case it is the audio data in the form of samples that flow through the network. We already mentioned O PEN M USIC in subsection 2.2.3 where networks of interconnected components are called patches. Another application that uses data flow is Pure Data (PD) [62]. In figure 6.4 an example of a PD net-
6.2 Rubato Composer: Computational Theories
69
work is presented. This example also shows a particular application of the data flow principle, commonly called Visual Programming. This term is a misnomer, since the idea is simply to place the usual structural programming elements, such as alternatives, loops and expressions onto a canvas and using links instead of (or rather, in addition to) variable assignment assignments. The usefulness of this “programming” style over traditional textual programming code is, however, doubtful. We prefer the nodes to provide high-level functionality instead of sticking together many primitive operations in a complex network to obtain functionality of medium complexity.
Fig. 6.4: A conditional statement in Pure Data. One of the print statements select1, select-2, or select-3 is executed, depending on the input value (currently 0) being 1, 2, or other.
6.2 Rubato Composer: Computational Theories R UBATO C OMPOSER follows essentially the data flow paradigm. However it takes a particular view on what constitutes a node and also integrates direct manipulation to some extent. R UBATO C OMPOSER is about bringing the previously discussed mathematical theories to the composer. Thus the term COMPOSER in the name of the software refers simultaneously to two ideas: (1) it is a tool with the primary goal of musical composition; (2) the approach used to achieve this goal is by composing theories. What does it mean to compose theories? First, a particular type of theory is meant, namely computational theories. This is not to be confused with theories of computation, an entirely different concept. A computational theory is a theory that contains a significant computational part, or, simply stated, a theory that can be implemented. In this view the data flow approach translates in R UBATO C OMPOSER to mathematical objects, represented through denotators, flowing through theories, represented by computational units,
70
6 Software Components for Computational Theories
thereby being constantly transformed. As a very simple example consider the field of number theory. Here the objects are integers. One computational unit would implement the theory of integer factorization. The input of the unit is an integer and the output the set of its factors. The output of two such factorizations may then possibly be fed into a computational unit implementing the greatest common divisor. Naturally, real computational units would generally not act on such a fine-grained level. In R UBATO C OMPOSER such a theory is implemented as a component called a rubette. Instances of rubettes provide the nodes of the networks in the data flow schema. A rubette is generally not a computational primitive such as the simple addition of numbers, but provides high-level processing, the very implementation of the computation theory. A rubette is also different from the usual realization of nodes in data flow networks, in that they may also provide direct manipulation through two interfaces presented to the user as a view, where data can be rendered in an audio-visual manner, and properties that change the behavior of the rubette. By combining the structural possibilities of the data flow approach and the advantages of intuitive direct manipulation, the user can experiment with theories, discover new relationships, and produce interesting results that would remain hidden in traditional approaches. The R UBATO C OMPOSER system is primarily a platform that attracts two types of users: the developer of theories who implements rubettes, and the designer of networks of theories who combines the theories in a meaningful way. Certainly, the two types do not describe disjoint classes. They collaborate to produce as the final result a software application based on the R UBATO C OMPOSER platform.
Chapter 7
Historical Overview
The development of the R UBATO C OMPOSER software is not an isolated event, but the latest entry in a line of musical composition and analysis systems, starting in the early 1980’s. This chapter presents an outline of the evolution of these systems and summarizes the various aims and the results that have been achieved.
7.1 presto At the Salzburger Musikgespräch 1984, Guerino Mazzola presented an early version of a computer system for musical composition called M(2, Z)\Z2 o-scope, based on principles of mathematical transformations. Figure 7.1 shows the demonstration that he gave to Herbert von Karajan. Encouraged by the impression that this system left on the people present at the demonstration, a software system called presto was developed on the Atari ST computer, at the time one of the leading platforms used for computer music composition and research, not least because it supported MIDI natively. The first version appeared in 1989, the second release was available a few years later. During the evolution of this software, Guerino Mazzola published the theoretical underpinnings in the book Geometrie der Töne [44]. The presto application features a direct manipulation style interface, where the user creates and manipulates the score directly in a main view (figure 7.2). Several modules are included that make use of geometrical operations to generate and transform musical motifs. Figure 7.3 shows the so-called OrnaMagic module. It can be used to create an ornament from a given score fragment by applying translations. The exact pattern of how these transformations are applied is indicated through
71
72
7 Historical Overview
Fig. 7.1: Guerino Mazzola demonstrating the M(2, Z)Z2 -o-scope system to Herbert von Karajan in 1984 (left). The Hardware/Software compound making up this “Urpresto” system (right).
Fig. 7.2: The presto software running on Atari ST.
a grid. A generalized version of this module has been implemented in R U BATO C OMPOSER as the Wallpaper rubette, which will be presented in detail in section 16.2.
7.2 “Classic” R UBATO
73
Fig. 7.3: The transformation grid of presto’s OrnaMagic module.
7.2 “Classic” R UBATO From 1992 onwards, Guerino Mazzola and coworkers (Oliver Zahorka, Daniel Muzzulini, and Marcel Waldvogel) developed a new software that, in contrast to presto, focused on the analysis and performance aspect of music. It allows, for example, the extraction of metrical information from a musical work, such as the piano piece “Kuriose Geschichte” from Robert Schumann’s Kinderszenen. This information is then adjusted according to given rules and the original music is rendered according to new data, resulting in a “performance” of the original uninterpreted “dead-pan” version. Joachim Stange-Elbe has applied this principles extensively to Bach’s Kunst der Fuge [66]. This “classic” R UBATO was implemented in Objective-C on the NEXTSTEP platform. The software architecture featured modules called rubettes, that served as subsystems handling each a specific task. The MetroRubette shown in figure 7.4 generates the metrical-rhythmical analysis alluded to above. There is also the HarmoRubette for harmonic analysis and the PerformanceRubette for generating performances. The classic R UBATO internally works with a fixed set of data structures, mainly corresponding to nested lists of notes, which realized a first instance of the (not yet functorial) language of denotators and forms [75]. Each ru-
74
7 Historical Overview
Fig. 7.4: The R UBATO software running on NEXTSTEP currently showing the MetroRubette.
bette can be configured using predicates to work on a certain part of the global pool of data. From 2001 to 2004, the classic R UBATO has been ported to the then new MacOS X platform, capitalizing on the fact that the application development framework on MacOS X is very similar to the NEXTSTEP framework. This work was done by the group led by Thomas Noll with Jörg Garbers as the main developer at the Technical University of Berlin. Apart from the port itself, the software has been extended in various ways. In particular, a project attempted to make R UBATO interact with other commonly used software in computer music research, such as OpenMusic or Humdrum [26]. This proved however difficult since there is no common language that would simplify communication, and therefore more or less ad hoc solutions have been devised. The new R UBATO C OMPOSER system as a successor to classic R UBATO is intended to remedy this by being completely based on the form and denotator language. Moreover, while the port to a modern platform (MacOS X) has certainly infused new life to the classic software, which would otherwise have died with the disappearing NEXTSTEP platform, a similar situation exists today, since MacOS X is the only supported platform and ports to other platforms are not planned. By using the Java Development Kit for
7.3 Experiments in Java
75
R UBATO C OMPOSER, platform independence has been achieved for all practical purposes.
7.3 Experiments in Java First steps in the implementation of a denotator based software have been made Stefan Müller, Stefan Göller and Gérard Milmeister at the Institute of Computer Science of the University of Zurich under the supervision of Guerino Mazzola. The software produced include the EspressoRubette by Stefan Müller [58] and the PrimaVista browser by Stefan Göller [30]. The EspressoRubette computes performance fields for given performances, realizing a topic of inverse performance theory [45]. The PrimaVista browser is a three-dimensional environment for audiovisual rendering and manipulating denotators (figure 7.5). It has been implemented using Java3D and allowed the use of a SpaceMouse for the navigation. A characteristic of this environment is that audio-visual objects are denotators themselves.
Fig. 7.5: PrimaVista browser showing a specific rendering of a set of denotators.
Chantal Buteau used parts of the Java framework to create the MeloTopRUBETTE [15], an improved version of the MeloRUBETTE contained in the
76
7 Historical Overview
classic R UBATO. By following the principles originating in classic R UBATO, the result is a large, monolithic piece of software, which shows an additional limitation of the original approach. The need to move to a component-based implementation led to the next step in and the current state of the development of R UBATO.
7.4 R UBATO C OMPOSER The developments in Java led to the implementation of an important subset of the theory of The Topos of Music, as far as computationally feasible. While the PrimaVista browser showed promise for certain tasks of visualizing denotators, it has been recognized that a data-flow implementation of the graphical frontend presented to the application user would have the largest benefit. The present result is the R UBATO C OMPOSER software system. In fact, its predecessors presto and classic R UBATO did not have an explicit representation of the history of manipulation. Therefore, especially in presto, it turned out to be difficult to retrace one’s steps and apply changes at an earlier stage without redoing the work. The data-flow orientation of R UBATO C OMPOSER allows fiddling with settings at any stage, since the complete diagram of work-flow is always present. The reinterpretation of the rubette principle as components allows the building of complex applications from simpler and more primitive parts, instead of the closed and rather inflexible approach used in the case of the MeloTopRUBETTE. Topos theory has been applied to various aspects of music, for example by Thomas Noll [60], who is also a member of the group maintaining the MacOS X port of classic R UBATO. Nevertheless the original R UBATO implementation does not natively support category-theoretic approaches at all, and thus does not provide an adequate environment for these theories. The fact that R UBATO C OMPOSER is based on categorical denotator theory ensures that modern mathematical principles gain a natural platform.
Part II
The Implementation
Chapter 8
Overview
This part describes the implementation of R UBATO C OMPOSER. A few words are in order about some design choices. The most important is obviously the programming environment. The predecessors presto and classic R UBATO have been targeted at very specific platforms. In particular presto was implemented in C using the framework provided by the Atari ST. Classic R UBATO was implemented in Objective C and required the framework of NEXTSTEP and, later, of MacOS X. The consequence of this commitment is that the software did not outlive the platform, i.e., Atari ST in the case of presto. In the case of classic R UBATO, the software was limited to run on the platforms that supported the OpenSTEP framework, and no attempt to port to a different platform has been made. In both cases, this fixation proved to be a hindrance to a more widespread distribution and use of the software. It is to be noted that OpenMusic, which has been implemented in a Lisp system particular to the Macintosh platform, suffers from a similar drawback, although some effort to port OpenMusic to Linux has been made. To avoid this fate, we decided to base the implementation on Java [6] running on the Java Virtual Machine [33]. Now, a lot has been said on the advantages and shortcomings of using Java, and exaggerations on both sides have been the rule rather than the exception. Most of the development of R UBATO C OMPOSER has been done on Linux using the version 1.5 of JDK (Java Development Kit) from Sun and experience has shown that running the complete system on MacOS X succeeded at first go, albeit with a few expected quirks here and there. Portability is thus mostly a non-issue. Performance is not far behind native implementation either. The remaining point is integration with the operating environment. This is probably the weakest aspect, since the Java GUI is intended to be the much the same on every platform, and thus results in some discrepancies in user interface. In the end, the graphical user interface looks a little better on MacOS X than on Linux, without however affecting the experience of the user in any way.
79
80
8 Overview
As has already been hinted at, the user interface of R UBATO C OMPOSER has been created using Java’s own toolkit, called Swing, which is extremely flexible and allows the creation of interface elements of high complexity without overburdening the developer. For the external representation of data, for example files, XML is used throughout. XML is not the panacea that hype would have had it. Ultimately it is just a standardized way of representing data in a hierarchical way, which is, to put it bluntly, rather too unimaginative to be revolutionary. The emphasis is on standardized, which means that, in particular, the JDK provides extensive support for XML, most importantly XML parsing tools. A last important part of the implementation concerns the rendering of musical data. There are many conceptually different ways to handle this. The lowest-level approach, that of generating audio data, has been postponed for the time being, since R UBATO C OMPOSER is mainly oriented to work with more abstract data such as notes and events, at least for the present. The rendering of audio is not excluded, however, and can be realized through denotators as well, as the concept of frequency modulation introduced in subsection 3.3.2 shows. A system for the rendering of audio data would be a complete subsystem in itself and could be added later on. The main concern at this time is a way to simply produce sound from the notes and scores that are generated and it has been deemed good enough to employ the MIDI framework provided by the JDK. Naturally, MIDI has severe shortcomings, such as the difficulty to control arbitrary frequencies and the limited number of channels. The following chapters offer a detailed description of the implementation of R UBATO C OMPOSER, starting with the overall architecture in chapter 9, the implementation of modules in chapter 10, forms and denotators in chapter 11, and some further tools in chapter 12. These explanations are not just meant to give an abstract view of the architecture of the software, but are an important reference, along with the automatically generated Javadoc, for the developer who wants to use R UBATO C OMPOSER as an application framework and intends to extend it. The last chapter 13 in this part deals with the graphical frontend and is treated in a more cursory way. The software is released under the General Public License (GPL). The source and binary code, as well has documentation, can be accessed at http://www.rubato.org.
Chapter 9
Architecture
9.1 Overall Structure The software architecture of R UBATO C OMPOSER ramifies into a vast network of parts and subparts of different size and complexity. In order to facilitate navigation, this chapter describes a rough division into four important parts. The layers I, II, III and IV of this division are shown in figure 9.1. I. The Java 1.5 environment forms the foundation, including subsystems such as XML, Swing and MIDI frameworks. Some extensions such as helpful user interface components, XML Reader and Writer classes and other utilities also belong to this level. The implementation of rational and complex numbers, as well as other more general utilities are also counted to this fundamental part. Some details on the code found therein are discussed in chapter 12. This basis guarantees that the architecture built on top is for the most part platform-independent. II. The second part is the lowest level of implementation that belongs properly to R UBATO C OMPOSER. In short, this concerns the mathematics on which the whole system is built on, i.e., mathematical modules, their elements and morphisms of modules. The symmetric shapes of the subparts suggest that the types (i.e., classes) of modules and of their elements are in a one-to-one correspondence, whereas module morphisms relate to both structures. The most important fact to retain here is that modules, as well as their elements and morphisms, are proper Java objects or instances. A detailed discussion follows in chapter 10. III. The core of R UBATO C OMPOSER resides in this part, namely the implementation of forms and denotators. Similarly to modules and elements, forms and denotators are instances of classes in a parallel hierarchy, consisting of implementations for each of the types Simple, Limit, Colimit,
81
82
9 Architecture
Fig. 9.1: The overall architecture of R UBATO C OMPOSER.
Power and an additional type List. Forms and, more specifically, denotators support a number of generic operations. These are provided either as methods or as separate classes. Chapter 11 covers this part in detail. IV. While the previous layers are mainly intended for programmers familiar with Java, the top layer is the application layer. It provides the graphical user interface that allows the manipulation of the objects in the R UBATO C OMPOSER universe without extensive knowledge of programming. Here the rubettes, components that package functionality from the layers below, are introduced. These components, in their graphical appearance, are placed into networks and linked together via input and output connectors. Some aspects of the implementation are discussed in chapter 13. The use of the interface itself is presented in the manual in chapter 20.
9.2 The R UBATO C OMPOSER Universe
83
9.2 The R UBATO C OMPOSER Universe To help understand what types of objects exist in the universe of R UBATO C OMPOSER, an example state is examined on the basis of figure 9.2. The first fact to remark is the presence of the five most important object types, namely forms (not named in a meaningful way in this context, but numbered from 1 to 5), denotators, modules, module elements, and module morphisms. This picture must not be confused with a class diagram. In fact, all of the little boxes represent objects (i.e., instance of classes), and the arrows denote the relationship of object reference (i.e., a “has-a” relationship in object oriented terminology). This figure requires a few explanations. To begin with, we concentrate on the upper right subdiagram. In general an arrow from a box A to a box B means, that the object A contains a reference to the object B. Endowed with the theoretical knowledge presented in earlier chapters of this text, it follows that Form 1 references Form 2 and Form 3, which in this case might mean that Form 1 is of type Limit, and Form 2 and Form 3 its coordinate forms. Similarly Form 2 and Form 3 contain Module 2 and Module 3, respectively, which indicates that both forms are of type Simple with the corresponding modules as the codomains. The denotator Denotator 1 is obviously of form Form 1, and references two coordinate denotators that have forms Form 2 and Form 3, respectively, as required by Form 1. Both denotators at the leaves contain a module element, each referencing their module. This modules are, of course, enforced by the respective forms. The lower subdiagram shows a similar situation. Here we find a form Form 4 of type Simple with module Module 1. Two denotators Denotator 4 and Denotator 5 have this form, the second one containing an element belonging to the required module. Denotator 4 shows a new feature, namely that of being non-null addressed. The address is Module 4, indicated by the arrow from Denotator 4. The denotator no longer contains just a module element, but a full-fledged morphism with domain Module 4 and codomain Module 1. Both are of course determined by the address on the one hand, and the module in Form 4 on the other hand. In class oriented programming languages, types and classes are defined and fixed at runtime. This is clearly not the case in R UBATO C OMPOSER, since forms (which are akin to classes and types) are objects just as denotators (playing the roles of instances). This means that forms can be dynamically created, and potentially deleted, or modified, respectively duplicated. Also the problem of identity must be taken care of. The dynamical creation of forms is a feature of R UBATO C OMPOSER, however the cases of deletion and modification raise serious questions of consistency. What happens when a form is modified, but is referenced by some denotators? This would lead to a complete mess and is therefore strictly forbidden. Problems of deletions
84
9 Architecture
Fig. 9.2: The universe of R UBATO C OMPOSER.
do not exist, since there is not explicit way of destroying objects. Duplication and identity, however, demands more attention. When are two forms identical? One way of defining equality of forms is structural equality: Two forms are equal if they are of the same type and their components are equal in turn. The first objection to this scheme is the computational cost, the second problem comes from the possibility of recursive forms, which requires additional complexity to handle equality correctly.
9.3 Java Packages
85
To avoid inconsistencies and to keep complexity low, the following scheme has been adopted: Each form must be given a name. This is in contrast to denotators which can be anonymous. No two different forms can have the same name. Two forms are equal if they are the same object (pointer equality in terms of implementation). To help observe these rules a system repository has been introduced. This is a unique, global object which acts as dictionary that maps names to objects, and, as a special case, to forms. Each form must be registered with the repository, no two forms can be registered under the same name, and existing forms are retrieved by looking up in the repository. In figure 9.2 the repository object is shown on the left side. Every form is referenced by the repository. In addition, named denotators may be also registered, as well as modules, elements and morphisms. For these types of objects the restrictions specific to forms do not apply.
9.3 Java Packages The following chapters are going to be concerned with details of the implementation in Java. Java promotes the division of functional parts of a software system into packages, each package containing a set of classes that are intimately related. The package structure of R UBATO C OMPOSER reflects key points of the overall architecture, and it should therefore be easy to relate the packages to the parts discussed above. The most important packages are presented here in advance to give a synopsis of things to come. The first set of packages implements the content of part I: org.rubato.audio.midi Transforming MIDI to and from denotators and playing, MIDI file I/O org.rubato.scheme Embedded Scheme interpreter org.rubato.util Text processing and other utilities org.rubato.xml XML parsing and writing org.rubato.math.arith Complex and rational arithmetic org.rubato.math.matrix Matrixes over common rings
This set of packages deals with part II: org.rubato.math.module Modules and elements thereof org.rubato.math.module.morphism Module morphisms
The third part and some of the fourth part are covered by these packages: org.rubato.math.yoneda Forms and denotators org.rubato.logeo General operations on forms and denotators org.rubato.base Repository class and interface for rubettes
86
9 Architecture
The hierarchies starting at the following packages contain everything relating to part IV: org.rubato.composer The graphical user interface of R UBATO C OMPOSER org.rubato.rubettes A collection of built-in rubettes
Chapter 10
Modules and Morphisms
We now come to the foundations of R UBATO C OMPOSER: Modules are to the architecture of R UBATO, as primitive types are to programming languages.
10.1 Modules and their Elements It has already been said that modules are instances just like their elements. Therefore each module, be it the module of integer numbers, or a free module over the ring of real numbers, is an object in the universe of R UBATO C OMPOSER. In the case of the number rings, there is one unique instance for each ring, except for the modular integers Zn , where there is a unique instance for each n. The types of modules are arranged in a hierarchy. Since each module contains its corresponding elements, there is an exactly parallel hierarchy of module elements. As example, consider the ring of integers: This module is the unique instance of the class ZRing, and the elements are instances of the class ZElement.
10.1.1 The Module Interface Before embarking on the explanation of Module and ModuleElement Java interfaces, it is useful to discuss the question: What makes a module and what makes a module element? The first question gives rise to several aspects: First, what types of modules are there? Second, what can be done with modules?
87
88
10 Modules and Morphisms
The determination of module types directly results in the hierarchy already mentioned: On top, there is the abstract description of a module, then there are the rings and the free modules over rings. These are the main and the best known types, but there are other more specialized ones. The Module hierarchy determines the hierarchy of ModuleElement. The second aspect is about the operations on modules and module elements. These are essentially the properties of mathematical modules, e.g., the determination of the zero element, the construction of the identity morphism, the retrieval of the factor ring, and, of course, the element-of relation between a module and a module element. This entails the interface which is presented in figure 10.1. public interface Module { public ModuleElement getZero(); public boolean hasElement(ModuleElement element); public ModuleElement cast(ModuleElement element); public ModuleElement createElement(List<ModuleElement> els); public ModuleElement parseString(String string); public public public public public public
Ring getRing(); int getDimension(); Module getComponentModule(int i); Module getNullModule(); boolean isNullModule(); boolean isRing();
public ModuleMorphism getIdentityMorphism(); public ModuleMorphism getTranslation(ModuleElement element); public boolean equals(Object object); public int compareTo(Module object); } Fig. 10.1: The Java interface of Module. Only the essential parts are shown.
A few comments about specific methods are in order. Note that the abstract types ModuleElement and ModuleMorphism are used, as well as Ring which is a subtype of Module. The methods getZero, hasElement and getRing behave as expected. getIdentityMorphism, getTranslation: These two methods create module morphisms that are available for every module. The argument of getTranslation must of course be an element of the module in question.
10.1 Modules and their Elements
89
getDimension, getComponent: These methods return the dimension of the module, and the specified component module of the module, respectively. These have different meanings for free modules and for direct sums. The dimension of the direct sum is the number of components. isRing, isNullModule, getNullModule: The methods isRing and isNullModule test whether the module is a ring and whether it is of dimension zero, respectively. The method getNullModule returns a module with the same coefficient ring but with dimension zero. cast: It is often necessary to convert a module element to an element in another module. This method tries to do just that. This is not always possible, however, and in such a case, it simply returns null. createElement: Often an element of a module can be constructed from several elements in a module or a ring, e.g., the coordinates of an element in a free module are ring elements. This method can be used for such constructions. equals, compareTo: The equality of modules is given by mathematical equality, whereas the order implied by compareTo is somewhat arbitrary. The following can be relied on. The number rings are ordered as Z2 < Z3 < · · · Z < Q < R < C. Free modules are first ordered by their rings, and within free modules over the same ring, the order is on the dimension. The FreeModule interface inherits directly from the Module interface and has two extra methods: isVectorspace is true whenever the factor ring is a field and getUnitElement(int i) returns the unit vector with 1 at coordinate i. The abstract class Ring adds three more methods that are specific to rings. It is shown in figure 10.2.
public abstract class Ring implements FreeModule { public abstract RingElement getOne(); public abstract boolean isField(); public abstract FreeModule getFreeModule(int dimension); }
Fig. 10.2: The Java abstract class Ring.
The getOne method returns the unit of the ring. This is specific to a ring, since the concept of a unit element does not exists for modules. The property isField is true whenever every non-zero element is invertible. Each
90
10 Modules and Morphisms
ring gives rise to a free module over this ring and getFreeModule creates such a module with the given dimension. These are the main components in the module hierarchy. The figure 10.3 presents the full hierarchy of all implemented types of modules. Since each ring gives rise to free modules of arbitrary dimensions over this ring, the corresponding free modules are omitted lest the picture is overcrowded. Note, that every ring is a free module of dimension 1. To distinguish between a ring and free modules over the same ring of dimension different from 1, there is an intermediate class ProperFreeModule from which all proper free modules inherits. Thus the free module over R of dimension 1 is the unique instance of RRing, whereas all other free modules over R are instances of RProperFreeModule. In order to avoid the existence of an instance of RProperFreeModule of dimension 1, instances can only be created using a static make method, which, provided with the dimension, returns the right instance. Thus RProperFreeModule.make(1) returns the unique instance of RRing, whereas RProperFreeModule.make(3) creates an instance of RProperFreeModule. The same principle applies to all other subclasses of ProperFreeModule.
Fig. 10.3: The module hierarchy. An arrow from A to B indicates that A is a subclass of (or an implementation of the interface) B. The lighter gray boxes with dashed borders represent interfaces in the Java sense.
10.1 Modules and their Elements
91
Types of Modules The following list summarizes all implemented types of modules. For each ring there are of course the corresponding free modules. ZnRing, ZRing, QRing, RRing, CRing: The rings of modular integers, integers, rational, real and complex numbers. ZnStringRing, ZStringRing, QStringRing, RStringRing, CStringRing: The ring of strings with UNICODE words and coefficient rings Zn , Z, Q, R and C, respectively. ProductRing: Product rings with at least two factor rings. The factors are arbitrary rings, including product rings in turn. PolynomialRing: Polynomials in one indeterminate with coefficients in an arbitrary ring. A polynomial ring is constructed using the static method PolynomialRing make(Ring coefficientRing, String indet). Multivariate polynomial rings can be built by using a polynomial ring as coefficientRing and a different variable as indeterminate. However, no further support for dealing with multivariate polynomials is currently provided. DirectSumModule: This class implements direct sums of modules over the same ring. As an example, Z3 ⊕ Z5 is created with DirectSumModule.make(Z3,Z5) where the variable Z3 contains a module created by ZProperFreeModule.make(3) and the variable Z5 contains a module created by ZProperFreeModule.make(5). RestrictedModule: A module over a ring R with ring change S → R. This class effectively wraps an R-module in an S-module. Thus the ring of reals R can be put into a RestrictedModule using the embedding Z → R and is henceforth regarded as a Z-module.
10.1.2 The ModuleElement Interface The same questions as for modules can be asked of module elements: What types of module elements are there? This is trivially answered since the types of module elements mirror the types of modules, and these we have already treated. The hierarchy rooted at ModuleElement is exactly analogous to the module hierarchy, therefore it is not repeated here.
92
10 Modules and Morphisms
The second aspect is more interesting: What operations do elements of modules admit? Certainly, the operations that belong to the theory of modules are included here: scaling by a ring element and addition. For module elements that are effectively elements of a ring, the specific ring operations are adjoined.
public interface ModuleElement { public boolean isZero(); public public public public public public public public
int getLength(); ModuleElement getComponent(int i); Module getModule(); ModuleElement cast(Module module);
public boolean equals(Object object); public int compareTo(ModuleElement object); } Fig. 10.4: The Java interface of ModuleElement. Only the essential parts are shown.
The isZero method is clear. The module operations come in pairs. Scaling by a ring element is performed using scaled and scale. The first returns a new module element scaled by the ring element, the second scales the module element in-place. Therefore the second operation is destructive and potentially dangerous because of aliasing. In general the destructive operation should only be used for performance reasons and if one is certain about the lifetime of the objects. The same comment applies to addition (sum, resp. add), subtraction (difference, resp. subtract) and negation (negated, resp. negate). The ring element argument of the scaling methods must of course belong to the coefficient ring of the module element. Here and in all other similar cases a RubatoException (package org.rubato.base) instance or an instance of a subclass thereof is thrown if the expected arguments and the actual arguments to a method are incompatible.
10.1 Modules and their Elements
93
The getLength method returns the number of components. In the case of a vector, it is simply the length of the vector. The i-th component can be retrieved using getComponent(int i). The result of getModule is the module that this element belongs to. The cast method is analogous to the homonymous method of Module with roles interchanged, i.e., it tries to convert the module element to the module provided as argument. Equality again follows from mathematical equality. The order induced by compareTo is defined as follows: If elements from different modules are compared, their order is the same as the order of the modules they are elements of. Otherwise the order depends on the particular module. In an ordered ring, for example, it is simply the natural order between ring elements. In other cases, for example complex numbers and elements of free modules, a lexicographical ordering is used. Just as in the case of the class Ring, the corresponding element class RingElement provides additional methods, shown in figure 10.5. public abstract class RingElement implements FreeElement { public abstract boolean isOne(); public abstract RingElement product(RingElement element); public abstract void multiply(RingElement element) public abstract boolean isInvertible(); public abstract RingElement inverse(); public abstract void invert(); public RingElement power(int n); }
Fig. 10.5: The abstract class RingElement.
The operations specific to ring elements are multiplication and inverse (whose result is only if an inverse actually exists). Similarly to the module operations, both are available as non-destructive (product and inverse) and destructive (multiply and invert) versions. The power method returns rn for the ring element, where n is a non-negative integer. The property isOne checks whether the ring element is the unit and isInvertible is true iff the element in question is invertible. Since ring elements can be regarded as elements from the one-dimensional free module over the ring, the abstract class RingElement inherits from the interface FreeElement. This interface is also implemented by the various proper free element class such as ZProperFreeElement. It includes methods for accessing the component elements (using getRingElement(int i)
94
10 Modules and Morphisms
or an iterator, so that it can be put in a for-each loop) and methods for the component-wise multiplication (productCW is the non-destructive version and multiplyCW is the destructive version).
Example: Z-Modules To make the relationship between the different classes clearer, consider the example of the ring of integers. Figure 10.6 shows the complete hierarchies containing ZRing and ZElement. The correspondence between the two hierarchies is evident. The (abstract) class NumberRing is just a marker to distinguish the rings Z, Q, R, and C, which are related by embedding homomorphisms, from the other rings.
Fig. 10.6: The complete hierarchy of Module and ModuleElement restricted to the part concerning the ring of integers Z.
One could ask why there is no Field class or interface that marks the property of being a field in the hierarchy, similarly to NumberRing. Remember that the classes in the hierarchy describe types of rings (resp. modules), and are not the rings themselves. As an example, take ZnRing, whose instances are modular integer rings, and only fields when n is a prime. Hence, a Field superclass could not discriminate between field instances and non-field instances of ZnRing. Therefore a property isField has been introduced instead. In the case of fields such as R (the unique instance of RRing) or C (the unique instance of CRing) this method returns the constant true, but in the case of Zn (instances of ZnRing) it checks whether n is prime.
10.2 Module Morphisms
95
10.2 Module Morphisms Now that we have discussed the implementation of the objects of the category of modules, we proceed to the morphisms of the category. The implementation of morphisms determines which particular category it will be. Of important theoretical interest are the diaffine (called simply affine, from now on) morphisms, and these are going to receive the most attention. However, the category of modules may be regarded as a subcategory of the category of sets, in which case the morphisms will be set mappings. These are necessary since many important functions are not affine. In R U BATO C OMPOSER the morphisms are set mappings in the first place, and an attribute indicates whether the morphism is affine or not.
10.2.1 The ModuleMorphism Interface Morphisms are instances of classes that are embedded in a hierarchy. The interface (which is actually an abstract class) of the root of hierarchy, ModuleMorphism, is shown in figure 10.7. The classes in the hierarchy stand for specific types of morphisms, for example, affine morphisms on free R-modules, which are specified by a real matrix and a real translation vector, etc. Instances of these classes can be composed, added, and scaled to create new morphisms, provided their domains and codomains are compatible. Since the component morphisms are well known, it is possible to infer the properties of the resulting morphisms, i.e., whether they are linear, affine, etc. Using the provided types, a large number of mappings can be created, and further types can be added to the hierarchy in later developments of R UBATO C OMPOSER. All module morphism must inherit from ModuleMorphism and suitably override its methods. Some methods have default implementations, such as sum, which creates a generic SumMorphism instance based on the components. ModuleMorphism(Module domain, Module codomain): The constructor must be called in the constructor of every subclass in order to set the domain and codomain of the morphism. This implements the actual formula of the morphism. It does of course work only if its argument is an element from the domain, otherwise an exception is thrown. compose, sum, difference, scaled: These are the types of function composition that are available on all module morphisms. The domains and codomains must be compatimap:
96
10 Modules and Morphisms
public abstract class ModuleMorphism { public ModuleMorphism(Module domain, Module codomain); public ModuleElement map(ModuleElement x); public public public public public
public int compareTo(Object object); public boolean equals(Object object); public static ModuleMorphism getIdentityMorphism(Module module); public static ModuleMorphism getConstantMorphism(Module module, ModuleElement value); } Fig. 10.7: The abstract class ModuleMorphism. Only the essential parts are shown.
ble, and, in the case of scaled, the argument must be in the ring of the codomain. power: The power morphism only gives a result when invoked on a morphism f where the domain is equal to its codomain. For an integer n ≥ 0 it returns f n . isModuleHomoMorphism: This property is true whenever the morphism is affine. It characterizes the category of modules and affine morphisms.
10.2 Module Morphisms
97
isRingMorphism, isRingHomoMorphism: The property isRingMorphism is true iff the domain and codomain are rings, whereas isRingHomoMorphism additionally checks if the morphism is a ring homomorphism and effectively characterizes the category of rings and ring homomorphisms. isIdentity, isConstant, isLinear: These properties determine whether the morphism is the identity, or a constant, or a linear morphism, respectively. getDomain, getCodomain: These methods return the domain and the codomain, respectively, of the morphism. getRingMorphism: Since a diaffine (or any other) module morphism possibly features a change of ring, for example for a morphism Z3 → R2 the change of ring from Z to R may be the simple embedding of Z in R, this method returns the ring change morphism. Often, it is just an identity morphism. inDomain: Before applying map to a ModuleElement object this method can be used to check if the given element is in the domain of the morphism. getIdentityMorphism, getConstantMorphism: These static methods create identity and constant morphisms with the specified module as domain. The codomain of an identity is of course the same as its domain. In the case of the constant morphism the supplied constant also determines the codomain.
Types of Module Morphisms The hierarchy rooted at ModuleMorphism is much simpler than the hierarchies rooted at Module and ModuleElement. It is essentially flat, with a collection of classes inheriting directly from ModuleMorphism and representing different types of morphisms. A presentation of some of the more interesting types of morphism now follows. IdentityMorphism: The identity exists for every domain and codomain M, such that Id : M → M, and Id(x) = x. Of course, identities can arise from types of morphisms different from IdentityMorphism. For example an affine morphism with unit matrix and zero vector is an identity morphism. To check for an identity, the method isIdentity should be used. ConjugationMorphism: Specific to complex numbers, conjugation maps x + iy to x − iy. This morphism extends conjugation to complex vectors. Conjugation is not
98
10 Modules and Morphisms
a module homomorphism. It is a ring homomorphism if the domain (and codomain) is the ring of complex numbers. ModuloMorphism: The modulo operation is the map Z → Zn : x 7→ x mod n. As implemented here, it extends to Zm → Zm n. EmbeddingMorphism: An embedding map is an important way of moving from one module to another. Embeddings require that no information gets lost during the translation. Therefore we have the standard embeddings of rings Z ⊂ Q ⊂ R ⊂ C, which extend to the string rings R and the polynomial rings R[X]. The embedding Sn into Rm is also possible provided that n < m and S can be embedded in R. When m is strictly larger than n, the last m − n components are filled with zeros from the codomain ring. A product ring R1 ×· · · Rm can be embedded in a product ring S1 ×· · · Sn whenever m = n and Ri can be embedded in Si . Whether the resulting embedding is a homomorphism or not depends on the mapping: For example, we allow the embedding of the rings Zn in the other number rings. However, these embeddings will no longer be a homomorphisms. CanonicalMorphism: A canonical morphism from M to N is the simplest morphism that takes an element from M to N. If M = N, this is of course the identity. If M can be embedded in N, it is an embedding as described above. Other types include ModuloMorphism, and the trivial morphisms R0 → M and M → R0 . In general, canonical morphisms may be devoid of any interesting properties, such as being a ring or module homomorphism. The direction from a “larger” ring such as as R to a “smaller” ring such as Z is also supported. This is akin to casting in programming languages and similarly discards important information. As long as a canonical morphism f from a ring R to a ring S (for example an embedding or even a cast) can be produced, a canonical morphism from Rm to Sn can also be created. If m = n, the components are transferred using f. If m < n the remaining n − m components of the element in the codomain Sn are set to the zero of Sn , and if m > n, the m − n supernumerary components of the element in the domain Rm are discarded. A canonical morphism can be used as a last resort to move from one space to another with a minimum loss of information. GenericBasisMorphism: A morphism f from a free module Rm to a module M can be defined in the following way: We consider the canonical basis vectors of Rm
10.2 Module Morphisms
99
e0 = (0, 0, 0, . . . 0) e1 = (0, 1, 0, . . . 0) e2 = (0, 0, 1, . . . 0) ··· em = (0, 1, 0, . . . 1) where the zero vector e0 is included, because we deal with affine mappings, and therefore the value of f at e0 is generally not 0M . For each ei , the value f(ei ) = fi in M is given. The fi , where 0 ≤ i ≤ m, are the parameters that determine an instance of GenericBasisMorphism. The value f(x) of an arbitrary element x = (x1 , . . . xm ) in Rm is now calculated as: m X f(x) = f0 + (fi − f0 )xi . i=1
PolynomialMorphism: The evaluation of a polynomial at an element from the coefficient ring of the polynomial is generally not a homomorphism, but provides a very important class of functions. The polynomial to evaluate is an instance of PolynomialElement introduced in the previous section. ProjectionMorphism: The projection f from a product ring R1 × · · · Rn to its i-th component Ri is the extraction of the i-th component of an n-tuple: f((x1 , x2 , . . . , xn )) = xi . ReorderMorphism: Given a product ring R1 × · · · Rn and a list of indexes i1 , . . . im , a reordering morphism f takes an element (n-tuple) from the domain R1 × R2 × · · · Rn and reorders its components according to the indexes ij to result in an m-tuple in the codomain Ri1 × Ri2 × · · · Rim : f((x1 , x2 , . . . , xn )) = (xi1 , xi2 , . . . xim ). A special case of an application of this type of morphism is the expansion of a ring R to the product ring Rn . SplitMorphism: A free module Rn can be regarded P as the direct sum of several free modules Ri1 ⊕ Ri2 ⊕ · · · Rim where m j=1 ij = n.
Now, a morphism from Rn → Rn can be defined as a “direct sum” of morphisms fi1 ⊕ fi2 ⊕ · · · fim : Ri1 ⊕ Ri2 ⊕ · · · Rim → Ri1 ⊕ Ri2 ⊕ · · · Rim ,
100
10 Modules and Morphisms
where we provide all fij : Rij → Rij to create an instance of SplitMorphism. ConstantMorphism: A constant morphism f : M → N maps every element from the domain to a fixed value in the codomain. For a given constant c ∈ N, f(x) = c. To check for a constant morphism, the method isConstant should be used. TranslationMorphism: A translation morphism f : M → M adds a constant to an element from the domain. For a given constant c ∈ M, f(x) = x + c. ProductMorphism: Given two morphisms f : M → R and g : M → R where the codomain R is a ring, a product morphism h : M → R can be created such that h(x) = g(x) · h(X).
Deriving Module Morphisms from Existing Ones A new morphism can be derived from one or more existing ones by calling one of the methods sum, difference, scaled, product, and compose. These methods try to make simplifications and reductions. As an example, consider the sum of two affine morphisms f and g of the form R2 → R2 . Since these morphisms are instances of RFreeAffineMorphism and each is characterized by a matrix and a vector, the call f.sum(g) will also return an instance of RFreeAffineMorphism defined by the sum of matrixes and the sum of vectors. Such a simplification is not always possible, however, and in that case an instance of SumMorphism is returned by the sum method. The same principle applies to the other methods for combining morphisms by creating instances of the general types DifferenceMorphism, PowerMorphism, ProductMorphism, ScaledMorphism, and CompositionMorphism.
Example: Rn → Rm morphisms Common types of morphisms are those of the form Rn → Rn , where R is a number ring, especially affine morphisms. In this case, the hierarchy is a little deeper and provides abstract classes of the form RAbstractMorphism and RFreeAbstractMorphism, where R ∈ {Zn,Z,Q,R,C}. These abstract classes already implement everything that is needed, the only abstract method is mapValue, which must provide the code and must be implemented by subclasses. Additionally, properties can be set and other methods can be overridden for reasons of efficiency. As an example we take R to be the real number ring R and look at RFreeAffineMorphism and RAffineMorphism, which extend RFreeAbstract-
10.2 Module Morphisms
101
Morphism and RAbstractMorphism, respectively. The class RAffineMorphism is shown in figure 10.8 and figure 10.9.
public final class RAffineMorphism extends RAbstractMorphism { public RAffineMorphism(double a, double b) { super(); this.a = a; this.b = b; } public double mapValue(double x) { return a*x+b; } public boolean isModuleHomomorphism() { return true; } public boolean isRingHomomorphism() { return (b == 0) && (a == 1 || a == 0); } public boolean isLinear() { return b == 0; } public boolean isIdentity() { return (a == 1) && (b == 0); } public boolean isConstant() { return a == 0; } . . . continued in figure 10.9 . . . } Fig. 10.8: The class RAffineMorphism. Only the more interesting parts are shown. Continued in figure 10.9 on page 102.
An instance of RAffineMorphism is specified by two real numbers a and b, the parameters determining the affine map x 7→ ax + b, which is implemented in mapValue. The properties in figure 10.8 return a result depending on the values of a and b. For example, the morphism is linear whenever b (the translation part) is 0. The compose method in figure 10.9 is specialized in that it first checks if the second component of the composition is also of type RAffineMorphism. If so, then a new instance of this type can be created by combining the param-
102
10 Modules and Morphisms
public final class RAffineMorphism extends RAbstractMorphism { . . . continued from figure 10.8 . . . public ModuleMorphism compose(ModuleMorphism morphism) throws CompositionException { if (morphism instanceof RAffineMorphism) { RAffineMorphism rmorphism = (RAffineMorphism)morphism; return new RAffineMorphism(a*rmorphism.a,a*rmorphism.b+b); } else { return super.compose(morphism); } } public ModuleMorphism sum(ModuleMorphism morphism) throws CompositionException { ··· } public ModuleMorphism difference(ModuleMorphism morphism) throws CompositionException { ··· } public boolean equals(Object object) { if (object instanceof RAffineMorphism) { RAffineMorphism morphism = (RAffineMorphism)object; return a == morphism.a && b == morphism.b; } else { return false; } } public double getA() { return a; } public double getB() { return b; } private double a; private double b; } Fig. 10.9: The class RAffineMorphism. Only the more interesting parts are shown. Continued from figure 10.8.
eters a and b of both affine morphisms. Otherwise, the superclass version is called which takes care of the generic composition. The sum and difference implementations are analogous. Equality testing consists in checking
10.2 Module Morphisms
103
for equality of the parameters. Note that we cannot test for mathematical equality of functions, i.e., two ModuleMorphism instances may compute the same function, but are of different types, hence not equal. The case of RFreeAffineMorphism is analogous. Instead of the real numbers a and b, a real matrix of type RMatrix and a vector of type double[] are required, from which the domain and codomain dimensions are inferred. One subtle point here is the use of a virtual constructor (or factory method): The constructor of RFreeAffineMorphism is not directly invoked, but through a static method public static ModuleMorphism make(RMatrix A, double[] b); In the case where A is a (1×1)-matrix and b is a 1-vector, an instance of RAffineMorphism is created instead of the more general RFreeAffineMorphism. This principle is often used throughout the system. It allows the selection of the most adequate implementation based on the arguments. This flexibility would not possible with ordinary constructors. The foregoing discussion applies with the corresponding changes to the case where R is Zn, Z, Q, or C, instead of R.
Chapter 11
Forms and Denotators
The implementation of forms and denotators constitute the center of R U BATO C OMPOSER. Together they provide the data model of the system. In general, all data that is passed on in R UBATO exist as denotators. Within a subsystem, pieces of data represented by denotators may certainly be transformed to any one of the many forms of representation devised by computer scientists and particularly suited for the task at hand. But whenever transformed or generated data leaves a subsystem, it is as a denotator. In this way, a common basis for accessing and distributing information is guaranteed. This has also been one of the objectives of XML. However, the superior mathematical foundation of forms and denotators as functorial concepts qualifies them much more for the tasks occurring mathematical music science, which deals with high-dimensional spaces. For the following discussion it is useful to recall and quickly summarize the idea behind forms and denotators, which has been extensively discussed in chapter 5: Forms are highly structured mathematical spaces, whereas denotators are objects in forms, i.e., points in such mathematical spaces. An important qualification concerns the notion of point, which is replaced by the notion of general addressed point, a concept which follows from so-called functoriality.
11.1 Requirements A glance at the discussion in chapter 5 suggests a few elements that will be needed for the implementation of forms and denotators. The several types Simple, Limit, Colimit, Power, and an additional type List, call for a five-fold branching of classes from a common superclass. The same applies to denotators. The additional type List is a concession to the convenience of list data types in computer programming. In terms of com105
106
11 Forms and Denotators
puter engineering, lists are more basic than the sets of type Power. Lists do not add anything fundamentally new: one way of expressing a list is as a set of pairs, with first coordinate an order number and second coordinate the list element, with the provision that there are no duplicate order numbers. Nevertheless, it would be an unseemly omission not to treat lists in a concrete as opposed to an abstract environment. Forms of the types Limit and Colimit require diagrams to describe how their factor and cofactor forms, respectively, are connected through morphisms. These diagrams are implemented through the class FormDiagram. However, the handling of the constraints implied by Limit form diagrams and the equivalence relation implied by Colimit form diagrams is not yet implemented. This is the subject of future work and requires extensive research in order to develop the exact semantics for a correct realization. Therefore Limit and Colimit types are currently used in their simpler guise as products and coproducts. Addresses, on the other hand, are fully implemented. Addresses arise from the module morphisms embedded in the denotators of type Simple. The domain of the morphism determines the address of the Simple denotator. The common address of the coordinate denotators determines the address of denotators of type Limit, Colimit, Power, and List. Notwithstanding the circularity involved through the name of a form being a denotator itself, the discussion starts with the implementation of forms. It is therefore necessary to mention a few details that are only fully discussed later on. As we shall see, names are required to have a determined form NameForm of type List, which is handled specially. We have opted for names to be restricted to this form instead of allowing arbitrary forms as intended by the theory. The reason for this choice is to avoid any issue with circularity. In the future, the use of general name forms may be made possible. It is a detail of implementation and should not affect the developer following the API. If not stated to the contrary, all the classes mentioned in this chapter reside in the package org.rubato.math.yoneda. In the following the word factor is often used interchangeably with the word coordinate for historical reasons.
11.2 Forms The current implementation of forms is not yet fully realized with respect to the complete theory as set forth in [47]. However the supporting infrastructure is present and provides the hooks and interfaces necessary to fit in the missing pieces. In order not to confuse the understanding, the discussion will proceed assuming a simpler version of the actual implementation.
11.2 Forms
107
Not all existing methods are mentioned, only a subset large enough to give a self-sufficient overview. As always, the ultimate reference is the Javadoc generated from the sources, available in electronic form.
11.2.1 Form Class The five types of forms are implemented as five subclasses SimpleForm, LimitForm, ColimitForm, ListForm, and PowerForm, of the abstract class Form. Additionally the class NameForm derives from ListForm to provide the special form for name denotators. It has itself the name Name and can be treated as a form of type List. The common interface for the form classes is shown in figure 11.1 and a simplified UML diagram in figure 11.2.