Release documentation for E.B.B.A. Release documentation for E.B.B.A...........................................................................................1 1. What EBBA is.........................................................................................................................1 Facial expressions...................................................................................................................1 Speech synthesizer..................................................................................................................1 IQ-test solver...........................................................................................................................2 2. Background ............................................................................................................................3 3. How it’s done..........................................................................................................................4 Technical perspective:.............................................................................................................4 Why is HTTP used with the AIML parser but TCP/IP with the facial expression system?....4 The user’s perspective:............................................................................................................5 4. Installation notes and system requirements............................................................................5 Requirements..........................................................................................................................5 5. Version history........................................................................................................................6 0.1 (2004-01-26) Initial release .............................................................................................6 0.2 (2004-03-17) ....................................................................................................................6 New features.......................................................................................................................6 Fixes....................................................................................................................................6 0.3 (2004-04-05).....................................................................................................................7 New Features.......................................................................................................................7 Improvements......................................................................................................................7 Fixes....................................................................................................................................7
1. What EBBA is EBBA is a project aiming to develop an advanced chatbot by combining AIML, facial expressions, speech synthesizer and commonsense databases. Artificial Intelligence Markup Language, created by Richard S. Wallace, is an xml-based (Xtensible Markup Language) language created solely for the purpose of making chat-bots. The development of AIML is based on the fact that the distribution of sentences in conversations tends to follow Zipf's Law. While the possibilities of what can be said are infinite, what is actually said in conversation is surprisingly small. In fact, 6000 patterns cover 95% of inputs. AIML is designed to easily match and specify responses to these core patterns.
Facial expressions The facial expression system is based on a project called Expression. Expression uses a muscle-based facial animation system. It has not been developed for the past one and a half years and the documentation is quite sparse, despite that I found it to be the, by far, most advanced open-source facial animation system.
Speech synthesizer SAPI5 (Speech Application Programming Interface) is used to create EBBA’s voice. SAPI5 is a standard set by Microsoft that enables you to create a voice-enabled program without
requiring a specific speech synthesizer. One other advantage with SAPI5 is that it not only creates phonemes from the text, it also calculates which visemes each phoneme represents. A visemes is term defined by Disney which describes how the lips in the face should move while saying a specific phoneme. The current visemes is then forwarded to the facial animation system and converted into the appropriate lip-movements.
IQ-test solver EBBA can solve numeric IQ-tests of the form “If a1,a2…ak are the first numbers what number is next”. A common and relatively simple question of this type is 2,4,8,? Most humans would answer 16 and you can prove the correctness of this answer with the formula: f(x)=2^x. The formula is applicable to the given numbers and the answer (f(1)=2, f(2)=4, f(3)=8, f(4)=16) so the answer 16 is therefore a logical solution. You can generally say that if f(1)=a1, f(2)=a2, f(3)=a3...f(k)=ak then a correct answer is f(k+1). EBBA is able create f(x) for all integer number series. She creates the formula according to the following algorithm (I am using the numbers 31, 117, 3 ,98 as example-numbers but the principle is the same for all numbers): 1. First she takes each number and adds 0’s in front of it so that all numbers have equally many characters in it. In the example this becomes 031, 117, 003, 098 2. Then she concatenates the numbers. The example would the become 031117003098 3. Then she sets the variables used in the formula. a=The character length of each number (3 in the example) b=The total character length of the concatenated number (12 in the example) k=The concatenated number (031117003098 in the example) The formula f(x) then looks like this:
NOTE: The formula above does not handle negative and floating-point values; which was implemented in release 0.3. I will update the docs with the new formula when I get the time. The example formula returns these values for the first numbers f(1)=31 f(2)=117 f(3)=3 f(4)=98 Answer=f(5) With this algorithm EBBA is able to find a logical answer for all integer number series. This algorithm is actually also applicable to spatial and verbal IQ-tests. If you convert the images in a spatial test to a bitmap and words in the verbal test from ASCII to it’s binary equivalence
then you can apply the same formula on those questions. Since the conversion process is completely logical the answer always is logical too; maybe not the same as what a human would answer but it’s still logical.
2. Background I have always been interested in artificial intelligence (AI) and during summer 2003 my interest got a real boost when I found out about the A.L.I.C.E project. By then I decided that my school project was going be in the fields of AI. However, AI is an extremely large subject and it took a while before I had narrowed down exactly what the project was going to involve. I wanted to know how far away we are from creating a bot that can emulate the human mind so well that we cannot distinguish it from a human being, or in A.I. terms, create bot that passes a Turing-test. At first I focused more on the mathematical perspective of AI and not, what I call, the human perspective. What is the difference between the two? The mathematical perspective is based on the belief that to create the “perfect” AI all you need is the perfect mathematical formula. Markov modelling and neural networks are two common concepts in this field. The human perspective focuses more on creating human-like software that creates the illusion of a human being. Alan Turing is the most famous ground-figure in this field. His text Computing machinery and intelligence (1950) revolutionized the way of thinking about AI. He believed that you can never create a perfect definition of intelligence and must therefore focus on what we humans believe is intelligent. In other words, if you cannot distinguish between a human been and some kind of AI then the AI must be just as intelligent as the human. At first, I focused more on the mathematical models but soon I realized what Mr Turing did over 50 years ago, we will never be able to create the perfect definition of intelligence and can therefore not find the mathematical “perfect formula”, simply because we will not know when we have found it. However, that does not mean that things such as Markov modelling and neural networks have no use, they are very much useful. Actually, to be truthful I realize that the day when we have AIs just as intelligent as human being they have not been created with tools such as AIML (that EBBA uses). However, on the way to the perfect AI this is as human-like you can get at the moment. The programming languages I have chosen to use (C++ and C#) were chosen for a quite simply reason, I have very little experience with them but I want to learn more. So the code in these initial releases might not be at the quality-level that I usually write at, but hopefully this will be a great learning experience instead.
3. How it’s done I have created two flow-charts, one from the user’s perspective (what he/she does and what happens then) and from the technical perspective (the data-flow within the processes).
Technical perspective:
Why is HTTP used with the AIML parser but TCP/IP with the facial expression system? It’s quite simple really, the AIML parser I am using (J-Alice) already has implemented a simple web-server. So I thought it was unnecessary to reinvent the wheel by implementing another protocol into the software. The reason I didn’t stick with HTTP for the Facial animation system is that low latency is more important when you are working with visemes.
The user’s perspective:
4. Installation notes and system requirements Installation: Just run setup.exe and follow the instructions Note: When you close the software you need to close several windows as they are separate processes.
Requirements Internet Explorer 5 or higher The .NET 1.1 Framework A SAPI5 compliant (preferably English speaking) TTS engine All the above softwares are available for free at Microsoft’s website.
5. Version history 0.1 (2004-01-26) Initial release Just a first “proof-of-concept” release. The face, voice and brain was connected but it was quite buggy and still had a male face.
0.2 (2004-03-17) New features Female face: Ebba now has a female face so that she can live up to the name ;-). (The model and the muscles were created by Gedalia Pasternak and the skin-texture was made by me) Updated version of the facial animation system: The facial animation system (Expression) has been updated with a version Gedalia sent me. (thanks to Gedalia Pasternak) Numeric IQ-test solver: Ebba can now find a formula to practically any integer number series, which are often used in IQ-tests. (created by me, read above for more info about it) Some enhancements in the AIML-code: Just some improvements to give Ebba a more personal touch. (me) Installation file The binary distribution now has an installation package created by the wonderful installation system NSIS. It includes, license-text, creation of shortcuts and uninstall etc. (me)
Fixes 1. Fix so that the texture-manager in the animation system can handle absolute paths to the texture in the mesh. Without this it was required to place the program in C:\src\expression because the mesh pointed to that path. The code is called only when the file is not found so it does not affect any other paths that were already working. 2. Addition of an extra argument to the process so that you can specify a different path to the face-data folder. That is useful if you whish to have the folder some place else. 3. Other minor fixes and improvements
0.3 (2004-04-05) New Features Speech-recognition: The interface now has a speech-recognition feature which automatically asks Ebba questions without the need of a keyboard. I’ve tested it and considering that English is not my native-language it worked fairly well.
Improvements 1. The IQ-test solver now handles negative and floating-point input values 2. The TTS-engine is now called from the facial-animation process: This somewhat improves performance and decreases the latency between the speech and the lipanimations. It also fixes the bug were several error-message come up if you close Ebba while she’s speaking. 3. Ebba now adjusts her speech-rate to how much she has to say: When she has more to say she speaks faster. 4. Ebba’s skin-texture is somewhat improved 5. Added a new icon to the executables
Fixes 1. The windows are now resized at startup to fit the screen better. 2. The camera is automatically adjusted at startup to fit Ebba’s face. 3. Various fixes and code-cleanups (also somewhat improved error-handling).