Appendix

  • Uploaded by: Eyal Geron
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Appendix as PDF for free.

More details

  • Words: 25,741
  • Pages: 46
Appendix A

INNOVATIVE METHOD FOR MOTOR LEARNING AND NEUROLOGICAL REHABILITATION By Prof. Avi Ohry August 2008 In the last five years, “Bionics” company has developed a method for motor learning, applied by means of a unique robotic technology, thus shortening the time needed for motor learning by patients with neurological injuries by half, as compared to task exercising performed freely, or aided by existing treatment means. Existing methods applied to aid motor learning in general, and facilitate neurological rehabilitation in particular, are based on verbal or physical guidance, requiring capabilities of cognitive pre-processing in understanding mistakes or “corrective” actions by the patient. The method behind the innovative technology introduced by “Bionics” is based on a primary and basic learning phenomenon inherent in animals and humans that takes place during adjustment to a new environment of forces. In practicing this method, while performing motor task exercises, forces proportional to the patient’s mistakes are applied to him, enhancing his mistake. These forces stimulate a primal instinct to adjust and oppose the “mistake enhancing” forces while applying “corrective” forces, which are updated in the motor memory just like any exercise experience. The primary advantage of this method is in stimulating automatic response with no preliminary cognitive processing, which is necessary in traditional motor learning methods and specifically in neurological rehabilitation, and which is not necessarily intact in patients with neurological damage. This innovative technology, which combines computerized robotics with virtual reality for optimal motor challenge, will lead the developing trend in rehabilitative robotics, which has been characterize d, so far, by automation of existing methods used in rehabilitation, while allowing for independence and multiple practice cycles. The innovative method and technology adapts itself, automatically or by design of the rehabilitative system, to the perception and optimal gradual progress of patients with various types of impairments. Clinical trials were carried out at “Reut” Medical Center on a group of patients with neurological damage, mostly following a stroke. The research findings demonstrated improvement in motor learning and reduction of learning time by more than half the time compared with the standard rehabilitative process, convenient and efficient interfacing to the method and the system, and quick and easy integration in the comprehensive rehabilitation process. Developing the method and the system, combined with findings of the clinical research, give rise to a scientific breakthrough with ramifications of almost unprecedented scale, constituting a revolutionary step in neurological rehabilitation in particular, and in motor learning in general, for many varied fields combining motor learning and motor skills, such as motor development in children, various sports fields, as well as simulators for fields such as aviation, security and art.

Appendix B

Motivating Rehabilitation by Distorting Reality

Motivating Rehabilitation by Distorting Reality* James Patton, Yejun Wei, & Chris Scharver

Robert Scheidt

Robert V. Kenyon

Sensory Motor Performance Program Electronic Visualization Laboratory Dept. of Biomedical (SMPP) (EVL) Engineering Marquette University University of Illinois at Chicago Rehabilitation Institute of Chicago Milwaukee, Wisconsin, USA Chicago, Illinois, USA Chicago, Illinois USA 60611 [email protected]

Abstract – We have found, through a series of recent experiments, encouraging evidence that the neuromotor system is motivated to change motor patterns when exposed to visuo-motor tasks. We have also shown that the learning of these tasks can be heightened with forces and/or visual distortions that appropriately manipulate the error. This process does not require intense concentration and it is often considered a game. We describe the next generation of dimensional three large-workspace, robotic haptics/graphics systems for rehabilitation. Index Terms – learning, adaptation, rehabilitation, human, stroke.

capabilities of a robot certainly allow for massed practice while simultaneously logging progress. Moreover, the human brain and spinal cord remain modifiable, even in the adult, and even following many brain injuries [16-20]. This neuroplasticity indicates that the structure and function of the brain can be altered continuously in response to sensory stimulation and changing physical environments. Plasticity is a pivotal element of neuroscience and rehabilitation, since it is likely to be the primary mechanism that underlies recovery from chronic neurological illness. Devices that encourage and facilitate plasticity can also be used with drugs that further enhance the effects. Thus, it makes good sense to study new and more efficient treatment involving technology, robots, and virtual reality.

I. INTRODUCTION The emergence of new robotic devices designed to interface with humans has led to great strides in both fundamental and clinical research on the sensory motor system. Research has recently answered questions relevant to rehabilitation, haptics (the study of artificially rendering touch), motor control, and human-machine interactions. Most importantly these devices have shown how humans adapt under altered environmental conditions [1-8]. Here we focus on experiments and technology to harness the adaptive process for rehabilitation. The recovering nervous system, such as in an individual who has suffered a stroke, is an excellent candidate for such adaptive training. The surviving stroke population in the US is over 3 million and growing [9], and roughly onethird of all individuals who experience a stroke will have some residual impairment of the upper extremity [10]. Labor costs for rehabilitation comprise roughly 60 to 70% of what the U.S. spends – about $30 billion per year [11]. If new technology could remove just 5% of the labor costs on 10% of only the largest population (stroke survivors, about 30%), the savings would be $300 million. Although Medicare's 2001 incentives are for shortening the length of stays and of therapy in hospitals, recent studies support intensive therapy or “massed practice” for stroke survivors [12, 13] and the constraint of the less-effected limb [14, 15]. It would appear that the tireless, precise, and swift

* This work is partially supported by American Heart Association 0330411Z, NIH R24 HD39627, NIH 5 RO1 NS 35673, NIH F32HD08658, NSF BES0238442 and the Falk Trust.

Fig. 1. Planar manipulandum robot. Forces are monitored with a load cell at the handle (ATI F/T Gamma30/100) and encoders record position (Teledyne Gurley 25/045-NB17-TA-PPA-QAR1S). Motors (PMI JR24M4CH) render forces at the subject’s hand.

New technology has made possible many new and imaginative possibilities for promoting adaptation. Robotic systems can be programmed to go far beyond the initial idea of limb guidance or making the physical system easier to manage (though these programs are important as well). Recent research suggests that making conditions more difficult can trigger functional recovery [21-26] and can “trick” the nervous system into certain behaviors by giving altered sensory feedback [27-33]. Interestingly, this adaptive process appears to bypass conventional learning mechanisms that require intense concentration -- results are the same if there is conversation or background music, and it is often considered a game. The sections that follow present two examples that have shown promise for engaging and motivating recovery of function in individuals that have suffered a neurological injury.

Baseline

Early training

II. ERROR AUGMENTATION Several recent robot experiments on both healthy and on stroke survivors have revealed the encouraging result that improvements occurred when the training forces tended to magnify errors but not when the training forces reduced the errors or when the forces were not present at all [17]. This led us to further investigate by custom designing a force that was proportional to the error the subjects initially made [23, 34]. During training, the force amplified their initial error, but resulted in beneficial outcome (Fig. 2). A few (3 of the 13) subjects did not preserve their beneficial after-effects to end of the experiment, (i.e., they de-adapted much like healthy people do in such experiments), but the remaining majority of subjects preserved their benefit for 75 more movements, much longer than healthy people typically de-adapt (Fig. 3). We are currently working on a follow up study that involves repeated visits to determine retention and incremental gains. An another experiment in healthy subjects focused specifically on the type of on the error augmentation strategy [35]. This revealed new insights for robotic teaching. Four groups of subjects that each trained on the planar robot (Fig. 1) with different types of error augmentation. Trajectory error from the ideal straight-lined movement were amplified on the visual display with a gain of *2, by *3.1 or by an “offset” -- a shift in their trajectory that did not depend on the current error. We found that error-augmentation improved the rate and extent of motor learning of the visuomotor rotation (Fig 3). Furthermore, our results suggest that both error amplification and offset-augmentation may facilitate neuro-rehabilitation strategies that restore function in brain injuries such as stroke. Interestingly, increasing the amount of error augmentation so that it is too large appears to diminish the benefits (*3.1 in Fig. 3). There appears to be several ways that error augmentation is successful in speeding up learning. More experiments are needed to identify optimal conditions that capitalize on this phenomenon.

After-effects After-effects

Final Final

Fig. 2. Motions to one of the targets of a stroke patient in successive critical phases of an experiment. The thick lines represent average motion; thin lines represent individual reaching motion paths. Shaded areas are 95% confidence intervals; and dotted lines indicate ideal trajectories. Training forces (green arrows) were specially designed to cause a beneficial after effect. Although these forces were turned off for 120 movements after training (about 15 minutes) – 8 of the 11 stroke subjects retained the benefits (in the Final Phase) much longer than a healthy subject would have retained any adaptation effect. [Adapted from [34]].

In summary, distortions that reshape the visual (via a display) and mechanical (via a robot) experience can be designed to amplify error, and they result in desired changes in the motor learning process. These results led us to a new family of technology that takes these “testbed” experiments on the simple haptic/graphic display system to a more to functionally relevant, large workspace, three train can that system haptic/graphic dimensional individuals on everyday tasks.

Time constant of error reduction

Offset

*2

*3.1

Control (*1) )

Movements

100

0 Fig. 3. Time constants of error reduction in healthy subjects (mean of subjects) for different types of error augmentation. The *2 & Offset groups learned in half the time. Error bars indicate the 95% confidence interval for all subjects in the group.

III. AUGMENTED REALITY TECHNOLOGY THAT ENGAGES THE PATIENT Much of this research has been constrained by the limitations of available technologies. Most systems are small with one or two degrees of freedom and hence do not allow the complex behavior seen in everyday tasks. They involve a visual display that often does not realistically overlay the actual motion. Recent research also supports “task-specific activity for rehabilitation,” in which motions relevant to activities of daily living should be part of recovery [16, 36]. In order to achieve significant advances in the diverse fields, the next generation of humaninterface robots must be strong, large, three dimensional, safe, backdrivable (i.e., allow the user to easily push back) and have an accompanying three-dimensional visual interface. Our current development work focuses on such a system [37]. The Virtual Reality and Robotic Optical Operations Machine (VRROOM). VROOM is an integrated system combining virtual reality graphics environment, haptic robotic force feedback, and tracing of limb segments using a magnetic tracking system (Fig. 4). The system’s primary component is the visual display system, the Personal (PARIS), System Immersive Reality Augmented developed in the Electronic Visualization Lab at the University of Illinois at Chicago. PARIS is currently the highest quality see-through augmented display system available. Most virtual reality displays are computationally burdened by rendering an environment with objects that in the end often do not look that real. Consequently a display that is slow with long latencies can hamper performance even in the healthy [38-43] and cause motion sickness [44]. Furthermore, when one also is controlling a haptic robotic device, delays can lead to catastrophic instabilities [45, 46]. Our focus along with others [47-49] is on reducing the amount of processing. PARIS projects stereographic images onto a half-silvered mirror, allowing users to view virtual objects superimposed onto the real

world. Through adjusting the relative lighting levels under the mirror, subjects are able to view their own limb and the actual environment, with only the artificial virtual elements that are needed [50]. Special design attention is given to brightness (luminosity), field-of-view, and resolution. A cinema-quality digital projector (Christie Mirage 3000 DLP) displays the images over five-foot-wide 1280x1024 pixel image resulting in a 110º viewing angle. Infra-red emitters synchronize the display of separate left and right eye images through LCD shutter glasses. The VRROOM system also integrates an Ascension Flock of BirdsTM magnetic tracking system that tracks head position so that the visual display is rendered with the appropriate viewer-centered perspective. The magnetic tracking system currently uses two sensors to track other body segments with continuous position and orientation information. We propose to purchase two more sensors so that head, back/trunk, shoulder, upper, and lower arms segments can all be tracked. It is important to note our tests have shown that neither the aluminum parts of the PARIS system nor the electromagnetic radiation from the motors of the PHANToM distort the readings of the magnetic tracking system. The VRROOM system also integrates several robotic arms that suit different needs for generating end-effector forces or motions on varying scales. Two PHANToM robots (the Omni or the larger 3.0) provide a workspace measuring up to 900 x 900 x 300 mm with a maximum continuous force of 3 Newtons (N) with transient peaks of runs controller hardware-resident The N. 22 asynchronously with the computer, assuring stable, uninterrupted control. The WAM (Barrett Technologies) can be used for strong impedance control applications that require precisely controlled forces and torques. Finally, the Haptic Master (FCS technologies) can be used for strong admittance applications that require precisely controlled motions. IV. SUMMARY, DISCUSSION AND CONCLUSIONS This paper discusses adaptive training to teach movements that does not require explicit instruction or a large amount of attention, and can provide motivation simply by heightening the error and providing an immersive and engaging experience. Our experimental results all point to a single unifying theory: the judicious manipulation of error (through forces and/or visual distortions) can lead to lasting desired changes by inducing adaptation. Interestingly, this process appears to bypass conventional learning mechanisms that require intense concentration -- results are the same if there is conversation or background music, and it is often considered a game. Based on ours and others’ inspirational studies, these systems inevitably should lead the way to new clinical practices and commercialization. There are several possible causes for why not all patients appear to retain the benefits of adaptive training. First, it is possible that there are secondary, chronic contractures in the peripheral passive tissues, common in chronic stroke survivors. Such causes cannot be attributed

to faulty motor programming and hence cannot be manipulated using adaptation. Second, shifts in movement patterns may have been “buried in the noise” of motor variability because higher variability is common for stroke survivors [17, 51-54].

learning process [19, 59-62]. Hence, going beyond virtual reality to distorted reality such as error augmentation is currently of great interest to our group [37]. ACKNOWLEDGMENT Thanks to Sandro Mussa-Ivaldi for his insights and participation in some of the studies mentioned. The authors thank Xun Lou for his assistance in some of the computer science and William Townsend of Barrett Technologies for some of his input on robotic theory. REFERENCES

IV. DISCUSSION Fig. 4 Design concept of the VRROOM system, and the actual system being used. The subject should be able to either stand or sit in front of a large-workspace, 3-D robotic device and an accompanying 3d display that allows the user to also see their own limb.

The reasons why adaptive training shows promise are not yet clear. One possible reason is that stroke survivors have fewer remaining motor pathways, and their new descending motor command signals are only a subset of their pre-injury signals, and are therefore inappropriate. Adaptive training may bring about a “motor epiphany,” much like how a coach gets an athlete to try the proper strategy. It is also possible that spastic activity can be reduced by such repeated training. Another possible reason may be that such learning is implicit, bypassing the areas of the brain that are affected by the injury. Implicit learning involves more primitive neural pathways [55-57], with more automatic recall [58]. Finally, the impaired nervous system may require larger errors to begin to change, and adaptive training may “wake up” the learning process. Intensifying error also leads to larger signal-tonoise ratios for sensory feedback and self-evaluation. In many artificial neural networks, the error signal drives the

[1] M. A. Conditt, F. Gandolfo, and F. A. Mussa-Ivaldi, "The motor system does not learn the dynamics of the arm by rote memorization of past experience," Journal of Neurophysiology, vol. 78, pp. 554560, 1997. [2] R. Shadmehr and F. A. Mussa-Ivaldi, "Adaptive representation of dynamics during learning of a motor task," Journal of Neuroscience, vol. 14, pp. 3208-3224, 1994. [3] R. Shadmehr and Z. M. Moussavi, "Spatial generalization from learning dynamics of reaching movements," Journal of Neuroscience, vol. 20, pp. 7807-15, 2000. [4] D. M. Wolpert, Z. Ghahramani, and M. I. Jordan, "Are arm trajectories planned in kinematic or dynamic coordinates? An adaptation study," Experimental Brain Research, vol. 103, pp. 46070, 1995. [5] R. Osu, E. Burdet, D. W. Franklin, T. E. Milner, and M. Kawato, "Different mechanisms involved in adaptation to stable and unstable dynamics," J Neurophysiol, vol. 90, pp. 3255-69, 2003. [6] K. E. Novak, L. E. Miller, and J. C. Houk, "Features of motor performance that drive adaptation in rapid hand movements," Experimental Brain Research, vol. 148, pp. 388-400, 2003. [7] T. E. Milner, "Adaptation to destabilizing dynamics by means of muscle cocontraction," Experimental Brain Research, vol. 143, pp. 406-16, 2002. [8] D. W. Franklin and T. E. Milner, "Adaptive control of stiffness to stabilize hand position with large loads," Exp Brain Res, vol. 152, pp. 211-20, 2003. [9] J. Broderick, S. Phillips, J. Whisnant, W. O'Fallon, and E. Bergstralh, "Incidence of rates of stroke in the eighties: the end of the decline in stroke," Stroke, vol. 20, pp. 577-582, 1989. [10] C. Gray, J. French, D. Bates, N. Cartilidge, O. James, and G. Venables, "Motor recovery following acute stroke," Age Aging, vol. 19, pp. 179-184, 1990. [11] D. B. Matchar, P. W. Duncan, G. P. Samsa, J. P. Whisnant, G. H. DeFriese, D. J. Ballard, J. E. Paul, D. M. Witter, Jr., and J. P. Mitchell, "The Stroke Prevention Patient Outcomes Research Team. Goals and methods," Stroke, vol. 24, pp. 2135-42, 1993. [12] J. Sivenius, K. Pyorala, O. Heinonen, J. Salonen, and R. P, "The significance of intensity of rehabilitation of stroke - a controlled trial," Stroke, vol. 16, pp. 928-31, 1985. [13] E. Taub, G. Uswatte, and R. Pidikiti, "Constraint-Induced Movement Therapy: a new family of techniques with broad application to physical rehabilitation--a clinical review. [see comments]," Journal of Rehabilitation Research & Development, vol. 36, pp. 237-51, 1999. [14] E. Taub, N. Miller, T. Novack, E. d. Cook, W. Fleming, C. Nepomuceno, J. Connell, and J. Crago, "Technique to improve chronic motor deficit after stroke," Archives of physical medicine and rehabilitation, vol. 74, pp. 347-354, 1993. [15] R. J. Nudo, "Recovery after damage to motor cortical areas," Current Opinion in Neurobiology, vol. 9, pp. 740-7, 1999. [16] R. J. Nudo and K. M. Friel, "Cortical plasticity after stroke: implications for rehabilitation," Revue Neurologique, vol. 155, pp. 713-7, 1999. [17] J. L. Patton, M. E. Phillips-Stoykov, M. Stojakovich, and F. A. Mussa-Ivaldi, "Evaluation of robotic training forces that either enhance or reduce error in chronic hemiparetic stroke survivors," Experimental Brain Research, vol. In press, 2005.

Appendix C

Evaluation of robotic training forces that either enhance or reduce error in chronic hemiparetic stroke survivors

Exp Brain Res (2006) 168: 368–383 DOI 10.1007/s00221-005-0097-8

RESEA RCH ARTICLE

James L. Patton Æ Mary Ellen Stoykov Æ Mark Kovic Ferdinando A. Mussa-Ivaldi

Evaluation of robotic training forces that either enhance or reduce error in chronic hemiparetic stroke survivors

Received: 9 March 2004 / Accepted: 21 June 2005 / Published online: 26 October 2005 Springer-Verlag 2005

Abstract This investigation is one in a series of studies that address the possibility of stroke rehabilitation using robotic devices to facilitate ‘‘adaptive training.’’ Healthy subjects, after training in the presence of systematically applied forces, typically exhibit a predictable ‘‘after-effect.’’ A critical question is whether this adaptive characteristic is preserved following stroke so that it might be exploited for restoring function. Another important question is whether subjects benefit more from training forces that enhance their errors than from forces that reduce their errors. We exposed hemiparetic stroke survivors and healthy age-matched controls to a pattern of disturbing forces that have been found by previous studies to induce a dramatic adaptation in healthy individuals. Eighteen stroke survivors made 834 movements in the presence of a robot-generated force field that pushed their hands proportional to its speed and perpendicular to its direction of motion — either clockwise or counterclockwise. We found that subjects J. L. Patton (&) Sensory Motor Performance Program, Rehabilitation Institute of Chicago, Physical Medicine & Rehabilitation, Mechanical and Biomedical Engineering, Northwestern University, 345 East Superior St., Room 1406, Chicago, IL 60611, USA E-mail: [email protected] Tel.: +1-312-2381232 Fax: +1-312-2381232 M. E. Stoykov Æ M. Kovic Sensory Motor Performance Program, Rehabilitation Institute of Chicago, Occupational Therapy, University of Illinois at Chicago, Chicago, IL, USA E-mail: [email protected] E-mail: [email protected] F. A. Mussa-Ivaldi Sensory Motor Performance Program, Rehabilitation Institute of Chicago, Physiology, Physical Medicine & Rehabilitation, Institute for Neuroscience, Mechanical and Biomedical Engineering, Northwestern University, Chicago, IL, USA E-mail: [email protected]

could adapt, as evidenced by significant after-effects. After-effects were not correlated with the clinical scores that we used for measuring motor impairment. Further examination revealed that significant improvements occurred only when the training forces magnified the original errors, and not when the training forces reduced the errors or were zero. Within this constrained experimental task we found that error-enhancing therapy (as opposed to guiding the limb closer to the correct path) to be more effective than therapy that assisted the subject. Keywords Human Æ Motor learning Æ Adaptation Æ Control Æ Force fields Æ Robots Æ Haptics Æ Human– machine interface Æ Teaching Æ Rehabilitation Æ Stroke Æ Hemiparesis Æ Impairment Æ Lesion Æ Cortex

Introduction It is appealing to consider using machines for the rehabilitation of brain-injured patients. Machine-assisted training can be highly accurate, can be sustained for very long periods of time, can measure progress automatically, and can produce a wide range of forces or motions. Repetitive practice of the impaired limb has been shown to be beneficial in improving functional ability (Wolf et al. 1989; Taub et al. 1993, 1999; Taub 2000). Beyond the recommended therapy that strengthens and stretches (Delisa and Gans 1993) lies the possibility of neurofacilitation, or neuromuscular re-education through techniques that incorporate our knowledge of the neural circuitry. An important question is which method, among the repertoire of possibilities, is best for motor recovery? Encouraging research (Patton et al. 2001a, b; Dancausea et al. 2002; Patton and MussaIvaldi 2003; Takahashi and Reinkensmeyer 2003) suggests one method may be adaptive training, in which the natural adaptive tendencies of the nervous system are used to facilitate motor recovery. This paper investigates

a critical question of the feasibility of adaptive training: Do stroke survivors preserve their ability to adapt? We also address the question of which type of training forces are best and whether the effects of adaptation last after the forces are removed. There have been a few promising preliminary studies on neurorehabilitation using mechatronic and robotic devices. A two-degree-of-freedom (DOF) robot manipulator (similar to the one used in this study) was used to train stroke survivors in shoulder and elbow movement by moving the hand and forearm of the patient in the horizontal plane (Krebs et al. 1998b, 2000; Volpe et al. 2000, 2001). This is an assistive form of therapy that guides the arm along the desired path and is different from the strategy presented in this paper. Clinical testing of assistive training has been underway for several years, and results have shown improved patient performance (Krebs et al. 1998b) with benefits lasting more than 3 years (Krebs et al. 1999b). This training has led to increased clinical scores and greater gains in proximal arm strength and greater recovery of functional independence (Volpe et al. 2000). An industrial robot (Puma 560) was also used to apply forces to the paretic limbs of stroke survivors through a customized forearm attachment (Burgar et al. 2000; Lum et al. 2002). The robot could move the limb to a target, applying spring-like forces toward the target or mirror the contralateral limb movements. These results provide convincing evidence that supplemental robotic therapy can improve recovery. They do not, however, indicate which type of robotic treatment offers the greatest advantage. Previous approaches have attempted to mimic the actions of the therapist by using a robot to apply error-decreasing, assisting forces. In fact, Kahn and colleagues (Kahn et al. 2001) suggested that reducing errors during reaching training with a robotic device does not provide any added benefit compared to repetitive reaching training where errors were allowed. This paper tests an alternative approach that enhances error and can only be implemented by a computercontrolled device. Interestingly, several theories have been proposed for clinical treatment. Some sources suggest that providing manual guidance during reaching may facilitate rehabilitation (Bobath 1978). Other theories advocate using a component of resistance in a direction opposite to movement during diagonal reaching patterns (Voss et al. 1985). Although these approaches are in some ways mutually exclusive, their efficacy has not been tested objectively, and the most effective rehabilitation algorithm(s) have yet to be determined. New techniques are currently being explored. For example, one possible technique is to provide assistance by guiding (pulling) the hand toward the desired trajectory (Volpe et al. 1999; Lum et al. 2002). Another possible technique is to provide resistance by either opposing the hand as it moves (Stein et al. 2004), or by imposing forces that amplify the error. The latter method is justified by the observation that movement

error is likely to be a driving signal for adaptation and learning (Rumelhart et al. 1986; Lisberger 1988; Kawato 1990; Dancausea et al. 2002). In a walking study, subjects significantly reduced the time required to predict the applied force field by approximately 26% when the field was transiently amplified (Emken and Reinkensmeyer 2005). Others have also emphasized augmented or amplified error in the therapeutic process. (Winstein et al. 1999; Brewer et al. 2004; Emken and Reinkensmeyer 2005). However, for such an error-enhancing, adaptive technique to work, the patient’s ability to adapt must be preserved following the injury. Recent studies of motor adaptation in healthy individuals have demonstrated the excellent potential of the natural adaptive process in altering motor patterns. When people are repeatedly exposed to a robot-generated force field applied to the hand (forces as a function of hand position and/or hand velocity) that systematically disturbs limb motion, they are able to recover their original kinematic patterns over a short period of practice (Shadmehr and Mussa-Ivaldi 1994). Subjects do this by cancelling the disturbance with an appropriate preplanned pattern of forces. This is a form of feedforward control that is revealed by characteristic after-effects: when the disturbing force field is unexpectedly removed, subjects make erroneous movements in directions opposite to the perturbing forces. Adaptation and its related after-effects have been demonstrated for different types of force fields, ranging from simple position-, velocity-, and acceleration-dependent force fields (Bock 1990; Flash and Gurevitch 1992; Shadmehr and Mussa-Ivaldi 1994; Gandolfo et al. 1996; Conditt et al. 1997) to Coriolis forces caused by moving in a rotating room (Lackner and DiZio 1994) to skew-symmetric ‘‘curl’’ fields that produce forces in a direction perpendicular to the velocity of the hand (Gandolfo et al. 1996). Similar results have also been observed after manipulations of visual perception that altered the visual feedback of movement (Held and Freedman 1963; Miall et al. 1993; Pine et al. 1996; Krakauer et al. 1999). More recent studies support the view that subjects adapt by learning the appropriate internal model of the perturbing force field rather than learning an appropriate temporal sequence of muscle activations (Gandolfo et al. 1996; Conditt et al. 1997). Using this internal model, subjects are able to predict the effects of the external field along a desired movement and use a feedforward control strategy (also called anticipatory control) (Hemami and Stokes 1982; Ghez 1991). Modeling techniques have been successful in predicting both how the arm is disturbed by a force field and the aftereffects of training (Shadmehr and Mussa-Ivaldi 1994; Kawato and Wolpert 1998; Bhushan and Shadmehr 1999). If this is true, one possible method for rehabilitation may be to use models to design the appropriate force field that will result in beneficial after-effects. Again, this would only work if stroke survivors can

Table 1 Subject characteristicsVertical bars on the left side indicate the subjects that made repeat visits to the lab and participated in zeroforce experiments

adapt. Furthermore, after-effects would also have to be permanently retained for this approach to have rehabilitative significance. There are several recent studies providing encouraging evidence that the ability to adapt and exhibit after-

effects is preserved following cortical stroke. Reaching in individuals with stroke is characterized by errors that reflect their poor ability to manage the interaction torques (Beer et al. 2000). Aspects of reaching in cortical stroke survivors resembles that of cerebellar patients

(Bastian et al. 1996). Adaptation and after-effects in stroke survivors can be observed in the oculomotor (Weiner et al. 1983) and limb-motor systems (Raasch et al. 1997; Dancausea et al. 2002). In fact, prism adaptation has been shown to trigger the recovery from hemispatial neglect following stroke (Rossetti et al. 1998). Stroke-related damage in the sensorimotor areas appears to effect the processes underlying the control and execution of motor skills but not the learning of those skills (Winstein et al. 1999). Recently, stroke survivors showed the ability to adapt to small elbow motions that were disturbed by a spring-like force disturbance (Dancausea et al. 2002). However, severely affected individuals used atypical correction strategies. Another recent study evaluated the ability to adapt to a sideways force during forward motions and found that the after-effects were less evident in stroke survivors with more severe impairment (Takahashi and Reinkensmeyer 2003). Furthermore, preliminary studies in our laboratory on stroke survivors have revealed that the aftereffects may persist longer when the after-effects resemble healthy unperturbed movements (Raasch et al. 1997; Patton et al. 2001a, b). While conventional skill learning, such as learning to play a musical instrument, requires more conscious attention in order to achieve a goal, neuromotor adaptation has been argued to be closely related to procedural learning and thus to be a form of implicit learning (Krebs et al. 2001). Hence, these learning mechanisms may offer an effective alternative to conventional methods of rehabilitation. Implicit learning takes place without awareness of what has been taught (Squire 1986), and often does not require complete conscious attention. One example is procedural motor learning of a motor sequence that is embedded in a seemingly random set of movements (Fitts 1964; Squire 1986; Gomez Beldarrain et al. 1999; Seidler et al. 2002). Another example is sensory-motor adaptation observed in force field paradigms (Krebs et al. 2001). Following unilateral stroke, recent evidence suggests that learning is facilitated by providing explicit information about the task that can enhance implicit motor learning (Boyd and Winstein 2001, 2003). If one could demonstrate that stroke subjects readily adapt to force training, and that beneficial after-effects persist, a new family of rehabilitation strategies would emerge. To begin exploring the possibility of exploiting implicit learning mechanisms for poststroke rehabilitation, this paper explores the features of motor adaptation in chronic stroke survivors during the execution of planar multijoint movements that are disturbed by a force field. We studied six movement directions using a two-joint planar robotic device, exposing hemiparetic stroke survivors and healthy age-matched controls to a force field that is commonly known to induce unequivocal adaptation in healthy individuals. We found that stroke survivors do adapt, albeit at a diminished level compared to healthy controls, and this capacity to adapt was not related to clinical scores of motor impairment.

Furthermore, we found that final improvements were most evident when the training forces magnified rather than reduced the original error. This study provides encouraging evidence that adaptive training could provide an effective supplement to conventional therapy. This research was presented in preliminary form at the Society for Neuroscience meeting, 2001 (Patton et al. 2001b).

Methods Experiments Research was approved by the Northwestern University Internal Review Board to conform to ethical standards laid down in the 1964 Declaration of Helsinki and federal mandates that protect research subjects. Before beginning, each subject signed a consent form that conformed to these Northwestern University guidelines. Twenty-seven stroke survivors, aged 30–72 years (mean age 51), and four healthy controls, aged 32–61 years (mean age 47) volunteered to participate (Table 1). All stroke participants were in the chronic stage, having suffered a stroke 16–173 months prior to the experiment. Our exclusion criteria were: 1) bilateral impairment, 2) severe sensory deficits in the limb, 3) aphasia, cognitive impairment or affective dysfunction that would influence the ability to comprehend or to perform the experiment, 4) inability to provide an informed consent, and 5) other current severe medical problems. Eight additional subjects were recruited whose data did not reach final analysis (not shown in Table 1). Of these, one chose to abort the experiment due to his own frustration (SA15); one healthy subject chose to quit because of time constraints (SA23); two healthy controls and three stroke patients had lost data due to technical problems with the robotic device or data collection (SA26, SA30, SA31, SA39, and SA 44); one stroke subject had such a poor elbow movement that she was not capable of completing the experiment (SA41). Subjects held the free-extremity (here referred to as the ‘‘endpoint’’) of a DOF robot (Fig. 1) described elsewhere (Conditt et al. 1997; Scheidt et al. 2000). Endpoint forces and torques were monitored with a sixdegree-of-freedom load cell fixed to the handle of the robot (Assurance Technologies Inc., model F/T Gamma 30/100). The robot was equipped with position encoders that were used to record the angular position of the two robotic joints with a resolution exceeding 20 arc/s of rotation (Teledyne Gurley, model 25/045-NB17-TAPPA-QAR1S). The position, velocity and acceleration of the handle were derived from these two signals. Two torque motors were used to apply programmed forces to the hands of the subjects (PMI Motor Technologies, model JR24M4CH). Subjects were seated so that the center of the range of targets — lying approximately at the center of their reachable workspace — was aligned with the shoulder,

in the proximal-distal direction (y-axis) (Fig. 1, right). The experiment involved only the hemiparetic limbs of the stroke subjects, and this corresponded to nondominant limbs in 17 of the 27 subjects (See Table 1). Subjects were asked to reach visual targets so that they made a series of random 10 cm movements to the vertices of a triangle. If subjects had difficulty in reaching the vertices of the triangle, we adjusted their chair position. To avoid fatigue, their elbow and forearm rested on a lightweight frictionless linkage (Fig. 1, left), and they could choose to rest between movements (subjects rarely rested longer than a few seconds every hundred movements). We controlled for a peak speed of 0.288 m/s by giving subjects feedback at the end of each movement using colored dots and auditory tones to let subjects know if they were going too fast, too slow, or within a range of ±0.05 m/s. Consequently, subjects’ speeds remained roughly constant throughout the entire experiment. All subjects performed a total of 834 movements (trials), broken down into the following experimental phases: – Unperturbed familiarization: 60 movements (approximately 5 min) to become familiar with the system and with the task of moving the manipulandum. – Unperturbed baseline: 30 movements (approximately 2 min) to establish a baseline pattern of reaching movements. – Learning: 372 movements (approximately 25 min) with constant exposure to the ‘‘curl’’ force field, governed by: F ¼

0 A

A x_ 0

Fig. 1 Subjects positioning and the experimental apparatus. Two brushed DC torque motors (PMI model JR24M4CH, Kolmorgen Motion Technologies, NY, USA) control forces at a handle via a 4-bar linkage. Rotational digital encoders (model 25/045NB17-TA-PPA-QAR1S, Teledyne-Gurley, Troy, NY, USA) report absolute angular position, and a 6-axis force/ torque sensor reports the interface kinetics (Assurance Technologies, Inc., TI F/T Gamma 30/10, NC, USA). A PC acquires the signals and controls torque

where F is the vector of forces, x_ is the vector of instantaneous velocity, and A is the gain. With this type of field, the forces are always orthogonal to the velocity of the hand and form either a clockwise or counterclockwise circulating pattern in the space of hand velocities. This phase of 372 trials was subdivided into a block of 30 trials (five in each direction) at the beginning and end of training for statistical analyses. After a second series of 240 trials, there was a rest period of approximately 1 min while data collection equipment was reset, followed by a block of 72 trials and then the final 30 trials (see Figs. 3 and 4). During the learning phase, half of the subjects experienced A = 20 NÆs/m (corresponding to clockwise forces), and half experienced A = 20 NÆs/m (corresponding to counterclockwise forces) (c.f. Table 1). Some of these subjects also served as their own controls as subjects in the zero-force group (Table 1). Prior to experiencing any force field, some visited the lab and performed the same experiment without any forces applied. These subjects were used to determine if practicing the motion alone (without any forces) led to a beneficial outcome. – After-effects: 240 movements (approximately 16 min), with random, intermittent removal of the force field for one in eight of the trials (catch trials) to determine the after-effects. – Training refresher: Twelve movements (approximately 1 min), identical to the learning Phase. – Washout: 120 movements (approximately 8 minutes), all without forces.

a. Unperturbed baseline

d. After-effects

b. Early training

c. Fi nal training

e. Final washout

(Stroke subject sa38)

Fig. 2 Movement paths for a mildly impaired stroke subject in successive phase of the experiment. For clarity, the starting points of the triangle pattern in Fig. 1 were shifted to display all starting points at the center. The ideal trajectories are the bold dotted lines, the average trajectories are represented as bold solid lines, and individual trajectories are thin lines. The subject first performs without force disturbances (a), and then experiences a prolonged training (b and c). The training forces are then turned off

intermittently in catch trials to test for after-effects (d). Finally, subject moves for 120 movements without forces, and the aftereffects ‘‘wash out.’’ Results from the final 15 movements of the washout phase are shown in (e). Arrows indicate the movement direction that showed the largest error in the baseline trials. This movement error was amplified by the force field during training, but resulted in a reduction of error following training (d) that was sustained until the end of the experiment (e)

Subjects were also required to take breaks (approximately 1–2 min) after movements 307 and 650 so that our data collection equipment could be reset. The movements in each direction were divided equally in each phase. At all times during the experiment, an additional set of ‘‘background’’ torques was generated to remove the inertial effects of the robot arm linkage, resulting in the feeling of movement on a slippery surface when the force field was not present. Motion and force data were collected at 100 Hz.

often make excessively large corrections later in their movements that may depend on earlier errors (Krebs et al. 1999a; Beer et al. 2000). Second, we were primarily interested in the early phase of the movement that best reflects the operation of a feedforward controller based on an internal model of the arm/environment dynamics. Our measure, the initial direction error, reflected this early phase of movement by forming a vector from the start point to 25% of the distance to the target (2.5 cm). This corresponded to approximately the first 200– 300 ms of movement that, if there were no error corrections at the end of the movement, would last about 1.1 s. Positive error corresponded to a counterclockwise rotation from the actual trajectory to the desired trajectory, and zero corresponded to a straight-line to the target. Initial direction error was used for testing our

Analysis We restricted our focus in this study to the earliest parts of movements for two reasons. First, stroke survivors

Fig. 3 Group results for stroke survivors showing error between the actual trajectory and a straight-line movement. Each block of data along the horizontal axis represents a successive phases of the experiment. The training phase in which the force field was applied to the subject are indicated by the light shading in the background. Baseline is compared to after-effects to test our hypothesis that the method shifts trajectories toward the desired, resulting in a significant reduction in error. Dashed lines shaded areas and dashed

lines represent 95% confidence intervals and means, respectively, for the group. Small symbols and vertical thin lines show the individual subject means and 95% confidence intervals. All phases of the experiment that underwent statistical analyses contained 30 trials (5 in each direction). A rest period (approximately 1–2 min) occurred during the learning phase after a second series of 240 trials while data collection equipment was reset

hypotheses on the feedforward controller and also was found to be highly correlated with the perpendicular distance measure used in other adaptation studies (Conditt et al. 1997; Scheidt and Rymer 2000; Thoroughman and Shadmehr 2000). To quantify adaptation, we established the Adaptation capacity, defined as the average shift in initial direction from unperturbed baseline trials to the aftereffects catch trials. All hypotheses were tested using an alpha level of 0.05. We tested for a shift in initial

direction from baseline to after-effects and for the tendency of the after-effects to disappear in the washout phase.

Results We found that both the healthy and the stroke subjects demonstrated a clear ability to adapt when these subjects moved their hands in the force field. The force field

Fig. 4 Group results for the same experimental conditions as in Fig. 3 but performed on a group of healthy subjects

significantly disturbed hand movement (Fig. 2b). Initial direction error did not significantly decrease after 330 movements of practice (Fig. 2; compare b to c). Aftereffects were evident when the forces were removed (Fig. 2d). These results were evident as a group as well (Fig. 3), with a significant shift in initial direction error from baseline to after-effects. As anticipated, we detected no significant change in any subjects’ movement speeds across the phases of the experiment, although peak speeds were slightly but significantly below target for stroke subjects (average of 0.218 m/s, P<0.05) but not for healthy subjects (average of 0.279 m/s). Speeds for some movements were as high as twice the target, resulting in peak forces in these extreme cases as high as 10 N. The maximum speeds 5th, 50th, 95th, and 99th percentiles for stroke patients were: 0.101, 0.201, 0.361, and 0.469 m/s, respectively, and for healthy they were 0.091, 0.255, 0.437, and 0.469 m/s, respectively.

After training, the disturbance was unexpectedly removed and the initial direction error shifted significantly, exhibiting a clear after-effect of adaptation (Fig. 2d). Both stroke and healthy subjects showed a marked shift in their initial direction that was opposite to the direction seen when they were initially exposed to the force field (compare Figs. 3 and 4; compare phases 2–6). On average, the stroke survivors’ limb movements were initially perturbed by the same amount (Fig. 3, Phase 2) as the healthy controls (Fig. 4, Phase 2). In contrast, the after-effect of the stroke subjects was significant (Fig. 3, Phase 6) but it was also significantly smaller than the healthy subjects by about 26% (Fig. 4, Phase 6). All of the healthy subjects’ after-effects were significantly shifted from baseline, and while stroke subjects’ shifts were all in the direction one would expect as an after-effect, only 10 of the 18 had shifts that were statistically significant (P<0.05 in individual t tests).

Stroke subjects as a group, however, showed a significant shift (P<0.05 in a paired t test of subject averages). Note that stroke survivors’ errors during baseline appear similar to those of the healthy subjects (compare Figs. 3 and 4, Phase 1) because these figures display the average overall movement directions. Stroke survivors’ larger errors are revealed only by the larger error bars. Stroke subjects were also more variable within movement directions. Trial-to-trial standard deviations of the initial direction error during the baseline phases was nearly three times higher for the stroke subjects (the average of the stroke subject’s standard deviations was 16.2° compared to 5.8° for healthy; P<0.001). It is important to stress that the goal of this study was to merely test for adaptation and not to correct the trajectories with a specially designed force field. Therefore, some after-effects were in the wrong direction (i.e., the errors were amplified), and some were shifted beyond a straight path to the target. The observed shift in direction from baseline (unperturbed) to after-effects (also unperturbed) is an excellent indication of motor adaptation. Absence of adaptation would result in a zero shift. In contrast, adaptation capacity, as quantified by the amount of this shift, was well above and significantly greater than zero for stroke survivors (in Fig. 5, the wings indicate a 95% confidence interval). Nevertheless, adaptation capacity was slightly but significantly larger for healthy subjects (Fig. 5, wings). We detected no difference in adaptation capacity between the subjects that received clockwise and the subjects that received counterclockwise forces. Stroke survivors’ changes in performance were not as large as that of the healthy subjects for the same amount of practice (compare Figs. 3 and 4), suggesting that longer training might also result in more persistent adaptation in patients. To further explore the possibility that individuals with stroke might have a decreased learning rate, we fit each subject’s learning phase data to a simple exponential model, A þ Be t/C where A is an offset, B the amount of learning (the change of the trajectory errors), C the time constant for the error to decrease (inversely related to rate of learning), and t is the movement number. However, we found no significant differences in either the time constant or the amount of learning between the healthy subjects and individuals with stroke. Adaptation capacity was not found to correlate well with the clinical measures of Elbow Modified Ashworth (r2=0.0013876, P>0.8), Chedoke (r2=2.7943e-006, P>0.9), and the upper extremity portion of the Fugl– Meyer (r2=0.0040592, P>0.8) (Fig. 5b). Hence, it may be difficult to predict a person’s ability to adapt based on these clinical scores. They also fail to show any evidence that the severely impaired individuals lack the ability to adapt.

While Figs. 3 and 5 indicate a clear and positive answer to the question about whether stroke survivors can adapt, these data do not reveal how the training forces might restore function. In an initial attempt to shed some light on this question, our analysis program identified each subject’s directions of largest error — the directions that could be improved upon the most. During training, these movement errors were either magnified or reduced by the forces (depending on how the force field happened to be pushing for each subject and each direction). Movement directions of largest errors (e.g., movements indicated by the arrows in Fig. 2) were only selected if they had significant error to begin with. For each subject, our software selected and analyzed up to two movement directions. The first was the direction that showed the largest initial error that was reduced by the forces, while the second was the direction that showed the largest initial error that was amplified by the forces. If no significant error was present to begin with, the movement was not considered. To determine the amount of error amplification/reduction, we calculated the dot product between the average training force direction and the average movement error direction (horizontal axis on Fig. 6). Positive values of this dot product indicated that the training forces tended to magnify the error, while negative values indicated that the training forces tended to reduce the error. By the end of the experiment (five final movements in each direction), after the forces had been turned off for 120 movements (final washout phase), we found that the relationship between error magnification and the improvement was significant (vertical axis on Fig. 6). Significant improvements occurred only when the training forces magnified the original errors, and not when the training forces reduced the errors or they were zero [F(1,13)=4.29, P<0.001]. Because three of the points were further than two standard deviations from the mean and appeared as outliers (Fig. 6, indicated by small horizontal arrows), we reran the analysis without them, but obtained the same results — error-amplification training resulted in significant improvement and error-reduction training resulted in significant detriment by the end of the experiment. In summary, by restricting our attention only to trials that had significant error to begin with, and then by separating movement directions into error-reducing or error-enhancing training, the evidence suggests that the error-enhancing forces may be more effective than the error-reducing forces for correcting the initial movement direction of the hand.

Discussion This study provides new evidence that stroke survivors retain their ability to adapt their arm movements when they are exposed to an altered mechanical environment (a force field), although at a somewhat diminished level. This is evidenced by the Adaptation Capacity measure

b

a

Fig. 5 a Adaptation capacity, in degrees, for the healthy and the stroke survivor groups, determined by calculating each subject’s average change in initial direction error from the baseline phase and the after-effects phase. Wings represent 95% confidence

intervals of the group. Stroke survivors show a large capacity to adapt, but not as strong as the healthy group. b Correlation of the adaptation capacity (horizontal axes) with three clinical scores that were measured in this study. Each stroke survivor represents a dot

(Fig. 5). Moreover, for movement directions that begin with significant errors, significant improvement occurred only when the training forces magnified the original errors (Fig. 6). With regard to the question of whether stroke survivors can adapt, this study confirms earlier studies conducted by others. Raasch and colleagues (Raasch et al. 1997) performed a preliminary study that provided encouraging evidence for adaptation in stroke survivors. In that study, a force was imposed during hand motion that pushed subjects’ hands away from the desired path with a magnitude proportional to hand speed, a protocol similar to that used in this study. A single-joint study by Dancausea and colleagues (Dancausea et al. 2002) demonstrated the ability of stroke survivors to adapt to a spring-like force. Their results were consistent with

ours in that brain-injured individuals needed more trials to diminish errors and their movements were more variable. Takahashi and Reinkensmeyer have also recently published new results on adaptation after stroke (Takahashi and Reinkensmeyer 2003). Although their study found that the ability to adapt was somewhat diminished in stroke survivors, they found that the ability to adapt was loosely correlated to clinical scores such as the Chedoke Mc-Master score. We found no such correlation in our group of subjects. Several important differences between our study and theirs may account for the discrepancies in our results. (1) Our study investigated mostly shorter movements. Therefore, a larger proportion of the movement we observed was likely controlled by early feedforward (and hence

SA Subjects

Performance Improvement vs Enhancement of error 40

Improvement in initial direction (degrees)

20

0

Training Forces Magnified Errors

20

Zero Forces

Training Forces Reduced Errors 40

1 0.5 0 0.5 1 Direction of Average error dotted with Direction of Average Training Force

Fig. 6 Cross plot of the performance improvement versus error magnification caused by the force field for different movement directions on the stroke survivors. Performance improvement was calculated by measuring the reduction initial direction error from the baseline phase to the final phase of the experiment. Positive represents an improvement in performance. Error magnification was determined by calculating the dot product between the average training force direction and the average movement error direction. Positive values of this dot product indicated that the training forces tended to magnify the error, while negative values indicated that

the training forces tended to reduce the error. Boxes with horizontal centerlines represent the mean and 95% confidence intervals of the three distinct groups: (1) the group in which the error was magnified during training (right), (2) the control group in which error was unchanged because no forces were applied (center), and (3) the group in which error was reduced during training (left). Vertical whiskers extending from the box plots indicate 2-standard deviations from the mean. The diagonal line represents linear leastsquares regression fit of the data shown

internal-model driven) components of control. Takahashi and Reinkensmeyer allowed the movement lengths to vary from subject to subject. Furthermore, in many cases the movements were close to the edge of the subjects’ achievable workspace. These longer and more distal movements may have been effected to a larger

degree by factors such as contractures and spasticity that are unrelated to the presence or absence of internal models. We attempted to minimize these factors in our study by targeting the midregion of the arm’s workspace. (2) While Takahashi and Reinkensmeyer compared movement adaptation between the paretic and the

nonparetic limb, we compare adaptation in paretic subjects to adaptation in healthy and age-matched controls. Indeed, individuals who have suffered a stroke do not perform as well with their unimpaired arm as age-matched healthy subjects, but they tend to show the same abilities to learn (Pohl and Winstein 1999). (3) Takahashi and Reinkensmeyer also did not support the arm against gravity. As those authors suggested, the multiaxis force generation constraints that come into play when generating large forces (perhaps even if those large forces are required in only one direction — against gravity), or the inability to produce force at a high enough rate, may limit adaptation in the most weakened subjects. Strength has been shown to be dramatically influenced by elevating the arm against gravity (Beer et al. 1999). The subjects in the present study did not have to overcome this large gravitational force requirement, and thus were operating in a wholly different region in force space. (4)Takahashi and Reinkensmeyer used a maximum force of 3.5 N, while our robotic forces were typically much stronger, reaching amplitudes as high as 15 N. (5) While our study used perturbing forces that depended upon hand velocity, Takahashi and Reinkensmeyer used a time-dependent force with a constant direction (to the side). Nevertheless, it is possible that their forces may have led to a similar adaptation as ours because when subjects adapt to a time-dependent force, they tend to build an internal representation that is not dependent on time (Conditt and Mussa-Ivaldi 1999). (6) Finally, the most provocative difference is that our protocol enabled us to look at how improvement in performance is influenced by the direction in which the training forces are acting. Although this study did not specifically intend to reduce or magnify error, our preliminary evidence suggests that error-magnifying forces may be most effective. Our study also has some important limitations. First, the apparatus did not allow motion out of the horizontal plane, and the effects of gravity were minimized at all times by an arm support (Fig. 1). A key motor deficit seen in stroke is the inability of the nervous system to counteract gravity while still making targeted movements (Dewald and Beer 2001). An important extension of our approach would be to test adaptation in the context of 3D activities that require supporting the arm against gravity. A second limitation is that the size and simplicity of the movements may not allow us to extend our conclusions to unconstrained and functionally relevant motions. Larger 3D motions, consistent with the activities of daily living would likely be more relevant to the recovery of the most important motor functions. We expect that these shortcomings will be addressed by using stronger full-dimensional robotic systems combined with more advanced visual display. The present study also uses some subjects as their own controls. While we did not detect any difference in these subjects, their increased exposure to the device might have biased the data. Yet another potential problem was that it was

impossible to keep the three groups of subjects fully balanced in all ways. For example, it is possible that the clockwise force group had subjects with slightly higher spasticity (see Table 1). Another limitation is that we did not control for the resting time. In order to prevent fatigue, intimidation, and attentional loss due to boredom, subjects were free to take rests. It remains to be seen whether rests may play a critical part in the the adaptive process. An additional limitation to this study may be that the focus was limited to straightness of the hand path. Making straighter and smoother movements need not to be the only goal or the principal goal of a therapy. Optimal functional recovery for these individuals may be something other than healthy-looking movements. Related to this issue is the breakdown of predictive feedforward control and corrective feedback control. While our measure, initial direction error, does a good job of characterizing the path errors early in each movement (first 25% of the distance to target), it does not directly address the time interval typically associated with human feedforward control and the launching of a movement. The time that the subjects crossed the 25% mark was 0.17±0.04 s (average ± standard deviation), which was significantly different for the healthy subjects at 0.14±0.03 s (two-tailed t test, P<0.05). While these values are still within the range of feedforward control that occurs prior to supraspinal-feedback corrections (0.12 – 0.18 s) (Schmidt 1988), they do not rule out elements of spinal reflex feedback that can be as fast as 0.03 s (Dewhurst 1967). Therefore, initial direction error should not be fully interpreted as a direct measure of feedforward control error. The most important limitation of this study is that only a proper prolonged clinical research study with separate groups of stroke survivors (one group receiving prolonged error-enhancing forces, one group receiving error-reducing forces and one group receiving no forces) is the only appropriate method for evaluating errorenhancement. What is presented here is only compelling preliminary evidence supporting such a strategy. An issue that was not addressed by this study is the likely relation between the neural structures damaged by the stroke and the adaptive performance. Lesion site information was not available for all subjects (Table 1), and in this initial study the data are still too sparse to draw conclusions on the location, extent, and severity of the strokes and how they may have influenced each individual’s results. Adaptability may also be influenced by other factors such as lesion hemisphere, the time since stroke, and the type and dosage of rehabilitation that the subjects received. More data are constantly being acquired, but it may take large numbers of subjects before these factors show any effect. Nevertheless, adaptation was statistically significant in 10 of the 18 individuals with stroke that we tested. While forces that amplified error appeared to benefit, forces that reduced error led to the opposite — aftereffects that increased the initial direction error (Fig. 6).

Hence, errors can be decreased or increased with the right transient perturbation. It could be that the nervous system is simply less concerned with the movement straightness following stroke. Indeed, Krebs and (Krebs et al. 1999a) colleagues have shown that stroke subjects exhibit jerky and multisegmented movements that tend to coalesce into smoother, straighter movements as recovery progresses. It is also important to speculate on the mechanisms that might allow a stroke subject to decrease directional errors with this paradigm that cannot decrease by simple practice alone. Sensory feedback systems may need to detect a stimulus with a magnitude that is large enough to trigger the recovery process. Such distorting interventions that trigger recovery have been shown to be promising in individuals with stroke that suffer from hemispatial neglect (Rossetti et al. 1998). Smaller errors may be imperceptible or considered less important than other aspects of the movement such as getting to the target, conserving energy, or minimizing discomfort. Another possibility is that the nervous system is trying to use motor pathways that are no longer intact, and the learning is a way to trick the nervous system into trying a new and nonintuitive pathway that it would otherwise not ever consider. Such questions will require further study using imaging, transcranial magnetic stimulation, or implants. A final question not answered in this study is whether beneficial after-effects persist beyond the final 120 movements in which the forces were absent. Preliminary studies in our laboratory on stroke survivors have revealed that after-effects persist when these resemble normal, unperturbed movements (Raasch et al. 1997). In fact, after-effects may become permanent if they are perceived by the subject to be an improvement with respect to the initial behavior. However, not all patients may benefit from this type of procedure. Subjects who show poor ability to adapt, such as cerebellar stroke survivors, may have great difficulty dealing with resistive techniques (Weiner et al. 1983; Sanes et al. 1990; Bastian et al. 1996). Moreover, chronic stroke survivors suffering from long-term changes in their muscle systems (i.e., atrophy and tissue shortening) also may not benefit from such techniques. While our force fields shifted the central tendencies of the subjects, they did nothing to change motor variability, which is known to be larger in stroke survivors (Fisk and Goodale 1988). Our study revealed that stroke survivors were more variable both within trials, from movement to movement, and across subjects. It is logical that the nervous system would naturally reduce its learning rate when sensory or motor inconsistencies and uncertainties caused by stroke make it difficult to form an exact internal model. Such is the case in artificial systems that learn (Rumelhart and McClelland 1986). Nevertheless, several studies suggest that increasing trial-to-trial variability with externally applied forces does not impair motor learning rate in healthy subjects (Scheidt et al. 2001; Takahashi et al. 2001; Reinkens-

meyer et al. 2003). It remains to be seen whether stroke subjects tend to learn more slowly because they are more variable or for some other reason. Nevertheless, several other studies agree with our results that error augmentation leads to enhanced learning. Learning how to counteract a force disturbance in a in a walking study increased by approximately 26% when the field was transiently amplified (Emken and Reinkensmeyer 2005). Artificially giving smaller feedback on force production has caused subjects to apply larger forces to compensate (Brewer et al. 2005). Several studies have shown how the nervous system can be ‘‘tricked’’ by giving altered sensory feedback (Flanagan and Rao 1995; Srinivasan and LaMotte 1995; Robles-De-La-Torre and Hayward 2001; Ernst and Banks 2002; Sainburg et al. 2003; Brewer et al. 2004; Kording and Wolpert 2004a). However, augmented feedback on practice conditions has not always proven therapeutically beneficial in stroke (Winstein et al. 1999). It may be that there are limits to the amount of error augmentation that is useful (Kording and Wolpert 2004b; Wei et al. 2005). The implications of successful implicit learning are that one can learn at a nearly subconscious level with minimal attention and with less motivation than more explicit types of practice, like pattern tracing. Training typically requires a balance of repetitive practice, strengthening, expert guidance, and appropriate feedback. We believe that the type of implicit learning demonstrated in this study — one that augments errors — may provide an excellent rehabilitation tool to enhance performance. In fact, explicit information has been shown to disrupt the acquisition of motor skills in participants with stroke but not in healthy controls (Boyd and Winstein 2003; Boyd and Winstein 2004). One might suggest that the changes seen in stroke patients are due to a reduction in tone (spasticity). Evidence has suggested that repeated muscle stretches may temporarily reduce spastic hypertonia (Schmit et al. 2000) (although we are not aware of this being shown during voluntary movement). Even though a reduction in tone cannot directly explain a benefit from erroraugmenting forces (they would reduce the amount of stretch to any muscle, not increase it), one possibility exists where the result of Schmit and colleagues might apply. Since spasticity is velocity-dependent and higher velocities occur well after a movement has been launched, anticipating a spastic muscle action might cause a person to learn to precompensate with a shift early in the movement so that the spastic response carries the limb to the target. For example, a person might launch their outward movements more laterally in anticipation of a later biceps spasm. In this case, exposure to error-augmenting forces would further stretch and increase the excitation of the muscle. Repeating such activity might reduce the spastic hypertonia as was seen in isometric elbow flexors (Schmit et al. 2000) once the forces stop, making it easier for the subject to aim more directly at the target. However, this explanation only works for

some of the outcomes of our study. Subjects that deviate in the opposite direction to this example should have had increased initial direction error as an after-effect, which was not the case. Furthermore, errors during planar movements after stroke are believed to be con- sistent with a lack of feedforward compensation for interaction torques, not an expression of spastic stretch reflexes (Beer et al. 2000). Hence a ‘‘stretching expla- nation’’ appears to be unlikely for our results, although it remains to be seen whether spastic reflexes can be reduced for muscle stretches that take place during re- peated voluntary movements. The results of this study have possible implications in rehabilitation because a properly induced adaptive process might be exploited to assist in the restoration of function. Robotic devices, combined with sophisticated and precise computer programs, provide new possibili- ties for improving and accelerating recovery. One can envision the possibility to custom-designed subjectspe- cific force fields generated by a model of the patient’s biomechanics and motor impairments (Mussa-Ivaldi and Patton 2000; Patton and Mussa-Ivaldi 2001; Patton and Mussa-Ivaldi 2003). Preliminary testing of this ap- proach has proved successful in some stroke survivors (Patton et al. 2001a). Other objectives such as extending the range of motion would require other approaches. What is clear is that opportunities for recovery after stroke are possible by extending intensive therapy be- yond present inpatient rehabilitation stays, and robotic therapy may be one way to economically accomplish this (Fasoli et al. 2004). While error-enhancing forces are useful for inducing adaptive responses, they are likely to be most effective if combined with other rehabilitation strategies. Studies using robotics for rehabilitation, assessment, and training have had some success (Krebs et al. 1998a; Lum et al. 1999). The many paradigms associated with adaptive training may add to the repertoire of possible strategies for rehabilitation. In the search for the most optimal method of training among the many possibili- ties, this preliminary encouraging evidence on erroraugmentation points to future studies that exploit the natural adaptive tendencies in the nervous system for restoring function. Acknowledgements Supported by AHA 0330411Z, NIH 1 R24 HD39627-01, NINDS R01 NS35673.

Appendix D

A Real-Time Haptic/Graphic Demonstration of how Error Augmentation can Enhance Learning

A Real-Time Haptic/Graphic Demonstration of how Error Augmentation can Enhance Learning * Yejun Wei and James Patton

Preeti Bajaj

Sensory Motor Performance Program Rehabilitation Institute of Chicago Northwestern University Chicago, IL 60611 [email protected] [email protected]

Johns Hopkins University Biomedical Engineering Baltimore, MD 21210 [email protected]

Abstract – We developed a real-time controller for a 2 degree-of-freedom robotic system using xPC Target. Th is system was used to investigate how different methods of performance error feedback can lead to faster and more complete motor learning in individuals asked to compensate for a novel visuo-motor transformation (a 30 degree rotation). Four groups of human subjects were asked to reach with their unseen arm to visual targets surrounding a central starting location. A cursor tracking hand motion was provided during each reach. For one group of subjects, deviations from the “ideal” compensatory hand movement (i.e. trajectory errors) were amplified with a gain of 2 whereas another group was provided visual feedback with a gain of 3.1. Yet another group was provided cursor feedback wherein the cursor was rotated by an additional (constant) offset angle. We compared the rates at which the hand paths converged to the steady -state trajectories. Our results demonstrate that error- augmentation can improve the rate and extent of motor learning of visuomotor rotations in healthy subjects. Furthermore, our results suggest that both error amplification and offsetaugmentation may facilitate neuro- rehabilitation strategies that restore function in brain injuries such as stroke. Index Terms – neuro- robotics, error augmentation, xPC Target, motor learning, and visual distortion

I. INTRODUCTION In recent years, experiments that alter the sensory and motor environment of an individual have explored new and exciting possibilities for tele-assistive teaching and robotically-enhanced rehabilitation techniques. For example, robotic devices can be programmed to provide precise forces that restore a brain injured individual’s movement patterns to a healthier pattern [1-4]. However, the use of robotics to promote physical rehabilitation is still in a formative stage, and initial attempts to exploit the intrinsic adaptive capacity of the human sensory- motor system for rehabilitative purposes are ongoing. In a promising study using specially-designed training forces, stroke survivors could make movements they previously could not [5]. This paper presents an initial exploration into the possibilities of a complimentary technique -- error-augmentation -- for facilitating sensory- motor learning. Several lines of reasoning suggest that augmenting error may enhance motor learning. First, many models and artificial learning systems such as neural networks suggest that error drives learning, so that one can learn more quickly if error is larger [6]. Such error-driven learning processes are believed to be central to adaptation and the acquisition *

Robert Scheidt Dept. Biomedical Engineering Marquette University Milwaukee, WI [email protected]

of skill in human movement [7, 8]. Secondly, larger errors are likely to heighten motivation to learn by making the consequence even small errors seem large. It also makes errors more noticeable to the senses and hence may trigger responses that would otherwise be lacking. Error augmentation may lead to larger changes in performance. Finally, intensifying error can also lead to larger signal-tonoise ratios for sensory feedback and self- evaluation. One issue is clear from adaptive control and learning models, however –learning may become unstable if gains are too high. Motor variability, sensor inaccuracies and other uncertainties can cause endless over corrections that do not converge to satisfactory performance. We hypothesized in this study that there was some optimal amount error augmentation. Recently we have shown that enhancing error by pushing the arm farther from its intended target can facilitate re-learning of motor commands required to make smooth and straight reaching movements [5]. In that study, stroke survivors experienced training forces that either amplified or reduced their hand path errors. Significant trajectory improvements occurred only when the training forces magnified the original errors, and not when the training forces reduced the errors or were absent. Hence error-enhancing training may be an effective way to promote functional motor recovery for brain injured individuals. Sensory- motor adaptation has been observed when there is a distortion in the mechanical realm [9-11], but is also observed when there is a distortion in the visuomotor realm [12-15]. In fact, visuomotor adaptation can even trigger recovery of sensory disorders such as hemispatial neglect secondary to stroke [16]. Both mechanical and visuomotor adaptation appear to involve similar neural mechanisms [17]. Hence, we restrict our focus in this initial visuomotor easily- implemented more the to study distortions and healthy adult subjects. While our preliminary results using error-amplification are encouraging, there are a variety of ways to augment or intensify error. Among these the most obvious are linear affine distortions of gain and offset. The first, gain, is the most obvious way to augment error. If subjects are instructed to move in a straight line to a target, a gain of 2 augmentation would mean that any deviation from the straight line would be displayed 2 times that distance from the line. However, recent work on motor learning suggests that there may be a practical limit to gain augmentation.

Supported by American Heart Association 0330411Z , NIH R24 HD39627, NIH 5 RO1 NS 35673, NIH F32HD08658, Whitaker RG010157, NSF BES0238442 and the Falk Trust

Scheidt and colleagues [18] have found that when force is used to disturb motions, subjects incrementally updated their behavior from one movement to the next based on the error they experienced on most recent att empts. This update was best represented by a transfer function that correspond ed to lead-lag compensator , in which the average value for the pole was 0.322. Inverting this transfer function suggests that a gain of 3.1 is the approximate limit to which gain could be amplified in order to obtain rapid learning without leading to instability. Since there is recent evidence to show that vision and force distortions are linked [17, 19], we tested the limiting gain of 3.1 as well as a more moderate gain of 2 in this experiment even though we focused on visual rather than force distortions. Specifically, we explored the gain error and rate learning between relationship augmentation using these two candidates. An alternative approach, error-offset augmentation accentuates error by adding a constant “expected error” to the visual feedback of hand path. Hence, if a subject’s error is 2 cm to the right, they might train with a visual feedback that has a 2 cm bias to the right. Offset-augmentation may prove to be superior to gain fo r several reasons. Magnification using offset is more stable than that of gain because the augmented error display does not grow larger with error. Moreover, in contrast to gain- augmentation, the offset is independent of the size of errors made later in training when the subject is closer to the desired goal. Therefore, offset-augmentation continues to present large errors that continue to motivate learning. One potential problem is that offset error augmentation does not know when to stop -- one can over- learn beyond the desired goal.. While these theoretical assertions could be made about these candidates for error-augmentation, only experimental tests will truly support their validity. This paper evaluates several of these candidate strategies for error-augmentation on healthy subjects. The magnitudes of error magnification studied in this project involved gain factors of 1 (normal conditions), gain of 2, offset, and a gain of 3.1. The goal of the experiment was to best condition error-augmentation which determine facilitated the learning of a visuo-motor distortion. We hypothesized that: 1) Subjects in all groups could adapt to the visual distortion; 2) Error enhancement would be most evident in the case of offset error augmentation; 3) The groups differ in how they are able to generalize what they learned to unpracticed directions of movement. Our results showed encouraging evidence for the use of error augmentation in haptic/graphic systems for robotic teaching, telemanipulation, and rehabilitation. II. M ETHODS A. Experimental Apparatus The experiment was carried out on a planar Manipulandum Robot (Fig. 1), which is consisted of two brushed DC torque motors (PMI model JR24M4CH, Kolmorgen Motion Technologies, NY, USA). The motors are capable of delivering forces at the handle via a Four bar Linkage. Rotational digital encoders (model 25/045-NB17TA-PPA-QAR1S, Teledyne-Gurley, Troy, NY, USA) reported absolute angular position, and a 6-axis force/torque sensor (Assurance Technologies, Inc., TI F/T Gamma 30/10,

and Apex, NC, USA) reported the interface kinetics. In this experiment, we only used the motors to remove the modeled inertial effects of the robot, rendering a nearly impedancefree movement of the handle. While seated in front of the robot and holding the robot handle, subjects were instructed to make reaching movements by following cues presented from an LCD projector on a horizontal projection plane (Fig. 1). Vision of the subject’s arm was obscured by the projection plane, and hence a wide variety of visual distortions were possible, including the rotational distortion used in the experiments described below. Previous developments with this robot were controlled by a PC running DOS to acquire the signals and control torques. For this experiment, a real-time control system was developed using MathWorks xPC target TM. The control schematic, shown in Fig. 2, illustrates the two PCs: a ‘Host’ running with MATLAB-Simulink and MS C++ compiler, and an ‘xPC-Target’ real-time kernel. Each separate element (referred to as a model) was developed using MATLAB-Simulink. A low- level Target model (Fig. 2, bottom) was first compiled on the Host PC and then passed to the Target. It consisted of position, I/O, and torque blocks and was dedicated to real- time control at 200 Hz. It also receives commands from and broadcasts motion and force data to the Host. The Host model (Fig. 2, top) issued executive commands to the Target, managed the experiment, displayed visual feedback to the user via a calibrated overhead projector, and collected and stored data (100 Hz).

Figure 1. Robotic manipulandum and display apparatus used. Subjects’ view of their own arm was blocked by a platform where artificial visual feedback was projected.

The communication between the Target PC and the Host PC was achieved through UDP (User Datagram Protocol), which is a transport protocol that was layered on top of the Internet Protocol (IP). UDP is characterized by its unencumbering nature because it uses a ‘send-and- forget’ strategy that can ensure reliable real-time control of the robot even when information fails to arrive within a single sampling period. B. Subjects Sixteen neurologically normal adults (22-30 years old)

volunteered. The subjects were divided evenly and randomly into four groups. All subjects gave informed written consent in accordance with the ethics committee (Internal Review Board at Northwestern University). Each subject only participated in one protocol to prevent crossover effects. C. Experimental Protocol Subjects were requested to make successive outward reach-and-stop movements to visually displayed targets. Targets were spread evenly along a circle with radius 0.1m. Return movements to the center point were not analyzed. We controlled for a speed of 0.45 m/s by giving subjects feedback at the end of each movement using colored dots and auditory tones to let subjects know if they were going too fast, too slow, or within a range of ±0.05 m/s. Consequently, subjects’ speeds remained roughly constant across the entire experiment.

the subject performs a movement along the path of the average initial error Phase 4. However, the gain (*2) and offset strategies differ dramatically at other locations. An extreme example is when the subject performs the ideal trajectory. Then error is zero, so the subject experiencing the gain (*2 or *3.1 ) will see their trajectory match their desired. However, the subject experiencing the offset will still perceive an error. Hence, an offset error augmentation does not decrease with learning like the gain error augmentation does.

B

A

e+eo

2*e e

e

Fig. 3 Illustration of the error- augmentation strategies. The ideal trajectory, approp riate for the rotated environment is indicated as a dotted line. The trajectory that the subject actually moves along is represented by the thin line. Each instant, the cursor (large dot) is displayed by calculating the current error and either multiplying that error (A ) or by adding a constant e0 to that error (B), resulting in the trajectory that the subject sees (thick lines).

Fig. 2. Control schematics for control using xPC. Two computers operate the robot and display. One manages the experiment, renders a feedback display, and stores data (Host PC); the other is dedicated to reliable control of robotic forces (Target PC).

All four types of error augmentation in this study were derived from a simple affine transformation. That is, the cursor location was moved either by a multiple of the current error vector (a gain, Fig. 3A), or shifted by a constant eo. (an offset, Fig 3B). A perpendicular vector from the ideal straight-lined movement was used to characterize the current error, and that vector was used to alter the position of the cursor for error magnification. The constant eo, was the average initial error, determined for each subject in each of the three possible directions of movement at the beginning of the experiment in (Phase 4, described below). To determine eo, we intermittently exposed each subject to the visual rotation early in the experiment. Note that the two examples in Fig. 3 appear the same. The final location of the cursor appears in the same spot if

For all subjects, the goal was to learn to perform movements to the targets within the allowed range of speeds. All subjects had to do this in the presence of a visual distortion and three of the four groups were subjected to error augmentation. Implicitly they all made movements that were as straight as possible to the target. The first group of subjects was a control that experienced the visual rotation only with no error augmentation (essentially a gain of 1). The second group (*2) experienced a gain of 2 as shown in Fig 3B. The third group (Offset Group) experienced an offset as shown in Fig 3B. The fourth group (*3.1 Group) experienced a gain of 3.1. Each protocol entailed 12 phases of experimentation that varied only in the values of the gain and offset factors, described below: Familiarization: 15 movements; 5 to each target, to become familiar with the system. Baselin e, 15 movements; 5 to each target, with no visual rotation or error enhancement. This established a baseline pattern. Rotated baselin e, 15 movements; 5 each to target that were thirty degrees away from t those in Phase 2. Initial Exposure: 120 movements. Here, one movement in eight (totally 15; 5 each to target) was with a 30º rotation of the visual field. There was no error augmentation. The average of these 15 ‘initial exposure’ movements is recorded to eo as a function of distance from the starting point. Early , Intermed iate, and Late Learning: I n thes e trials (390 movements in all) the four groups experienced the same visual rotation of thirty degrees, but movements also included error augmentation, dependent upon the group descriptions above. Also during this phase, all of

1. 2. 3. 4.

5- 7.

the four groups experienced ‘catch-trials’ that were randomly presented once every eight movements. During these catch trials their respective error augmentation was removed. Hence for all subjects, these catch trials were the same (30º rotation of the visual field with no error augmentation, occurring at the same movement number). These catch trials were used to monitor and compare learning across all groups. Evaluation: In all 15 of these trials all subjects 8. experienced the same visual rotation of th irty degrees with no error augmentation. This consisted of 5 movements to each target. 9- 12. Early, rotated , middle, and late ‘washout.’ Here , all visual rotations and error enhancements were removed to study how the nervous system de-adapts back to a normal behavior. Phase 9 was composed of 10 movements to each target. Phase 10 consisted of 5 movements to each target, but the targets corresponded to the same target locations in phase 3. Phases 11 and 1 2

Quite interestingly, our error augmentation approaches were found to successfully enhance learning in several aspects. First, the Offset group proved to learn significantly more than all other groups (Figure 6, top) (p<0.002). Moreover, error augmentation sped up learning for two of the three groups -- Learning for Gain *2 and Offset Groups was both significantly faster than the other groups (Fig. 5, middle two tracings, and Fig. 6, bottom figure) (p<0.006). Overall in this experiment, the Offset Group learned best in terms of magnitude and speed of learning.

consisted of 40 movements to each target.

D. Data Analysis The measure of interest of this study was the change of the trajectory error compared with an ideal, straight line movement to the target. This ideal closely represented the movements of subjects under normal conditions when there is no distortion or error-augmentation (Fig 1, figures on left side), as observed in previous studies [20, 21]. The trajectory error was defined as the maximum distance (also often called the infinity norm or Chebychev norm) between the actual trajectory and the desired trajectory described above. Other error measures yield similar results. We made four key different comparisons between the results of four groups of subjects: The amount and rate of adaptation, the amount and rate of washout. We also identified the change after the first catch trial among the groups; and the extent generalization. In assessment of the learning rate of the adaptation to the rotated visual field, the trajectory errors were grouped into 5 trials a block and fit into an exponential curve, −t/ C

A + Be where A is an offset, B is the amount of learning (the change of the trajectory errors), and C is the rate of learning (time constant for the error to decrease). Data for this analysis was restricted to the catch trials during learning Phases 5-7 and another fit to the data from the washout phases (9, 11 and 12). Significance for all statistics was assessed using ANOVAs and Tukey post-hoc Comparisons at α=0.05. III. RESULTS Subjects of all four types of error augmentation showed evidence of learning. Subjects made curved trajectories when first exposed to visual distortion (Fig 4, second column of plots), but recovered their ability to produce a straight line at the end of training (Fig 4, third column of plots). When they were returned to normal (un-distorted) conditions, they displayed the characteristic after-effects that curved opposite to initial exposure phase, and these after-effects gradually washed out (Fig 4, rightmost column of plots). Subjects in all the four groups presented large after-effects of adaptation, which was strong evidence that adaptation had taken place.

Fig. 4. Representative trajectories of the hand. Each row of plots displays data from a typical subject from each group. Each column represents a critical phase of the experiment. Only the catch trials are shown for the Learning Phase. Red lines indicate the path the subjects should have reached to successfully complete the task.

All subjects de-adapted in about 70 movements after training when the visual distortion was removed. However, we found no significant differences in the magnitude or in rate of de-adaptation (Figure 6, teal bars). We also tested a gain of 3.1 to see if this large amount of gain might lead to more complete learning after a single trial of exposure. We looked at the trials immediately following the first exposure to 3.1, which was designed to be a catch trial, but we found no significant differences among the groups on the improvements following that single initial catch trial. Finally, all groups were ab le to generalize their learning skills well to unpracticed targets, but there was no indication from our data that any one of the groups differed from the others. IV. D ISCUSSION This paper evaluated several candidate strategies regarding error augmentation to investigate how healthy subjects learn. The goal of the experiment was to determine which error-augmentation condition is optimal for learning a visuo- motor distortion. The smaller time constants for gain

Control

*2

Offset

*3.1

Fig. 5. Learning curves for representative subjects in each group. Small dots represent the trajectory error for a movement, and the bold dots represent the mean trajectory error for 5 successive movements in combination. Learning (5-7) and washout (9- 12) phases were each fit to exponential curves (lines). For the learning phases (5-7), only the catch trials are shown because these trials were used for the regression lin es that characterize the rate of change and amount of error reduction. For these catch trials, the conditions were the same for all groups (30º rotation of the visual field with no error augmentation, occurring at the same movement number). Conditions were the normal for all groups during the washout phases (9-12), and hence the regression used all trials (as shown).

of 2 and offset demonstrate that error augmentation can increase the rate of learning. Moreover, the Offset Group in learned significantly more the other groups. The offset condition, while a less intuitive, appears to allow subjects to adapt to the visually rotated environment more efficiently and accurately than the other methods of augmentation tested. As stated previously, the difference between the offset and gain error augmentation condition is that there is a constant error adding to movements in the Offset Group that does not decrease with improvement. Offset offers one other advantage – it is more stable than the gain-based approach, which can have an unwieldy display of errors when the subject makes a large mistake. There are several undesired effects from offset condition as well. The Offset Group’s larger amount of learning may be due to the fact that they were required, in effect, to learn a rotation of as large as 60 degrees. The offset condition delivers visual feedback that always deviates from proprioceptive feedback the same amount -30 degrees in this experiment. This means that offset can

lead to learning beyond the goal, which occur in some trials in this experiment. However, the advantage of Offset is that it may overcome the problem of diminishing returns due to small errors tha t are often seen at the end of the learning process. Therefore a more intelligent implementation may be a ‘scheduled’ mixture of offset and gain, in which the offset factor is extinguished when the subject learns beyond the goal, may be optimal. Our results also demonstrate limits on the effectiveness of a gain augmentation strategy. The gain 3.1 in the experiment did no better than the control (gain 1) and worse than gain 2, possibly because the larger gain may have decreased the relative stability of the adaptation process beyond that which subjects were comfortable, thus causing them to down- regulate their internal feedback gain so that the overall gain approached “normal”. Had they not done so, noise and sensorimotor uncertainty could possibly lead to learning. unstable consequently and overcorrections Consequently, there is likely an optimal gain between 1 and 3.1. Because there is some recent evidence to show that

vision and force distortions are linked [17, 19], our results may not be true for all tasks and contexts. Moreover, error augmentation using sensory feedback such as proprioceptive or haptic forces may be more effective at different gains. Offset, which was not scaled in this experiment, could also have an optimal value. Moreover, some combination of gain and offset strategies may lead to the best possible learning pattern.

Fig. 6. The amount (top) and time constant (bottom) of error decay during learning (green) and washout (teal) for the four groups. Horizontal lines indicate significant difference between the two groups beneath their tips.

Finally, while our results were on healthy subjects, they have excellent implications for robotic neurorehabilitation. The results of the experiment bring new possibilities to rehabilitation methods which employ robotics. Increased rates of learning via error augmentation would be quite valuable to therapists. The results found in this experiment support and expand upon those found in a recent study [5] that reported when brain injured individuals were subjected to error augmenting vs. reducing forces, error augmenting better healthier movement patterns. The experiment affirms that increasing the error perceived by the subject increases the rate of learning. Error augmentation may ‘wake up’ an inattentive nervous system and trigger the recovery process by supplying heightened, magnified sensory feedback about a persons’ motor deficit. ACKNOWLEDGMENTS

REFERENCES [1] H. I. Krebs, N. Hogan, M. L. Aisen, and B. T. Volpe, "Robot -aided neurorehabilitation," IEEE Transactions on Rehabilitation Eng ineering, vol. 6, pp. 75-87, 1998. [2] P. Lum, C. Burgar, P. Shor, M. Majmundar, and M. Van der Loos, "Robot- assisted movement training compared with conventional therapy techniques for the rehabilitation of upper limb motor function following stroke," Archives of Physical Medicine and Rehabilitation, vol. 83, pp. 952-959, 2002. [3] D. J. Reinkensmeyer, L. E. Kahn, M. Averbuch, A. McKenna- Cole, B. D. Schmit, and W. Z. Rymer, "Understanding and treating arm movement impairment after chronic brain injury: pro gress with the ARM guide," Journal of Rehabilitation Research & Development , vol. 37, pp. 653- 62, 2000. [4] J. Patton, F. Mussa-Ivaldi, and W. Rymer, "Altering Movement Patterns in Healthy and Brain- Injured Subjects Via Custom Designed Robotic Forces," presented at EMBC2001, the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), Istanbul, Turkey, 2001. [5] J. L. Patton, M. E. Phillips- Stoykov, M. Stojakovich, and F. A. MussaIvaldi, "Evaluation of robotic training forces that either enhance or reduce error in chronic hemiparetic stroke survivors," Experimental Brain Research , vol. Accepted pending revisions, 2004. [6] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature (London), vol. 323, pp. 533- 536, 1986. [7] M. Kawato, "Feedback-error -learning neural network for supervised learning," in Advanced neural computers, R. Eckmiller, Ed. Amsterdam: North-Holland, 1990, pp. 365- 372. [8] D. M. Wolpert, Z. Ghahramani, and M. I. Jordan, "An internal model for sensorimotor integration," Science, vol. 269, pp. 1880-2, 1995. [9] O. Bock, "Load compensation in human goal-directed arm movements," Behavioral Brain Research , vol. 41, pp. 167-177, 1990. [10] J. R. Lackner and P. DiZio, "Rapid adaptation to Coriolis force perturbations of arm trajectories," Journal of Neurophysiology, vol. 72, pp. 299- 313, 1994. [11] R. Shadmehr and F. A. Mussa- Ivaldi, "Adaptive representation of dynamics during learning of a motor tas k," Journal of Neuroscience, vol. 14, pp. 3208-3224, 1994. [12] F. A. Miles and B. B. Eighmy, "Long -term adaptive changes in primate vestibuloocular reflex I: behavioral Observations," Journal of Neurophysiology, vol. 43, pp. 1406-1425, 1980. [13] J. R. Flanagan and A. K. Rao, "Trajectory adaptation to a nonlinear visuomotor transformation: evidence of motion planning in visually perceived space," Journal of Neurophysiology, vol. 74, pp. 2174 -8, 1995. [14] Z. M. Pine, J. W. Krakauer, J. Gordon, and C. Ghez, "Learning of scaling factors and reference axes for reaching movements," Neuroreport, vol. 7, pp. 2357- 61, 1996. [15] H. Imamizu, T. Kuroda, S. Miyauchi, T. Yoshioka, and M. Kawato, "Modular organization of internal models of tools in the human cerebellum," Proceedings of the National Academy of Science, vol. 100, pp. 5461-5466, 2003. [16] Y. Rossetti, G. Rode, L. Pisella, A. Farne, L. Li, D. Boisson, and M. T. Perenin, "Prism adaptation to a rightward optical deviation rehabilitates left hemispatial neglect," Nature, vol. 395, pp. 166- 9, 1998. [17] C. Tong, D. M. Wolpert, and J. R. Flanagan, "Kinematics and dynamics are not represented independently in motor working memory: evidence from an interference study," Journal of Neuroscience, vol. 22, pp. 1108-13, 2002. [18] R. A. Scheidt, J. B. Dingwell, and F. A. Mussa- Ivaldi, "Learning to move amid uncertainty," J Neurophysiol, vol. 86, pp. 971 -85, 2001. [19] Y. Wei and J. L. Patton, "Forces that Supplement Visuomotor Learning: A 'Sensory Crossover' Experiment ," Experimental Brain Research, conditionally accepted pending revisions, 2004. [20] P. Morasso, "Spatial control of arm movements," Experimental Brain Research, vol. 42, pp. 223- 227, 1981. [21] T. Flash and N. Hogan, "The coordination of arm movement s: An experimentally confirmed mathematical model.," Journal of Neuroscience, vol. 5, pp. 1688- 1703, 1985.

.

Appendix E

Haptic Identification of Surfaces as Fields of Force

J Neurophysiol 95: 1068 –1077, 2006. First published October 5, 2005; doi:10.1152/jn.00610.2005.

Haptic Identification of Surfaces as Fields of Force Vikram S. Chib,1,2,3 James L. Patton,1,2 Kevin M. Lynch,3 and Ferdinando A. Mussa-Ivaldi1,2,4 1

Sensory Motor Performance Program, Rehabilitation Institute of Chicago; and 2Department of Biomedical Engineering, Laboratory for Intelligent Mechanical Systems, Department of Mechanical Engineering, and 4Department of Physiology, Northwestern University, Chicago, Illinois

3

Submitted 13 June 2005; accepted in final form 29 September 2005

Chib, Vikram S., James L. Patton, Kevin M. Lynch, and Ferdinando A. Mussa-Ivaldi. Haptic identification of surfaces as fields of force. J Neurophysiol 95: 1068 –1077, 2006. First published October 5, 2005; doi:10.1152/jn.00610.2005. The ability to discriminate an object’s shape and mechanical properties from touch is one of the most fundamental somatosensory functions. When exploring physical properties of an object, such as stiffness and curvature, humans probe the object’s surface and obtain information from the many sensory receptors in their upper limbs. This sensory information is critical for the guidance of actions. We studied how humans acquire an internal representation of the shape and mechanical properties of surfaces and how this information affects the execution of trajectories over the surface. Experiments involved subjects executing trajectories while holding a planar manipulandum that renders planar virtual objects with variable shape and mechanical properties. Subjects were instructed to make reaching movements with the hand between points on the boundary of a curved virtual disk of varying stiffness and curvature. The results suggest two classifications of adaptive responses: force perturbations and object boundaries. In the first case, a rectilinear hand movement is enforced by opposing the interaction forces. In the second case, the trajectory conforms to the object boundary so as to reduce interaction forces. While this dichotomy is evident for very rigid and very soft objects, the likelihood of an object boundary classification depended, in a smooth and monotonic way, on the average force experienced during the initial movements. Furthermore, the observed response across a variety of stiffness values lead to a constant average interaction force after adaptation. This suggests that the nervous system may select from the two responses through a mechanism that attempts to establish a constant interaction force. INTRODUCTION

Studies have been performed to understand how humans perceive shape through active touch. Kappers et al. (1994) found that we are capable of learning and distinguishing slight differences in the curvature of various surfaces. Further study of actively touched curved surfaces has shown that adaptation and after effects are present after haptic exploration (Vogels et al. 1996). These after effects are manifested as flat surfaces being judged as convex after touching of a concave surface and flat surfaces being judged as concave after touching of a convex surface. Haptic after effects increase with the time of contact with the curved surface and decrease with the time elapsed between the touching of two different surfaces (Vogels et al. 2001). The stiffness, or degree of rigidity, of an object is critically important for manipulation. Psychophysical studies have been performed to determine thresholds for stiffness discrimination (Jones and Hunter 1990). These studies used a contralateral Address for reprint requests and other correspondence: V. S. Chib, 345 East Superior St., Suite 1406, Chicago, IL 60611 (E-mail: [email protected]).

limb matching procedure, in which subjects adjusted the stiffness of a motor connected to one arm until it was perceived to be the same as that connected to the other arm. These studies concluded that the sensitivity of stiffness discrimination was much worse than would be expected by combining the sensitivities for force and displacement discrimination. Rigid objects, such as walls and table tops, are characterized by high-impedance boundaries. When the hand comes in contact with these objects, there is minimal or negligible penetration inside the boundary, regardless of the force applied to the object boundary. Other objects, such as pillows and computer keyboards, respond to applied forces with larger displacements. In mechanical terms, these different behaviors are captured by describing objects as fields of position-dependent forces. Stiffness is the factor that characterizes how an object responds to a given displacement at its surface. In this study, we investigate interactions between the hand and objects as interactions of the hand with external force fields. Unperturbed free reaching movements have been studied extensively (Flash and Hogan 1985; Hogan 1984; Morasso 1981). More recently, there has been a growing body of investigation concerning the effect of deterministic force fields on free reaching movements of the arm (Flash and Gurevich 1992; Gandolfo et al. 1996; Matsuoka 1998; Shadmehr and Mussa-Ivaldi 1994; Thoroughman and Shadmehr 2000). However, the idea of using similar experimental paradigms for studying the interaction with object boundaries has not yet been explored. Studies of free reaching movements in a field of forces have highlighted the existence of adaptive mechanisms that tend to restore straight-line movements and the kinematics of the unperturbed hand motion. This makes sense if one indeed considers the external forces as a perturbation to be rejected. However, the same mechanism would not be appropriate when the hand encounters an unexpected hard surface in a dark room. In this case, “fighting the field” would be pointless and painful. One would rather comply and modify the trajectory to avoid the wall or to move smoothly over its surface. While these considerations are self-evident, a question arises as to what mechanical conditions would lead the motor system to react to a force field as a disturbance to overcome or as a boundary to comply with. Here, we address this question together with some corollary issues. Is the dichotomy between the representation of fields as disturbances or boundaries the outcome of an adaptive process? Is the dichotomy itself characterized by a sharp, “two state” transition, like the Neckercube illusion and other perceptual dichotomies? Is there a The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

critical value in the local variation of the force field (i.e., the “stiffness”) that the motor system uses to discriminate boundaries from disturbances? We have addressed these questions by observing the effect on hand movements of force fields that emulated a circular planar object of variable stiffness and curvature. Our findings show the existence of both compensatory and compliant responses occurring at different values of surface stiffness. The lowest values of stiffness lead to compensatory responses in which a free space trajectory is recovered, whereas the highest values lead to compliant responses. However, the experiments also revealed a continuum of motor responses, rather than a sharp transition. A smooth transition from compensation to compliance appeared across a continuum of stiffness levels. Objects with different stiffness induced, as expected, different levels of interaction forces. However, a striking effect of practice was a strong tendency toward a fixed average interaction force that did not depend on the object’s stiffness. Thus we observed either compensatory responses or compliant adaptive responses depending on the relation of the initial interaction force to this final fixed level of interaction force. Taken together, these studies reveal a mechanism of adaptation that may subserve the implicit learning of object shapes through repeated mechanical interactions.

was instrumented with joint encoders that report the joint angles at a frequency of 100 Hz. Position and velocity of the manipulandum handle were computed from these encoder signals. The manipulandum was also equipped with two torque motors that generated the force fields corresponding to the virtual object. Endpoint forces were acquired using a six-degrees-of-freedom load cell fixed to the handle of the robot. Visual feedback was given to subjects through a projection system. This system displayed a cursor registered to the movement of the manipulandum handle, as well as start and goal positions, to prompt subjects’ movements. The cursor and visual cues were presented through an LCD projector projecting on a horizontal plane. Vision of the subjects’ arm was obscured by the projection plane.

Force fields The force fields experienced by subjects were computed in real time using the formula f

K R r) + br 0

r
(1)

This expression defines a circular, viscoelastic force field in polar coordinates; where f is the magnitude of interface force produced by the robot, R is the radius of the virtual disk, r is the distance of the handle from the center of the disk, K is a spring constant, and b is a damping constant. The interface force is always directed radially away from the center of the disk. Damping is added to alleviate instabilities encountered at higher stiffnesses.

METHODS

Experimental apparatus

Subjects

Experiments were performed using a two-degrees-of-freedom planar manipulandum as seen in Fig. 1. Subjects made goal-directed movements in the plane of the manipulandum while grasping its handle. The manipulandum was similar to those previously described (Mussa-Ivaldi and Bizzi 2000; Shadmehr and Mussa-Ivaldi 1994). It

Twenty-two naive, healthy, volunteers (age range, 18 –35 yr) participated in this study after giving informed consent in accordance with the standards of the Institutional Review Board of Northwestern University. All subjects were right-handed and had normal vision or vision that was corrected to normal. Subjects were divided into two groups. The first experimental group (low-curvature group), consisting of nine subjects, was prompted to make reaching movements in the presence of a virtual surface of curvature 15 m 1 (R 6.5 cm). The second group of nine subjects (high-curvature group) was prompted to make movements in the presence of a virtual surface of curvature 20 m 1 (R 5.0 cm). To evaluate memory effects related to order of stiffness presentation, a group of four subjects was used (memory effect group). These subjects were presented surfaces in order of descending stiffness, as opposed to ascending order of stiffness presentation in the case of the low- and high-curvature groups.

Experimental protocol

FIG. 1. Schematic of manipulandum and dimensions of the position-dependent force fields.

Subjects made goal-directed reaching movements from a start target to a goal target. Their arms were supported against gravity and constrained to the plane of movement of the manipulandum by a low-inertia arm support. During a trial, a target was projected onto the subject’s workspace, and the subject was asked to make one continuous movement to place a cursor registered to the manipulandum handle within the target, while achieving a desired maximum velocity. The next target appeared after the subject held the cursor at the prior target for 1 s. Subjects were given feedback as to whether they moved faster or slower than the desired maximum velocity. The optimal speed of movement was specified before each experiment. When subjects achieved a maximum velocity >5% faster than the desired velocity, the target turned green. If the target was reached with a maximum velocity >5% slower than the desired velocity, the target turned blue. When the target was reached within the desired maximum velocity range, the target was animated to “explode,” and a distinctive sound was presented to reinforce the perception of a successful

movement. These feedback cues allowed subjects to achieve a consistent maximum speed of movement. Before the introduction of force fields, subjects practiced making point-to-point movements under the required velocity constraints, in the absence of a virtual object, for 60 movements. To assess the typical performance of subject, undisturbed in free space, objects were not introduced during this baseline unperturbed phase. This phase of the experiment allowed subjects to familiarize themselves with the passive dynamics of the manipulandum. After the baseline unperturbed phase, virtual objects were presented to the subject. Subjects were only provided with a haptic rendering of the virtual object: visual information regarding the geometry of the object was not presented. The dimensions of the virtual objects are shown schematically in Fig. 1. A testing phase consisted of the subject moving between targets located on the boundary of the virtual object. Subjects made 100 reaching movements between the presented start and goal positions. The first 50 movements of a testing phase served as a practice period for the subject to acquire information about the virtual surface. During the final 50 movements of the testing phase, catch trials—movements during which, unexpectedly, no force field was present—were introduced pseudorandomly for 12.5% of the movements. These catch trials were introduced to reveal any adaptations of the feedforward motor command that may have occurred after training with the virtual object (Shadmehr and Mussa-Ivaldi 1994). After completion of the phase consisting of 100 movements with the virtual object, a “wash-out” phase, consisting of 50 movements in a null field, was introduced. This phase allowed for deadaptation and unlearning of the field encountered during the previous phase. Six different stiffness levels were tested (K 200, 400, 800, 1,200, 1,600, 2,000 N/m). The stiffness levels were presented in order of increasing magnitude. One group of subjects (low-curvature group) was exposed to the various stiffness levels with a virtual disk of curvature 15 m 1 (R 6.5 cm), whereas a second group (high-curvature group) was exposed to the same stiffness levels with a virtual disk of curvature 20 m 1 (R 5 cm). The experimental protocol and instructions were exactly the same for both groups: the only difference between the experiments was the curvature of the fields presented to subjects. Field stiffness was presented in ascending order after preliminary studies showed a marked memory effect when surfaces of high stiffness were presented before those of lower stiffness. When subjects were presented with a high stiffness field that clearly revealed the shape of the boundary, they showed a marked tendency to identify the boundaries of subsequent low stiffness fields. That is, the identification of a rigid boundary tended to persist in lower stiffness fields. This finding is presented at the end of RESULTS. Because the main interest of this study was to find the minimum stiffness that would lead to identification of a boundary and to determine whether such a value has the property of a threshold leading to an abrupt change in behavior, we gradually increased the presented stiffness levels in search of such a threshold.

Trajeciory analysis Two different measures were used to quantify subjects’ response to virtual objects and their subsequent learning. The measure of area reaching deviation (ARD) was used to evaluate a subjects’ deviation from a straight line path. This measure was defined as the signed area between the trial path and a reference straight-line path between the start and goal positions. Paths to the right of the reference straight-line path yielded positive ARD, whereas those to the left yield negative ARD. If the trajectory is monotonic in y, the signed area reaching deviation can be computed by

AREA REACHING DEVIATION.

A

J

yf

x dy

(2)

The measure of interface force was used to evaluate the forces imposed by the virtual object during subjects’ movements. Force measures were calculated using subjects’ position and velocity signals and Eq. 1. The calculated force values were integrated over the duration of the movement to acquire a resulting force cost (Eq. 3) for an entire reaching movement. This measure expresses the forces imposed by the environment and not the forces produced by the subject

AVERAGE INTERFACE FORCE.

J F

tf

U f Udi

ti

if ii

(3)

Psychomeiric funciion A common means of quantifying a subject’s performance of a psychophysical task is the fitting of a psychometric function (Wichmann and Hill 2001a). The psychometric function relates an observer’s performance of a psychophysical task to some physical aspect of stimulus. For these experiments, the performance metric used was the sign of the ARD. A two-alternative paradigm was implemented for the catch trials performed at each stiffness level. Catch trials having a negative ARD were classified as perception of a field, whereas those having a positive ARD were classified as perception of an object boundary or surface. Subjects’ results were compiled into a single group measure for each stiffness level. This measure was expressed as the proportion of positive surface responses at each stiffness level. The general form of the psychometric function is (x,a, ,y,X) y + (1 y X)F(x:a, )

(4)

The shape of the curve is determined by the parameters (a, , X) and the choice of a two-parameter function F, which is typically a sigmoid function. For these experiments, a cumulative Gaussian was used for F. From the defined range, it follows that the parameter y gives the lower bound of x, which can be interpreted as the base rate of performance in the absence of a signal. The upper bound of the curve, the performance level for an arbitrarily large stimulus level, is given as 1 X. For these experiments, X 0, because it was found that as stiffness increased subjects had a greater propensity to discriminate a surface. Between the two bounds, the shape of the curve is determined by a and . Bootstrapping was used to generate 95% CIs for the resulting psychometric functions. These CIs represent the variability in subjects’ probability of perceiving a surface given the independent variable of surface stiffness. The bootstrap method is a Monte Carlo resampling technique relying on a large number of simulated repetitions of the original experiment. Simulated repetitions of the experiment are obtained by repetitively resampling subsamples of the data. Bootstrap methods are especially well suited for analysis of psychophysical data because their accuracy does not rely on a large number or trials, as do methods from conventional statistical asymptotic theory (Wichmann and Hill 2001b). To construct 95% CIs for psychometric functions obtained during this study, random combinations of subject’s psychometric functions were sampled with replacement 1,000 times. RESULTS

Adapiaiion io viriual objecis of varying siiffness A typical set of movement trajectories for a subject in the low-curvature group, with different field stiffness and at different stages of learning (early exposure, mid exposure, and late exposure), is shown in Fig. 2. At all levels of stiffness, during the early exposure phase, the effect of the virtual object

yi

FIG. 2. Low-curvature group trajectories. The 1st 6 trajectories from various stages of adaptation for a representative subject. Green squares represent the start position: red circles represent the goal position.

on the hand trajectory was quite significant. The time-course of movements during the early exposure phase can be divided into two parts. During the first part, the hand was driven off course by the field and forced away from a straight-line trajectory. During the second part of the movement, after the force field of the virtual object had caused the hand to veer off course from the target, subjects made a second movement back toward the target. At low stiffness values (200 and 400 N/m), after adaptation, subjects produced straight-line movements through the field (Fig. 2). Repeated practice of movements with high-stiffness virtual disks (K 800, 1,200, 1,600, and 2,000 N/m) resulted in a markedly different adaptation. As with low stiffness disks, initial exposure to the field resulted in a two-segment movement, with the first portion corresponding to the hand being perturbed by the force field and forced away from the boundary of the virtual object and the second portion corresponding to a recovery to the goal position (Fig. 2). This qualitative pattern of initial responses was remarkably similar at all field strengths. However, unlike adaptation to low stiffness fields, adaptation to high stiffness fields did not result in subjects recovering straight-line movements. During mid and late ex-

posure, subjects produced movements that followed the virtual surface. Data from the washout periods after force field learning indicate that the effects of force field adaptation were completely suppressed by the end of the 50 trial washout periods (Fig. 3). Washout blocks were presented between the presentations of fields of different stiffness. During washout blocks, subjects’ level of ARD converged to zero, which is consistent with the production of straight line trajectories after movements in the null field. This finding does not exclude the possibility that subjects maintained some memory trace of earlier exposure and that this memory effect could lead to a slowly changing perception of stiffness. Furthermore, even a more prolonged period of rest would not be adequate to extinguish that effect, because some memory of experienced force fields have been documented to persist for weeks (Shadmehr and Brashers-Krug 1997). However, if present, this memory did not have an impact on the baseline movements before each block of trials. These qualitative observations of subjects’ adaptations to virtual objects of varying stiffness are quantified and summarized for all subjects in Fig. 4A. As previously described, with low stiffness values (200 and 400 N/m), after adaptation, subjects produced straight-line movements through the field (Fig. 2). This result was captured by the ARD before and at the end of practice. ARD measures the area spanned by a subject’s deviation from a straight path (see METHODS). ARD for virtual surfaces of 200 and 400 N/m were significantly different (P < 0.05: Fig. 4A). At these low-stiffness values, ARD was reduced to nearly zero, indicating that subjects attempted—and succeeded—to produce straight-line movements. At the high stiffness levels, ARD was not significantly reduced after learning. The data in Fig. 4A show that the measure of ARD increases as the stiffness level increases. Thus as stiffness is increased, subjects began to produce trajectories that conformed to the boundary of the virtual object, as opposed to actively counteracting the forces generated by the virtual object. The shapes of the adapted trajectories varied with the levels of stiffness (Fig. 2, late field exposure). This was also reflected by the pattern of ARD (Fig. 4A). However, a unifying feature among the adapted movements is the average interface force. Through the continuum of stiffness levels, subjects tended to produce similar average interface forces after adaptation (Fig. 4B). A two-factor ANOVA did not find a significant difference among average interface force for the six stiffness levels for subjects in the low-curvature group after adaptation had occurred (F6,5 0.187: P 0.966). After effects from adaptation to the low-stiffness field were observed during catch trials at the end on training. During catch trials, the virtual disk was unexpectedly removed from the path, and subjects made movements in an unperturbed environment. These after effects are mirror images of the

FIG. 3. Learning curve for the low-curvature group. White blocks represent null field presentation, dark gray blocks represent force field presentation, and light gray blocks represent a phase of pseudorandom catch trials.

FIG. 4. Learning data for the low-curvature group. Measures shown are an average of the 1st and last 5 trials averaged across all subjects. A: area reaching deviation (ARD). B: interface force.

responses to the initial field exposure (Fig. 2). This suggests that, in the presence of low-stiffness position-dependent force fields, subjects adapted by developing an internal representation of the field. The internal representation predicted and canceled the forces of the virtual disk. At higher stiffness levels, this was not the case. Once a stiffness threshold was exceeded, subjects’ after effects appeared in the direction of the applied forces and following the profile of the virtual surface. The amount by which the after effects appeared in the direction of the applied forces of the virtual surface and away from a straight line trajectory increased as the stiffness of the surface increased. Group results of ARD for catch trials showed a gradual transition between negative after effects at low stiffness, which indicates that subjects produced a compensatory response opposing the forces associated with the virtual disk, and positive after effects at high stiffness, which indicates that subjects produced a response conforming to the boundary of the virtual surface (Fig. 5A). We used the measure of ARD to derive a binary classification of force fields as either objects or disturbances. Specifically, we used catch trials leading to a positive ARD to classify subjects’ perception of a virtual disk of a given stiffness as being a surface. This allowed us to fit a psychometric function to the response (Fig. 5B). The psychometric function, consistent with the underlying set of ARDs, showed a smooth pattern of classification across stiffness levels. The

A

Probability of Perceiving a Surface

Area Reaching Deviation (m 2)

1

2

1

0

−1

−3 0

0.75

0.5

0.25

Psychometric Function Bootstrap 95% CI

0 200

400

800

1200

Stiffness (N/m)

5.

To assess the effect surface curvature has on the adaptive interaction with an object, an additional subject group engaged in the previously described experimental protocol. This group, the high-curvature group, interacted with a virtual surface of increased curvature (20 m 1). For the high-curvature group, we did not observe force field adaptation with recovery of straight-line movements at low stiffness values. Instead, after training with the low stiffness surface, subjects exhibited movements that complied with and conformed to the circular disk boundary (Fig. 6). This result was manifested in subjects catch trial movements. During these movements, subjects increased ARD at lower stiffness levels after learning (Fig. 7A). Furthermore, after effects from this higher curvature level remained positive and gradually increased across all stiffness levels, indicating that subjects produced after effects in the direction of the applied forces and approximately conforming to the profile of the virtual boundary. The high-curvature group showed a monotonic increase of ARD for catch trials (Fig. 8A) and probability of perceiving a

x 10

−2

FIG.

Influence of surface curvaiure

B

−3

3

probability of perceiving the field as an object boundary, or surface, progressively increased as the level of stiffness increased, reaching chance (0.5) in the stiffness range of 1,000 – 1,300 N/m.

1600

2000

0

200

400

800

1200

1600

2000

Stiffness (N/m)

Low-curvature group catch trial data and psychometric function. A: ARD for catch trials of subjects interacting with a surface of curvature 15 m 1.

measure of ARD. This result suggests that an intended interface force is being regulated by the control system during haptic exploration. Regulaiion of average inierface force

FIG. 6. High-curvature group trajectories. The 1st 6 trajectories from various stages of adaptation for a representative subject. Green squares represent the start position: red circles represent the goal position.

surface as a boundary (Fig. 8B) as the stiffness level of the virtual surface increased. Again, the learning curves from this protocol show after-effect washout after 50 movements in a null field (Fig. 9). As in the case of the surface experienced by the lowcurvature group, the learned behavior followed a gradual trend across stiffness levels. Another common feature between the adaptations to fields of different curvatures was the average interface force experienced after learning. Results showed that, in the case of the higher curvature surface, subjects again achieved an invariant level of interface force (Fig. 9B). While the level of force was larger than that found for the lowcurvature group, it remained relatively constant across all stiffness levels. A two-factor ANOVA without replication did not find a significant difference among the six stiffness levels (F6,5 1.272: P 0.307) after learning. For both groups, high curvature and low curvature, the relative excursion of average interface force was smaller than the excursion in the kinematic

A

Our findings suggest that the unifying theme across these stiffness and curvature levels was a subject’s tendency to generate a constant level of interface force regardless of object mechanics. A regression of interface force before learning versus ARD after learning (Fig. 10A) showed very similar trends in both curvature groups. It is important to note that the forces presented in these regressions are those produced solely by the environment and do not include forces generated by the subject. Considering the similarity in these regressions, we are led to consider a single model to describe the experimental data. Indeed, using a single model to account for the data is consistent with the more general Occam’s razor principle of minimizing the number of parameters (i.e., the complexity of the model or the “size” of the hypothesis space) needed to account for the data. This single model suggests the existence of a common adaptation strategy for the low-curvature and high-curvature groups, whose outcome depends on the amount of interface force experienced in the early interactions with the field/surface. Accordingly, the level of initial interface force experienced may be the key factor in subjects’ perception of a field as a surface. A gradual trend was seen between subjects’ level of interface force before learning and their probability of perceiving a surface (Fig. 10B). The threshold of preadaptation interface force at which subjects began to perceive surfaces greater than chance occurs at 1.0 N. The fact that interface force, rather than stiffness or curvature, is the best predictor of the final classification suggests (see DISCUSSION) that the feedforward command plays a central role in haptic perception. Responses io decreasing siiffness: a memory effeci We found a strong memory effect when surfaces of high stiffness were presented before those of lower stiffness. Four subjects were presented with field stiffness in descending order and were asked to execute the reaching task in the presence of a field with the same geometry as presented to the low-

B −3

x 10

1

Probability of Perceiving a Surface

Area Reaching Deviation (m 2)

10

8

6

4

2

0

−2 0

0.75

Psychometric Function Bootstrap 95% CI

0.5 200

400

800

1200

Stiffness (N/m)

1600

2000

0

200

400

800

1200

1600

2000

Stiffness (N/m)

FIG. 7. High-curvature group catch trial data and psychometric function. A: ARD for catch trials of subjects interacting with a surface of curvature 20 m 1. Colored lines are data from a single subject: bold black line is a group average. B: a psychometric function computed for all subjects.

FIG. 8. Learning curve for the high-curvature group. White blocks represent null field presentation, dark gray blocks represent force field presentation, and light gray blocks represent a phase of pseudorandom catch trials.

curvature group. We observed that, as the stiffness of the field was reduced, starting from a level that clearly revealed the shape of the boundary, subjects had a marked tendency to move along the boundaries of subsequent low stiffness surfaces. Thus the identification of a rigid boundary persisted in lower stiffness fields. The data in Fig. 11 show ARD for catch trials from these subjects. It is apparent that the after effects were in the direction of the boundary (i.e., positive ARD) even at the lowest stiffness values, suggesting that subjects were biased by their initial perception of surface boundaries during initial high stiffness interactions. This is clearly in contrast with the trend shown in Fig. 5A. Because of this memory effect, we decided to present the subsequent boundaries in ascending order, which would have allowed us to detect a transition in behavior if a critical value of stiffness delimited the classification of a force field as either a disturbance or an object. DISCUSSION

We found that the adaptation of movements to a virtual surface is dependent on both surface stiffness and surface curvature. For a given surface curvature, when subjects exceed a threshold stiffness level, they learned to produce a smooth trajectory on the boundary of the surface. In contrast, at lower stiffness, they adapted by recovering the unperturbed kinematics of hand movements in free space. While these distinctly different adaptation strategies were clearly evident at the extremes of tested stiffness, we did not observe a distinct dichotomy but a smooth transition through a continuum of stiffness levels. Furthermore, across these stiffness levels subjects produced an invariant level of average interface force, suggesting a common underlying strategy of interaction. At the lowest stiffness (200 N/m) and low curvature, subjects produced a higher average interface force after adaptation compared with interface forces observed before adaptation

(Fig. 4B). Furthermore, at low stiffness levels (200 and 400 N/m), we found a significant reduction in lateral deviation after practice. The same subjects generated after effects with deviation opposite to the direction of the force. This is consistent with the hypothesis that, at low stiffness and curvature, subjects developed an internal representation of the forces generated by the virtual object. This result is in holding with observations from previous force field adaptation experiments, where the ability to recover a straight-line movement after adaptation was considered to result from an internal model of the perturbing field (Shadmehr and Mussa-Ivaldi 1994: Thoroughman and Shadmehr 2000). At the higher stiffness and curvature levels, subjects abandoned the goal of reducing straight line deviations. Instead, they reduced the interface force by complying with the shape of the object boundary. This trend is shown by increases in lateral deviation and reductions in interface force after learning. This suggests that, at high stiffness, subjects formed an internal representation of the virtual surface based on the reduction of the interface force to a given level. After effects from catch trials at higher stiffness levels further show the difference in adaptation between low stiffness and high stiffness fields (Fig. 5A). At lower stiffness levels, the area reaching deviation in catch trials was negative, indicating an adaptive mechanism that compensated for the field dynamics and resulted in straight-line movements in the presence of the field. At higher stiffness and curvature levels (Fig. 8A), area reaching deviation for after effects became positive, consistent with after effects in the direction of the applied forces and approximately following the profile of the virtual boundary. The magnitude of this positive after effect increased with increasing stiffness. Below a threshold value (K 1,200 N/m: K 15 m 1), the magnitude of the negative after effect increased with decreasing stiffness. In the case of high curvature interactions, subjects positive after effects persisted throughout the continuum of stiffness levels. While after ef-

FIG. 9. Learning data for the high-curvature group. Measures shown are an average of the 1st and last 5 trials averaged across all subjects. A: ARD. B: interface force.

FIG. 10. Unified curvature results. A: ARD after learning vs. interface force before learning for both the low-curvature and highcurvature groups. B: unified psychometric function of perception of surfaces for an average interface force before learning.

fects were always positive during these interactions, the magnitude of the after effects increased as surface stiffness increased. Taken together, these data suggest that subjects may have learned to respond to the perturbation in two ways: 1) by enforcing a nominal trajectory (the straight line from start to end target) at low stiffness and 2) by modifying the nominal trajectory, so as to comply with the perceived object boundary. The two responses are evident at the extremes of the tested stiffness values: the first response appears at low stiffness, whereas the second appears at high stiffness. Interestingly, however, we did not observe a sharp transition between these responses, but a smooth change from 1 to 2. We wish to stress that, in the in the context of force field adaptation, this is a rather paradoxical finding. Based on earlier studies of adaptation, one would expect to observe increasingly larger after effects as the intensity of the perturbing field increases. In contrast, a smooth transition of responses, such as the one observed in this study (Fig. 5A), corresponds to a progressive reduction of the after effects as the force field stiffness increases. The tendency to larger after effect with increasing stiffness would have been consistent 3

with a discrete switching of responses when the field stiffness reaches a perceptual threshold. However, our results are unequivocally dismissing such a possibility. Instead, we find a gradual transition of after effects, from compensatory to compliant responses. This modification suggests that subjects alter their desired trajectory of movement in the face of environmental conditions. In the case of low stiffness, the desired trajectory is a straight line, whereas at higher stiffness, the desired trajectory is one that traces the object boundary. As surface stiffness increases, and interaction forces begin to increase, subjects may modify their desired trajectory of movement to accommodate the increasing forces of interaction. Rather than fighting the interactions forces through the continuum of surface stiffness, subjects develop a desired trajectory that maintains a constant level of interaction force and conforms to the object boundary. The gradual transition of responses suggests that subjects may execute hand movements by combining multiple modules of control. This is consistent with the view that movements are generated by a combination of time-varying force fields (Bizzi et al. 1991, 1995: d’ Avella and Bizzi 1998). The control module that dominates during interaction with low stiffness

−3 x 10

Descending Order of Stiffness Presentation

Area Reaching Deviation (m 2)

2

1

FIG. 11. ARD for catch trials of subjects interacting with a surface of curvature 15 m 1. Surfaces were presented in order of descending stiffness. Colored lines are data from single subjects: bold black line is the group average.

0

−1

−2

−3 0

200

400

800

1200

Stiffness (N/m)

1600

2000

surfaces is one that strives to resist the perturbation of the field and recover a straight motion of the hand across the external field. At higher stiffness levels, the prevailing control module is one that modifies the nominal trajectory to comply with the object boundary. Through a continuum of force fields, the weighting of these two modules would result in the observed smooth transition of learned behavior. While the concept of two control modules being combined to form field/object representations is plausible, this is not the only plausible explanation of subjects’ behavior. Rather than a linear interpolation between two control schemes, subjects’ adaptation may also by described by a simple continuous representation that estimates the external state as a function of experienced dynamics. This idea of a single continuous representation is supported by the linear relationship between preadaptation interface force and postadaptation after effects (Fig. 10A). The invariance of average interface force after learning, together with the consistent relation between initial interface force and final adapted response, is in holding with other findings that suggest that there may be very simple mechanisms for scaling muscle activations and coordination patterns to produce consistently low amounts of net fingertip force (Valero-Cuevas 2000). Subjects may choose control policies that result in the nominal interface force they are willing to sustain. This level of interface force remains constant across stiffness levels within experiments with the same curvature. The idea of constancy of interaction force has also been reported in another a recent haptic contact experiment, in which subjects explore a virtual environment of varying stiffness using an instrumented stylus (Walker and Hong 2004). It has also been shown that there may be very simple mechanisms for scaling muscle activations and coordination patterns to produce consistently low amounts of net force. The idea that the nervous system uses an invariant average interface force as a cue for the mediation of control policies is further suggested by the similarity in behaviors seen for surfaces of different curvature. It is likely that with a surface of higher curvature, subjects experience larger interface forces caused by an initial increased level of surface penetration. This may explain why they tend to comply with the surface curvature starting from the lowest stiffness levels. One should observe that the interface force is not, per se, a good classifier of the rigidity of a boundary. Different interface forces against a given object can be generated by hand movements driven by different motor commands. One can deliberately push against a constraint or generate a light touch. However, given an invariant motor command, boundaries with different degrees of rigidity will induce different levels of interface force. The variable pattern of contact forces in the initial phase of each experiment (Figs. 4B and 7B) indicates that subjects produced a relatively stereotyped motor command, based on the assumption that the hand was to move in free space. Under this condition, a given level of contact force can be taken as an indicator of the rigidity of the encountered boundary and can therefore be used for classification (Fig. 10B). A physiological account for the detection and regulation of interface force may result from the manner in which tactile afferents respond to touch. During grip tasks, FA I mechanoreceptors in glabrous skin tend to initiate a response and fire at

lower force contact and be suppressed and turn off once a threshold of force has been reached (Johansson and Westling 1991). Collectively, a network of receptors distributed on the surface of the hand allow for a narrow range of detection of force (pressure) and its geometric distribution on the hand. During a high level of contact force, a large number of receptors in the network may be overly excited and become inactive, resulting in a deficit in sensory information. Conversely, if too low a level of force is applied a smaller number of receptors may be activated, again resulting in a deficit in sensory information. Therefore there may be an optimal contact force that allows the task to be completed while maximizing activation of the receptor network and thus maximizing sensory and haptic feedback about the surface in contact. While the two adaptive behaviors are present (i.e., restoration of an unperturbed motion and a compliant response), there seems to be a rather gradual transition between them. This finding is in contrast with our own initial prediction and with some known perceptual transitions (e.g., the Necker cube illusion). Furthermore, we found that the observed response across a variety of stiffness values leads to a constant average interaction force after adaptation. This gives support to the hypothesis that the adaptation response may be mediated by a mechanism that attempts to enforce a constant interaction force (rather than a nominal trajectory). However, further research is needed to understand the control mechanisms the CNS uses when developing internal representations of the objects with which we interact. GRANTS

This work was supported by National Institute of Neurological Disorders and Stroke Grants R01-NS-35673 and F31-NS-49795. REFERENCES

Bizzi E, Giszter S, Loeb E, Mussa-Ivaldi F, and Saltiel P. Modular organization of motor behavior in the frog’s spinal cord. Trends Neurosci 18: 442– 446, 1995. Bizzi E, Giszter S, and Mussa-Ivaldi F. Computations underlying the execution of movement: a novel biological perspective. Science 253: 287– 291, 1991. Colgate J and Brown J. Factors affecting the z-width of a haptic display. In: IEEE Iniernaiional Conference on Roboiics and Auiomaiion. San Diego, CA, 1994, p. 3205–3210. d’ Avella A and Bizzi E. Low dimensionality of suprasinally induced force fields. Proc Nail Acad Sci USA 95: 7711–7714, 1998. Fasse E, Hogan N, Kay B, and Mussa-Ivaldi F. Haptic interaction with virtual objects. Biol Cybern 82: 69 – 83, 2000. Flash T and Gurevich I. Arm stiffness and movement adaptation to external loads. In: Annual Conference on Engineering in Medicine and Biology, Orlando, FL, 1992, p. 885– 886. (Medical Biology Conference 13) Flash T and Hogan N. The coordination of arm movements: an experimentally confirmed model. J Neurosci 5: 1688 –1703, 1985. Gandolfo F, Mussa-Ivaldi F, and Bizzi E. Motor learning by field approximation. Proc Nail Acad Sci USA 93: 3843–3846, 1996. Hogan N. An organizing principle for a class of voluntary movements. J Neurosci 4: 2745–2754, 1984. Hogan N, Kay B, Fasse E, and Mussa-Ivaldi F. Haptic illusions: experiments on human manipulation and perception of “virtual objects”. In: Cold Spring Harbor Symposium on Quaniiiaiive Biology. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1990, p. 925–931. Johansson R and Westling G. Afferent signals during manipulative tasks in man. In: Informaiion Processing in ihe Somaiosensory Sysiem, edited by Frazen O and Westman J. London, UK: Macmillan Press Ltd, 1991, p. 25– 47.

‫הפתעה מלאכותית‬ ‫גלילאו‪ ,‬יום שני‪ 28 ,‬ביולי ‪2008‬‬ ‫מהי הפתעה‪ ,‬מה חשיבותה‪ ,‬מודלים קוגנטיביים ורגשיים‪ ,‬ואיך עולם התוכנה משתלב בתהליך‬ ‫הרובוט דומו‬ ‫הפתעה היא תגובה למצב שלא צפינו אותו מראש‪ .‬אם אין לך כל ציפייה לגבי‬ ‫העתיד‪ ,‬לא תוכל להיות מופתע‪ ,‬אבל קשה לדמיין קיום כזה‪ .‬לצפות איך יתפתח‬ ‫המצב בעתיד‪ ,‬לתכנן לפי ציפיות אלו‪ ,‬לבדוק אם הציפיות התממשו ולתקן את‬ ‫התכניות בהתאם ‪ -‬תיאור זה ממצה מגוון רחב של התנהגויות של חיות ושל‬ ‫בני‪-‬אדם‪ ,‬וכן של פעולות של מערכות מכניות ואלקטרוניות‪.‬‬ ‫בלא ניבוי המצב העתידי לא נוכל להתאים את תכניותינו למצבים שבהם סביר‬ ‫שנמצא את עצמנו‪ .‬הצורך להשוות בין התוצאה שנצפתה לבין התוצאה שקרתה‬ ‫בפועל נובע מכך שהידע שלנו אינו מושלם‪ ,‬ולכן גם הניבויים שלנו אינם יכולים‬ ‫להיות מושלמים‪ .‬למעשה‪ ,‬יש סיבות תיאורטיות לחשוב שניבויים אינם יכולים כלל להיות מושלמים ‪ -‬אם כתוצאה‬ ‫מהאקראיות של פיזיקת הקוואנטים ואם כתוצאה מהמתמטיקה של מערכות כאוטיות‪ ,‬שמופיעה גם בפיזיקה‬ ‫הקלאסית‪.‬‬ ‫החשיבות שבגורם ההפתעה‬ ‫התנהגות של ניבוי ותיקון אינה חייבת להיות מסובכת‪ :‬לדוגמה‪ ,‬נניח כי שלב מסוים בתהליך הייצור דורש מרובוט‬ ‫תעשייתי להזיז את הזרוע ‪ 25‬מעלות ימינה ‪.‬התוכנה תחשב כי כדי להגיע לתזוזה כזו‪ ,‬יש להפעיל את המנוע‬ ‫המתאים בסדרה של ‪ 1,500‬אותות ספרתיים‪ ,‬שכל אחד מהם מקדם את הזרוע בשיעור קטן בכיוון הנכון‪.‬‬ ‫מתכנני הרובוט לא יסתפקו בכך‪ :‬הרובוט כולל גם חיישן אחד לפחות המודד את מיקום הזרוע בפועל‪ .‬תוך כדי‬ ‫התנועה‪ ,‬התוכנה תשווה את המקום שבו צפויה להיות הזרוע )לפי האותות שנשלחו למנוע( עם המקום שבו‬ ‫נמצאת הזרוע לפי הדיווח מהחיישן‪ .‬ייתכן שיידרש תיקון קל‪ ,‬כתוצאה מכך שאף אחד ממרכיבי המערכת אינו‬ ‫מושלם‪ .‬מנגנון תיקון כזה הוא מנגנון משוב קלאסי‪.‬‬ ‫נניח שאכן‪ ,‬כתוצאה מהמשוב‪ ,‬יופעל המנוע ב‪ 1,498-‬אותות במקום ב‪ 1,500-‬עד להגעת הזרוע למיקום הנדרש‪.‬‬ ‫אם "טעות" כזו )של רק מעט יותר מעשירית האחוז )היא בתוך תחומי התכנון המקורי‪ ,‬סביר שרוב האנשים‬ ‫יסכימו כי זוהי פעולה תקינה שלא תפתיע את מתכנני הרובוט ‪ -‬או את הרובוט עצמו‪ ,‬אילו היתה לו יכולת להיות‬ ‫מופתע‪.‬‬ ‫אבל ייתכן גם מצב אחר‪ :‬לאחר שעברה שליש מהדרך‪ ,‬הזרוע עוצרת ואותות נוספים אינם גורמים לה להמשיך‬ ‫בדרכה‪ .‬זהו מצב שאינו צריך לקרות‪ .‬מה היינו רוצים שהרובוט יעשה במצב כזה? האם על התוכנה להמשיך‬ ‫לנסות להפעיל את מנועי הרובוט כדי להביא את הזרוע ליעדה? נראה שעדיף לעצור את התהליך‪ ,‬ואף להסיט‬ ‫חזרה את הזרוע‪ ,‬כפי שקורה כאשר דלתות המעלית מתחילות להיסגר ונתקלות במכשול‪ ,‬וכפי שקורה לנו ‪-‬‬ ‫הרבה יותר מדי פעמים ‪ -‬עם מערכת ההפעלה במחשב ‪.‬אפשר גם לדמיין רובוט משוכלל מפנה מצלמות אל‬ ‫האזור הבעייתי כדי לנסות לפענח מה גרם לעצירה‪ .‬אילו היה זה אדם שהיה עוצר פעולה שגרתית כאשר זו לא‬ ‫התפתחה בצורה הנורמלית‪ ,‬מסלק את ידיו מהמקום שבו היו ומפנה את כל תשומת לבו אל אותה פעולה‪ ,‬היינו‬ ‫מפרשים את התנהגותו כגילוי של הפתעה‪.‬‬ ‫אם כך‪ ,‬אפשר לראות את ההפתעה כמשרתת צורך חשוב‪ :‬הצורך לגלות כי קרה משהו השונה מהותית מכל מה‬ ‫שציפינו לו; להגיב תגובה מהירה עוד בטרם פענחנו את השוני ואת הסיבה ‪ -‬מכיוון שייתכן כי שוני זה מהווה סיכון‬

‫לנו או להצלחת המשימות שאנו מנסים לבצע; ולגייס משאבים כדי להבין את המצב וליצור תכנית פעולה חדשה‪.‬‬ ‫משאבים אלו עשויים להיות קוגניטיביים )הפניית קשב(‪ ,‬סנסוריים( הפניית חיישנים(‪ ,‬מוטוריים )הפעלת תנועות‬ ‫מתאימות( או אנרגטיים‪ .‬ברור שהפניה כזו של משאבים משפיעה על פעולות אחרות המתבצעות באותו הזמן‪,‬‬ ‫ועשויה להגביר את צריכת האנרגיה או ליצור סיכונים נוספים )למשל‪ :‬כאשר הצורך במידע נוסף מחייב את החיה‬ ‫הניצודה להוציא את ראשה מהמחבוא(‪ .‬לכן‪ ,‬ההפתעה גוררת תגובה של עוררות כללית המשפיעה על כל מרכיבי‬ ‫ההתנהגות‪ ,‬ומביאה פעילות שגרתית אל "אור הזרקורים" של המודעּות והתכנון‪.‬‬ ‫מודלים קוגניטיביים של הפתעה‬ ‫בכנס העשרים של ‪ IJCAI (International Joint Conference on Artificial Intelligence),‬הכנס הבינלאומי‬ ‫לבינה מלאכותית‪ ,‬שנערך בהודו בינואר ‪, 2007‬הוצג מאמר בשם "הפתעה כקיצור‪-‬דרך לציפייה"‪ .‬המאמר‪ ,‬מאת‬ ‫מישל פיונטי ‪ (Piunti),‬כריסטיאנו קסטלפרנצ'י )‪ (Castelfranchi‬ורינו פלקונה )‪ (Falcone‬מהמכון למדעים‬ ‫וטכנולוגיות קוגניטיביים באיטליה‪ ,‬מציע מודל התנהגותי שבו ההתנהגות אינה מבוססת רק על למידה‬ ‫סטטיסטית‪ ,‬אלא גם על ייצוג של "אמונות" ‪",‬מטרות" ו"ציפיות‪".‬‬ ‫למידה סטטיסטית מקשרת בין פריטי מידע כך שאפשר להשתמש בקישור כזה לניבוי‪ :‬אם ב‪ 80%-‬מהמקרים‬ ‫שבהם רובוט עבר באזור ‪ X‬הוא מצא שם חפצים מהסוג שהוטל עליו לאסוף‪ ,‬כאשר הרובוט יבחן פעולות שונות‬ ‫הוא יוכל לדרג את הפעולה" תנועה לכיוון "‪ X‬כבעלת עדיפות גבוהה‪ ,‬כתוצאה מהערכה כי יש סבירות גבוהה‬ ‫שיוכל למצוא שם את הפריטים שהוא מחפש‪ .‬ההבדל בין למידה כזו לבין יצירת" אמונה )‪" (belief‬הוא עדין אך‬ ‫משמעותי‪ :‬קשר סטטיסטי יכול להיות בעל חוזק כלשהו‪ ,‬וכל פריט מידע נוסף יכול להקטין או להגדיל את חוזק‬ ‫הקשר‪ .‬לכן קשה להגדיר איזה מידע חדש יכול להיות מוגדר כהפתעה‪ .‬לעומת זאת‪" ,‬אמונה" מעלה את הקשר‬ ‫לדרגת ניבוי‪ ,‬כך שכל פריט מידע הסותר את הניבוי יהווה הפתעה‪ .‬מובן שלא כל אמונה חזקה באותה מידה‪,‬‬ ‫וככל שהאמונה "חזקה" יותר‪ ,‬מידע הסותר אותה צריך להוות הפתעה גדולה יותר‪.‬‬ ‫לפי המודל הפסיכו‪-‬אבולוציוני של וולף מאייר ‪ (Meyer),‬המהווה חלק מהתשתית התיאורטית של מחקר זה‪,‬‬ ‫אירועים הגורמים להפתעה מפעילים את התהליך הזה‪ :‬ראשית‪ ,‬המידע מהחושים או מהחיישנים מזוהה כמהווה‬ ‫הפתעה מעבר לסף מסוים‪ .‬כתוצאה מכך‪ ,‬מופסקים או מעוכבים תהליכים קוגניטיביים אחרים‪ ,‬כדי להפנות‬ ‫משאבים לחקירת האירוע‪ .‬במקביל לחקירת האירוע‪ ,‬ננקטות גם פעולות מיידיות המכוונות לאיסוף מידע נוסף‬ ‫)כמו זקיפת אפרכסות האוזניים בחיות רבות( או להיערכות למשמעויות אפשריות של ההפתעה )לדוגמה‪ ,‬יצורים‬ ‫שבהם התגובה הראשונה לרעש מפתיע היא היצמדות לקרקע וקפיאה במקום(‪ .‬בהמשך ‪,‬כתוצאה מחקירת‬ ‫המידע הראשוני והמידע הנוסף שהתקבל‪ ,‬מתעדכנות האמונות וכתוצאה מכך נוצרות ציפיות חדשות ומטרות‬ ‫חדשות‪.‬‬ ‫מודלים רגשיים של הפתעה‬ ‫המודל של שילוב הרגשות בתהליך קבלת ההחלטות קיבל תימוכין גם מעבודתם של חוקר המוח אנטוניו דמסיו‬ ‫)‪(Damasio‬וקבוצתו‪ .‬הם טוענים כי המוח קורא את התגובות הרגשיות )כמו דופק והזעה( ואז מגיע להחלטה‪,‬‬ ‫שהיא בדרך‪-‬כלל נכונה ‪.‬אם מערכת קריאה ואינטגרציה זו‪ ,‬האונה הקדם‪-‬מצחית‪ ,‬נפגעת‪ ,‬האדם יודע להסיק‬ ‫מסקנות תיאורטיות נכונות‪ ,‬אך נוטה לטעויות קשות ביישום של המסקנות לגבי עצמו‪.‬‬ ‫בעשור האחרון נוצרו בסיסים תיאורטיים וניסוייים להבנת התפקיד של רגשות בקבלת החלטות וביצירת ציפיות‪.‬‬ ‫מחקרים בנוירו‪-‬פסיכולוגיה ובתחום החדש של נוירו‪-‬כלכלה ‪ ( Neuroeconomics -‬המפגש של מדעי הכלכלה‬ ‫עם מדעי המוח( מראים כי רגשות יכולים לשמש כמנגנון המפעיל עוררות ועוזר לעבור במהירות בין התנהגויות‪,‬‬ ‫כמו למשל בין איסוף מידע‪ ,‬איסוף מזון או תגובות הגנתיות‪ .‬זוהי הכוונה בביטוי "קיצור‪-‬דרך" בכותרת המאמר‪:‬‬ ‫כפי שהתברר לכלכלנים בעשורים האחרונים‪ ,‬גם במקרה הלא‪-‬מציאותי של קיום מידע מושלם‪ ,‬מגבלות של זמן‬ ‫ומשאבים קוגניטיביים אינן מאפשרות ניתוח מלא והסקת כל המסקנות ה"מתחבאות "באותו מידע‪ .‬מצבים‬ ‫רגשיים כמו הפתעה משמשים כאסטרטגיה לניהול יעיל של חלוקת הזמן והמשאבים בין ניתוח לבין פעולה‪ ,‬כאשר‬ ‫המידע אינו שלם‪ ,‬ואפילו המידע שכבר הושג‪ ,‬ניתוחו עדיין לא הושלם במלואו‪.‬‬ ‫הסוכן והמאורה‬

‫כדי לבחון את האפקטיביות של מודלים אלה‪ ,‬החליטו פיונטי ועמיתיו למחקר להתחיל בהגדרה של סביבה‬ ‫פשוטה ושל "סוכנים" פשוטים הפועלים בתוכה‪" .‬סוכן ‪",‬בהקשר זה‪ ,‬הוא עצם ‪ -‬וירטואלי או פיזי ‪ -‬שיש לו יכולת‬ ‫לאסוף מידע‪ ,‬לפעול על פי המידע שקיבל וליצור אינטראקציה עם סביבתו‪ .‬החוקרים יצרו תוכנה המדמה סביבה‬ ‫וירטואלית המיוצגת במפה דו‪-‬ממדית‪ ,‬שבה קירות ומכשולים מגבילים את מסלולי התנועה האפשריים‪ .‬בשלושה‬ ‫אתרים בתוך סביבה זו מופיעים שלושה סוגי" מזון"‪ ,‬בתדירות שונה‪ .‬לכל סוג מזון יש טיב שונה עבור הסוכן‪.‬‬ ‫מטרת הסוכן היא לנוע על המפה‪ ,‬למצוא מזון ולהביא אותו למקום מוגדר )שאפשר לחשוב עליו בתור‬ ‫"המאורה"(‪ .‬כאשר המזון מגיע למאורה‪ ,‬הסוכן מרוויח אנרגיה שכמותה תלויה בכמות המזון‪ ,‬בטיב המזון ובזמן‬ ‫שעבר מרגע האיסוף עד רגע הגעתו למאורה‪.‬‬ ‫הסביבה כוללת גם סכנות‪ ,‬שבמקרה זה מוגדרות כ"מדורות"‪ .‬מדורות מופיעות בתחילה כ"עשן"‪ ,‬המשמש‬ ‫כאזהרה‪ ,‬ואז מתפתחות ל"להבה" היכולה להזיק לסוכן בכך שהיא מפחיתה מהאנרגיה שלו‪ .‬מדורות נפוצות יותר‬ ‫בחלקים מסוימים של המפה מאשר באחרים‪ .‬מדורה שהופיעה במפה יכולה גם לנוע ולהגיע למקומות אחרים‪.‬‬ ‫בכל רגע‪ ,‬הסוכן יכול להחליט באיזו מהירות לנוע וכמה אנרגיה להפנות לחיישניו‪ .‬ככל שהחיישנים מקבלים‬ ‫אנרגיה גבוהה יותר‪ ,‬הם יכולים לחוש מוקדם יותר בסכנות ובמזון‪ .‬מובן שגם תנועה מהירה צורכת אנרגיה רבה‬ ‫יותר מאשר תנועה אטית‪ .‬כאמור‪ ,‬תנועה מהירה עשויה להביא את המזון מהר יותר למאורה ולכן להגדיל את‬ ‫רווח האנרגיה‪ .‬אפשר להסיק‪ ,‬אם כן‪ ,‬שאף שהסביבה שתוארה כאן פשוטה לאין שיעור מסביבות "אמיתיות"‪ ,‬היא‬ ‫מספיק מורכבת כדי ליצור אתגרים קשים לתכנון ולהתנהגות‪.‬‬ ‫סוכן בעל מצבים רגשיים‬ ‫המאמר מציג השוואה בין שני סוכנים‪ .‬הסוכן הראשון קיבל את השם ‪SEU (Subjective Expected Utility),‬‬ ‫משום שבכל רגע הוא בוחר מבין הפעולות האפשריות )תנועה‪ ,‬שינוי מהירות‪ ,‬שינוי האנרגיה המוקצית לחיישנים‪,‬‬ ‫איסוף מזון וכו'( אותה פעולה הצפויה להביא לתועלת )‪ (utility‬הגבוהה ביותר‪ ,‬לפי הידע החלקי והסובייקטיבי של‬ ‫אותו סוכן‪.‬‬ ‫הסוכן האחר קיבל את השם )‪ MS (Mental States‬כי נוסף על מנגנון ה ‪-SEU,‬הוא כולל גם "מצבים נפשיים"‪.‬‬ ‫המצבים האפשריים של סוכן זה הם "נורמלי" ‪",‬משועמם"‪" ,‬מרוגש"‪" ,‬זהיר" ו"סקרן"‪ .‬כל מצב משפיע בצורה‬ ‫שונה על חישוב התועלת הצפויה מהפעולות האפשריות‪ ,‬ולכן מוביל להתנהגות שונה‪ .‬הגורם המשפיע על מעבר‬ ‫בין מצבים אלה הוא הופעתם של אירועים המהווים הפתעות חיוביות או שליליות‪ .‬סדרה של הפתעות חיוביות‬ ‫תשפיע על הסוכן לעבור למצב "מרוגש"‪ ,‬שבו הערכת התועלת של פעילויות המביאות לאיסוף מהיר של "מזון"‬ ‫תהיה גבוהה יותר מההערכה הנגזרת ממנגנון ה ‪-SEU,‬ולכן עשויה לגבור על הערכת מרכיבי הסיכון של אותן‬ ‫פעולות‪ .‬בדומה לכך‪ ,‬הצטברות של הפתעות שליליות תוביל למצב רגשי" זהיר"‪ ,‬שבו הדגש הוא על הימנעות‬ ‫מסיכונים‪.‬‬ ‫אין להסיק מתיאור זה כי אותו "סוכן" ‪ -‬תוכנת מחשב הפועלת בתוך סימולציה ממוחשבת של סביבת קיום‬ ‫פשוטה ‪ -‬הוא באמת בעל רגשות בני‪-‬השוואה לרגשות אנושיים‪ .‬אילו היה הדבר כך‪ ,‬היינו אולי צריכים להסס‬ ‫בטרם נכבה את המחשב ‪,‬או "נדליק" להבות המאיימות לשרוף את הסוכן שלנו‪ .‬כאן‪ ,‬המונח "מצב רגשי "משמש‬ ‫רק כדי להיעזר באנלוגיה למודלים קוגניטיביים ופסיכו‪-‬אבולוציוניים‪.‬‬ ‫עם זאת‪ ,‬האנלוגיה מעניינת מספיק כדי שנוכל לתהות‪ ,‬בצורה דומה לשאלות עבור בני‪-‬אדם‪ :‬מה התועלת‬ ‫בהטיית השיפוט על‪-‬ידי המצב הרגשי? האם בחינה רציונלית וחסרת פניות של מיטב המידע העומד לרשותנו‬ ‫אינה עדיפה על החלטה" רגשית?"‬ ‫ניצחון הרגש על הרציונל‬ ‫מתברר שההטיה הרגשית היא אכן המנגנון העדיף‪ ,‬לפחות על פי תוצאותיו של מחקר זה‪ .‬עבור סביבות‬ ‫"בטוחות" )עם מספר קטן של להבות(‪ ,‬סוכן ‪ MS‬פעל בצורה" סקרנית" יותר‪ ,‬וכתוצאה חקר את סביבתו וניצל את‬ ‫מקורות המזון בצורה יעילה יותר‪ .‬עבור סביבות "מסוכנות"‪ ,‬סוכן ‪ MS‬פעל רוב הזמן במצב הרגשי "זהיר ‪",‬וכך‬ ‫נמנע מהנזק שספג סוכן‪SEU.‬‬

‫אפשר להקשות כאן‪ :‬אם המצב הרגשי מתבטא בהתנהגות המתאימה את עצמה למאפייני הסביבה )כמו שכיחות‬ ‫הסכנות ושכיחות מקורות המזון(‪ ,‬האם לא היה נכון יותר לשפר את התוכנה של הסוכן כך שתכלול חישובים לגבי‬ ‫מאפייני סביבה אלה‪ ,‬תלמד אותם ותשתמש בהם כדי להגיע לאותן החלטות יעילות מבלי להזדקק לרעיון של‬ ‫"מצבים רגשיים?"‬ ‫המצבים הרגשיים כפי שהוגדרו במחקר זה הם בהכרח פחות מדויקים‪ ,‬מכיוון שאין להם עוצמה או שילוב ‪ -‬הסוכן‬ ‫אינו יכול להיות ‪ 20%‬סקרן ו‪ 30%-‬מרוגש ‪.‬כותבי המאמר אינם מתייחסים לכך‪ ,‬אבל נראה לי כי מודל כזה‬ ‫ללמידה רציונלית של מאפייני הסביבה אינו נכון‪ ,‬גם מכיוון שהוא אינו כללי אלא דורש פיתוח ספציפי עבור כל‬ ‫שינוי בהתנהגות הסביבה‪ ,‬בחיישנים וכו'; וגם כי בעולם האמיתי יש עלות גבוהה לשיפורים קוגניטיביים‪ ,‬במונחים‬ ‫של צריכת אנרגיה ושל אדפטציות מיוחדות )כמו המאפיינים הייחודיים של האדם הקשורים לגודל הראש" ‪).‬קיצור‬ ‫הדרך" של השימוש ברגשות הוא תחליף יעיל לשיפורים כאלה‪.‬‬ ‫תוכנה שמבינה מתי אדם מופתע‬ ‫הסוכן שתואר לעיל הוא תוכנה שאין לה כל אינטראקציה עם בני‪-‬אדם‪ ,‬אבל רוב התוכנות פותחו כדי לשרת צרכים‬ ‫אנושיים ולתקשר עם משתמשים‪ .‬תוכנות כאלה יכולות לתקשר בצורה יעילה יותר אם יכללו מודלים קוגניטיביים‬ ‫שיוכלו לחזות מה יפתיע את המשתמש‪.‬‬ ‫המגזין רב‪-‬ההשפעה ‪ Technology Review ,‬המוצא לאור על‪-‬ידי( ‪MIT‬המכון הטכנולוגי של מסצ'וסטס(‪,‬‬ ‫מפרסם מדי שנה דו"ח מיוחד המציין עשר טכנולוגיות חדשות הצפויות להשפיע על העולם‪ .‬בין הטכנולוגיות‬ ‫שנבחרו עבור שנת ‪ 2008‬נמצא גם הרעיון של יצירת מודלים להפתעה‪ .‬הדוגמה שמביאים כותבי הדו"ח היא‬ ‫חיזוי תנועה‪.‬‬ ‫הרעיון עצמו פשוט‪ ,‬אם כי קשה למימוש‪ :‬אם נאסוף מידע רב המתאר את מהירות התנועה בחלקים גדולים‬ ‫מרשת הכבישים והרחובות‪ ,‬כמה פעמים בכל שעה‪ ,‬במשך שנה ויותר‪ ,‬נוכל לענות על שאלות כמו "כמה זמן ייקח‬ ‫לי להגיע מביתי למרכז העיר ביום רביעי הבא אם אצא בשש בערב?"‪ .‬התשובה אינה בהכרח אותה תשובה‬ ‫שתתקבל עבור כל יום חול באותה שעה‪ :‬ייתכן שליום רביעי יש פרופיל שונה מאשר לימי שבוע אחרים‪ ,‬או אולי‬ ‫יום רביעי הקרוב הוא היום האחרון לפני סוף חודש ‪,‬וכו'‪ .‬זהו שימוש אופייני‪ ,‬אם כי מאתגר‪ ,‬לטכנולוגיות של‬ ‫"כריית מידע ‪",‬שהמשותף להן הוא סקירה של כמויות גדולות של מידע כדי לזהות תבניות אופייניות ולהסיק‬ ‫מסקנות‪ .‬טכנולוגיות אלה משתמשות לעתים קרובות ברעיונות ובאלגוריתמים מתחום הבינה המלאכותית‪ .‬כמה‬ ‫חברות מסחריות מספקות תוכנה כזו עבור חיזוי תנועה‪.‬‬ ‫לפחות חברה אחת ‪ - Inrix -‬שואפת לספק למשתמשיה מידע שיעזור להם יותר ‪.‬מפתחי התוכנה‪ ,‬שהחלה את‬ ‫דרכה כפרויקט פנימי של מיקרוסופט‪ ,‬הגיעו למסקנה כי כאשר תושב מקומי רוצה לדעת מה מצב התנועה‪ ,‬יש‬ ‫דברים רבים שהוא כבר יודע ‪.‬אם השאלה נשאלת בשעות העומס‪ ,‬אין טעם לספק רשימה ארוכה של דרכים‬ ‫שכצפוי התנועה בהן עמוסה ואטית‪ .‬הרבה יותר שימושי לאותו נהג לשמוע על דרכים שבצורה מפתיעה צפויות‬ ‫להיות נוחות לנהיגה בשעה הקרובה‪ .‬לשם כך יש צורך במודל קוגניטיבי של הפתעה‪ :‬מה צפוי אדם לדעת‪ ,‬ומה‬ ‫גודל הפער בין הציפייה לבין המציאות שדי בו להוות הפתעה? התשובות לשתי השאלות שונות מאדם לאדם ‪,‬‬ ‫ולכן התוכנה מאפשרת לכל משתמש להתאים את התנהגות התוכנה לפי הידע שלו ולפי העדפותיו האישיות‪.‬‬ ‫כותבי הדו"ח מסכימים עם מפתחי ‪ Inrix‬כי גישה זו היא כללית וכי יש לה פוטנציאל לתרומה משמעותית לצורה‬ ‫שבה נעבוד עם מחשבים בעתיד‪ .‬אם נשתמש במנוע חיפוש כדי ללמוד על נושא כלשהו‪ ,‬ואם אותו מנוע חיפוש‬ ‫כבר למד מספיק עלינו כדי לנחש בצורה מושכלת מה אנחנו כבר יודעים על אותו נושא‪ ,‬אנו נעדיף לקבל רק את‬ ‫המידע המפתיע‪ .‬חשוב להדגיש כי מידע מפתיע אינו רק כזה שאינו ידוע לנו‪ ,‬אלא גם כזה שהוא מנוגד לציפיות‬ ‫שלנו‪ .‬יכולת כזו היא כנראה עדיין רחוקה‪ ,‬אבל שימושים קרובים הרבה יותר עשויים להיות בתחומים הקלאסיים‬ ‫של עיבוד וניתוח מידע‪ :‬מכל הגרפים והטבלאות המספריות המגיעים לשולחנו של חוקר מודיעין‪ ,‬או משקיע‬ ‫בבורסה‪ ,‬או מנהל שיווק‪ ,‬מהו המידע הלא‪-‬צפוי‪ ,‬המפתיע ‪,‬הרומז כי משהו חדש קורה ומזמין לבדוק אם נוצרו‬ ‫סיכונים או הזדמנויות חדשות? ייתכן שבקרוב נוכל לצפות מהתוכנה שלנו לשלוף פריטים כאלה ולהסב אליהם‬ ‫את תשומת לבנו‪.‬‬

‫תוכנה שמפתיעה את מפתחיה‬ ‫כל מתכנת מופתע מפעם לפעם מהתנהגות התוכנה שכתב‪ ,‬אבל בדרך‪-‬כלל זוהי תוצאה של טעות בתכנות‬ ‫)"באג"(‪ .‬כשפרופ' מייקל ליטמן )‪ (Littman‬מאוניברסיטת ראטגרס בניו‪-‬ג'רזי הופתע מהתנהגות הרובוט שבנה‪,‬‬ ‫הסיבה לכך היתה שונה‪ ,‬כפי שמתועד בסרטון שהעלה לאתר ‪ youtube.‬סרטון זה זכה במקום הראשון‬ ‫בקטגוריית" וידיאו קצר"‪ ,‬בתחרות שנערכה במסגרת כנס האגודה האמריקנית לבינה מלאכותית בשנת ‪.2007‬‬ ‫מטרתו של ליטמן היתה ליצור רובוט לומד‪ .‬הרובוט היה ‪ AIBO,‬הכלבלב הידוע של חברת סוני‪ ,‬והתוכנה ששלטה‬ ‫בו ניסתה למצוא עבור הרובוט דרך החוצה מתוך חדרון סגור וחשוך‪ .‬בחדרון היה מתג שלחיצה עליו פתחה את‬ ‫הדלת ואפשרה את היציאה‪ ,‬אבל לא היה אפשר לזהות את המתג בחשכה‪ .‬מתג אחר היה מואר‪ ,‬ולחיצה עליו‬ ‫הדליקה את האור בחדרון כך שעיניו של הרובוט יכלו לגלות את מתג היציאה‪.‬‬ ‫הרובוט הושם בחדר החשוך כמה פעמים‪ ,‬ובכל פעם הועמד בכיוון שונה ובמקום שונה‪ .‬לאחר ניסיונות אלו‪,‬‬ ‫התוכנה הצליחה ללמוד דרך מהירה לצאת מהחדר‪ ,‬אבל זו לא היתה הדרך שלה ציפה ליטמן ‪ -‬הרובוט מצא‬ ‫שיטה מהירה יותר מאשר הליכה אל מתג התאורה‪ ,‬לחיצה עליו ואז הליכה אל מתג פתיחת הדלת‪ .‬הקוראים ירצו‬ ‫אולי לעצור לרגע ולחשוב מהי אותה שיטה מהירה‪.‬‬ ‫לפני התשובה‪ ,‬הנה רמז‪ :‬מה היה קורה אילו כל ניסיון היה מתחיל כאשר הרובוט היה נמצא במקום ובכיוון‬ ‫קבועים? מה היה יכול הרובוט ללמוד אז?‬ ‫והתשובה‪ :‬הכלבלב האלקטרוני למד לכוון את עיניו אל המתג המואר‪ ,‬למקם את גופו בזווית הנכונה יחסית לאותו‬ ‫מתג‪ ,‬וללכת אחורה עד שחלקו האחורי לחץ על מתג פתיחת הדלת‪.‬‬ ‫טור זה עסק בהפתעה כמנגנון חסכוני להחלטה מהירה בתנאי אי‪-‬ודאות; בהפתעה כמנגנון לתקשורת יעילה;‬ ‫ובהפתעה כתוצאה בלתי נמנעת של מערכות מורכבות ודינמיות‪ .‬אולי אין זה צריך להפתיע אותנו כאשר אנו‬ ‫מוצאים את מושג ההפתעה קשור בצורה כה עמוקה לבינה מלאכותית ‪ -‬הרי זה כה אנושי להיות מודעים לכך‬ ‫שהמידע שבידינו הוא תמיד חלקי ולא‪-‬מדויק‪ ,‬ולחיות עם מודעות זו על‪-‬ידי איזון בין הימנעות מהפתעות לא‪-‬‬ ‫נעימות לבין משיכה אל החדש והמפתיע‪.‬‬

‫‪Appendix F‬‬

‫הפתעה מלאכותית‬

Related Documents

Appendix
July 2020 41
Appendix
October 2019 60
Appendix
May 2020 44
Appendix
November 2019 58
Appendix
June 2020 24
Appendix
June 2020 37

More Documents from ""

Bioxtrim 3 Page Es
June 2020 3
Appendix
June 2020 24
10-icait2011-a063.pdf
April 2020 4
June 2020 10
June 2020 7