Faculty: Andrew Fagg

www.cs.ou.edu/~fagg/

Research

One of the elusive goals of artificial intelligence is to build machines that can work in symbiotic collaboration with humans. To do so, the machines must learn to perform new skills and to refine old ones from their interaction with humans and with the surrounding environment. My research centers around these symbiotic relationships between humans and machines. Specifically, I study machines as models of how biological systems represent and learn motor and cognitive skills; primates as inspiration for new robot control and learning techniques; and the interaction of humans with machines. Central to all of these problems are the issues of constructing rich representations of the state of the agent, the local environment, the task, and the skills; and of using various forms of available training information to refine these representations. In this study of symbiotic computing, I draw on the disciplines of robotics, artificial intelligence, machine learning, computational neuroscience, and wearable/ubiquitous computing.

Research: Motor Skills Representation and Learning

Skills involving reaching, grasping, and manipulation are a rich focus of inquiry because they enable both humans and monkeys to affect their environments in a flexible manner. By studying these motor skills, I hope to build robots that will be able to perform tasks within unstructured human environments, as well as environments that are inhospitable to humans, including space. I am particularly interested in drawing inspiration for robot control systems from the study of biological control and in the use of robots as a mechanism in which to test biological theories of motor control. To better study robot control, one of my projects has been the design and construction of the UMass Torso robot (Figure 1) in the context of an NSF-sponsored Research Infrastructure project. As with many humanoid-form robots, the UMass Torso consists of many controllable degrees-of-freedom and sensors. Thus, there are often many ways in which a task may be accomplished with the available sensor and actuator set. Although this design increases the complexity of the control and sensing problem, the redundancies in which a task may be addressed can be exploited to allow the robot to perform a wide range of tasks while optimizing for a variety of task criteria. The research challenge is to manage these complexities and provide layers of abstraction that 1) enable a programmer to work at an intuitive task level, and 2) allow planning and machine learning algorithms to be used in a practical manner to automatically improve motor skills (or more specifically, control policies).

The control basis approach provides a general framework for sensorimotor abstraction (Platt, Jr. et al., 2003). At its most fundamental level, the framework calls for a set of closed-loop controllers that bridge the gap between the continuous sensor/actuator/time domains and the discrete, abstract representations relating to these. One class of closed-loop controllers that I have been developing with a student is aimed at the formation of stable grasps. Rather than starting with a detailed model of the object to be grasped (e.g., as derived from a vision system), the first step in our approach is to haptically explore the object to be grasped. At each contact with the object, the controller estimates the total force and torque applied to the object by the set of contacts. Given a simple model of the local object geometry, the controller computes movements of the fingers and arm that attempt to reduce the total force and torque (Platt, Fagg, and Grupen, 2002). The power of this approach to grasp format ion is that the controller can be assigned a variety of different physical resources, including finger tips, palms, multiple hands, and even ``virtual contacts'' such as gravity (Platt, Fagg, and Grupen, 2003).

Although the closed-loop controllers sense and act in the continuous domain, they provide an interface to a higher level of abstraction. From this abstract level, a single action becomes the activation of a particular grasp controller. This action terminates at some later time, with a report of success (e.g., having achieved a stable grasp) or of failure (e.g., lost contact with the object). This type of action and state transition model can be captured by the Semi-Markov Decision Processes (SMDP) formalism, which supports the use of a variety of planning and learning techniques and also provides a convenient way to express task-level programs. For example, we have shown that stable manipulation of an object can be expressed in terms of a sequence of grasp controller activation actions (Platt, Fagg, and Grupen, submitted). In addition, a student and I have applied a reinforcement learning technique to the problem of discovering an appropriate sequence of grasp and place actions (Wheeler, Fagg, and Gru pen, 2002). Rather than starting with a model of which grasp was appropriate for a given final object configuration, the robot learned through interaction to select a grip in anticipation of how the grasped object was to be used in future actions. The behavior exhibited through the learning process by the robot demonstrated interesting qualitative similarities to what one sees in grip selection with children in a similar task.

Related to this work, I am a member of a collaborative project led by NASA/Johnson Space Center. Their humanoid robot, Robonaut, will ultimately be deployed on the International Space Station to participate in the assembly and maintenance of the station components. Although there are a number of kinematic and sensing differences with the UMass Torso, we have demonstrated that aspects of our control approach apply well in this robot. In particular, we are bringing critical automated grasping and teleoperator interface components to the Robonaut system.

Research: Biological Motor Control

Biological systems represent the best examples of motor control and learning. I am pursuing biological models as inspiration for robot control approaches and robots as mechanisms with which to evaluate biological control theories. In my research, I seek to understand how to better describe what is represented by different areas of the nervous system, the computations that are implemented by these regions and their interconnections, and what factors drive the development of these representations. I study these questions in the context of motor control - specifically in the area of reaching, grasping and manipulation. One of the critical questions to be addressed when examining the role of the brain in motor control is the relative contribution of peripheral systems, specifically, muscles, the sensors embedded within the muscles and other tissue, and the neural circuitry within the spinal cord. It is common in the modeling community to assume that these peripheral systems impose a linear transformation of the motor signals generated by the brain. Although a simple assumption, it implies that the full complexity of a temporal muscle activation pattern is due to the motor commands generated by the brain itself (and requires a large number of parameters to describe). I have developed a model of muscle/spinal interaction that includes key nonlinearities, particularly within the feedback loop implemented by the spinal circuitry (Houk, Fagg, and Barto, 2002). Although these nonlinearities impose additional complexities to the modeling process, we have shown that they can drastically reduce the complexity of the motor command that is necessary to produce realistic muscle activation patterns (Fagg, Barto, and Houk, 1998a). This observation has important implications for how the brain represents and learns motor skills.

I also study models of biological motor skill learning. It is clear from the psychophysics and neuroscience areas that multiple, distinct mechanisms of learning are involved in the process of acquiring a new skill. In addition, there exist interesting parallels between theories of machine learning and the mechanisms that are implemented by several brain regions. I am interested in developing models of these learning mechanisms and their interaction. For example, an area of the brain called the Cerebellum is thought to be involved in learning coordinated motor skills. Experimental evidence suggests that the motor outflow from this area is trained using a mechanism that relates to supervised learning (or regression) techniques. One unknown is the source of the error information that drives the learning process. When one examines reaching movements in adults, we often see a gross movement to the target followed by a sequence of smaller movements. A hypothesis that I have been exploring is that this training information (in the form of an error vector) is derived from the submovement that follows the current one (Fagg, Sitkoff, Barto, and Houk, 1997; Fagg, Zelevinsky, Barto, and Houk, 1998b; Barto, Fagg, and Houk 1999).

This approach is interesting in that the motor system, in some sense, is responsible for teaching itself how to produce smoother, more coordinated movements. However, this model assumes that the motor system is always capable of generating an effective sequence of corrections to take the arm to the target. One possibility is that some aspect of a corrective action is selected as a function of its utility in completing the movement (Fagg, Barto, and Houk, 1998a). Another set of brain regions known as the Basal Ganglia are thought to be involved in the assessment of the utility of actions. I am currently developing an abstract model in which a reinforcement learning (RL) module is responsible for selecting from a small number of available corrective actions, but the meaning of these actions is altered at the same time by a supervised learning mechanism. This model is particularly interesting in that it uses exploratory learning (specifically, RL) when there is little information about how to perform a movement, but then comes to rely on supervisory training information when the teacher becomes competent.

In addition, I study the formation of movement representations and execution strategies. The production of movements typically involves the differential recruitment of many more muscles than skeletal degrees of freedom and many more neurons than muscles. However, there are specific regularities in the way in which neurons and muscles are recruited in the movement generation process (for example, both muscles and cells are often recruited as function of the cosine of the direction of movement). The question is what factors lead to these regularities despite the redundancies that exist. In our model, we explore the hypothesis that many of these effects can be explained through a process that attempts to optimize both the movement error and the degree of effort used to perform the movement (Fagg, Shah, and Barto, 2002b). The model produces patterns of systematic wrist muscle recruitment that are consistent with both human and monkey data. Furthermore, through this approach we are able to explore issues surrounding the neural representation of movement (Shah, Fagg, and Barto, submitted) and the formation of these representations (Sondhi, Shah, and Fagg, in preparation). I am currently working to extend these techniques to the area of grasp formation (Fagg and Arbib, 1998; Fagg, 1996).

Research: Human-Machine Interaction

There are currently many consumer electronic devices that promise to improve our daily lives by performing a wide range of tasks - especially related to communication and memory functions. However, in practice, these devices demand greater amounts of personal attention on the part of the user, which detracts from their benefits. A solution is to develop devices capable of automatically making intelligent guesses as to the information that the user will need over the next few minutes. This information should then be presented in a form that minimizes user distraction. By reducing the user's need to attend to the mechanics of interacting with the devices, we open up a wide range possibilities for new uses of such ``wearable'' computing systems.

I have developed a distributed service model to address these problems. A set of independent agents is responsible for gathering information that may be useful to the user at any given time (e.g., email, news, and location-dependent ``sticky'' notes). However, these agents do not communicate directly to the user, but instead submit information to a central interaction process. This process is responsible for making context-sensitive decisions about whether the information should be presented to the user and how it should be presented (displayed as text or whispered in the user's ear). I approach this decision problem as one of control in which a representation of the user's activity is translated into an appropriate presentation action. This control perspective of the user interface enables us to engage a variety of machine learning approaches, including both supervised and reinforcement learning techniques.

To date, this perspective has been applied in two experiments. First, I have shown that an effective association can be acquired between a representation of the user's current activity and a document that she will access in that context. This prediction is acquired by ``looking over the user's shoulder'' and observing regular patterns of document access. Predicted documents are presented to the user in menu form and can be selected with a minimal number of keystrokes, increasing the speed at which many documents can be retrieved. Second, a student of mine has examined a context-sensitive power management problem in which a mobile computer must decide at any given time to suspend for a short period of time or continue to be active so as to respond to user requests or critical sensory events. We formulated the problem in terms of an SMDP and employed Q-Learning (a form of reinforcement learning) to optimize the selection of control actions. The learned control policy acquired an implicit representation of the conditions under which the processor could safely suspend while only missing a small number of external events. In the coming semester, we will be applying similar techniques to the problem of when/how to present agent-generated messages.

The issues addressed in the wearable computing domain also apply to the area of human-robot interaction. Here, we wish to maximize the efficiency of communication between the human and (potentially) many robots. Several students and I have been developing mixed-reality interfaces (a combination of real and virtual environments) for this purpose (Fagg et al. 2002a; Ou, Karuppiah, Fagg, and Riseman, 2004). Here, a virtual environment is used to summarize the state of the real world as extracted by the set of sensors and to make explicit the physical relationships between the different robots and sensors. This approach allows the user to explore the data space in a spatial manner and then to select individual sensors for access to their live data streams or individual robots for control purposes.

One of the dominant paradigms in robot control for space applications or hazardous environments is for a user to teleoperate a robot. Due to the large cognitive effort required to ensure that the robot acts as intended by the teleoperator, the useful operation time of a user is often less than an hour. I have been exploring the use of mixed autonomy approaches that allow the robot to perform some subtasks autonomously after permission is given by the user. One approach that I have been pursuing is to use our already-existing humanoid control system as a mechanism for the recognition of the intended movement produced by the teleoperator. This technique is being used to preemptively complete movements initiated by the teleoperator (giving the teleoperator short periods of time to rest) and to train the control system to perform sequences of submovements within a single demonstration.

Research:Future Research Directions

In the future, I will continue to pursue the research themes discussed in the previous sections. However, I plan to also take two additional steps. A significant next phase in the humanoid robot work is to develop a version of this system that allows motion of the base and of the trunk. Besides the technical problems of mobility, balance, and power, this step will enable the exploration of new research problems, including the collaborative manipulation of large objects; the interaction in planning and execution of reach, grasp, posture, and body placement; and planning for long-duration tasks involving object acquisition, assembly, and delivery. My research interests in computational neuroscience, robotics, and wearable computing converge on the emerging field of Brain-Machine Interfaces (BMI). These interfaces will ultimately involve the chronic implantation of a large number of electrodes into the brain, potentially allowing for the high bandwidth transfer of information between brain and computer. This work has important implications for prosthetic limbs that will behave and ``feel'' much like the biological limbs that they replace, and for the development of computational prosthetics that will augment aging brain regions. But - there are many technical and computational questions that have yet to be addressed. The latter include how to interpret the cellular activity in real time so as to command an artificial limb in a convincing manner, how to be robust to drift in the cellular representation of movement, and how to support the collaborative learning of the human and machine to improve performance of the complete system. I am a member of a multidisciplinary group that includes Northwestern University and University of Chicago that has recently submitted a proposal to the National Institutes of Health in which we plan to take some of the next critical steps in this work.

Teaching

Teaching: Teaching Philosophy

One of the most effective ways I have found to teach students is to make it possible for them to take an active role in their own education. In this context, I see my role as structuring the class experience so that the students have the tools and the interest in the subject matter to do this in a productive manner. This process takes a variety of forms:

First, I explicitly engage the students in discussion during class. One of the devices that I have found effective is to open many of my classes with an opportunity for students to discuss recent news events that relate to the class topic. This gives the students an opportunity to direct the class conversation and to appreciate how the material about which they are learning relates to the external world. I have found that this works well even in classes as large as 70 students.

Second, I structure class assignments such that the students must interact with one-another, both inside and outside of class. Something that has been lost in the computer in every dorm room culture is that students in the same class often do not have contact with other students outside of the classroom, thus losing out on the opportunity to learn from each another. I find that giving collaborative assignments helps this contact, but that the effects are particularly improved when the assignments are designed such that students must work side-by-side outside of the classroom (even when they are not collaborating directly).

For example, my Embedded Systems class (designed for graduate and advanced undergraduate students) is heavily oriented toward laboratory work. For the most recent semester, the class project was to construct an interactive room that could sense the presence of individuals in the group lab space, infer something about their activities, and customize information delivery and other services. I found that the hands-on work was very engaging to the students. In general, they were responsible for selecting a set of their own mini-projects, which were graded based on the difficulty of the problem and the quality of the work. I provided enough structure to ensure that the students satisfied a minimum set of requirements and that they made regular progress through the course of the semester. I also structured the skeleton of the projects such that the students started the class by building components in pairs, but later projects could involve an arbitrary number of students. This led to class and lab discussio ns about larger group goals and about standards in the way that individual software and hardware modules are constructed. Having students work together also gave them the opportunity to learn from each other. This proved to be the case even when the students were not directly collaborating on a common project.

Teaching: Training Students in Research

In addition to classwork, I have supervised the research activities of 7 graduate and 13 undergraduate students in a range of research projects in the areas of robotics, computational neuroscience, and wearable computing. The undergraduate research has been performed either in the context of a Research Experiences for Undergraduates (REU) funded program (NSF) or through supervision of honors theses.

At all levels, my first goal is to teach the students some of the fundamentals of working in science: how to transform a problem into a scientific question, how to design an experiment from a question, how to analyze the results, and how to express oneself in spoken and written form. I also focus on making sure that the students develop an appropriate tool set with which to tackle the problems in their research area. With the younger students, I am typically careful about setting the direction of the research and writing processes. But as the students mature, I think it is most valuable to them (and to me) for them to take a more active role in determining the direction of their own research. In this context, research and writing becomes more of a collaborative process. I also expect senior students to take on leadership and mentoring roles. This gives the students a better sense of investment within the laboratory and prepares them to direct research in their own labs.

I have served and am now serving on a variety of Masters and PhD committees. Although most of these students are from the Computer Science Department, I have served as an outside member of two committees in the UMass Psychology Department. In addition, I have served as a PhD committee member for students from the Universitat Jaume I in Spain and the University of Queensland in Australia.

Content