MPI Series in Biological Cybernetics, Bd. 20
The theoretical part of the thesis starts with a close view on sensorimotor processing. The cognitivist approach and the embodied approach to sensorimotor processing are contrasted with each other, providing evidence from psychological and neurophysiological studies in favor of the latter. It is outlined how the application of robots fits into the embodied approach as research method. Furthermore, internal models are defined in a formal way, and an overview of their role in models of perception and cognition is provided, with a special emphasis on anticipation and predictive forward models. Afterwards, a thorough overview of internal models in adaptive motor control (covering both kinematics and dynamics) and a novel learning strategy for kinematic control problems ("learning by averaging") are presented.
The experimental work comprises four different studies. First, a detailed comparison study of various motor learning strategies for kinematic problems is presented. The performance of "feedback error learning" (Kawato et al., 1987), "distal supervised learning" (Jordan and Rumelhart, 1992), and "direct inverse modeling" (e.g., Kuperstein, 1987) is directly compared on several learning tasks from the domain of eye and arm control (on simulated setups). Moreover, an improved version of direct inverse modeling on the basis of abstract recurrent networks and learning by averaging are included in the comparison.
The second study is dedicated to the learning of a visual forward model for a robot camera head. This forward model predicts the visual consequences of camera movements for all pixels of the camera image. The presented learning algorithm is able to overcome the two main difficulties of visual prediction: first, the high dimensionality of the input and output space, and second, the need to detect which part of the visual output is non-predictable. To demonstrate the robustness of the presented learning algorithm, the work is not carried out on plain camera images, but on distorted "retinal images" with a decreasing resolution towards the corners.
In the third experimental chapter, a model for grasping to extrafoveal (non-fixated) targets is presented. It is implemented on a robot setup, consisting of a camera head and a robot arm. This model is based on the premotor theory of attention (Rizzolatti et al., 1994) and adds one specific hypothesis: Attention shifts caused by saccade programming imply a prediction of the retinal foveal images after the saccade. For this purpose, the visual forward model from the preceding study is used. Based on this model, several grasping modes are compared; the obtained results are qualitatively congruent with the performance that can be expected from human subjects.
The fourth study is based on the theory that visual perception of space and shape is based on an internal simulation process which relies on forward models (Moeller, 1999). This theory is tested by synthetic modeling in the task domain of block pushing with a robot arm.