next up previous contents index
Next: 2.7 References Up: 2 Written Language Input Previous: 2.5 Handwriting as Computer

2.6 Handwriting Analysis

Rejean Plamondon
Ecole Polytechnique de Montréal, Montréal, Québéc, Canada

2.6.1 Problem Statement

As in many well-mastered tasks, human subjects generally work at the highest and most efficient level of abstraction possible when reading a handwritten document. When difficulties are encountered in decyphering a part of the message using one level of interpretation, they often switch to a lower level of representation to resolve ambiguities. In this perspective, the lower levels of knowledge, although generally used in the background, constitute a cornerstone on which a large part of the higher and more abstract process levels relies. For example, according to motor theories of perception, it is assumed that motor processes enter into genesis of percepts and that handwriting generation and perception tasks interact and share sensorimotor information. Cursive script recognition or signature verfication tasks therefore require, directly or indirectly, an understanding of the handwriting generation processes.

Consistent with these hypotheses, some design methodologies incorporate this theoretical framework in the development of automatic handwriting processing systems. So far, numerous models have been proposed to study and analyze handwriting [PM89,PSS89,GS93,FLVK94]. Depending on the emphasis placed on the symbolic information or on the connectionist architecture, two complementary approaches have been followed: top-down and bottom-up. The top-down approach has been developed mainly by those researchers interested in the study and application of the various aspects of the high-level motor processes: fundamental unit of movement coding, command sequencing and retrieval, movement control and learning, task complexity, etc. The bottom-up approach has been used by those interested in the analysis and synthesis of the low-level neuromuscular processes. For this latter approach to be of interest in the study of the perceptivomotor strategies involved in the generation and perception of handwriting, two criteria must be met. On the one hand, a model should be realistic enough to reproduce specific pentip trajectories almost perfectly and, on the other, its descriptive power should be such that it provides consistent explanations of the basic properties of single strokes (asymmetric bell-shaped velocity profiles, speed accuracy trade-offs, etc.). In the other words, the most interesting bottom-up models should allow the link to be made between the symbolic and connectionist approaches.

2.6.2 A Model of the Handwriting Generation System

A serious candidate model for a basic theory of human movement generation, in the sense that it addresses some of the key problems related to handwriting generation and perception, is based on two basic assumptions. First, it supposes that fast handwriting, like any other highly skilled motor process, is partially planned in advance [Las87,vdGT65], with no extra control during execution of a continuous trace of handwritten text, hereafter called a string [Pla89b]. Second, it assumes some form of rotation invariance in movement representation and used differential geometry to describe a handwritten string by its change of line curvature as a function of the curvilinear abscissa [Pla89a].

In this context, a string can be described by a sequence of virtual targets that have to be reached within a certain spatial precision to guarantee the message legibility. Each individual stroke can be seen as a way to map these targets together in a specific two dimensional space. To produce a continuous and fluent movement, it is necessary to superimpose these discrete movement units in time, that is to start a new stroke, described by its own set of parameters, before the end of the previous one. This superimposition process is done vectorially in a 2D space. A complex velocity pattern, representing a word, thus emerges from the vectorial addition of curvilinear strokes.

A general way to look at the impulse response of a specific controller, say the module controller, is to consider the overall sets of neural and muscle networks involved in the production of a single stroke as a synergetic linear system producing a curvilinear velocity profile from an impulse command of amplitude D occurring at [Pla92]. The curvilinear velocity profile thus directly reflects the impulse response of neuromuscular synergy.

The mathematical description of this impulse response can be specified by considering each controller as composed of two systems that represent the sets of neural and muscular networks involved in the generation of the agonist and antagonist activities resulting in a specific movement [Pla95b]. Although various forms of interaction and coupling between these two systems probably exist throughout the process, we assume that their global effect can be taken into account at the very end of the process by subtracting the two outputs. If so, each of the systems constituting a controller can be considered as a linear time-invariant system and the output of a controller as the difference between the impulse responses of the agonist and antagonist systems, weighed by the respective amplitude of their input commands. The mathematical description of an agonist or antagonist impulse response can be specified if the sequential aspects of the various processing steps occurring within a system are taken into account. Indeed, as soon as an activation command is given, a sequence of processes goes into action. The activation command is propagated and a series of neuromuscular networks react appropriately to it. Due to the internal coupling between each of the subprocesses, one stage is activated before the activation of the previous one is completed. Within one synergy, the coupling between the various subprocesses can thus be taken into account by linking the time delays of each subprocess.

Using a specific coupling function and making an analogy between this function and the predictions of the central-limit theorem, as applied to the convolution of a large number of positive functions, it is predicted that the impulse response of a system under the coupling hypothesis will converge toward a log-normal curve [Pla95b] provided that the individual impulse response of each subsystem meets some very general conditions (real, normalized and non-negative, with a finite third moment and scaled dispersion). So, under these conditions, the output of the module or the direction controller will be described by the weighted difference of two lognormals, hereafter called a delta lognormal equation [Pla95b].

In this context, the control of the velocity module can now be seen as resulting from the simultaneous activation (at ) of a controller made up of two antagonistic neuromuscular systems, with a command of amplitude and respectively. Both systems react to their specific commands with an impulse response described by a lognormal function, whose parameters characterize the time delay and the response time of each process [Pla95b].

One of the most stringent conclusion of this model, apart from its consistency with the major psychophyusical phenomena regularly reported in studies dealing with speed/accuracy tradeoffs, is that the angular component of the velocity vector just emerges from this superimposition process and is not controlled independently by a specific delta lognormal generator [Pla95a]. Each string is thus made up of a combination of curvilinear strokes, that is, curvilinear displacements characterized by delta-lognormal velocity profiles. Strokes can be described in terms of nine different parameters: , the time occurence of a synchronous pair of input command; and , the amplitude of agonist and antagonist commands respectively; , and , , the logtime delays and the logresponse times of the agonist and the antagonist systems; and , the initial postural conditions, that is, stroke orientation and curvature. In this general context, a curvilinear stroke is thus defined as a portion of the pentip trajectory that corresponds to the curvilinear displacement resulting from the production of a delta-lognormal velocity profile, produced by a specific generator in response to a specific pair of impulse commands fed into it. These strokes are assumed to be the fundamental units of human handwriting movement and serve as the coding elements of the motor plan used in trajectory generation.

2.6.3 Testing the Model

Several comparative studies have been conducted to test and validate this model [PAYL93,AP94,AP93]. Without entering into the details of each study, let us simply point out that it was concluded that the delta equation was the most powerful in reconstructing curvilinear velocity profiles and that its parameters were consistent with the hierarchical organization of the movement generation system. Computer simulations have also demonstrated that the delta lognormal model predicts the majority of phenomena consistently reported by many research groups studying the velocity profiles of simple movements [Pla95b].

2.6.4 Conclusion

Further, the delta lognormal model provides a realistic and meaningful way to analyze and describe handwriting generation and provides information that can be used, in a perceptivomotor context to tackle recognition problems. Its first practical application has been the development of a model-based segmentation framework for the partitioning of handwriting [Pla92] and its use in the development of an automatic signature verification system [Pla94b]. Based on this model, a multilevel signature verification system was developed [Pla94a], which uses three types of representations based on global parameters and two other based on functions. The overall verification is performed using a step wise process at three distinct levels, using personalized decision thresholds.

2.5.6 Future Directions

As long as we partially succeed in processing handwriting automatically by computer, we will see on-line tools designed to help children learn to write appearing on the market, as well as intelligent electronic notebooks, signature verification, and recognition systems, not to mention the many automated off-line systems for processing written documents.

In order to see these newest inventions (all of which are dedicated to the popularization of handwriting) take shape, become a reality, and not be relegated to the status of laboratory curios, a great deal of research will be required, and numerous theoretical and technological breakthroughs must occur. Specifically, much more time and money must be spent on careful research and development, but with less of the fervor that currently prevails. False advertising must be avoided at all costs when technological breakthroughs are made, when development is still far from complete and any undue optimism arising from too many premature expectations risks compromising the scientific achievement.

In this perspective, multidisciplinarity will play a key role in the future developments. Handwriting is a very complex human task that involves emotional, rational, linguistic and neuromuscular functions. Implementing any pen-based system requires us to take a few of these aspects into account. To do so, we have to understand how we control movements and how we perceive line images. Any breakthrough in the field will come from a better modeling of these underlying processes at different levels with various points of view. The intelligent integration of these models into functional systems will require the cooperation of scientists from numerous complementary disciplines. It is a real challenge for patient theoreticians.



next up previous contents
Next: 2.7 References Up: 2 Written Language Input Previous: 2.5 Handwriting as Computer