The sensory and perceptual processes whereby the acoustic waveform of speech is transformed into a sequence of phonetic units are to be studied. Novel combinations of methods of short-term spectral analysis will be studied and evaluated with regard to their effectiveness in deriving a psycho-accoustically appropriate spectral envelope and subsequently eliminating from it confusing perturbations due to spectral tilt, nasalization, and voice quality. The spectral envelopes resulting from such short-term analyses will be subjected to further "higher" processings designed to reveal the phonetic content of the speech. These higher processes will be modelled in terms of an auditory-perceptual theory of phonetic recognition. The theory, itself, is to be implemented in a digital computer with three-dimensional display capabilities. In this way, precise quantitative evaluation of the theory as well as rapid modification of its parameters and concepts are possible. Experimental studies designed to estimate the parameters of these models and evaluate their usefulness include the study of samples of naturally produced syllables, sentences and connected discourse. The human perception of synthetically produced speech will be studied with the technique of identification. Emphasis will be placed on the perception of vowels, diphthongs, r-colored vowels and voiced stops; on an hypothesized sensory-perceptual transformation; and on hypothesized dynamic criteria for the segmentation of the auditory stream into events and "speech sounds". To shed further light on the adequacy of the auditory-perceptual approach, we will program an automatic system designed to recognize limited vocabularies of single words for a single male talker and a single female talker, but with the potential for generalization. The improved understanding of the speech perception process that might emerge from these studies could have profound implications for the rehabilitative procedures and devices for those with impaired speech perception.