To tell a story, give directions, or describe the layout of a house, speakers must generate multiple utterances in sequence. Because most psycholinguistic research on speaking is based on paradigms designed to elicit single utterances, little is known about multi-utterance language production in children and adults. Linked to this empirical focus on single utterances is the widespread use of a method in which subjects describe simple visual images containing just a handful of objects shown in straightforward spatial arrangements. In contrast, real world scenes contain multiple objects in various relationships that can be described in numerous ways, and so the attentional and language systems face a challenging set of decisions concerning where to begin the description, how to cluster objects and capture their relations, and what information to include or omit. This project uses complex, real-world scenes as a tool to examine the linearization of complex thoughts into sequenced utterances, focusing on adults at this investigate stage in order to establish developmental benchmarks. Image and semantic characteristics of complex scenes will be precisely quantified and used to generate predictions about the allocation of attention as a scene is viewed and described. The project examines the conditions under which scene image features exogenously draw the eyes to specific visual areas which the linguistic system then describes, and under what conditions the cognitive system guides the eyes to meaningful regions of the scene, allowing the language system to prepare a description even before the relevant object or region has been fixated. On this latter view, the language and cognitive systems use scene meaning interactively to formulate a linearization plan for coordinating the production of multiple utterances. To address these theoretical issues, the project focuses on three Specific Aims: (1) To use computational tools from the field of visual cognition to measure image and meaning properties of complex scenes, which will permit the precise quantification of features controlling attention during speaking tasks as well as the selection and sequencing of linguistic content. (2) To determine the extent to which viewers predict the presence of objects and their locations and use those predictions to get a head-start on linguistic encoding even before an object is attended. (3) To extend our approach to the production of utterances describing events by applying the same methods for quantifying scene image and meaning properties that have been developed for nonevent scenes to scenes depicting events with and without animate agents. The project tests an innovative theory of multi-utterance language production which assumes that speakers formulate a linearization plan to guide the allocation of attention and linguistic decisions concerning inclusion and ordering of information. This approach will lead to a deeper understanding of language production, which will lead to identification of testable, rigorous hypotheses concerning the emergence of these processes across development.