Visual Attention and Learning: Interconnectedness of Auditory and Visual Distractions
Many learning tasks that children encounter necessitate the ability to direct and sustain attention to key aspects of the environment while simultaneously tuning out irrelevant features. Research suggests that successful learning depends in part on the ability to selectively focus attention. Children have difficulty learning in chaotic environments containing visual or auditory distractors. Research on these issues tends to be siloed: Relatively little cross talk occurs among researchers with expertise in auditory and language development and researchers focusing on visual attention and learning. We argue that each domain has important implications for the other, and considering visual and auditory distractions jointly may lead to new insights and recommendations for best practices for caregivers, educators, developers, and policymakers.
The Significance of Selective Attention
The environment contains many distinct sources of visual and auditory information; however, only a subset of this information may be relevant for a particular learning task. Thus, to learn, children must selectively attend to relevant features of the environment at the expense of others. Imagine a child sitting at the kitchen table listening to a caregiver read a story. This task might require the child to visually attend to the illustrations and carefully listen to the story. Simultaneously, the child needs to ignore the sights and sounds of a busy household that are irrelevant to the task at hand (e.g., an intricate tablecloth, the dishwasher humming, the dog walking past). Attention regulation can be automatic, meaning that attention is captured by salient aspects of the environment- such as loud sounds, bright colors, and motion-or top- down and voluntary, based on an individual’s goals and interests. Early in development, selective sustained attention is largely driven by stimulus properties such as brightness, contrast, and novelty. As brain regions such as the prefrontal cortex mature, children acquire increasing ability to deploy attention voluntarily. This dual model of attention regulation is a common framework in cognitive psychology; however, debate exists over precisely how to define attention, its components, and functions. Prior work has examined the relationship among attention, task performance, learning, and academic achievement. Although selective sustained attention is hypothesized to be critical for learning in both the auditory and visual domains, a disproportionate amount of research on the development of attention has focused on visual attention. Additionally, research with infants and toddlers tends to focus on attention that is regulated automatically.
Auditory vs. Visual Attention: Key Differences
Many differences exist between the auditory and visual domains. For example, although sounds may be sustained over time, decay and transience are among their fundamental properties. Thus, auditory processing often involves making sense of rapidly changing or disappearing signals. In contrast, visual input is more stable and less likely to suddenly disappear or to disappear as quickly. Consequently, visual processing may be less fundamentally linked to temporal principles governing auditory processing. Research suggests that temporal dynamics favoring learning in the visual domain often differ from those of similar learning tasks presented in the auditory domain. In the same way that temporal dynamics are critical to auditory processing, spatial factors are of greater importance to visual processing. To attend to a target among distracting objects, a viewer must localize it in space. At a physiological level, visual information is spatially distributed, such that information from different objects is processed by different receptors in the eye. In audition, all sounds are funneled down the ear canal to the tympanic membrane or eardrum; collective vibrations are transmitted to the cochlea and auditory receptors. The brain must reseparate the target and distracting signals prior to making sense of an attended signal. In some respects, visual distractions may be easier to ignore than auditory distractions, especially if an individual can physically orient away from the distractions. For instance, desk dividers can shield against visual distractions and focus attention on instructional materials. Another important difference relates to phenomena associated with increasing the number of auditory and visual objects. In vision, increasing the number of objects can produce clutter, possibly increasing the difficulty of maintaining attention to a target. In contrast, increasing the number of auditory signals can fuse signals into a single noise that is more intense but also less variable and thus less likely to cause distraction; this is particularly true of voices. Consequentially, attentional effects based on the number of objects likely differ by domain. There are also differences in modality dominance: Infants and toddlers rely more on auditory information in contexts in which auditory and visual information compete. Around age 4, this preference evens out, and eventually visual information begins to dominate.
Commonalities Between Auditory and Visual Attention
There are also similarities between the auditory and visual domains. For example, background noise can impair processing of a target through energetic or informational masking. In energetic masking, energy from one signal interferes with another; this can occur when a distractor at the same frequency as the target makes the target inaudible or when the auditory representations of two signals interfere as a result of spread of excitation on the basilar membrane. In vision, spatial occlusion of one object by another can be thought of as analogous to energetic masking. Informational masking refers to cases in which a potential distractor causes confusion, making the listener uncertain of which sounds belong to which signal; here, the target signal typically remains partially or even fully audible but can still be difficult to distinguish from background noise. In the visual domain, an analogous scenario occurs when a target object is fully visible but presented with other objects. In both the visual and auditory domains, distractors can be simple and static, such as the relatively constant sounds of the air conditioning humming, or plain and unadorned stationary objects. Distractors can also be complex and variable-for example, speech sounds changing in frequency, pitch, or volume or objects or displays that are bright, moving, or patterned. Regardless of domain, individuals may find it more difficult to habituate or ignore variable and complex stimuli. Finally, the intensity of auditory and visual information can cause frustration and stress, and in some cases physical damage; for example, very loud sounds and bright lights can damage sensory receptors. Tolerance for extraneous noise and clutter may also vary. Children with autism or hearing loss may be disproportionately affected by extraneous information in the environment because of heightened sensitivity to noise and susceptibility to visual distractions.
The Effects of Background Noise and Visual Clutter
Background noise can be detrimental to children’s speech comprehension and learning which is important given that noise levels in day-care centers and schools are frequently higher than recommended levels. Hygge, Evans, and Bullinger (2002) reported a variety of negative effects on cognitive performance measures in elementary school students exposed to aircraft noise. Learning costs related to more pleasant background noise have also been shown, as instrumental music can impair learning from television among infants. Similarly, background speech can disrupt the acquisition of new labels: McMillan and Saffran (2016) found that toddlers struggled to learn new labels unless they were substantially louder than background speech. Although the cause of such difficulties in listening and learning when noise is present is uncertain, early maturation of the auditory system may implicate attentional difficulties. Background noise that varies in content or volume over time may automatically capture attention, resulting in divided attention. Findings such as these have led to classroom design recommendations to help improve acoustics by adding drop ceilings, acoustical ceiling tiles, carpeting, and noise-absorbing surfaces. Specific recommendations regarding this latter acoustical modification include incorporating noise-absorbing materials, such as cork bulletin boards, and hanging quilts, flags, and student work from classroom walls. However, such recommendations should be tempered considering how these design elements interact with children’s visual attention: A growing literature has found greater inattention and reduced learning outcomes in environments containing visual distractions such as educational posters and artwork, compared with visually streamlined environments. Similarly, classroom complexity and color are negatively related to student achievement; however, Barrett, Davies, Zhang, and Barrett (2015) recently found evidence of a curvilinear relationship suggesting that moderate amounts of visual stimulation may be optimal for learning. As discussed above, visual clutter can be detrimental for school-age children but it can also serve as a distraction and impair learning in early childhood. For example, visual clutter can impede vocabulary acquisition: Pereira et al. (2014) found that toddlers’ acquisition of novel labels was enhanced when the target was centrally positioned in the toddlers’ view, with few or no distractors, compared with cases in which the target was less central or among more distractors (see also Horst, Scott, & Pollard, 2010). However, here again, the relationship may be curvilinear, as label acquisition may be enhanced by the presence of a single distractor compared with conditions under which all visual clutter is omitted. Complexity of visual stimuli or overloading also affects preschoolers’ ability to learn new words. Three-year-olds struggled to learn new words from books when they contained multiple illustrations per page compared with visually streamlined books containing a single illustration per page. Additionally, young children show diminished learning outcomes when learning novel words or content from books containing pop-ups or manipulative features compared with standard picture books. Efforts to increase attention and engagement while reading have resulted in electronic books filled with animations and sound effects.
Read also: Crafting a Syllabus: Best Practices
Integrating Auditory and Visual Considerations in Learning Environments
These examples highlight the importance of integrating across disciplines, and they underscore the significance of considering both auditory and visual properties and their potential for distraction when creating learning environments and instructional materials that support early learning. For example, if quilts are used to muffle distracting sounds, solid and neutral-color materials may be less visually distracting than colorful or patterned fabric. As discussed above, highly decorated learning environments, even those containing educational content, can increase inattention and decrease learning. Educational practitioners can help mitigate these negative effects by reducing the amount of visual material displayed in the classroom. Instead of decorating the classroom itself, educators can create exhibits showcasing student work in hallways or the cafeteria. With advancements in technology, classrooms can become adaptive places where only materials relevant for the current lesson are projected, reducing attentional competition between the visual environment and learning activity-a possibility we are currently investigating. One could easily extend these ideas to other formats, including educational applications, games, books, and television programing.
Eye-Tracking and Visual Attention: A Window into Cognitive Development
To provide more direct assessment of cognitive development, one highly promising method is eye-tracking. Eye-tracking is a non-invasive technology that can provide exquisite temporal and spatial resolution on a child’s direction of gaze, and can be largely automated to produce scalable measures of individual differences in visual attention in early development. Humans are a highly visual species, and control information input through eye movements and visual attention is a particularly crucial modality for learning in early development. Eye-tracking tasks can be conducted without the need for complex verbal instruction and do not rely upon children’s comprehension ability or motor skills. Eye-tracking limits potential researcher bias as it minimises the role of the researcher and provides a more direct and objective measure of children’s processing. Additionally, compliance rates are usually high since there are no requirements for the child to wear any equipment, nor the need to interact with anyone unfamiliar to them, enabling a broader community of children to successful participate. Many developmental studies have utilised eye-tracking to investigate attention, but these have typically focussed on comparing group differences in performance on a single eye-tracking task or a battery of tasks in small size cohorts. Using a comprehensive battery of visual attention tasks also allows investigators to move beyond metrics extracted from single tasks and to test whether profiles of attention across the battery fit are consistent with theoretical models of visual attention. In designing eye-tracking tests of attention, many investigators have distinguished between endogenous and exogenous control of attention. Endogenous control involves executive attention systems that select goal-relevant actions by resolving conflict between competing inputs or impulses. Conversely, exogenous control is stimulus-driven, and relies primarily on ‘bottom-up’ mechanisms. Although most work on the development of attention has employed non-social stimuli, there has been increasing recent interest in the construct of ‘social attention’, the motivation to attend to social stimuli such as people and faces. This is of particular interest given its perturbation in the emergence of neurodevelopmental conditions such as autism. A rich historical literature indicates that infants orient to faces from birth and show preference for face-like stimuli (versus non-face configurations), direct gaze (versus averted gaze or eyes-closed gaze) and biological motion (versus inverted or scrambled) throughout development.
Visual Attention Span and Reading Fluency: A Complex Relationship
Reading fluency refers to the ability to read rapidly and accurately to comprehend text. However, some individuals cannot develop fluent reading skills, which can have severe academic, economic, and psychosocial consequences. Hence, it is necessary to explore the developmental mechanisms of reading fluency to help these struggling readers improve their skills in fluent reading. Fluent reading procedure involves simultaneous visual processing of several orthographic units, which mainly reflects the capacity of the visual attention span (VAS). The VAS refers to the amount of distinct visual elements that can be processed in parallel in a multi-element array. The connectionist multi-trace memory model of polysyllabic word reading proposed by Ans et al. (1998) provides a possible explanation for the relationship between VAS and reading fluency (especially for the word/lexical-level of fluent reading). According to this model, there are mainly two reading procedures (i.e., global and analytic reading modes). These two reading procedures differ in the size of the visual attention window from which orthographic information is extracted. In the global reading procedure, the window size of visual attention extends over the whole string, further contributing to the generation of the entire phonological output; by contrast, in the analytic reading procedure, the visual attention window narrows down to serially process single elements of the visual string, and to encode the relationship between orthographic and phonological sub-lexical segments. The fluent reading procedure involves attention allocation across letters, and then VAS capacity limits the number of letters that can be processed in parallel during fluent reading, which in turn affects the processing of orthographic and phonological representations of the sequence. Previous studies have typically adopted whole/partial report tasks and modified paradigms to measure VAS capacity, thus further examining the relationship between VAS and fluent reading.
Visual Attention Paradigms and Reading Difficulties
Relations of visual attention to reading have long been hypothesized; however, findings in this literature are quite mixed. These relations have been investigated using several different visual attention paradigms and with variable controls for other competing reading-related processes. We extended current knowledge by evaluating four of the key visual attention paradigms used in this research (visual attention span, attention blink, visual search, visuospatial attention) in a single study. We tested the relation of these to reading in 90 middle schoolers at high risk for reading difficulties, while considered their effect in the context of known language predictors. Performance on visual-spatial, visual search, and attentional blink paradigms showed weak, nonsignificant relations to reading. Visual attention span tasks showed robust relations with reading even when controlling for language, though only when stimuli were alphanumeric. Although further exploration of visual attention in relation to reading may be warranted, the robustness of this relationship appears questionable, particularly beyond methodological factors associated with the measurement of visual attention. Reading skill and reading difficulties are widely considered to be rooted in language, with phonological awareness and rapid access to alphabetic knowledge widely recognized as essential to the reading process. Many researchers also view other cognitive (e.g., non-language) processes as important for reading, and their contributions are “value added” after controlling for primary language factors. An alternative view is that some cognitive skills are as or more central to reading than language skills as precursors of or contributors to the relationship of language to reading. One such alternative view involves visual attention. There is little reason to doubt that attention is involved in the reading process, but in order to clarify its role, it is important to understand what is meant by attention, how it is measured, and how it impacts the reading process. Because reading is manifestly visual (i.e., words are printed on a page), studies of attention and reading have focused on the role of visual attention. Most explanations of why visual attention is relevant to reading include not only that reading inherently requires direct visual processes, but also because spatial attention, as the focus of selective attention must move over a word in a spatial fashion (e.g., left to right in many written languages), and attend to specific graphemes (letters or syllables). Spatial attention is thought by some to be the mechanism through which a magnocellular deficit causes dyslexia, because of the connection of the magnocellular pathway to the posterior parietal cortex, which is known to be involved in selective spatial attention. Given the range of views and empirical results, the role of visual attention for reading remains an area worthy of investigation, particularly when doing so addresses multiple gaps in this literature. For example, studies that evaluate visual attention are primarily conducted in transparent orthographies (e.g., Italian), where reading difficulties are characterized by reading fluency more than reading accuracy. This is in contrast to opaque languages (e.g., English) where accuracy is much more prominent. Studies also vary in the extent to which language factors are controlled for when interpreting the role of visual attention.
Read also: Effective English Practices
Read also: Simple Contrastive Learning Framework
tags: #visual #attention #and #learning #relationship

