figshare
Browse
NLC2013_Barres&ArbibFinal.pdf (3.96 MB)

Matching utterances with visual scenes: Neurocomputational investigation of the language-vision interface

Download (0 kB)
poster
posted on 2013-12-03, 01:19 authored by Victor BarresVictor Barres, Michael Arbib

Poster presented at the Society for the Neurobiology of Language conference, San Diego, November 2013

Abstract

Humans can do a great amount of things with their languages. But among these, talking about the world one perceives, describing in a compact way what has caught one’s attention and conversely matching descriptions with a visual scene are key both from an evolutionary and developmental perspective. We present a new model of situated comprehension that establishes a bridge between the dynamical incremental processing approaches of neural net models (e.g., CIANet, Mayberry et al 2009) and the cognitive analyses of the relations between linguistic knowledge and sensory-motor systems (e.g., ECG, Feldman & Narayanan 2004), while adding the architectural constraints necessary to simulate the functional consequences of brain lesions. Specifically, based on a conceptual analysis by Barrès and Lee (2013), we present an implemented schema-level neurocomputational model of language comprehension during sentence-picture matching tasks in both normal subjects and agrammatic patients that rests on three key elements. (1) Construction grammar and visually anchored semantics. The model uses the formalism of Template Construction Grammar, a visually grounded construction grammar that bridges between schema theory and linguistic theory (Arbib & Lee 2008). Semantic structures extracted from or matched against the cognitive representation of visual scenes are represented as SemReps, a graph structure that incorporate conceptual elements (nodes), their thematic or spatial relations (edges), their relevance (activity levels), and their link to visuo-spatial regions. Grammatical knowledge is modeled as a network of constructions schemas defined as mappings between linguistic forms and SemReps. (2) Neural architecture. Going beyond Arbib & Lee, the two-route functional architecture of the new model reflects neuropsychological data showing that: (a) world knowledge plays a role alongside grammatical knowledge during comprehension and can survive lesion to a “grammatical route” (Caramazza & Zurif 1976) and (b) that world knowledge (“heavy” semantics) should be distinguished from that form of “light” semantics engaged in characterizing the semantico-syntactic categories of slot fillers in constructions (Kemmerer 2000). (3) Dynamical distributed system. The model uses cooperative computation to operationalize distributed processes both within and between functional routes in a way that is consistent with the dynamic nature of neural activity. Sentence-picture matching is therefore simulated in state-space as a temporal trajectory of schemas activity levels. It rests on a dynamic self-organized search for a semantic interpretation of linguistic inputs (received one word at a time) that satisfies both world and grammatical knowledge constraints. The two parallel routes process these constraints separately and cooperate online to build a SemRep. The emerging sentence-based SemRep is dynamically compared to the SemRep representing the output of the visual system to determine whether or not the picture matches the sentence. Within the context of sentence-picture matching tasks, we replicate core patterns of agrammatic comprehension performances. We simulate and quantitatively compare conceptual accounts of agrammatic comprehension such as thematic-role assignment deficits and working memory limitations. We show their limits by outlining the central role that complex temporal interactions between distributed systems play during the comprehension process. Finally, we model comprehension under adverse conditions in normal subjects and analyze its similarity with agrammatics’ performances.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC