Thomas Aquinas sees a sharp metaphysical distinction between artifacts and substances, but does not offer any explicit account of it. We argue that for Aquinas the contribution that an artisan makes to the generation of an artifact compromises the causal responsibility of the form of that artifact for what the artifact is; hence it compromises the metaphysical unity of the artifact to that of an accidental unity. By contrast, the metaphysical unity of a substance is achieved by a process of generation whereby the substantial form is solely responsible for what each part and the whole of a substance are. This, we submit, is where the metaphysical difference between artifacts and substances lies for Aquinas. Here we offer on behalf of Aquinas a novel account of the causal process of generation of substances, in terms of descending forms, and we bring out its explanatory merits by contrasting it to other existing accounts in the literature.
AbstractIt is well known that real-time human language processing is highly incremental and context-driven, and that the strength of a comprehender’s expectation for each word encountered is a key determinant of the difficulty of integrating that word into the preceding context. In reading, this differential difficulty is largely manifested in the amount of time taken to read each word. While numerous studies over the past thirty years have shown expectation-based effects on reading times driven by lexical, syntactic, semantic, pragmatic, and other information sources, there has been little progress in establishing the quantitative relationship between expectation (or prediction) and reading times. Here, by combining a state-of-the-art computational language model, two large behavioral data-sets, and non-parametric statistical techniques, we establish for the first time the quantitative form of this relationship, finding that it is logarithmic over six orders of magnitude in estimated predictability. This result is problematic for a number of established models of eye movement control in reading, but lends partial support to an optimal perceptual discrimination account of word recognition. We also present a novel model in which language processing is highly incremental well below the level of the individual word, and show that it predicts both the shape and time-course of this effect. At a more general level, this result provides challenges for both anticipatory processing and semantic integration accounts of lexical predictability effects. And finally, this result provides evidence that comprehenders are highly sensitive to relative differences in predictability – even for differences between highly unpredictable words – and thus helps bring theoretical unity to our understanding of the role of prediction at multiple levels of linguistic structure in real-time language comprehension.
In visual word identification, readers automatically access word internal information: they recognize orthographically embedded words (e.g., HAT in THAT) and are sensitive to morphological structure (DEAL-ER, BASKET-BALL). The exact mechanisms that govern these processes, however, are not well established yet - how is this information used? What is the role of affixes in this process? To address these questions, we tested the activation of meaning of embedded word stems in the presence or absence of a morphological structure using two semantic categorization tasks in Italian. Participants made category decisions on words (e.g., is CARROT a type of food?). Some no-answers (is CORNER a type of food?) contained category-congruent embedded word stems (i.e., CORN-). Moreover, the embedded stems could be accompanied by a pseudo-suffix (-er in CORNER) or a non-morphological ending (-ce in PEACE) - this allowed gauging the role of pseudo-suffixes in stem activation. The analyses of accuracy and response times revealed that words were harder to reject as members of a category when they contained an embedded word stem that was indeed category-congruent. Critically, this was the case regardless of the presence or absence of a pseudo-suffix. These findings provide evidence that the lexical identification system activates the meaning of embedded word stems when the task requires semantic information. This study brings together research on orthographic neighbors and morphological processing, yielding results that have important implications for models of visual word processing.
Abstract Four language production experiments examine how English speakers plan compound words during phonological encoding. The experiments tested production latencies in both delayed and online tasks for English noun-noun compounds (e.g., daytime), adjective-noun phrases (e.g., dark time), and monomorphemic words (e.g., denim). In delayed production, speech onset latencies reflect the total number of prosodic units in the target sentence. In online production, speech latencies reflect the size of the first prosodic unit. Compounds are metrically similar to adjective-noun phrases as they contain two lexical and two prosodic words. However, in Experiments 1 and 2, native English speakers treated the compounds as single prosodic units, indistinguishable from simple words, with RT data statistically different than that of the adjective-noun phrases. Experiments 3 and 4 demonstrate that compounds are also treated as single prosodic units in utterances containing clitics (e.g., dishcloths are clean) as they incorporate the verb into a single phonological word (i.e. dishcloths-are). Taken together, these results suggest that English compounds are planned as single recursive prosodic units. Our data require an adaptation of the classic model of phonological encoding to incorporate a distinction between lexical and postlexical prosodic processes, such that lexical boundaries have consequences for post-lexical phonological encoding.
Published: 13 May 2016 The present study investigated the proactive nature of the human brain in language perception. Specifically, we examined whether early proficient bilinguals can use interlocutor identity as a cue for language prediction, using an event-related potentials (ERP) paradigm. Participants were first familiarized, through video segments, with six novel interlocutors who were either monolingual or bilingual. Then, the participants completed an audio-visual lexical decision task in which all the interlocutors uttered words and pseudo-words. Critically, the speech onset started about 350 ms after the beginning of the video. ERP waves between the onset of the visual presentation of the interlocutors and the onset of their speech significantly differed for trials where the language was not predictable (bilingual interlocutors) and trials where the language was predictable (monolingual interlocutors), revealing that visual interlocutor identity can in fact function as a cue for language prediction, even before the onset of the auditory-linguistic signal. This research was funded by the Severo Ochoa program grant SEV-2015-0490, a grant from the Spanish Ministry of Science and Innovation (PSI2012-31448), from FP7/2007-2013 Cooperation grant agreement 613465-AThEME and an ERC grant from the European Research Council (ERC-2011-ADG-295362) to M.C. We thank Antonio Ibañez for his work in stimulus preparation.
There is widespread interest in the relationship between the neurobiological systems supporting human cognition and emerging computational systems capable of emulating these capacities. Human speech comprehension, poorly understood as a neurobiological process, is an important case in point. Automatic Speech Recognition (ASR) systems with near-human levels of performance are now available, which provide a computationally explicit solution for the recognition of words in continuous speech. This research aims to bridge the gap between speech recognition processes in humans and machines, using novel multivariate techniques to compare incremental ‘machine states’, generated as the ASR analysis progresses over time, to the incremen- tal ‘brain states’, measured using combined electro- and magneto-encephalography (EMEG), generated as the same inputs are heard by human listeners. This direct comparison of dynamic human and machine internal states, as they respond to the same incrementally delivered sensory input, revealed a significant correspondence between neural response patterns in human superior temporal cortex and the structural properties of ASR-derived phonetic models. Spatially coherent patches in human temporal cortex responded selectively to individual phonetic features defined on the basis of machine-extracted regularities in the speech to lexicon mapping process. These results demonstrate the feasibility of relating human and ASR solutions to the problem of speech recognition, and suggest the potential for further studies relating complex neural computations in human speech comprehension to the rapidly evolving ASR systems that address the same problem domain. This research was supported financially by an Advanced Investigator grant to WMW from the European Research Council (AdG 230570 NEUROLEX), by MRC Cognition and Brain Sciences Unit (CBSU) funding to WMW (U.1055.04.002.00001.01), and by a European Research Council Advanced Investigator grant under the European Community’s Horizon 2020 Research and Innovation Programme (2014-2020 ERC Grant agreement no 669820) to Lorraine K. Tyler. LS was partly supported by the NIHR Biomedical Research Centre and Biomedical Unit in Dementia based at Cambridge University Hospital NHS Foundation Trust.
The use of written symbols is a major achievement of human cultural evolution. However, how abstract letter representations might be learned from vision is still an unsolved problem 1,2 . Here, we present a large-scale computational model of letter recognition based on deep neural networks 3,4 , which develops a hierarchy of increasingly more complex internal representations in a completely unsupervised way by fitting a probabilistic, generative model to the visual input 5,6 . In line with the hypothesis that learning written symbols partially recycles pre-existing neuronal circuits for object recognition 7 , earlier processing levels in the model exploit domain-general visual features learned from natural images, while domain-specific features emerge in upstream neurons following exposure to printed letters. We show that these high-level representations can be easily mapped to letter identities even for noise-degraded images, producing accurate simulations of a broad range of empirical findings on letter perception in human observers. Our model shows that by reusing natural visual primitives, learning written symbols only requires limited, domain-specific tuning, supporting the hypothesis that their shape has been culturally selected to match the statistical structure of natural environments 8 .
Language production models typically assume that retrieving a word for articulation is a sequential process with substantial functional delays between conceptual, lexical, phonological and motor processing, respectively. Nevertheless, explicit evidence contrasting the spatiotemporal dynamics between different word production components is scarce. Here, using anatomically constrained magnetoencephalography during overt meaningful speech production, we explore the speed with which lexico-semantic versus acoustic-articulatory information of a to-be-uttered word become first neurophysiologically manifest in the cerebral cortex. We demonstrate early modulations of brain activity by the lexical frequency of a word in the temporal cortex and the left inferior frontal gyrus, simultaneously with activity in the motor and the posterior superior temporal cortex reflecting articulatory-acoustic phonological features (+LABIAL vs. +CORONAL) of the word-initial speech sounds (e.g., Monkey vs. Donkey). The specific nature of the spatiotemporal pattern correlating with a word's frequency and initial phoneme demonstrates that, in the course of speech planning, lexico-semantic and phonological-articulatory processes emerge together rapidly, drawing in parallel on temporal and frontal cortex. This novel finding calls for revisions of current brain language theories of word production. We thank Elin Runnqvist for her useful comments on previous versions of the manuscript and we are grateful to Max Garagnani, Yuri Shtyrov and Olaf Hauk for the technical help with the MEG analyses. Kristof Strijkers received funding for this research from the People Programme (Marie Curie Actions) of the European Union's Seventh Framework Programme (FP7/2007–2013) under REA grant agreement number 302807, and from the French Ministry of Research (grant number: ANR16-CE28-0007-01)
Natural language is compositional; the meaning of a sentence is a function of the meaning of its parts. This property allows humans to create and interpret novel sentences, generalizing robustly outside their prior experience. Neural networks have been shown to struggle with this kind of generalization, in particular performing poorly on tasks designed to assess compositional generalization (i.e. where training and testing distributions differ in ways that would be trivial for a compositional strategy to resolve). Their poor performance on these tasks may in part be due to the nature of supervised learning which assumes training and testing data to be drawn from the same distribution. We implement a meta-learning augmented version of supervised learning whose objective directly optimizes for out-of-distribution generalization. We construct pairs of tasks for meta-learning by sub-sampling existing training data. Each pair of tasks is constructed to contain relevant examples, as determined by a similarity metric, in an effort to inhibit models from memorizing their input. Experimental results on the COGS and SCAN datasets show that our similarity-driven meta-learning can improve generalization performance. ACL2021 Camera Ready; fix a small typo