Referential Choice: Predictability and Its Limits
- We report a study of referential choice in discourse production, understood as the choice between various types of referential devices, such as pronouns and full noun phrases. Our goal is to predict referential choice, and to explore to what extent such prediction is possible. Our approach to referential choice includes a cognitively informed theoretical component, corpus analysis, machine learning methods and experimentation with human participants. Machine learning algorithms make use of 25 factors, including referent’s properties (such as animacy and protagonism), the distance between a referential expression and its antecedent, the antecedent’s syntactic role, and so on. Having found the predictions of our algorithm to coincide with the original almost 90% of the time, we hypothesized that fully accurate prediction is not possible because, in many situations, more than one referential option is available. This hypothesis was supported by an experimental study, in which participants answered questions about either the original textWe report a study of referential choice in discourse production, understood as the choice between various types of referential devices, such as pronouns and full noun phrases. Our goal is to predict referential choice, and to explore to what extent such prediction is possible. Our approach to referential choice includes a cognitively informed theoretical component, corpus analysis, machine learning methods and experimentation with human participants. Machine learning algorithms make use of 25 factors, including referent’s properties (such as animacy and protagonism), the distance between a referential expression and its antecedent, the antecedent’s syntactic role, and so on. Having found the predictions of our algorithm to coincide with the original almost 90% of the time, we hypothesized that fully accurate prediction is not possible because, in many situations, more than one referential option is available. This hypothesis was supported by an experimental study, in which participants answered questions about either the original text in the corpus, or about a text modified in accordance with the algorithm’s prediction. Proportions of correct answers to these questions, as well as participants’ rating of the questions’ difficulty, suggested that divergences between the algorithm’s prediction and the original referential device in the corpus occur overwhelmingly in situations where the referential choice is not categorical.…
Author details: | Andrej A. Kibrik, Mariya V. Khudyakova, Grigory B. Dobrov, Anastasia LinnikORCiD, Dmitrij A. Zalmanov |
---|---|
DOI: | https://doi.org/10.3389/fpsyg.2016.01429 |
ISSN: | 1664-1078 |
Pubmed ID: | https://pubmed.ncbi.nlm.nih.gov/27721800 |
Title of parent work (English): | Frontiers in psychology |
Publisher: | Frontiers Research Foundation |
Place of publishing: | Lausanne |
Publication type: | Article |
Language: | English |
Year of first publication: | 2016 |
Publication year: | 2016 |
Release date: | 2020/03/22 |
Tag: | cross-methodological approach; discourse production; machine learning; non-categoricity; referential choice |
Volume: | 7 |
Number of pages: | 21 |
First page: | 9939 |
Last Page: | 9947 |
Funding institution: | Russian Foundation for Basic Research [14-06-00211]; Basic Research Program of the National Research University Higher School of Economics (HSE); Russian Academic Excellence Project [5-100] |
Peer review: | Referiert |
Institution name at the time of the publication: | Humanwissenschaftliche Fakultät / Exzellenzbereich Kognitionswissenschaften |