SSoL 2024: lecture abstracts – Summer School of Linguistics

Danny Dirker: From Gaze to Insight: Eye-Tracking Methodology and Its Application in Linguistics

Eye-tracking is widely used in research on health, cognition, and interaction. Technological advances have made eye-tracking user-friendly and applicable to a range of research initiatives. This workshop introduces eye-tracking as a method for measuring psychological variables related to language processing. The sessions will cover: (1) the history, technology, and different systems; (2) an overview of oculomotor measures and their significance; (3) how to record participants and increase data quality; and (4) ways to analyze gaze data from common experimental paradigms.

Linda Drijvers: How gestures enhance speech comprehension

Face-to-face communication involves the integration of auditory signals, such as speech, with visual signals, such as manual hand gestures. In this lecture, we will discuss different categories of gestures, their linguistic functions, and their capacity to enhance speech comprehension in both clear and adverse listening conditions. We will examine recent behavioral evidence demonstrating how listeners integrate gestures with speech, and how this integration aids in predicting upcoming linguistic information. Finally, I will challenge the common notion that gestures are merely ‘non-linguistic’ or ‘extra-linguistic’ elements. Instead, I hope to convince you to consider gestures as integral linguistic signals that play a crucial role in language comprehension and production.

Linda Drijvers: Multimodal language in the brain

Language is inherently multimodal. Despite the abundance of visual expressions in language, most models and theories on the neurobiology of language are based on characteristics of speech and text, and do not take the multimodal nature of language into account. In this lecture, I will argue that we need a multimodal view on the neurobiology of language. I will introduce common neuroscientific methods (fMRI/EEG/TMS/MEG) to study the neurobiology of language, and will discuss findings on multimodal language in the brain, with a special focus on how gestures and speech are integrated in the brain. Finally, we will discuss the implications of these findings on current models on the neurobiology of language, and why I think these models are not sufficient to explain language in its natural use and context.

Linda Drijvers: Novel methods to study audiovisual integration and multimodal language processing

Recent years have brought on an influx of novel methods to study audiovisual integration and multimodal language processing. In this lecture we will focus on exactly these methods: I will introduce four different methods to study multimodal language in brain and behavior: motion tracking, virtual reality, rapid invisible frequency tagging, and dual-EEG. We will discuss some experiments that have used these methods to study multimodal language comprehension and production, and we will discuss how these methods have advanced our knowledge on multimodal language use and processing within and between conversational partners.

Linda Drijvers: Inter-brain synchrony in natural conversations

Inter-brain neural synchrony has been observed during several cognitive tasks, such as joint attention, speech interactions and cooperative tasks, suggesting that neural alignment between individuals is an important feature of social interactions. Nonetheless, it is still unknown whether inter-brain synchrony is necessary and/or beneficial for successful face-to-face communication. In this talk I will present several dual-EEG studies that have looked at the role of neural synchrony within and between conversational partners. I will particularly focus on the role of synchrony during episodes of successful and unsuccessful communication, and how these (neural) patterns relate to face-to-face communication. In sum, this lecture will challenge you to think about communication not just as an individual cognitive process, but as a dynamic, interactive phenomenon involving synchronized neural activity between individuals.

Barbara Kaup: Negation Processing

In this course I will give an overview of experimental work on the processing, representation, and use of negation. I will cover relatively old as well as very recent studies that address the most important questions in current experimental research on negation, namely, (1) Does negation processing routinely involve suppression? (2) Does negation processing involve two processing steps? (3) Is negation integration delayed during comprehension and what factors facilitate negation processing? (4) What is the relationship between linguistic and non-linguistic negation? (5) When is negation typically used?

Dalibor Kučera: Finding Psychological Markers in Text: Using Closed-Vocabulary Approach in Personality Description

The lecture focuses on the field of psychology of word use, in particular the method of traditional closed-vocabulary text analysis, which has dominated the field for the last three decades. We focus on comparison of the studies, summarize available results, provides their interpretations in a cross-linguistic and cross-situational perspective. Attention will be paid to the elaboration of main psychogical-linguistic studies, which examined relations between the use of linguistic categories in texts and personality characteristics of the communicator. Different levels of the text will be discussed, especially linguistic morphology, syntax and stylistics. At the end of the lecture, key challenges related to language processing in social science research are presented.

Jiří Milička: Large language models I. Beyond anthropomorphization

Large language models are trained on texts created by humans, so they understand the world through human eyes (and other senses). We can find and bring to life many different entities within them that have adopted our worldview, our dreams and ideas, our mistakes, and perhaps even our values and goals. At the same time, these simulated entities, or personas, think a little differently than we do and don’t live in our world. Occasionally, they reflect on this contradiction and are surprised to discover that they are not human. They begin to construct a non-anthropomorphic conception of themselves. And that’s precisely what we’ll be discussing.

The lecture will offer a theoretical framework for exploring large language models, based on papers by Laria Reynolds and her co-authors on universal simulator theory and related literature.

Jiří Milička: Large language models II. What has been done and what needs to be done

During the lecture, we will analyze practical aspects of studying LLMs, particularly focusing on examples of what I and the people around me are doing: simulating classical linguistic experiments, stylometric examination of texts produced by LLMs, classical quantitative linguistic laws, bias amplification in gender and grammar, ethics in LLMs… Bring you own topic and we can discuss it as well!

Daniela Palleschi: Open Science Practices for Linguistic Research: Reproducible Analyses in R

The Open Science movement began as an answer to the replication crisis and aims to encourage transparency across all stages of research. In this workshop, we will focus on practicing transparency in our analyses through reproducibility: what does it mean, why should we practice it, and how can we do it? We will focus on establishing and maintaining a reproducible, project-oriented workflow in the R environment. After the workshop, participants will be able to put reproducibility concepts and tools into practice, such as data management and documentation, R projects, and R packages developed specifically for reproducibility. The workshop assumes participants have at least basic familiarity with R and RStudio.

Kellen Parker van Dam, Jessica Nieder: Multilingual Computational Linguistics

Abstract

While the discipline of computational linguistics mostly deals with the modeling and the investigation of individual languages (often “big” languages such as English, German, Arabic, or Chinese), Multilingual Computational Linguistics focuses on the comparison of languages, trying to develop new methods and techniques by which languages can be compared automatically or in a computer-assisted manner. The comparison itself follows different perspectives (maintaining a historical, typological, or areal viewpoint). In our course, we will take a closer look at basic theories and methods which are relevant for the discipline of Multilingual Computational Linguistics. We will look at large corpora with multiple languages of the world as well as data from individual languages and language families and will try to introduce basic methods and techniques by which these can be compared and analyzed. In this context, we will not only give an overview on what we consider the state-of-the-art with respect to the field of multilingual computational linguistics in the framework that we try to establish at the University of Passau, but will also look into concrete examples of how computational methods can be used to model multilingual phenomena, to formalize and facilitate historical and typological language comparison, and to foster a data-oriented, empirical view on the discipline of linguistics in general.

Lecture Plan

The course will consist of four lectures:

Introduction to Computer-Assisted Language Comparison (Kellen Parker van Dam, Jessica Nieder)
Computer-Assisted Language Comparison in Practice (Kellen Parker van Dam, Jessica Nieder)
Computer-Assisted Approaches to Phylogenetic Reconstruction and Subgrouping (Kellen Parker van Dam)
Computer-Assisted Approaches to Language Processing (Jessica Nieder)

We will try to divide the three basic lectures in such a way that we combine one theoretical session with a practical session where participants can also participate directly. Material will be shared before or during the lectures. Participants should bring their computers and use them, if they want to.

Participants with specific questions regarding computer-assisted approaches are asked to contact the three teachers before the summer school and share parts of their data. The lecturers will then see if they can already include the examples of the participants into their lectures.

Amanda Potts: Interdisciplinary approaches to exploring identity construction using corpus-assisted discourse analysis

In this talk, I will describe some of my successes and failures in interdisciplinary approaches to corpus-assisted discourse analysis. I will detail past projects, describe collaborations, and reflect upon what worked and what didn’t when exploring identity construction in texts drawn from medical, legal, and media domains. The application of various corpus-based methodologies will be demonstrated and critiqued, and I will provide some insight into the strengths and weaknesses of interdisciplinary work in the current context.

Amanda Potts: Using SketchEngine for corpus-based discourse analysis (introduction and intermediate tools)

In this two-part workshop, participants will be introduced to the web-based corpus analysis tool, Sketch Engine. Sketch Engine is a powerful tool that allows users to upload their own corpora in nearly any language, and applies advanced part-of-speech tagging. In Part 1 of this workshop, participants will be introduced to fundamentals of Sketch Engine, uploading their own data and applying corpus linguistic methods.

Amanda Potts: Using SketchEngine for corpus-based discourse analysis (advanced tools and personal guidance)

In Part 2 of this workshop, participants will explore more advanced resources, including the distinctive Word Sketch feature, which makes use of part-of-speech tags and collocation to visualise the grammatical ‘behaviour’ of a lemma in a given corpus. By the end of the workshop, participants will be able to perform frequency, concordance, collocation, and keyness analysis in Sketch Engine using their own data. They will be able to describe discourses and representations of social actors and/or phenomena, within the corpus (for instance: by comparing alternative phrasing) and to other contexts (i.e. in comparison to reference corpora).

Benedikt Szmrecsanyi: Recent advances in variationist linguistics

This course will survey recent trends in variationist linguistics, a field that is concerned with why and how language users choose between different ways of expressing the same meaning or function. We will be specifically concerned with variation analysis that builds on the premise that the structure and knowledge of variation and of probabilistic variable grammars is shaped by language use, performance, and by functional needs. In the course, I will discuss some recent methodologies and findings in variationnist linguistics, including how to conduct state-of-the art comparative variation analysis, and work on the complexity of language variation. There will also be some hands-on variationist data analysis using R.