SSoL 2015: lecture abstracts – Summer School of Linguistics

Štefan Beňuš: Entrainment in spoken interactions

Entrainment is a ubiquitous tendency observed in human-human dialogues in which interlocutors adapt their communicative behavior to the behavior of their conversational partners in several modalities. In speech, entrainment has been observed in many linguistic and paralinguistic domains (phonetic realization of words, conversational fillers, dialectal features, intensity, speech rate, lexical and syntactic choices, function word use, turn-taking behavior, and others), and linked to several social domains. The talk will review recent advances in this area from my own work and other labs, and discuss also cross-cultural aspects of entrainment and its potential for applications in human-machine communication.

Jakub Dotlačil: Workshop: How to study and analyze reading – two experimental methods

In the workshop, I will summarize eye-tracking and self-paced reading as two methods employed in the analysis of reading or more exactly in the analysis of the time progress of reading. I will show how to design an experiment using the self-paced reading method and I will also talk about some pitfalls of this method and about the analysis and (linguistic) interpretation of the collected data. If we will have time, we will look at some examples of self-paced reading experiments.

Michael Dunn: Evolutionary analysis of linguistic change

In this course I will give a brief introduction to using phylogenetic comparative methods for the exploration of grammatical and semantic change. In Lecture 1 we will look at the problem of genealogical inference, and examine what we can know about the structure of language families. In lectures 2 and 3 we will examine how these language genealogies can be used in models of the evolution of other aspects of linguistic structure. We will work through some novel case studies in the evolution of grammatical and semantic structure, and discuss the prospects and potential of a world phylogeny of languages for phylogenetic comparative analysis.

Lecture 1: Genealogical inference

In this lecture we will look at the problem of genealogical inference in traditional historical linguistics and beyond. We will examine what more can be extracted from the historical signal preserved within languages using a synthesis of traditional historical linguistic techniques and computational phylogenetic methods originating from evolutionary biology. The logic of model-based tree inference will be introduced, with a focus on how we can incorporate what we already know into an analysis, and our confidence in it, and how that enables us to make probabilistic inferences. Importantly, model-based tree inference lets us infer rates of change, as well as real calendar dates.

Lecture 2: Phylogenetic comparative methods

The phylogenetic comparative methods are a family of techniques for investigating phylogenetically structured data. The key requirement these methods have is a quantified model of family relatedness, of the type discussed in Lecture 1. Two methods useful for linguistics are:

(i) Ancestral state reconstruction. A model of language genealogy allows us analyse linguistic (and some cases, non-linguistic) features and make statistically sound probabilistic statements about their values at the root of the tree or on any other ancestral node.

(ii) Model testing. Use a language genealogy to infer the evolutionary dynamics of change of a linguistic feature. Or conversely, given a hypothesis about an evolutionary process (e.g. a proposed grammaticalization chain) test how consistent that hypothesis is with the observed data.

Lecture 3: Phylogenetic comparative methods continued, prospect

In the final lecture we will discuss one further kind of phylogenetic comparative method:

(iii) Dependency test. If, for instance, we are interested in correlations between linguistic features, then genealogy is a crucial concern: are two linguistic features correlated because there is a functional relationship between them, or is this correlation an artefact of a historical coincidence in a mutual ancestor?

Finally, we will work through some novel case studies in the evolution of grammatical and semantic structure, and discuss the prospects and potential of a world phylogeny of languages for phylogenetic comparative analysis.

Jan Havlíček: Evolution of communication

Although communication is most extensively studied in our own species, there is also enormous variability in communication systems across various nonhuman taxa. Most commonly, animal communication is investigated from the standpoint of evolutionary theory. As communication systems always involve at least two parties (i.e. producer and perceiver), it results from co-evolutionary processes (i.e. change in one party may create selective pressure on the other party). The evolutionary analysis primarily focuses on functional and phylogenetic dimensions. The functional approach performs analysis of costs and benefits of communication in both the producer and perceiver. From this perspective signals (in contrast to cues) are considered only such structures or behaviours which evolved for communication purposes. Ritualization is currently thought to be the main evolutionary process in the evolution of signals. In contrast, phylogenetic analysis primarily focuses on historical dimension of the communication system. Such analysis can provide insights into the origins of the individual signalling systems and their temporal changes.

The individual evolutionary processes will be illustrated on examples from various species. Finally, I will discuss consequences of application of the evolutionary framework to human communication systems including evolution of language.

Jan Havlíček: Human chemosensory communication

Olfactory communication shows several specific features as compared to other modalities. These include independence on ambient light, presence of the communicating individual and its temporal dynamics. Olfactory communication in humans has been shown to play a significant role in several areas such as communication between mother and infant, communication of emotions and romantic relationships.

In the context of mother-infant relationship it appears that infants not only recognise their mother based on their smell but also prefer the smell of mother milk irrespective of previous experience. There is also some evidence, though incomplete, showing that individual recognition between mother and infant facilitates attachment formation.

There is also a growing body of research showing that exposure to odours collected under emotionally valenced situations changes the perceivers’ cognitive functioning and behaviour, even though they may not be aware of what the odour refers to or may not even perceive it on the supraliminal level. The evidence regarding proximate mechanisms behind the production of such odours is currently missing, which makes it impossible to conclude whether these could be termed signals.

In the context of romantic relationships the majority of attention has been devoted so far to the role of Major Histocompatibility Complex (MHC) genes on olfactory preferences. The products of the MHC genes play a central role in the immune system functioning, namely in self/non-self recognition. The result of most studies show preferences for MHC-dissimilar individuals, which may result in a more efficient immune system functioning in the offspring. The MHC-associated preferences appear to be modulated by use of hormonal contraception and may also impact on relationship dynamics. The most intensively studied group of compounds are undoubtedly the 16-androstenes which can be found in the axilla. Previous studies have shown that the odour of these compounds tends to modulate affective states, mate choice, and sexual arousal. These effects are frequently affected by the social context and concentration of the compound. However, the studies were criticized for unrealistic methodology and simplistic models of human pheromones. Finally, I aim to discuss the main definitions of human pheromones and how they affect theories on olfaction and sexuality.

Maciej Karpiński: Exploring the boundaries of language

There is much more to a spoken utterance than its lexical content, syntactic structure or linguistic prosody. Some of its features and components can hardly fit into the frames of traditional linguistic analysis but they still significantly contribute to the meaning. In the present course, a selection of paralinguistic phenomena will be presented and analysed. Some practical aspects of their identification in speech and in body motion as well as annotation in multimodal corpora will be discussed.

Emmanuel Keuleers: From the lab to the crowd: How megastudies and crowdsourcing are advancing psycholinguistics

In recent years the amount of data available to researchers interested in language processing and production has grown immensely. In this talk, I will address some of these developments: First, I will discuss megastudies –behavioural experiments in which large amounts of chronometric behavioural data are collected using methods such as lexical decision, naming, progressive demasking, or eye-tracking– and how their analysis informs psycholinguistic theory. Then, I will talk about crowdsourcing and specifically about how the megastudy approach can be scaled from its controlled laboratory settings to online data collection in the browser. Specifically, I will report on the design, the execution, and the results of a number of recent crowdsourcing studies in which hundreds of thousands of participants do a short vocabulary test based on the classical lexical decision task and how this has led to new insights in the relationship between age, multilingualism, education, and vocabulary size.

Emmanuel Keuleers: When, where, and by whom words happen: A unified view on measures of word frequency, contextual diversity, and word prevalence

Word frequency and related measures such as contextual diversity are among the most important variables in the study of language processing, both theoretically and methodologically. Recently, ‘word prevalence’ -the proportion of a population who know a word has also been shown to be highly predictive of word processing speed. Very often, however, the theoretical characteristics of these measures are not well understood. I will address these issues by first addressing the mathematical relationship between the measures and then propose an information theoretical interpretation which is closely related to current discriminative learning based accounts of language processing.

Anetta Kopecka: Motion across and within languages

While motion represents a universally shared human experience, languages vary considerably in the types of linguistic resources they use to describe motion events and in the ways they assign semantic information to lexical and grammatical devices (Talmy 1985, 1991; Slobin 2004). This course explores research on language variation in this domain of expression focusing on both cross-linguistic and intra-linguistic variation. The first unit of the course provides a typological perspective on cross-linguistic similarities and differences in the expression of semantic components of motion such as Path and Manner by surveying data from typologically varied languages. The second unit explores variation found within individual languages and discusses patterns of innovation and continuity in the expression of motion. Based primarily on Romance languages, it addresses diachronic changes these languages have undergone (e.g. loss of productivity of verb prefixes and verb particles, lexical fusion of prefixes with verbs of motion) and shows the effects of these changes on the description of motion in discourse and the types of information speakers attend to when they depict motion events.

Ágnes Lukács: Natural language acquisition and statistical learning

The acquisition of complex motor, cognitive and social skills like language (or playing a musical instrument or mastering sports) is generally associated with implicit skill learning which often involves statistical learning (SL). In the two talks, we will focus on SL as a model of language acquisition in child language development, but we will also touch upon sequence learning and categorization. We will evaluate the role of SL in learning different aspects of language (phonology, lexicon and grammar), discuss the debate over language-specific versus domain general mechanism in language acquisition. Summarizing results both from previous studies and from our lab, we will look at how the development of SL mechanisms relates to the proposal of a critical period of language learning, and see how this mechanism operates in developmental and acquired disorders of language.

Ján Mačutek, Michaela Koščová: Cluster analysis and its application in the classification of languages/texts

Introduction to cluster analysis in R. Applications in two areas:

1) A classification of Slavic languages based on grapheme frequencies.

2) Sequential properties of texts (length motifs and distances between words of equal lengths), their mathematical models and a classification of texts based on parameters of the models.

Andrea Pešková: Romance vs. Slavic Pro-Drop

According to the World Atlas of Language Structures (see Dryer 2005), Spanish and Czech belong to one type of languages, in which the “normal expression of pronominal subjects [PS] is by means of affixes on the verb” (e.g. sp. canto, cz. zpívám, en. I sing). In the Generative Grammar, languages that permit the omission of PS are so called pro-drop or null-subject languages (NSL) (e.g. Chomsky 1981; Rizzi 1986). The widespread proposal of this approach is that the subject position of a sentence is occupied by a phonetically empty element (pro). According to the latest typology of NSL proposed by Biberauer et al. (2010), Spanish and Czech are consistent pro-drop languages with typically “rich” verbal morphology. At first sight, both languages share the same properties (omission of subjects, no overt expletives, postverbal subjects etc.). However, it will be shown that there are many differences between them. Whereas Czech pro-drop has not been described in detail so far, there is an extensive bibliography on Spanish (and Romance) pro-drop, within the Generative framework (see above) as well as within empirical variationist research (see e.g. Otheguy et al. 2007). The aims of the talk are threefold: First, I will present an overview of the extensive research on Spanish pro-drop, second, I will compare the both “typologically” similar languages as for expression and omission of PS, and, third, I will discuss some methodological issues related to the phenomenon. The paper should also give input to further theoretical and especially corpus-based investigation of the pro-drop property in both languages.

References:

Biberauer, T. – Holmberg, A. – Roberts, I. – Sheenan, M. (2010): Parametric Variation: Null Subjects in Minimalist Theory. Cambridge: Cambridge University Press.

Chomsky, N. (1981): Lectures on Government and Binding. Dordrecht: Foris.

Dryer, M. S. (2005): “Expression of Pronominal Subjects.” In: Dryer, M. S. – Haspelmath, M. (2013): The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wals.info, Accessed on 2015-04-20).

Otheguy, R. – Zentella, A. C. – Livert, D. (2007): “Language Contact in Spanish in New York toward the Formation of a Speech Community,” Language 83(4), 770-802.

Rizzi, L. (1986): “Null Object in Italian and the Theory of pro,” Linguistic Inquiry 17, 501-557.

Radel Skarnitzl: How Is the Speaker’s Identity Reflected in the Speech Signal?

Identifying familiar speakers is something we do on an everyday basis. A substantially more challenging task is the identification of unfamiliar speakers, which we do in order to aid police investigations. Although the uniqueness of every person’s voice is a very appealing concept, our speech production mechanism is so plastic that one cannot talk about a unique “voiceprint” (in the analogy to a fingerprint). Nevertheless, there are ways of capturing the specificity of a given speaker’s voice and speech production in general. The lecture will introduce forensic phonetics as a developing applied discipline and will be mostly dedicated to the most frequent way of identifying speakers, the combination of auditory and acoustic analyses. Practical examples and exercises will be included.

Elisa Sneed German: Online processing in L1 and L2: Similarities and differences revealed by ERPs

Comparisons of L1 and L2 language processing have revealed a mixed bag of results: similarities are found in some linguistic domains and distinct differences are found in others. These variable results may depend on the methodology used, the proficiency of the learners, and the particular linguistic feature(s) studied, among other factors.

In this seminar, we will focus on the results of two studies comparing online language processing by native speakers of French and two groups of late learners (low- and high-intermediate L1 English speakers who have learned French in school). The first study, a syntactic manipulation, investigated the acquisition of French direct object clitics (a category that does not exist in English). The results suggest that increased proficiency leads to increased online sensitivity to this grammatical constraint; however, they also show a mismatch between behavioural results and online results in the low-intermediate learner group.

The second study, a semantic manipulation, investigated online sensitivity to cloze probability (i.e., the likelihood that a given word would occur in a given context) according to established French norms. Although both learner groups were sensitive to anomalous sentence continuations, neither was sensitive to the cloze manipulation. Unlike the L1 French controls, the ERP trace for the learner groups did not vary with differences in probability of the two felicitous sentence continuations.

Experience with ERPs is not required. A general overview of ERP methodology will be provided at the beginning of the seminar.

James Sneed German: Hybrid Models in Phonology

Recent studies have shown that phonological knowledge depends on both detailed phonetic representations (exemplars) as well as more traditionally recognized abstract categories. The challenge for researchers now is to understand how these two kinds of representations fit together into an integrated system of production and perception. In this seminar, we will begin with a review of the debate surrounding the role of abstraction in phonology, followed by an introduction to exemplar-based approaches and the types of findings that they have been successfully applied to. Drawing largely on data from dialect imitation studies, we will then explore the notion of a hybrid model, which incorporates elements of both approaches, demonstrating that such a model is necessary in view of the total range of observed data. Finally, we will discuss how closely related models may be extended to non-segmental aspects of phonology (e.g., intonation), as well as to non-phonological domains such as morphology and syntax.

James Sneed German: Variability in the Prosodic Realization of Focus

Many languages exhibit an association between information structure (e.g., focus, topic, etc.) and prosodic features, a fact which is often interpreted in terms of marking. Recent studies, however, have begun to reveal that this association is often not robust, in that it does not show a one-to-one mapping along the lines of the so-called Focus-to-Accent approach (Gussenhoven, 1983; Ladd, 1996; i.a.). This is especially apparent in Romance languages, where focus, though clearly influential in the selection of prosodic form, does not constrain it in a deterministic way. In this seminar, we will explore a range of experimental findings from Romance languages (French, Spanish, and Catalan), as well as from English, that highlight some of the ways that the prosodic realization of information structure can be an imperfect marking system. We will next explore how such findings can be captured by probabilistic grammars and will discuss their implications for a model of situated communication.

Jakub Szewczyk: What ERPs tell us about prediction in language comprehension?

In this class I will speak about how event-related brain potentials let us reverse engineer how the brain comprehends language. More specifically I will tell you about my research on semantic predictions – how the brain “preloads” things that might appear as a likely continuation of an unfolding discourse.