Diving Deep into Event Semantics

Liu, Zhengzhong

doi:10.1184/R1/28742744.v1

Diving Deep into Event Semantics

thesis

posted on 2025-04-18, 18:31 authored by Zhengzhong LiuZhengzhong Liu

Events, with their complex interconnected structures, are vital to semantic under standing in natural language. Extensive research has been conducted on analyzing them, primarily focusing on frame structures (examining semantic roles such as participants, time, and locations) and various forms of anaphora, such as event coreference, verb phrase ellipsis, event sequence prediction, event schema and script induction.

The interconnected nature of events presents both challenges and opportunities. On one hand, predicting and analyzing event structures can be complex. For example, understanding events in a document can involve multiple structure prediction tasks (e.g., event mention detection, event coreference, arguments extraction). On the other hand, the interactions can be leveraged using structural constraints to improve predictions, or providing a lens to study the mechanisms of models. This thesis explores these challenges and opportunities in different data availability scenarios, developing methods including direct supervised training, crowdsourcing procedures, and incidental/indirect supervision.

In the first part, we present empirical results on event semantics using expert annotated, task-specific datasets. We start by introducing methods studying individual structures, such as event mention prediction, pair-wise event coreference, and event sequencing. We then present approaches to solve problems involving multiple structures, using multi-step or joint learning methods, such as joint coreference and sequencing, slot filling and verb phrase ellipsis.

Recognizing the high cost of scaling expert-annotated datasets, the second part of this thesis explores methods to increase data availability through crowdsourcing and indirect supervision signals. These approaches offer new insights into event semantics, revealing new semantic phenomena and offering a deeper understanding of how models process semantics. A key contribution in this area is our LLM360 language model project, which tracks and shares snapshots of the model at various stages during its training process. We demonstrate the project’s utility for interpretability analysis, using the complex anaphora task of Winograd schemas as a case study.

This thesis presents methods for analyzing and understanding the complex nature of events. We observe that models trained on larger datasets can develop human interpretable structures, such as attention heads that capture correlations between events and states. These developments may indicate an emerging semantic understand ing, particularly in Large Language Models (LLMs). Looking ahead, our analysis with LLM360 opens new avenues for exploring how these models process semantics internally, paving the way for the development of more effective control algorithms and improved model architectures.

History

Date

2024-08-12

Degree Type

Dissertation

Department

Language Technologies Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Teruko Mitamura

Usage metrics

Keywords

event event mention script event schema coreference implicit arguments semantics indirect supervision quasi-identity large language models mechanical interpretation transformer circuits

Licence

CC BY 4.0

Diving Deep into Event Semantics

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports