figshare
Browse

Diving Deep into Event Semantics

Download (6.29 MB)
thesis
posted on 2025-04-18, 18:31 authored by Zhengzhong LiuZhengzhong Liu

Events, with their complex interconnected structures, are vital to semantic under standing in natural language. Extensive research has been conducted on analyzing them, primarily focusing on frame structures (examining semantic roles such as participants, time, and locations) and various forms of anaphora, such as event coreference, verb phrase ellipsis, event sequence prediction, event schema and script induction.

The interconnected nature of events presents both challenges and opportunities. On one hand, predicting and analyzing event structures can be complex. For example, understanding events in a document can involve multiple structure prediction tasks (e.g., event mention detection, event coreference, arguments extraction). On the other hand, the interactions can be leveraged using structural constraints to improve predictions, or providing a lens to study the mechanisms of models. This thesis explores these challenges and opportunities in different data availability scenarios, developing methods including direct supervised training, crowdsourcing procedures, and incidental/indirect supervision.

In the first part, we present empirical results on event semantics using expert annotated, task-specific datasets. We start by introducing methods studying individual structures, such as event mention prediction, pair-wise event coreference, and event sequencing. We then present approaches to solve problems involving multiple structures, using multi-step or joint learning methods, such as joint coreference and sequencing, slot filling and verb phrase ellipsis.

Recognizing the high cost of scaling expert-annotated datasets, the second part of this thesis explores methods to increase data availability through crowdsourcing and indirect supervision signals. These approaches offer new insights into event semantics, revealing new semantic phenomena and offering a deeper understanding of how models process semantics. A key contribution in this area is our LLM360 language model project, which tracks and shares snapshots of the model at various stages during its training process. We demonstrate the project’s utility for interpretability analysis, using the complex anaphora task of Winograd schemas as a case study.

This thesis presents methods for analyzing and understanding the complex nature of events. We observe that models trained on larger datasets can develop human interpretable structures, such as attention heads that capture correlations between events and states. These developments may indicate an emerging semantic understand ing, particularly in Large Language Models (LLMs). Looking ahead, our analysis with LLM360 opens new avenues for exploring how these models process semantics internally, paving the way for the development of more effective control algorithms and improved model architectures.

History

Date

2024-08-12

Degree Type

  • Dissertation

Department

  • Language Technologies Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Teruko Mitamura