figshare
Browse

Towards Efficient Training and Inference of Large Transformer Models

Download (22.52 MB)
thesis
posted on 2024-09-28, 23:56 authored by Haoyu He
Transformers have revolutionized modern applications but are costly as model sizes grow. This thesis targets efficient training and inference of large Transformer models. We first explore allocating trainable parameters to task-specific positions to boost performance for parameter-efficient fine-tuning (PEFT). We then tailored PEFT efficiently to generate a range of fine-tuned models meeting various hardware constraints through model stitching. Additionally, we offer a novel pruning approach reconfiguring expensive self-attention layers into efficient convolutional layers, creating compact hybrid models. For semantic segmentation, we develop efficient cross-attention layers and a dynamic positional query design, achieving state-of-the-art performance with affordable costs.

History

Campus location

Australia

Principal supervisor

Bohan Zhuang

Additional supervisor 1

Jianfei Cai

Year of Award

2024

Department, School or Centre

Data Science & Artificial Intelligence

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

    Faculty of Information Technology Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC