figshare
Browse

Text prompts and videos generated using 5 popular Text-to-Video models plus quality metrics including user quality assessments

Version 3 2024-02-21, 21:59
Version 2 2023-10-02, 19:31
Version 1 2023-09-02, 18:31
dataset
posted on 2024-02-21, 21:59 authored by Iya Chivileva, Philip Lynch, Tomas Ward, Alan SmeatonAlan Smeaton

A collection of 201 prompts which are used to generate short-form videos using 5 popular text-to-video models namely Tune-a-Video, VideoFusion, Text-To-Vudeo Synthesis, Text2Video-Zero and Aphantasia. Each of the 1,005 generated videos is included along with automatically calculated quality metrics naturalness, text similarity between the original prompt and a generated text caption, and inception score, for each. Each video was rated by 24 different people and the data also includes the MOS scores for alignment between the generated videos and the original prompts, as well as for perception and overall quality of the video.

Please cite this paper if using this dataset. GitHub URL for code for implementing video naturalness calculation is available at https://github.com/Chiviya01/Evaluating-Text-to-Video-Models

Funding

Science Foundation Ireland 12/RC/2289_P2

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC