The UCR Archive 2018 provides 128 time series datasets with diverse sources. But there are two primary issues to be used to train a time series generative model. Firstly, a majority of the datasets have a larger test set compared to a training set. Secondly, there is clear difference in patterns between training and test sets for some of the datasets.
To make the datasets from this archive more suitable for a time series generation task, we merged the existing training and test sets and resplit it using StratifiedShuffleSplit (from sklearn) into 80% and 20% for a training set and test set, respectively.