figshare
Browse
ZZZ_C001_FakeHIV.csv (42.6 MB)

The Heath Gym Synthetic HIV Dataset

Download (42.6 MB)
Version 2 2023-07-12, 07:03
Version 1 2022-05-24, 04:41
dataset
posted on 2023-07-12, 07:03 authored by Nicholas KuoNicholas Kuo

###===###
IMPORTANT NOTE:
Dear viewers, please consider Version 2.0 of our synthetic ART for HIV dataset.

Kuo, Nicholas (2023). The Health Gym v2.0 Synthetic Antiretroviral Therapy (ART) for HIV Dataset. figshare. Dataset. https://doi.org/10.6084/m9.figshare.22827878.v1

The latest version is much more realistic than this current version, 
especially regarding class imbalanceness and utility for training RL algorithms.
Interested viewers may also refer to our preprint for further details.

Kuo, Nicholas I., Louisa Jorm, and Sebastiano Barbieri. "Generating Synthetic Clinical Data that Capture Class Imbalanced Distributions with Generative Adversarial Networks: Example using Antiretroviral Therapy for HIV." arXiv preprint arXiv:2208.08655 (2022). 


###===###
This dataset was generated using the Health Gym generative adversarial network (GAN).

The dataset contains viral loads, CD4 counts, and drug regimen information for 8,916 patients with HIV. The dataset is stored in CSV format.

Please consult our archived paper for more details:
Kuo et al. (2022).
The Health Gym: Synthetic Health-Related Datasets for the Development of Reinforcement Learning Algorithms. arXiv preprint arXiv:2203.06369

###===###
1) Refer to page 3 for the full description of the dataset.
2) Refer to page 13 for the descriptive statistics.
3) Refer to pages 25-28 for the quality assessment.
4) Refer to page 9 for the patient re-identification risk.
5) Refer to pages 4, 5, and 14 on our novel generative adversarial network for generting this synthetic dataset.
6) Also refer to our website: www.healthgym.ai for an overview of the project.

###===###
Edited: 12th-July-2023 
Date of 2-edits ago: 24th-May-2022

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC