datas.zip (2.19 MB)
Datas of Disease Patterns
Version 3 2017-06-02, 13:44
Version 2 2017-05-29, 02:48
Version 1 2017-05-29, 02:37
dataset
posted on 2017-06-02, 13:44 authored by Jichang ZhaoJichang Zhao1.the "dingxiang_datas.xls"contains all the original data which is crawled from DingXiang forum, and also the word segmentation result for each medical record is given.
2.the "pmi_new_words.txt" is the result of new medical words found by calculating mutual information.
3.the "association_rules" folder contains the association rules mined from the dataset where h-confidence threshold is set 0.3 and support threshold is set 0.0001.
4.the "network_communities.csv" describes the complication communities.
p.s. if you encounter a "d", it means the word is a disease description vocabulary, and "z" or "s" represents a symptom description vocabulary.