Paraphrase choice based on user traits

<p>PPDB paraphrase pairs and clusters with their associated usage score across three user traits:<br>a. Gender: male or female<br>b. Age: <25 or >30<br>c. Occupational Class: low or high</p> <p>Contents:<br>frequencies.tar.gz - contains the raw frequency statistics for all phrases and each trait<br>pairs.tar.gz - contains files with pairwise usage scores for each trait<br>clusters.tar.gz - contains files with cluster usage scores for each trait</p> <p>In pairs and clusters, the negative values are phrases which are more associated with: females, lower occupational class and users over 30 years old.</p> <p> If you are using this dataset, please reference our work:</p> <p>@inproceedings{paraphrase16aaai,<br>author = {Preo\c{t}iuc-Pietro, Daniel and Xu, Wei and Ungar, Lyle},<br>title = {{Discovering user attribute stylistic differences via paraphrasing}},<br>booktitle = {{Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence}},<br>series = {AAAI},<br>year = {2016}<br>}</p> <p> </p> <p> </p>