Graph Samping Results
These boxplots show the recall distribution of our queries for each combination of rewrite method and network analysis algorithm for the 4 datasets we considered. The bounds of the box represent the lower and upper quartiles of the recall scores. The average recall is denoted by the triangle and the horizontal line provides the median recall score. Whiskers extend to datapoints that are up to 1.5 times larger and smaller than the interquartile range. Any points outside this range are considered outliers, and are represented as dots.
Whenever we can claim a statistically significant better recall than one of our baselines, we show this using + or * signs. Our significance calculations are public as well on github: https://github.com/Data2Semantics/GraphSampling/blob/master/bin/significance