Word Frequency Count from Network Review
These are files containing the data used in one of the plots shown in Figure 10 in my basic overview of Complex Networks (a review for Contemporary Physics, see below). Please cite the source if you use this data. However you could also do this yourself by using the LaTeX file via arXiv. It was produced by using various UNIX tools to strip the LaTeX commands to produce a list of words (one per line) followed by counting the number of times each line was repeated. I can see that there was no stopping or stemming
e.g. "The" and "the" appear separately, "vertex" and "vertices" are counted separately.
netrevcountrawdata.xls = rank and count for each word, along with plots
netrevcountTabSeparated.txt = rank and count for each word in simple text format
netrevindex.txt = raw data, unsorted (note there are some silly 'words' like "x"
Contemporary Physics 45 (2004) 455-475
You must be logged in to post comments.
Embed "Word Frequency Count from Network Review"
You claim request was sent. I will be handled in the next 24 hours.Close window