Search Engine Manipulation: SEO to Spread Kremlin-Aligned Disinformation. Think Tank Backlink Network and Keyphrase Network dataset.
README
These datasets and R script are to generate the visualizations used in "Search Engine Manipulation: SEO to Spread Kremlin-Aligned Disinformation"
DATA
Data were collected from the Ahrefs (ahrefs.com) dashboard on Feb 21, 2022. Any re-use of this data must cite Ahrefs as the source.
We provide two data files:
1) ora_rd.csv (referring domain network extracted from Ahrefs)
2) ora_kw.csv (keyword network extracted from Ahrefs)
ora_rd.csv contains:
- 'source' : (think tank domain)
- 'target' : (backlinking domain)
- 'weight' : (total # of backlinks)
- 'network' : (ru_tt_rd = Russia, us_tt_rd = US, ds_rd = Pseudo)
ora_kw contains:
- `keyword`: top key phrases
- `outlet`: countains the think tank domain - maps to source in ora_rd
R Script
This script generates visuals of top backlinkers and backlink recipients in the think tank network. It also constructs and visualizes the minimum overlap network for referring domains present in the paper.
This code was created using R version 4.1.1 (2021-08-10) -- "Kick Things".
To run this code, you'll first need to install the libraries at the beginning of the script.
Once that is done, open the script and modify the path to the input file (ora_rd.csv): `rd_data_el` and modify the path to the output file: `output_dir`.
The script can then be run line-by-line in Rstudio or by typing `Rscript seo_visuals.R` in the terminal.
Citation
How to cite (will be updated upon acceptance):
```
Williams, E. M. & Carley, M. C., (2022). Search Engine Manipulation: SEO to
Spread Kremlin-Aligned Disinfor-mation. Harvard Kennedy School (HKS)
Misinformation Review, Volume #(Issue #).
Received: Month Xth, 2022. Accepted: Month Xth, 20XX. Published: Month Xth, 2022.
```