figshare
Browse

Search Engine Manipulation: SEO to Spread Kremlin-Aligned Disinformation. Think Tank Backlink Network and Keyphrase Network dataset.

Download all (1.4 MB)
dataset
posted on 2022-08-06, 18:14 authored by Evan WilliamsEvan Williams, Kathleen M. Carley

README

These datasets and R script are to generate the visualizations used in "Search Engine Manipulation: SEO to Spread Kremlin-Aligned Disinformation"

DATA

Data were collected from the Ahrefs (ahrefs.com) dashboard on Feb 21, 2022. Any re-use of this data must cite Ahrefs as the source.

We provide two data files:  
1) ora_rd.csv  (referring domain network extracted from Ahrefs)
2) ora_kw.csv  (keyword network extracted from Ahrefs)

ora_rd.csv contains:

  • 'source' : (think tank domain)  
  • 'target' : (backlinking domain)  
  • 'weight' : (total # of backlinks)  
  • 'network' : (ru_tt_rd = Russia, us_tt_rd = US, ds_rd = Pseudo) 

ora_kw contains:

  • `keyword`: top key phrases  
  • `outlet`: countains the think tank domain - maps to source in ora_rd  

R Script

This script generates visuals of top backlinkers and backlink recipients in the think tank network. It also constructs and visualizes the minimum overlap network for referring domains present in the paper.

This code was created using R version 4.1.1 (2021-08-10) -- "Kick Things".

To run this code, you'll first need to install the libraries at the beginning of the script.

Once that is done, open the script and modify the path to the input file (ora_rd.csv): `rd_data_el` and modify the path to the output file: `output_dir`.

The script can then be run line-by-line in Rstudio or by typing `Rscript seo_visuals.R` in the terminal.

Citation

How to cite (will be updated upon acceptance):

```
Williams, E. M. & Carley, M. C., (2022). Search Engine Manipulation: SEO to  
  Spread Kremlin-Aligned Disinfor-mation. Harvard Kennedy School (HKS)    
  Misinformation Review, Volume #(Issue #).
  Received: Month Xth, 2022. Accepted: Month Xth, 20XX. Published: Month Xth,   2022.
```


Funding

ONR (Office of Naval Research) Scalable Tools for Social Media Assessment N00014-21-1-2229

ONR (Office of Naval Research) Group Polarization in Social Media N00014-18-1-2106

ONR Office of Naval Research, MURI: Near Real Time Assessment of Emergent Complex Systems of Confederates, N000141712675

ONR Office of Naval Research, MURI: Persuasion, Identity, & Morality in Social-Cyber Environments, N000142112749

History