figshare
Browse
redactions.zip (1.9 GB)

Redactions extracted from ASIO surveillance records in National Archives of Australia Series A6119

Download (1.9 GB)
dataset
posted on 2016-10-27, 11:49 authored by Tim SherrattTim Sherratt

This is a collection of 239,571 redactions (blacked out words and phrases) extracted from surveillance files created by the Australian Security Intelligence Organisation (ASIO) and held in Series A6119 at the National Archives of Australia.

The digitised records were harvested from the NAA's online database as a collection of individual page images. These images were then processed to find and extract redactions.

The script used to extract the redactions is in the linked GitHub repository. After running the script, I manually removed non-redactions (these are available in a separate fileset for comparison). There was about a 20% error rate.

The redactions are saved as jpgs and are mostly very small images. The filenames of the images provide important contextual information. For example:

1009279-p10-1-74-54.jpg -- has the following attributes

File barcode: 1009279

Page: 10

Redaction number: 1

Width: 74px

Height: 54px

You can search the NAA database to find the original file using the barcode, or construct a link like:

https://owebrowse.herokuapp.com/items/1009279/pages/10/

to view the page from which the redaction was extracted in my own experimental interface.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC