pdfmetadata.csv (40.82 kB)
A survey of Academic Publisher PDF metadata
This is the corresponding dataset to a blogpost at http://rossmounce.co.uk/2012/12/31/pdf-metadata-why-so-poor/
It's a simple survey of PDF metadata, across a variety of different academic publishers sampling mostly from PDFs published in the year 2011, or what I could gain access to. All are from the publisher-provided Version of Record PDFs not self-archived pre-prints or other such. I used the CLI tool pdfinfo to extract this metadata.
Columns A to K are identifying metadata I supply about each PDF (some fields not complete!). Whilst columns L to V provide the interesting metadata about each PDF.
Many of the PDFs sampled are not Open Access so (sadly) I cannot provide you with copies to replicate these results.