Evidence that identifiers are a source of problems for data integrators.
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Advances in computing power and expansion of the Internet have led to increasing optimism that big data will lead to new insights. However, in the life sciences, relevant data is not only "big"; it is also highly decentralized across thousands of online databases. Wringing value from it depends on the discipline of data science and on the humble bricks and mortar that make it possible -- identifiers.