Sunday, 20 September 2015

Data and Unlikely Academic Collaborators

Last week at a social event I met two interesting academics, working in apparently diverse fields –chemistry and social science. It became clear in conversation that academic distinctions mean much less than they used to, and rather than becoming more specialised, many research disciplines are expanding to such an extent that boundaries are blurred and perhaps even irrelevant. Both these academics, for example, use computational methods not just to aid their research, but at its core.

Working in Cambridge exposes one to all sorts of academic research and thought – it is ceaselessly fascinating to meet real experts in their fields from all over the world and to hear first-hand about the early Christian temples of the Middle East, developments in cancer prevention or the history of concentration camps. In today’s cross-disciplinary environment one can risk overload when considering the multitudinous connections between research disciplines – particularly when working, as I do, in a field which facilitates collaboration. For decades, ‘hard sciences’ such as particle physics have generated massive amounts of data, and developed advanced statistical analysis tools to cope. It didn’t take long for financial institutions to grasp the power of these mathematical techniques and computational approaches (and to hoover-up PhD graduates), but it has taken longer for other academic disciplines to catch on. In the last ten years or so, there has been a significant change: highly data-centric scientific disciplines boomed – computational biology being an excellent example, and specialist hubs are no longer solely at traditional centres of excellence like Cambridge but are truly global. Furthermore, and perhaps less expectedly, big data has been embraced by social sciences, humanities and even the arts.

The cynic might see some of this activity as a reflection of the poor state of research (and particularly arts) funding causing academics to jump on the latest ‘data’ bandwagon, regardless of its relevance to their work. This is undoubtedly true in some cases, but there is real academic rigour in much of this work; surely it is not so surprising that statistical analysis on a massive scale should return to its home territory of monitoring and analysing human society – just as in the Doomsday Book. What is more, new swathes of cross-disciplinary collaboration are opened up: the cross referencing of data sets and scientific models from disparate sources can lead to radical and sometimes counter-intuitive findings. This trend is taking hold across academic study encouraged by forward-thinking organisations such as GEO in earth-observation, mapping land and sea measurements against atmospheric and satellite data, and Cambridge’s own CRASSH in the social sciences, whose cross-disciplinary seminar series can be amazing to attend.

The scope and form of the research groupings that will emerge from this collapsing of boundaries is difficult to predict. What is already clear, however, is that these new fields have specific requirements which go beyond traditional data handling tools. Technologies which allow metadata to be cross-referenced across academic boundaries, methods of uniquely identifying research findings, and secure data-sharing technologies will all become key to the next generation of scientist-cum-data-beachcomber. What isn’t clear yet, is who will establish the universal standards in this space.