ERC Starting Grant, March 2014 – November 2019

Research team

  • Principal Investigator: Sabina Leonelli
  • Research Fellow: Niccolò Tempini
  • PhD Student: Gregor Halfmann

Project review

The project was successfully completed on 30 November 2019: Download a summary of project findings and outputs here: PDF. There is also a review available in six languages from here.

This project was also taken as a case study for open publications access, research data management and -sharing within ERC projects: Download the report here: PDF.

Project goals

This project aimed to develop a philosophy of data-intensive science that clarifies how research practices are changing in the digital age, and examined how this affects current understandings of scientific epistemology within the philosophy of science and beyond. This was accomplished through examining data practices, travels, and uses across a variety of disciplines, including plant science, biomedicine, particle physics, climate science, environmental sciences, archaeology and economics. In particular, we focused on the impact that the increasing reliance on online databases has been having on the travel and re-use of scientific data. While the overarching goal of the project was philosophical, we grounded philosophical analysis on historical and social scientific methods and findings, and conducted research in collaboration with leading scholars in philosophy, history, sociology and anthropology of science.

Context

The scale of scientific data production has massively increased over the last decades, raising urgent questions about how scientists are to transform the resulting masses of data into useful knowledge. A technical solution to this problem is offered by technologies for the storage, dissemination and handling of data over the internet, including online databases that enable scientists to retrieve and analyze vast amounts of data of potential relevance to their research. These technologies are having a profound effect on what counts as scientific knowledge and on how that knowledge is obtained and used. This is a step change in scientific methods, which scientists refer to as ‘data-intensive’ research. The characteristics and philosophical implications of this emerging way of doing science have not yet been extensively and systematically analyzed. This is partly due to the relative scarcity of empirical, qualitative research on how data disseminated online are actually used across scientific fields; and partly to the lack of scholarship bringing results from social and historical studies of data-intensive research to bear on philosophical accounts of scientific methods, practices and knowledge. This project aimed to fill this, gap by combining the analytic apparatus developed by philosophers of science with empirical, qualitative methods used by social scientists to investigate cutting-edge scientific practices.

Methods and timeline

In the first phase of the project (2014–2017), the research team investigated how the use of online databases is affecting research practices and outcomes in two areas: plant science and biomedicine. This study was coordinated with scholars conducting similar research on other fields, so as to facilitate comparison across different sciences.

In the second phase (2017–2019), Leonelli was using these results to analyze how data-intensive methods challenge existing philosophical understandings of the epistemic role of data, theory, experiments and division of labour in science, thus producing a systematic assessment of the implications of the rise of data-intensive research for how science is organized, conducted and assessed.

Executive summary

The DATA_SCIENCE project conducted 6 empirical case studies, which were analysed in detail and informed a novel framework for the philosophy of data-intensive science.

In terms of publications, the project produced 4 books, 32 research papers in peer-reviewed journals, 13 book chapters, 4 book reviews, 4 encyclopaedia entries, 17 public outreach articles, 5 open data collections and a pilot database for plant data science infrastructure (PlantDBMap). We organised 24 conferences and thematic symposia, including 11 conferences at Exeter, sessions at the American Association for the Advancement of Science, the European Philosophy of Science Association and the Philosophy of Science Association, and a 24-papers-strong themed track at 4S/EASST 2016.

Additionally, the PI and RF delivered 36 keynote lectures, 96 invited lectures and 51 contributed talks to audiences including general public, philosophers, historians, social scientists, library and information scientists, librarians, biologists, clinicians, engineers, mathematicians, data scientists, environmental scientists, policy makers and university management. These outputs were covered by a wide variety of news media and informed the writing of 16 policy reports and position statements concerning the governance of data and data-intensive research (in which the PI participated as a lead author or co-author).

The PI’s Twitter account @sabinaleonelli acquired 4200+ followers and the project Twitter account @DataScienceFeed acquired 1000+ followers over the duration of the project.

Key events

Key outputs

Books

Other outputs

For further outputs (special issues, book chapters, peer-reviewed articles, reviews, consultations and policy documents, datasets, and non-reviewed contributions) please consult the final summary report.