Loading Events

« All Events

  • This event has passed.

The Data Dialogue. Time to Share: Navigating Boundaries & Benefits

28th July 2016 @ 9:30 am - 4:30 pm


This event is over. See what happened




Time to Share: Navigating Boundaries and Benefits

Researchers are faced throughout their careers with managing research data. Implementing best practice early on can not only help you to access the right data quickly, but also ensure that data is relevant,  reusable and accessible to others, contributing further to scientific research.

This unique event aims to provide Early Career Researchers from a range of disciplines with the tools and knowledge to better approach their research data needs, covering the benefits of sharing data, how we share difficult and sensitive data, gaining access to restricted, secure, or open data, and what types of repositories are available for use.

Join us at The University of Cambridge to hear from key speakers from the UK’s leading Research Data centres and join a discussion amongst peers with tangible experience in accessing and managing data, sharing their own discipline-specific examples.


Dr. Nicole Janz

Keynote: “Data Transparency and Replication”

Dr Nicole Janz is a political scientist at the Department of Sociology at the University of Cambridge. Her current research agenda includes the impact of globalisation on human rights; determinants of labour standards; and corruption in Brazil. Methodologically, Nicole focuses on data transparency and replication. She is involved in a number of initiatives such as the Political Science Replication Initiative (PSRI), the Center for Open Science (CFO), and the Berkeley Initiative for Transparency in the Social Sciences (BITSS). Nicole regularly writes about reproducibility on her blog.

Keynote: “Personal, not painful: practical and motivating experiences in data sharing”

The UK Data Service is a comprehensive resource funded by the ESRC to support researchers, teachers and policymakers who depend on high-quality data. It has been initially created to support economic and social data, but now the UK Data Service opened its services to all researchers working on high quality data, from all disciplines. The UK Data Service provides an online data repository, ReShare, where researchers can archive, publish and share research data, as open or safeguarded data. During this talk you will learn how UK Data Service can help researchers share restricted research data.

Louise Corti is the Associate Director at the UK Data Service, where she leads the new Digital Futures project, which is a culmination of bringing together digitally enhanced high quality older data sources with high profile users. She also coordinates the international working group on metadata standards for qualitative data.

Prof. Peter Smith

“Better knowledge, better society: how ADRC-E can support your research and enhance its impact”

Prof. Smith is Director at The Administrative Data Research Centre for England. The ADRC-E is led by the University of Southampton, and run in collaboration with University College London, the London School of Hygiene and Tropical Medicine, the Institute for Fiscal Studies and the Office for National Statistics.

The Centre combines exceptional physical facilities with high performance computer platforms to help researchers gain access to de-identified administrative data so they can carry out social and economic research − research that has the potential to benefit society.

Discover more about the ADRC-E by visiting their website where you can view a short video of what they do, listen to training podcasts and read about latest news and events. You can also follow the centre on Twitter @ADRC_E or contact adrce@soton.ac.uk.

“Evidence for economic policy: Experiences from the ESRC Secondary Data Analysis Initiative”

Our work on economic crisis and mental health uses the European Social Survey, an award-winning, trans-continental database consisting of nearly two decades of biannually collected measurements. While we were able to generate meaningful insights from this work, many lessons were learned about secondary data collaborations, which are relevant to users of data as well as their audiences. These insights include participation in surveying, loss of control over data available, applying data beyond original purpose, and how to engage highly diverse audiences as well as the general public in findings.

Dr Kai Ruggeri is an Affiliated Lecturer in the Department of Psychology and directs the Policy Research Group. He specialises in quantitative methods, research design and analysis. His work is primarily focused on the link between health and economic policy using major population datasets. His background is in quantitative research with an emphasis on evidence-based policy, mainly in global health.

Dr. Kai Ruggeri
Fiona Nielsen

“Addressing the Problem of Human Genomic Data Discoverability”

Access to raw experimental research data and data reuse is a common hurdle in scientific research. Despite the mounting requirements from funding agencies that raw data is deposited as soon as (or even before) a paper is published, multiple factors often prevent data from being accessed and reused by other researchers.  

The situation with human genomic data is even more dramatic, since, on the one hand, it is probably the most important data to share – it lies at the heart of efforts to combat major health issues such as cancer, genetic diseases, and genetic predispositions for complex diseases like heart disease and diabetes. On the other hand, since human genomic data contains sensitive and personal information, it is often exempt from data sharing requirements.   
We found out that, on average, researchers use 4-5 genomic data repositories on a regular basis. At the same time, there are many more sources of data available that are often unknown to researchers. We have addressed the most pressing problem for public genomic data, that of data discoverability, by indexing worldwide resources for genomic research data on an online platform (repositive.io) providing a single point of entry to find and access available genomic research data.   
Fiona will present the overview of genomic data sources around the world and discuss the potential solutions for improving ethical and efficient data sharing.  

Fiona is a bioinformatics scientist specialising in genome analysis with 15 years of experience in software development and project management. Fiona left her job at Illumina Cambridge in 2013 to pursue her vision of enabling efficient genomic data sharing and founded the charity DNAdigest. In August 2014 she founded Repositive as a spin out of DNADigest.

“The processes and benefits of sharing clinical data”

This talk will present the case for appropriate sharing of clinical (patient-sensitive) data, with examples taken from an ongoing research project. The typical processes of acquiring clinical data will be illsutrated, including obtaining ethical approval, obtaining informed consent from participants, and publication of research methodology a priori. Processes for management of clinical data will be demonstrated, including anonymisation, cleaning, formatting and public dissemination. Key benefits of sharing clinical data and analysis tools will be highlighted, including enhancing the reproducibility of research, providing educational value, and laying the foundations for future developments. Further details and resources.

Peter Charlton gained the degree of MEng in Engineering Science in 2010 from the University of Oxford. Since then he held a research position, working jointly with Guy’s and St Thomas’ NHS Foundation Trust, and King’s College London. Peter’s research focuses on physiological monitoring of hospital patients, encompassing signal processing, evaluation of wearable sensors, and the design of real-time alerting algorithms. His role involves working with patients, clinicians, and engineers to develop novel technology-based solutions for unmet clinical needs.

Peter Charlton
Prof. Henry Rzepa

“Chemical Science and Data Repository Design”

A case study in molecular sciences will be presented in the form of an interactive demonstration, located with the doi: 10.14469/hpc/646.  The learning curve we experienced of using data repositories for 11 years, with well in excess of  100,000 items deposited and assigned  DOIs during this period, led us to identify a number of simple fundamental features for the design of a simple yet effective new repository.  Features we believe will encourage working scientists to use such tools on a regular and productive basis.  There are two core components, the ELN (electronic lab notebook), into which is integrated a simple “publish to a repository” button. This button initiates a workflow that collects core metadata and injects it into a repository against a well defined schema. The metadata is then sent to  DataCite in exchange for a  DOI.  The case study will explore the benefits of such a workflow, illustrated via a newly designed (open-sourced) data repository (doi: 10.17616/R3K64N ) which is starting to be regularly used for the collaborative authoring of journal articles in which the data citations are an exposed component, as required by funding councils to demonstrate effective RDM.

Henry Rzepa  is emeritus professor of computational chemistry at  Imperial College London. He has been interested in applying communication technologies to molecular sciences since well before the current internet, and  since 1993  his researches have included the development of publishing models for chemical data guided by what is now called the  FAIR principles (findable, accessible, interoperable and reusable), with an emphasis on integrating visualisation tools into the processes.  He writes a regular blog where many of these themes are  discussed and indeed practised.

“Intermine: A Data Integration Model for Re-use of Biological Data”

The InterMine data warehouse is an integration and analysis system for diverse biological datasets. One of InterMine’s principal aims is to enable data re-use and this is achieved by exposing data through intuitive and accessible user interfaces, by making use of commonly used biological data sources and formats, and by conforming to many current biological standards.  On a more technical level, InterMine provides comprehensive web services that have API bindings to many widely used languages so making it easy to use computer programs to access data. Work is ongoing to enable cloud deployments of InterMine instances through Docker, so making it easy to start building new InterMine databases.

The InterMine system has been adopted by many groups as a way to present biological data to their user communities in a way that maximizes the knowledge that can be extracted from the original data. The regulation of biological systems occurs at many levels and can only be fully understood by analyzing many types of data together. The integration of biological data within InterMine enables such data sets to be filtered and combined, thus enriching their interpretation and re-use. As an example we will look at a study in yeast which made use of several integrated datasets for the prediction of chemotherapy targets.

Julie Sullivan is a software developer at InterMine, an open sourcebiological data warehouse developed by the Micklem lab at the Universityof Cambridge. InterMine is part of the NIH’s BD2K initiative and is anamed resource for the ELIXIR UK node.  A current focus of the team isenabling greater data sharing and interoperability between resourcesthrough the creation of cloud instances and unified interfaces. Read more at http://www.intermine.org and follow us on twitter @intermineorg.

Julie Sullivan
Dr. Jamie Moore

“Project ADRCE_001: Experiences, reflections and using linked census – survey data to monitor survey non-response”

ADRCE_001 is the first project undertaken by the Administrative Data Research Centre for England (ADRCE). In it, we utilise linked census data to facilitate monitoring of survey non-response error over data collection, to inform modifications to optimise quality-cost trade-offs. Here, we reflect on experiences during this ongoing project, including the application process, negotiations with data owners, and conducting research. To illustrate the benefits of utilising linked datasets, we then briefly mention findings so far: that potentially survey data collection can be ended early, reducing costs substantially, with little impact on dataset quality.

Dr. Jamie Moore is a Research Fellow at the Administrative Data Research Centre for England (ADRCE) and the Department of Social Statistics and Demography, University of Southampton. His current research interests are in survey and record linkage methodology.


28th July 2016
9:30 am - 4:30 pm
Event Categories:
, ,


Sigourney Luz and Marta Teperek


Murray Edwards College, The University of Cambridge
Huntingdon Rd
Cambridge, CB3 0DF United Kingdom
+ Google Map