Skip to Main Content

Research Data Management

Support for storing, sharing, and preserving research data.

Overview

After a research project ends, it may be necessary to make the data, or at minimum information about those data, available to other researchers per the Data Management Plan. A good way to disseminate research data to colleagues and more widely is to deposit it in a repository or archive. Controlling access to these data can then be accomplished in a variety of ways, for example using passwords, encryption, or a permissions authorization system.

Intellectual property and licensing

Overview

The general philosophy of Mines is that research data should be made available for access and re-use where appropriate and under appropriate safeguards. Open access to research data from public funding should be easy, timely, and user friendly. However not all data can or should have unrestricted access. Availability of certain data may need to be restricted due to confidentiality, contractual or other issues. Note that federal programs may differ in their definitions of restricted, confidential, sensitive, and classified data. And data ownership and control issues can be complex in some situations.

Mines Policy

Mines policy grants ownership of research data and materials to the school. The researcher generally retains the rights and responsibilities over control and licensing of data and related materials.  Distribution is also at the discretion of the principal investigator, based on the Data Management Plan, if it exists, and barring any limits imposed by confidentiality agreements or funding agency restrictions.

Sharing Data Appropriately

Preserving data in data centers or repositories which are managed by trusted entities for long-term access is a common and perhaps preferred way to share data. Other options are to share directly with colleagues via email, or collaborative networks. There are a number of important issues to consider when planning for data sharing such as:

• Does the research project have sufficient permissions necessary to disseminate the project data?
• Does the project need to provide access to all the data produced under a grant?
• Do the data include any private information, medical information, or other information with possible confidentiality concerns?
• Would the project like Attribution/Acknowledgment to be required or requested?
• Would the project like to receive information regarding the use of the project data by users?
• Would project like to provide permission for users to redistribute project data under certain conditions?

Exclusive rights to reuse or publish research data should not be handed over to commercial publishers or agents without retaining the rights to make the data openly available for re-use, unless this is a condition of funding.

Legal, Technical and Social Aspects of Open Data

When applying a license to your own data, you are encouraged to make it as open as appropriate, to enable others to use and build on your data. See the Open Data Handbook for more information on the legal, social and technical aspects of open data.

Tools for End Users

OVERVIEW

This page recommends tools or sites that will help you find tools for manipulating, visualizing and interacting with data, metadata, web technologies, etc.  Be sure to checkout the CCIT software list.

RECOMMENDED SITES FOR SEARCHING FOR TOOLS
RECOMMENDED TOOLS

Ocean Data Viewer

DATA WRANGLING:

OpenRefine

STATICTICAL SOFTWARE:

R (Website) (Campus Computer Lab)

RStudio

MATLAB (Campus Computer Lab)

Controlling Access

CONTROLLING ACCESS

Retrieval and access procedures for restricted data are based on the data archive and the individuals deemed responsible for providing access. Retrieval and access to both restricted and unrestricted research data should be aligned with funder/sponsor requirements and based on the Data Management Plan. Controlling access can then be accomplished in a variety of ways, such as password protection or encryption, or through a system of permissions authorizations involving one or more Gatekeepers. Note that the researcher’s ideal choice of permission restrictions may conflict with the rules of a given repository. If data are restricted, one or more individuals responsible for authorization will need to be specified as the Gatekeeper, whose role it is to control access. Gatekeepers can be the Research Support Services group, the PI, an Office of Research Administration employee, a data center’s archivist, or whoever is thus designated in the Data Management Plan. If a permission authorization system is to be used, specific requirements and guidelines for evaluating the request and providing access need to be specified.

Sharing Data

OVERVIEW

Depending on the research discipline, data can often be deposited in one or more data centers (or repositories) that will provide access to the data. These repositories may have specific requirements in regards to: subject/research domain, data re-use and access, file format and data structure, and metadata.

INSTITUTIONAL REPOSITORY INFORMATION

The Mines Repository is a stable platform for sharing Mines’ work. The IR fosters research, learning, and discovery by sharing Mines’ scholarly, educational, and creative content with the world.

See library.mines.edu/research/ir/ for more information on depositing your work.

The institutional repository has file size limitations; therefore researchers must contact the Library to ensure sufficient storage availability. For large datasets, other arrangements with IT&S may be required to support local storage and access.  Alternatively, the researcher may opt for off-campus archival storage.

EXTERNAL REPOSITORY

There are many different external repositories in which to deposit data. In many cases, repositories and data centers will have their own policies regarding transfer, access permissions, data formats, metadata creation, retention periods, costs, policies and procedures. If you are going to use a repository/data center, check their policies before including them in a Data Management Plan. Any data that are deposited externally still needs to create metadata that can be added to the Mines institutional respository in order to facilitate discovery and re-use.

 

DISCIPLINE-RELATED REPOSITORIES
CHEMISTRY
  • Cambridge Structural Database – small molecule crystal structures ChemSpider – free-to-access collection of chemical structures and their associated information
  • eCrystals – x-ray crystallographic data
  • PubChem – NCBI’s repository of bioactivy/bioassay data and information for “small” molecules (i.e. not macromolecular). Both text-based and structure-based search tools are provided
COMPUTER SCIENCE
ENVIRONMENTAL AND GEOSCIENCES
GIS AND GEOGRAPHY
LIFE AND BIOLOGICAL SCIENCES
PHYSICS
SOCIAL SCIENCES
  • ICPSR (Inter-university Consortium for Political and Social Research) A non-profit, membership-based data archive located at the University of Michigan. The UO is a member of ICPSR, which allows students, staff, and faculty to access ICPSR data files and documentation for research.
  • Dataverse Network is a collection of social science research data contained in virtual data archives called “dataverses”. Maintained by the IQSS (Institute for Quantitative Social Sciences at Harvard), you can create your own “dataverse” and upload your data, subject to certain terms.
DIRECTORIES OF REPOSITORIES
  • re3data (“REgistry of REsearch REpositories”) List of repositories
  • DataBib List of repositories
  • DataCite List of Repositories Compiled by the British Library, BioMed Central, and the UK’s Digital Curation Centre.
  • Distributed Data Curation Center: Other Data Repositories Managed by Purdue University Libraries, the Distributed Data Curation Center lists of more than 50 open data repositories from a range of science disciplines.
  • Gene Expression Omnibus The Gene Expression Omnibus (GEO) is an open data repository which provides access to microarray, next-generation sequencing, and other forms of functional genomic data submitted by the scientific community.
  • Global Change Master Directory The Global Change Master Directory, maintained by the Earth Sciences Directorate at the National Aeronautics and Space Administration (NASA), provides access to more than 25,000 earth and environmental science data sets, relevant to global change and Earth science research.
  • MIT Data Management and Publishing: Sharing Your Data The MIT Libraries’ subject guide on data management and publishing includes a list of open data repositories spanning the disciplines of astronomy, atmospheric science, biology, chemistry, earth science, oceanography and space science.
  • Oceanographic Data Repositories Funded by the National Science Foundation, the Biological and Chemical Oceanography Data Management Office (BCO-DMO) provides access to several oceanographic data repositories created by the US Joint Global Ocean Flux Study and US Global Ocean Ecosystem Dynamic programs.
  • Open Access Directory: Data Repositories Launched in 2008 and hosted by the Graduate School of Library and Information Science at Simmons College, the Open Access Directory is a wiki that lists links to over 50 open data repositories in the disciplines of archaeology, biology, chemistry, environmental sciences, geology, geosciences and geospatial data, marine sciences, medicine and physics, as well as multidisciplinary open data repositories.
  • Public Data Sets on Amazon Web Services Amazon Web Services provides a centralized place to download public domain and non-proprietary astronomy, biology, chemistry and climatology data sets.