Presentation: Research Data Management Basics

Research Data Management presentationJanuary 24, 2014
Faculty Development Day
John Jay College of Criminal Justice

Slides (PDF, 3MB)
Handout (PDF, 1MB)

Creative Commons License
These works are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


This presentation covers the basics of data management. It’s structured around the Research Data Management Guide I wrote for the library.

(These slides also feature an OPP/DMP pun, which fell completely flat at the time, but which an audience member told me five months later was “hilarious.” File that under Longest Awkward Silence After Bad Joke.)

Outline

What does data look like? 

  • Databases & spreadsheets
  • Interview audio & transcriptions
  • Code, programs, & software
  • Text corpora
  • Photos, video

What does data management look like?

  • Planning for data collection
  • Processing data for analysis
  • Curating data to minimize error
  • Storing data securely
  • Sharing data with research community
  • Archiving data in a repository

Data storage

  • Remember LOCKSS: Lots Of Copies Keep Stuff Safe!
  • Backup options:
    • Local backup (e.g., secondary server or hard drive)
    • Off-site backup (e.g., office & home)
    • Cloud backup (e.g, paid service)

Data security

  • CUNY requires encryption of confidential data. We have access to a MacAfee and can use other standards, like PGP.
  • Store non-public info on a secure server, rather than desktop computers, flash drives, etc.
  • Password-protect
  • Back up!

 Metadata

  • The who, what, when, where, why, how of your research
  • Disciplinary standards
    • Follow community convention as much as possible for better understanding and reuse
  • Discovery standards
    • When storing/sharing data, providing basic metadata means your data will be more discoverable
  • Protect your reputation and your data
    • Metadata gives your data and project the context necessary for understanding
    • Prevent data misinterpretation!
    • Prevent data rip-offs!

Sharing your data

  • Why share?
    • Reproducibility
    • Impact
    • Funder requirements & recommendations
    • For science! / For the humanities!
  • How can you share your data?
    • Submit to a repository
      • Open, restricted, or embargoed
    • Include data as supplemental material in your publication
    • Give your data a DOI and cite it
    • Personally provide data (or a form of it) on request
  • Relevant repositories
    • ICPSR (social science data)
    • National Archive of Criminal Justice Data
    • Institutional repositories

Data management plans (DMPs) 

  • A data management plan (DMP) is a document that describes the data a research project will collect and the steps that project investigators will take to ensure that the data is secure, standardized, documented, and (optionally) shareable.
  • Some funding agencies require a DMP to be submitted with project proposals. The biggies:
    • NSF
    • NIH
    • NOAA
    • NEH-ODF
  • Most DMPs ask for information like the following:
    1. Summarize your project
    2. Who is responsible for the data?
    3. List the kind(s) of data your project will create and which format(s) you’ll be using
    4. Identify the standards to which your data must adhere
      • disciplinary standards
      • metadata standards
      • naming conventions used
    5. Define your plans for
      • data storage and security
      • sharing the data and policies of reuse
      • how you will archive and preserve the data (e.g. a repository)
  • Resources
    • DMPtool
      • Step-by-step guide
      • Provides examples
    • DMP templates & examples are on the John Jay library’s Research Data Management subject guide

« Presentations