Copy

Monthly Roundup - April 2021

Updates

  • DCN met with COAR director Kathleen Shearer on April 20, 2021

  • Value of Curation group presented at IDCC April 19, 2021

  • Our DEI consultant Fay Cobb Payton has delivered our co-created action plan and final report! We’ll be discussing these more at the AHM in June.

  • AHM Draft Schedule is up on the website!

  • Note: going forward the interest group updates will appear in the newsletter on a quarterly basis

News Spotlight

New guide offers institutions a path toward public access for research data

In April 2021 the Association of American Universities (AAU) and the Association of Public and Land-grant Universities (APLU) released their “Guide to Accelerate Public Access to Research Data” to prompt campus discussions on how their institution might improve public access to data resulting from federally funded research and open science.

The Data Curation Network applauds this milestone in a multi-year effort in which our member institutions have been actively participating. And the DCN is thrilled that we’re highlighted in the guide as a model for collaboration across institutions for providing long-term stewardship for research data.

Read the guide and join in the discussion at your institution!

Curation

Monthly Reporting

New submissions

DCN-258: Long term forest regeneration data from the BWCAW, a forestry and forest sciences database (Access) dataset, was submitted by Melinda Kernik for Minnesota and was curated by Dave Fearon for JHU.

DCN-259: Ferric iron triggers greenalite formation in simulated Archean seawater Dataset (Hinz_Ferriciron_tt44pn13w), an earth and environmental sciences tabular dataset, was submitted by Rachel Woodbrook for Michigan and was curated by Wanda Marsolek for Minnesota.

DCN-260: Single-molecule microscopy image data and analysis files for “A translational riboswitch coordinates nascent transcription–translation coupling” (Chatterjee_Single-moleculemicroscopy_7h149q145), a chemistry code (MATLAB) and genomic dataset, was submitted by Rachel Woodbrook for Michigan and curated by Katie Barrick for Minnesota.

DCN-261: Gonzalez 2021 tagged CO model archive, an atmospheric chemistry netCDF dataset, was submitted by Katie Barrick for Minnesota and curated by Jon Petters for Virginia Tech.

DCN-263: Canadian Cordillera Fault Gouge XRD and isotopes (Lynch_CanadianCordillera_tt44pn145), a earth and environmental sciences tabular dataset, was submitted by Rachel Woodbrook for Michigan and curated by Xuying Xin for Penn State.

DCN-264: Temperate and chronic virus competition, a quantitative biology code (MATLAB) dataset, was submitted by Hoa Luong for Illinois and curated by Chen Chiu for JHU.

Resolutions

DCN-246: Bedform tracking tool, a civil and environmental engineering code (MATLAB) dataset, was submitted by Lisa Johnston for Minnesota and curated by Seth Erickson for Penn State. Lisa resolved this dataset last week reporting that due to Seth's expertise major curation actions were taken. Seth noted that the initial documentation was very good, but could be improved by adding information about the versions of MATLAB used and the operating systems the code had been tested in, that the license should be included within the source code, that the author include sample outputs with the dataset and finally, that the README be provided as a plain text file instead of a Word document. All of Seth's recommendations were accepted and followed up on.

DCN-253: Dataset for "Quantification of Magnetic Resonance Spectroscopy data using a combined reference: Application in typically developing infants, a neurosciences scientific image dataset, was recently resolved by Hoa Luong for Illinois. This dataset was curated by Chen Chiu for JHU, and Hoa reports that due to Chen's expertise major curation actions were taken. Chen recommended that the author include the scripts used to process/analyze the data with the dataset instead of pointing to Github (the link was already broken), and that they fix a few other small things (an erroneous link in the documentation and an empty directory). Hoa reported that some new documentation was added to the dataset along with the code.

DCN-256: Seq-Scope processed datasets for liver and colon results (RDS) (Cho_Seq-Scopeprocessed_9c67wn05f), a physiology code (R) and genomic dataset, was submitted by Rachel Woodbrook for Michigan and curated by Katie Barrick for Minnesota. This dataset was resolved by Rachel last week, who reported that due to Katie's expertise minimal curation actions were taken. Katie recommended that the author create a README for the dataset including descriptions of the files, information about how the files relate to each other, and version information for the code packages used. Katie also recommended the author provide links to the experimental outputs deposited. Rachel reports that among other things links were added to articles and additional data.

DCN-260: Single-molecule microscopy image data and analysis files for “A translational riboswitch coordinates nascent transcription–translation coupling” (Chatterjee_Single-moleculemicroscopy_7h149q145), a chemistry code (MATLAB) and genomic dataset, was submitted by Rachel Woodbrook for Michigan and curated by Katie Barrick for Minnesota. Rachel closed out this dataset last week reporting that due to Katie's expertise minimal curation actions were taken. Katie recommended that the author include information about the version of MATLAB used to write the code, as well as take additional steps to capture and document details of the coding environment, and finally to share information about how long the code should take to run to completion. Rachel reported that the software version and runtime were added to the documentation as well as an updated citation for the associated manuscript.

DCN-261: Gonzalez 2021 tagged CO model archive, an atmospheric chemistry simulation and code (netCDF and Fortran) dataset, was submitted by Katie Barrick for Minnesota and curated by Jon Petters for Virginia Tech. Jon recommended that the submitter improve the documentation for this dataset by including the names of the data authors, describing the code used as well as the model output and indicating whether or not the code had been adapted or not. Jon also recommended changes to the naming conventions of various files and directories in this dataset and explaining the conventions in the documentation. Katie resolved this dataset reporting that due to Jon's expertise minimal curation actions were taken. She reported that the author responded to all of Jon's suggestions - a new readme file was added to the dataset and some small changes were made to the dataset.