Monthly Roundup - March 2021


News Spotlight

The DCN hit a major milestone of our Alfred P Sloan grant to grow and sustain our project by publishing our Data Curation Network “Sustainability and Transition Plan.” And this plan is starting to take shape this summer as we formally move from a grant-supported project to a member-driven organization. Stay tuned for more information which will be presented at the 2021 All Hands Meeting.


Big Data

The Big Data Interest Group is planning to meet in late April or early May 2021, when they will revisit plans for a best-practices type document as their output.


The DCN Education subgroup met on March 30th. An extension of the current IMLS grant was discussed, in addition to platform and/or partnership options for creating self-service data curation training modules. We also began discussing the future of the group within the new DCN governance structure. A follow up discussion on DCN Educational work will take place at the bi-weekly team meeting on April 27, 2021.

End User Satisfaction

The End User Satisfaction working group has been hard at work since last fall. Meeting frequently to develop a survey of questions assessing researcher satisfaction with curation services and to enroll DCN partners. The survey is currently being reviewed by participating partner IRBs and we hope to send it out to researchers before the end of April 2021!

Human Subjects

The Human Subjects working group has primarily been focused on publishing new primers: the Consent Forms primer was published in late 2020, and the Human Subjects Essentials was published in early 2021. These primers were discussed in a recent blog post. The group hasn’t met recently and plans to meet on an “as needed” basis going forward.

Institutional Outreach & Communications

One of the main things we are working on is information sharing of different communication and outreach strategies we are using at our institution through a series we are calling “Steal this Idea”, we have also been thinking about other ways we might support collecting end user stories that might be used in marketing by collaborating with the End User Satisfaction Group, as well as producing other communications outputs to promote curation but that is still very exploratory.

Racial Justice

Working group met in March 2021 to do a deep dive on how our CURATED model enables us to incorporate elements of racial justice and diversity, equity, accessibility and inclusion into our work. The session was led and facilitated by our consultant Fay Cobb Payton and compared our model to other data models including CARE, FATE, FEAT, and FAIR using three example data sets (1, 2, 3) to illustrate the different facets of how our work may address racial justice issues.

Value of Curation

The working group has been meeting bi-weekly to analyze the results of our survey (closed January ‘21) and which received 120 responses describing the curation practices and perceptions of their “value-add” from staff working at generalist, disciplinary and institutional data repositories in the US. Our team plans to publish these results in a journal and will present some preliminary findings to the DCN at an upcoming webinar.


Monthly Reporting


New submissions

DCN-251: Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University, a library and information sciences code and tabular dataset, was submitted by Hoa Luong for Illinois and curated by Seth Erickson for Penn State.

DCN-252: Utilizing Virtualized Hardware Logic Computations to Benefit Multi-User Performance, a computer science code dataset, was submitted by Jennifer Moore for WashU and curated by Greg Janée for UCSB.

DCN-253: Dataset for "Quantification of Magnetic Resonance Spectroscopy data using a combined reference: Application in typically developing infants", a neuroscience scientific image dataset, was submitted by Hoa Luong for Illinois and curated by Chen Chiu for JHU.

DCN-254: Scopus API Scripts for Data Reuse Project, an LIS code dataset, was submitted by Hoa Luong for Illinois and curated by Susan Borda for Michigan.

DCN-255: N fixation in green manure, a crop sciences code (R) dataset, was submitted by Sarah Wright for Cornell and curated by Susan Braxton for Illinois.

DCN-256: Seq-Scope processed datasets for liver and colon results (RDS) (Cho_Seq-Scopeprocessed_9c67wn05f), a physiology code (R) and genomics dataset, was submitted by Rachel Woodbrook for Michigan and curated by Katie Barrick for Minnesota.

DCN-257: Public Support for State Investment in Early Childhood Education, a communications code (R) dataset, was submitted by Sarah Wright for Cornell and curated by Marley Kalt for JHU.


DCN-244: Deep-learning model for occulted hard X-ray flare detection, an astrophysics simulation dataset, was submitted by Lisa Johnston for Minnesota and curated by Henrik Spoon for Cornell. Henrik recommended that the authors enhance their documentation (readme) to provide better instructions on how to run the code so that users unfamiliar with the programming language could still run it successfully! Lisa resolved this dataset and reported that due to Henrik's expertise major curation actions were taken. The authors did update their readme as recommended, a sample input file was added, and more contextual information was added to the documentation as well.

DCN-245: Data for: The Potential Impact of a Clean Energy Society On Air Quality, an atmospheric simulation (netCDF) dataset, was submitted by Hoa Luong for Illinois and curated by Susan Borda for Michigan. Susan had a few recommendations for improvements to the documentation including: to define acronyms related to file names, clarify which output files related to each model, to clarify if output files were raw or processed (and if processed to include the code used for processing), to define which data or parameters were used as model inputs, and to include a citation to the paper mentioned in the dataset's description. Hoa reports that thought the author did agree with the recommendations, no changes have been made yet.

DCN-249: Southeastern South America Soil Moisture Alteration Experiment Using CESM2, an atmospheric simulation (netCDF) dataset, was submitted by Hoa Luong for Illinois and curated by Susan Borda for Michigan. Susan's only recommendation was that the author clarify if thedatafiles were raw or processed, and if they were processed to include any code used in the processing. Hoa reported that minimal curation actions occurred based on Susan's recommendation, but that the author did include the clarification as recommended.

DCN-250: Characterization of pubertal development of girls in rural Bangladesh, a public health statistical (stata) dataset, was submitted to the DCN by Marley Kalt and curated by Sophia Lafferty-Hess. Sophia's reported that the dataset was generally in very good shape but recommended that the author add definitions of and reasons for missing data within the dataset to the data dictionary and also that the local curation team request a copy of the consent form/waiver of consent, if they wished. Marley reported that due to Sophia's curation efforts minimal action was taken and the author did update the data dictionary as recommended.

DCN-252:Data from Virtualized Hardware Logic Computations, a computer science code dataset, was submitted by Jennifer Moore and curated by Greg Janée for UCSB. Greg’s recommendations were primarily concerned with documentation, specifically he recommended that the author document how to run the code, any external dependencies the code relies on, define the structure of the output file and also address copyright statements found in some of the files. Jennifer reports that based on Greg’s expertise minimal curation actions were taken and the researcher did address the recommendations.

Curator’s Corner

Get to Know DCN Curator Susan Braxton!

How did you come to your current position?

When we moved to Urbana-Champaign I wasn’t a librarian yet, but I’d done a lot of library-type work (ex. I’d worked at the USDA in the Biological Control Documentation Unit developing a database)….

Click to read the full interview!