Copy

Monthly Roundup - June 2021

Updates

  • Education: IMLS confirmed the no cost extension to Penn State to wrap up the Specialized Date Curation Workshops grant. The education group will work on getting the workshop curricula up on GitHub over the next 6 months. (See extended update from Hannah Hadley below.)

  • Governance: The Governance Board will hold its first meeting on July 20th 10am (CST) and all are welcome to join the zoom call. The board will vote to ratify the proposed Governance Model. If you have any questions please contact your DCN Representative.

  • Welcome Michael J. Fox Foundation! In case you missed it during the All Hands Meeting, the Foundation is the newest sustaining member joining us for Year 4! They will be represented by Leslie Kirsch and Josh Gottesman.

News Spotlight

DCN Holds 4th Annual All Hands Meeting!

The 4th annual All Hands Meeting was held virtually from Jun 22-24, 2021 from 1-4pm EST each day. Special thanks to the planning team that made the virtual format fun, participatory, and informative: Liza Coburn (DCN/Minnesota), Katie Wissel (NUY), Wendy Kozlowski (Cornell), Dorris Scott (WashU), Wanda Marsolek (Minnesota), and Jennifer Moore (WashU).

Read the full post, including highlights from the meeting, on the DCN blog!

Research

Big Data

The Big Data group met during a breakout session hosted by Wendy Kozlowski.After jumping right into conversations about what data to keep, for how long, best practices for preservation and replicability, and curation processes, the big data interest group has decided to continue to convene. We will meet quarterly (next meeting September), and will revisit options for format and possible outputs. The interest group will be open to DCN members and interested curation community members, and logistics will be organized by Erin Clary and Wendy Kozlowski.

Education

An education interest gathering was held in the AHM breakout sessions. Recent discussions from the DCN Group CURATED for Graduate Students was highlighted. The breakout participants also shared data-themed education resources (e.g., data management, curation, format expertise) or training they have offered, and discussed interests or needs for the same. One take away was a mutual need for practice datasets or code to enhance training experiences. Anyone interested in continuing these discussions by forming an interest group may contact Hannah Hadley. Logistics will be discussed and worked out with interested persons. An education interest group would be open to all members of DCN and others in the curation community who express interest.

Additionally, members of the Education Committee are using remaining IMLS grant funds to publish training content to Github Pages. The generous IMLS funding allowed us to provide three free Specialized Data Curation Workshops and facilitate the creation of twenty-four Data Curation Primers. We were not able to provide an additional workshop due to the pandemic. However, publishing essential training content will assist in building expertise broadly. The training content will be conveniently adjacent to the Data Curation Primers and promote their usage. We understand that online training cannot replace the hands-on experience normally offered in our workshops, therefore we plan to further develop our curriculum content and resume in-person training opportunities in the future.

End User Satisfaction

At the AHM, the End User Satisfaction Survey SIG, represented by Hoa Luong and Sarah Wright, shared the early results of the short survey that was sent to all researchers who have deposited data with the participating repositories between Jan 1, 2019 - March 15, 2021. With a 40% response rate (227 responses), the feedback was overwhelmingly positive: 95% of respondents agreed that data curation adds value to the data sharing process, 98.2% would recommend their colleagues submit data to our repositories, and numerous text responses were positively glowing endorsements of both our services and repository curators, who were often specifically named in the praise. Next steps for the SIG include publishing de-identified survey results, sharing the survey instrument and possibly creating a question bank for other repositories to use to assess user satisfaction, and writing a paper to further share our findings. Contact Sarah Wright with any questions.

Human Subjects

Jen Darragh hosted a breakout session at the AHM. In the session they discussed what they’d worked on to date as well as possible future directions around geospatial data and human subjects confidentiality concerns (and other ethical concerns).

Institutional Outreach & Communications

During the DCN AHM meeting DCN members met informally to chat about various aspects of how we communicate and advocate for data sharing with specific stakeholders, such as the IRB, as well as shared other experiences. The next Institutional Outreach and Communications Interest Group meeting is still TBD and the group will discuss next steps including potential collaborations with the End User Satisfaction group.

Racial Justice

On behalf of the Racial Justice Working Group Mara Blake (JHU) and Lisa Johnston (Minnesota) presented a summary of the report from consultant Fay Cobb Payton. The group does not yet have a meeting schedule, but when we do, we will share out the time so others can join us to discuss our next steps in implementing the short-term recommendations.

Refining the Curation Protocol

At the All Hands Meeting Susan Braxton reported on the group’s work and their goals in refining the curation protocol. Therefinement of the model and checklist highlights the overlapping and interdependent nature of the CURATED steps, identifies format-agnostic essential tasks, and articulates ethics considerations for each part of the curation process.

Value of Curation

Renata Curty, Wendy Kozlowski, and Sophia Lafferty-Hess presented preliminary results from a survey project undertaken by the Value of Curation Interest Group to examine the level of curation performed by US-based repositories and the perceived value of curation activities. The group is currently working on drafting a publication to share the results of this study with the broader community.

Curation

Monthly Reporting

New Submissions

DCN-269: Simulation data for "Adsorption of Charge Sequence-Specific Polydisperse Polyelectrolytes", a chemical engineering code/simulation dataset, was submitted by Lisa Johnston for Minnesota and curated by Susan Borda for Michigan.

DCN-270: Dataset for "Safety and data quality of EEG recorded simultaneously with multi-band fMRI", a neuroscience code (MATLAB) dataset, was submitted by Hoa Luong for Illinois and curated by Seth Erickson for Penn State.

DCN-271: Seeding Rates for Cover Crop Mixtures, a crop sciences code (R) dataset, was submitted by Sarah Wright for Cornell and curated by Jess Herzog for Dryad.

Resolved Datasets

DCN-194: Floral resource diversity drives bee community diversity in prairie restorations along an agricultural landscape gradient, an entomology R code dataset, was submitted by Valerie Collins for Minnesota and curated by Katie Barrick for Minnesota and Briana Ezray for Penn State. Briana recommended that the researcher provide documentation describing the data files, the process for development of the R script, the version of R used and update the licensing statement. She also recommended some changes to the R script itself. Katie recently closed this dataset reporting that as a result of Briana’s expertise major curation actions were taken including the addition of data files and updates to the documentation.

DCN-255: Data from: Estimating agronomically relevant symbiotic N fixation in green manure breeding programs, a plant sciences code (R) dataset, was submitted by Sarah Wright for Cornell and curated by Susan Braxton for Illinois. Susan's recommendations included adding a file manifest or documenting file relationships in some way, looking into some possible missing files, and updating labels of photographs by tagging them with species names. Sarah resolved this dataset on Friday reporting that the researcher addressed all of Susan's feedback and major curation updates were made - the files were updated (some added, some removed), the manifest included in the readme was updated, species name labels were added to thumbnails.

DCN-268: Data from: Therapeutic Frequency Profile of Subthalamic Nucleus Deep Brain Stimulation is Shaped by Antidromic Spike Failure, a neurobiology code (MATLAB) dataset, was submitted byJen Darragh for Dukeand curated byChen Chiu for Johns Hopkins. Chen made substantial recommendations to the initial readmes, including to: provide information about all of the packages and toolboxes needed to run the code (and better citations so the user can find them easily), update file names to make the code run better, remove unnecessary/unused directories from the dataset, define the variables in the experimental data, convert the experimental data to an open format (CSV) and choose a software license for the code. Jen finalized this dataset reporting that due to Chen's expertise, major curation actions were taken and the depositor accepted Chen's recommendations and made the necessary changes.

Curator’s Corner

Get to Know DCN Curator Melinda Kernik!

How did you come to your current position?

I was a graduate student at University of Minnesota in the Department of Geography, Environment, and Society. In my first year, I had a summer job at the Map Library – georeferencing scanned maps and digitizing historical building footprints…

Read the full interview on the DCN website!