how to disclose data for double-blind review and make it archived open data upon acceptance
Openness in science is key to fostering progress through transparency, reproducibility, and replicability. Although open access and open data are two essential pillars in open science, open data forms the foundation for excellence in evidence-based research. For years, I have promoted open science practices, both open access and open data, within the software engineering research field.
While I strive to turn these recommendations into requirements (we are making progress), numerous authors have expressed concerns about potential missteps when
double-blind review and
data sharing occur simultaneously. The process is indeed straightforward, but once the data is appropriately open and archived, there’s no turning it back3.
Firstly, I would like to reiterate that you should not distribute preprints, postprints, and datasets from non-persistent systems such as personal websites–whether on your personal server or your institution’s server–or consumer cloud storage (e.g., Dropbox, Google Drive). These systems are extremely volatile, and data can disappear over sometimes short periods6.
Scientific and research communities require your knowledge to last forever–and yes, any knowledge you produce is valuable. This is why your data (and pre/postprints) should be released under a proper license and preserved in archived repositories, where no one, not even you, can delete it.
Here’s where figshare.com7 and Zenodo.org come into picture. Figshare and Zenodo are two discipline-agnostic platforms for archived open data. They offer the same functionality. Figshare is a for-profit enterprise managed by open science proponents, whereas Zenodo is a non-profit institution supported by OpenAIRE and CERN. Although they’re free for users and provide identical functionalities, their content archival methods differ. Figshare’s content is secured and distributed with CLOKSS, while Zenodo, to my knowledge, uses no additional digital preservation or archival system, but it is hosted at CERN, which provides reasonable confidence in their data preservation capabilities.
Without further ado, let’s get into it. I’ll explain the process for figshare first, followed by Zenodo. I’m assuming that you already have:
- An anonymized dataset ready for third-party people and machines to view.
- An account on either figshare or Zenodo. You don’t need an anonymous/dummy account.
I have created a dummy dataset named test.csv to demonstrate the process for both figshare and Zenodo.
Double-blind data submission on figshare
In figshare, when you initiate a submission, you’ll need to fill in details such as title, authors, submission categories, item type, keywords, and description.
You can provide unblinded author details at this stage. The author details will be blinded once the item is published for double-blind review. Ensure that the title and description are blinded, as these fields will become visible.
Now for the crucial part. Make sure to
Generate a private link and do not check the
Publish option. The following screenshot displays the correct settings.
There’s no need to check the options
Make file(s) confidential8, and
Reserve Digital Object Identifier.
After verifying that the
Publish is not checked, you can save the item.
Include the private URL in your submission. In my case, the private URL was
Here’s what reviewers see when they open your dataset from the private URL:
The dataset is presented nicely and can be downloaded. The title, category, and description are shown, while the author details and your username are not displayed. You’re all set. The dataset is private to you and those who know the private URL. Furthermore, the dataset is not indexed by figshare.
Important: private URLs at figshare expire after 12 months and can be extended by contacting support. This is designed to prevent the use of private URLs as final placeholders.
If you have multiple files for double-blind review (e.g., an appendix, a dataset, and some scripts), upload them separately. Then, create a figshare Collection. Collections can also be shared privately via a single URL.
Open data upon acceptance on figshare
Congratulations on getting your paper accepted. Now, you only need to perform three steps to turn your figshare submission into open data.
Choose either the CC0 or the CC-BY 4.0 license. CC0 equates to rendering the data as public domain (it can’t be freer), while CC-BY requires attribution when reusing the data and permits any form of reuse9. Earlier versions of the CC-BY license are not appropriately worded for data.
Use the DOI you’ve just received to properly reference your open data in your paper.
Here is a sample of my published test file:
Graziotin, Daniel (2018): Test upload to demonstrate private sharing for peer review and DBR. figshare. Dataset. DOI: 10.6084/m9.figshare.7048631.v1.
Done. You are now an open science hero and my personal hero.
Double-blind data submission on Zenodo
Zenodo also allows submissions of blinded data, but I find the process less intuitive, with a twist. As far as I know, the solitary way of making a submission compatible with double-blind review is to publish the submission as open access [sic], with all details blinded. Other options will either render the file inaccessible (
Embargo access or
Closed access options) or expose your username during the request process (
Restricted access option).
Important: your dataset will be anonymous and compliant with double-blind review. However, the file will be publicly accessible and indexed on Zenodo.
Start by uploading your dataset. Ensure that you click the green
Start upload button after selecting your dataset.
Enter various details, but avoid revealing identifying information. The following part varies from the figshare approach: enter Anonymous as the
author first and last name. Even though you’re using your account to upload the item, the published data will display Anonymous as the author, and your username won’t be exposed.
As mentioned earlier, choose Open Access as your
Save and publish the dataset. Utilize the acquired DOI to reference the dataset in your double-blind submission.
This is what reviewers will see:
Open data upon acceptance on Zenodo
Congratulations on getting your paper accepted. There are only two steps required to make your Zenodo submission open data. Technically, it was already open data. You just need to reveal it now.
Replace the Anonymous entry with the actual author details.
Use the DOI to properly reference your open data in your paper.
That’s all. For any questions, feel free to contact me or leave a comment below. This is particularly relevant if I have overlooked something about Zenodo.
This post is aimed at the software engineering research community, but it applies to any discipline. ↩
I have collaborated with CHASE, MSR, PROFES, ESEM, and ESEC/FSE [to be defined] on open science policies, serving as the open science chair for all except MSR. Fun fact: the CHASE workshop was the first software engineering venue to adopt open science policies in 2016. At this point, I would like to extend my sincere gratitude to Daniel Méndez for his support and his attempts to incorporate open science practices into conferences while involving me in the process. ↩
Both figshare and Zenodo are versioned. Each time you save a published submission, a new version is created and also published. Unpublishing a submission and its versions is not straightforward. Proceed with caution. ↩
Archiving means there are mechanisms to ensure that data is appropriately preserved, duplicated, and distributed in such a way that it will stand the test of time even if the hosting platform should fail. See, for example, LOCKKS and CLOKKS. ↩
This approach is also compatible with single-blind review and open peer review. ↩
Edit 2019-09-26: As of September, I am no longer an advisor for any company, including figshare.
Full disclosure: I am a figshare advisor. If you ask me to pick between figshare and Zenodo, my answer will be biased.↩
Make file(s) confidentialis an option used when only metadata should be made publicly available. In software engineering research, it’s unlikely we will need to use this. ↩
I do not use a commenting system anymore, but I would be glad to read your feedback. Feel free to contact me.