Data & Software for Authors
What is Needed?
AGU requires that the underlying data needed to understand, evaluate, and build upon the reported research be available at the time of peer review and publication. Additionally, authors should make available software that has a significant impact on the research. This entails:
- Depositing the data and software in a community accepted, trusted repository, as appropriate, and preferably with a DOI
- Including an Availability Statement as a separate paragraph in the Open Research section explaining to the reader where and how to access the data and software
- And including citation(s) to the deposited data and software, in the Reference Section.
Click on the headings below for detailed information on:
- Models & Simulations
- Data and Software Sharing Guidance for Authors Submitting to AGU Journals
- International Geo Sample Numbers
Most of your questions regarding data and software should be answered by the resources below. Additionally, this example paper from Koymans et al demonstrates the use of an Open Research section with data and software citations. Just in case, if you still have questions, you can contact [email protected].
What Data Needs to be Available?
Primary and processed data used for your research should be preserved and made available. Generally, the underlying data are considered to be the types of data usually preserved in domain repositories for each discipline. These may include raw data, but are usually the processed or refined data that support and lead to the described results and allow other readers to assess your conclusions and build off your work.
In your paper, cite these data, as well as any data you used from other sources, and include information about access to the data in the availability statement. For model or simulation data, follow Data and Software Sharing Guidance for Authors Submitting to AGU Journals on prioritizing preserved output; in general, availability of software is most important.
Very large data (greater than 1 terabyte or TB) can be a challenge to preserve as there are often fees and additional resources required. One option to consider, institutions often offer solutions for data preservation and compliance. Again, refer to the Data and Software Sharing Guidance for Authors Submitting to AGU Journals for more information or email [email protected].
The data that supports the research reported in your paper must be deposited in a community accepted, trusted repository. When identifying the most appropriate repositories for your data, first, refer to the Data and Software Sharing Guidance for Authors Submitting to AGU Journals below. We recommend a repository that specializes in the data for your scientific domain as this will maximize the probability that the deposited data will be findable, accessible, interoperable and reusable (FAIR). Otherwise, look to your institutional repository, your computing center, or a general repository Note, English-language (or English translation) for any cited sources is required. For your reference:
- Data and Software Sharing Guidance for Authors Submitting to AGU Journals
- National Repository
- Institutional Repository - information (US-based)
- Generalist Repository comparison chart
Note: Starting March 2021, AGU authors funded by the U.S. NSF will have their data publication fees waived when using the Dryad repository. Learn more about the AGU-Dryad partnership.
An Availability Statement, located in the Open Research section of the paper, contains information about your data, software, and other research objects (e.g. notebook) and how readers can access these (available in AGU's LaTeX / Word templates). The Statement should include:
- A brief description of the type(s) of data or software
- Repository Name(s) where they are deposited
- Version (of software)
- DOI, Persistent Identifier Link to Data or Software (and Identifier)
- Link to publicly accessible development platform (in the case of Software, e.g. GitHub)
- Access Conditions (e.g. if Registration is Required)
- Licensing/Permissions (e.g. Creative Commons Attribution)
- In-text citation in References (optional)
When developing the Availability Statement, consider how best to direct the reader/reviewer to your data (or software). For instance, do not simply provide a web link to the homepage of the repository. Directly link to the data (or software) or provide information/guidance necessary to get to the data (or software) efficiently.
Check to see if the repository or data/software source has an “Acknowledgements” or “How to Cite” page to follow when putting together your Availability Statement and citation in the References section.
It is not sufficient to archive and make your data available in the supplementary information of your paper and to write that your data will be available upon request. Additionally, locations such as FTP sites and/or project web pages are also not suitable preservation choices (see Repository Selection). See Availability Statements and Template Examples for guidance on data owned by others that is not in a preservation repository.
For data that is not initially available upon submission, authors should describe where the data will be shared in the Data Availability Statement and can share information within the supplementary information for peer review purposes only (Note: Use file upload type "Data File(s) for Peer Review (will not publish)" for this purpose).
Availability Statement Templates:
- The [type of data] data used for [brief context, description] in the study are available at [repository, source name] via [DOI, persistent identifier link] with [license, access conditions] [optional in-text citation in References]
- [Version number] of the [software name] used for [brief context, description of what the software was used for] is preserved at [DOI, persistent identifier link], available via [license type, access conditions] and developed openly at [software development platform link].* [optional in-text citation in References]
* For Jupyter Notebooks, R Script(s)/Markdown guidance, please see the following resources:
The Methodology section of your paper should also describe how your data/software pertains to your research.
Data & Software Citation
Please cite in your References/Bibliography section a formal citation to the data/software described in the Availability Statement. Doing so will provide a citation credit for the data/software. Additionally, please cite data and software created by others used in your research, also to ensure proper credit for that work. If the data or software is described in a separate data or software paper, please include both that paper and the deposited data or software as separate citations. Citations should include:
- Author(s) or project name(s)
- Date / Software published
- Title / Software name
- Data or software release/version (optional)
- Bracketed description type (e.g., [Dataset], [Software], [Collection], [ComputationalNotebook])
- Repository name / Publication venue
- DOI, persistent identifier, URL
- Retrieved date (required when using URL)
For more information on citations, reference the Data and Software Sharing Guidance for Authors Submitting to AGU Journals.
Data Citation Examples:
- Fiechter, J., & Cheresh, J. (2020). Physical and biogeochemical drivers of alongshore pH and oxygen variability in the California Current System (Version 7) [Dataset]. Dryad. https://doi.org/10.7291/D1D96Q
- Edmunds, P. J., Didden, C., & Frank, K. (2021). Mean percentage cover of corals and Porites astreoides at each site by year at St. John, VI from 1992 to 2019 (Version 1) [Dataset]. Biological and Chemical Oceanography Data Management Office (BCO-DMO). https://doi.org/10.26008/1912/BCO-DMO.843284.1
- Alwarda, R., & Smith, I. (2021). Elevation data for Reflectors within the CO2 Deposit in Planum Australe, Mars [Dataset]. Zenodo. https://doi.org/10.5281/ZENODO.4639669
- Gries, C., Downs, R. R., O’Brien, M., Parr, C., Duerr, R., Koskela, R., et al. (2019). Return on Investment Metrics for Data Repositories in Earth and Environmental Sciences [Dataset]. Environmental Data Initiative. https://doi.org/10.6073/PASTA/D49BEC63F51603512EFA7E0FD2717203
Software Citation Examples:
- Lab for Exosphere and Near Space Environment Studies. (2019, March 20). lenses-lab/LYAO_RT-2018JA026426: Original Release (Version 1.0.0) [Software]. Zenodo. http://doi.org/10.5281/zenodo.2598836
- Bell, S. W. (2020). samwbell/saturn_counts: April 26, 2020 Release (Version 1.1.0) [Software]. Zenodo. https://doi.org/10.5281/ZENODO.3766959
- Shaoqian Hu. (2019, December 25). Direct surface wave radial anisotropy tomography package (Version 1.0) [Software]. Zenodo. http://doi.org/10.5281/zenodo.3592528
For more information on citation examples, reference the Data and Software Sharing Guidance for Authors Submitting to AGU Journals. Note: See Software Citation - 5 Tips.
Need help with formatting your data and software citations? Try CiteAs or the DOI Citation Formatter and select the Formatting Style “apa” and “en-US” for Language and Country. Note: The Formatter resolves and negotiates DOIs from DataCite, Crossref, and mEDRA from the full list of DOI registration agencies.
Models & Simulations
For research involving models and simulations, refer to the community guidelines regarding what specifically must be made available and cited. Otherwise, in all cases, the model and configuration information must be made available.
When the primary data for the research comes from models & simulations, follow these guidelines:
Citation of the model (most important).
BEST OPTION (model in repository): Cite the model using a repository that registers the version used for the paper with a persistent identifier (e.g., Digital Object Identifier) and metadata that describes the model using community standards. If a published paper has the complete description, please cite that also. Your citation should accurately capture the authors/creators of the model.
GOOD OPTION (model described in paper): Cite the publication where the model is described with information about the version used for this paper.
Description of the model.
Include a description of the model in the text of the paper that is adequate to support reproducibility. If a publication describes the model thoroughly, cite that paper.
Information about the configuration/parameters used to run the model.
This information should be included in the paper text as well as providing any script/workflow used. The script/workflow should be preserved in a repository and cited. Any forcing datasets used should be described and cited.
Data that Supports the Summary Results, Tables and Figures.
BEST OPTION: Cite a package in an appropriate repository that includes scripts/workflows, provenance information, and summary files that support the research, figures and tables, consistent with archives maintained for transparency and traceability by assessments such as the IPCC.
GOOD OPTION: Cite files (e.g., scripts, descriptive detail) in an appropriate repository that support evaluating the research and provide the details behind the tables and figures.
ACCEPTABLE OPTION: Provide the necessary information for transparency and traceability of the analysis using your community standards or guidance.
Model Output Data (optional)
If certain model output data are instrumental to evaluating the research, then deposit these in a community accepted, trusted repository. There are currently limited resources for preserving files of very large size. Selecting representative output from one or a few model runs as is recommended by a specific community may be necessary.
If the model or software is not available because of the sensitivity of the research or proprietary concerns, then provide as much information as possible to support evaluation of the research and responsibility. Acceptance in such cases is at the discretion of the editors. Papers where the primary results depend on proprietary scripts that are not available will usually not be allowed.
Data and Software Sharing Guidance for Authors Submitting to AGU journals
AGU editors, staff, and community members have developed Data and Software Guidance for Authors Submitting to AGU Journals. Sections in the guidance are available below for quick reference:
- Considerations for publication related to data and software
- Guidelines for Research Primarily Based on Numerical Models or Theory
- Selecting Your Repository
- During Peer Review
- Paper Acceptance
- Availability Statements and Template Examples
Fox, Peter, Erdmann, Chris, Stall, Shelley, Griffies, Stephen M., Beal, Lisa M., Pinardi, Nadia, Hanson, Brooks, Friedrichs, Marjorie A. M., Feakins, Sarah, Bracco, Annalisa, Pirenne, Benoî, & Legg, Sonya. (2021). Data and Software Sharing Guidance for Authors Submitting to AGU Journals. Zenodo. https://doi.org/10.5281/zenodo.5124741
AGU editors, staff, and community members have also provided a list of Domain-Displine Repositories Useful to AGU Journals.
International Geo Sample Numbers
AGU recommends the use of IGSNs (International Geo Sample Numbers) for citing samples reported in research papers. The IGSN provides a unique identifier that allows samples to be linked across publications and searched through a central metadata repository. We strongly encourage authors to register samples with an IGSN Allocating Agent and obtain IGSNs and use them throughout their manuscript, tables, and archived data sets. We recognize IGSNs during our production process and will provide links in the manuscript and tables to the registered sample descriptions. IGSNs can be reserved before field seasons, or assigned afterwards. For more information, see http://www.igsn.org.