Topics

Sharing code and workflows

In the era of FAIR (Findable, Accessible, Interoperable and Reusable) and Open Science, all research outputs should be shared and made available to the public. This includes the code and workflows produced during the course of the research project. This page, while not exhaustive, aims to give an introduction to the topic of FAIR code.

Learn more about the FAIR principles

What are code & workflows?

Code

Code in this context ranges from individual instructions and scripts, to software. Individual instructions and scripts are pieces of code that execute specific tasks or algorithms within research contexts. This could include anything from simple data manipulation or visualisation to complex simulations. Software encompasses code in a packaged, documented and operational form, designed to conduct a range of scientific tasks, thereby becoming an essential tool in the research project.

Workflows

Scientific workflows or pipelines are code that describe the structured sequences that organise and automate research tasks. They detail the flow of data between tasks, often managed by Workflow Management Systems (WMS), and encapsulate a higher level of complexity by integrating multiple software components, scripts, and tools.

Why share code & workflows?

In life science research, code and workflows are often as important as the data itself. From data processing scripts to full analysis pipelines — sharing your code is about openness, improving your science, gaining recognition, and allowing others to build on your work.



Benefits of sharing code and workflows © Media Elements sourced from Canva.com

Openly sharing code and workflows not only fulfils key scientific and ethical principles, but also catalyses new discoveries and collaborations across the life sciences. It ensures that research methodologies are transparent, reproducible, and accessible to all, laying the groundwork for a more inclusive and innovative scientific community.

  • Strengthen scientific integrity and utility - Clear documentation and the ability to replicate analyses build trust in scientific results, and are essential for verifying and understanding research findings. Sharing workflows allows them to be applied to new datasets or challenges, maximising the utility and lifespan of research tools, and preventing the loss of valuable methodologies.

  • Enhance research efficiency and compliance - Modular, well-documented code facilitates smoother research processes, from rerunning experiments to adapting methodologies for new data. Publishing code and data independently of each other, and of associated research articles, simplifies compliance with licencing requirements. Adherence with open code mandates set by journals and funders both enhances research credibility, and broadens dissemination. Open code obtains scholarly recognition through citations, emphasising the importance of code as a research output.

  • Foster collaboration and global innovation - Sharing tools broadens their reach, enabling equitable access and serving as vital resources for training the next generation of scientists, particularly in resource-limited settings. Open code encourages community input and interdisciplinary exchanges, leading to more robust, flexible solutions, and sparking innovation across research domains.

Learn more about practical benefits from Software Sustainability Institute

Read about FAIR Research Software Principles at RSQKit

How to share code & workflows?

How you share code and workflows depends on your goals; whether you seek transparency, collaboration, or community contributions. Keep in mind that your objectives may evolve over time, and every step towards FAIR practices adds value.

Consult the SciLifeLab Open Science Software Checklist for comprehensive guidance

Where can code and workflows be shared?

To ensure transparency, make a copy of your code or workflow available for long-term access, independent of any research paper. This should be accompanied by a README file with a clear description, usage instructions, and licence(s). Remember to also describe the compute environment (platform, specific software versions, dependencies or libraries used) in order to ensure compatibility and eliminate potential version conflicts or discrepancies. Collaborative version control platforms, such as GitHub, are ideal for both sharing and ongoing development. Use tags or releases to be able to reference specific versions of your code in publication etc.

To enhance FAIRness, consider obtaining a persistent identifier for your code or workflow. One way to obtain a persisten identifier is by publishing your code or workflow in a research output repository, such as Zenodo or the SciLifeLab Data Repository. This can be achieved by linking your GitHub repository or uploading files directly. Another way to obtain a persistent identifier is via the Research Resource Identification Portal (RRID).

If your code or workflow is intended for reuse, distribute it through a package manager (e.g., BioConda or CRAN). For workflows, consider joining communities like nf-core or Galaxy Europe, and register your workflow on platforms such as WorkflowHub. These steps help others build upon your work.

For software, consider publishing in a dedicated software journal (e.g. Journal of Open Source Software) or as a software article in a life sciences journal (e.g. PLOS Computational Biology or BMC Bioinformatics).

Referencing and citing content on GitHub

Tips & Tricks

  • Start from a template or “cookie cutter” to streamline setup, and follow best practices.
  • Write readable code by using meaningful variable names, a logical structure, and clear comments. Some coding languages offer packages/tools that lint and format code for consistency.
  • Expand your README file beyond the basics. See our README topic page for inspiration.
  • Ask a colleague to test whether they can access and understand how to use the code.
  • Remember to cite your code in your research publications.

Resources

Please find below resources concerning the sharing of code and workflows in form of training, guidance, and/or tools.

Training resources

Guiding resources

Tools