lead image from https://www.nature.com/articles/s41559-017-0160
Posting an interesting article from the journal Nature by people who work on a massive ocean health indexing project. The authors, all from National Center for Ecological Analysis and Synthesis, University of California at Santa Barbara, describe themselves this way: "We are environmental scientists whose impetus for upgrading approaches to collaborative, data-intensive science was driven by our great difficulty reproducing our own methods."
They wrote a very readable narrative featuring many interesting moments such as moving to open data standards, forming teams where each person has a couple skill sets so that at least one overlaps with someone else's skills to make collaboration easier, and sharing methods themselves (in R, for example) in the public so that people can iterate on them and annotate exactly why changes were made.
But when we began to reproduce our workflow a second time and repeat our methods with updated data, we found our approaches to reproducibility were insufficient. However, by borrowing philosophies, tools, and workflows primarily created for software development, we have been able to dramatically improve the ability for ourselves and others to reproduce our science, while also reducing the time involved to do so: the result is better science in less time