As discussed at the 2017 TRB Annual Meeting

Reproducible Research


Important question: is the journal model getting obsolete?
Maybe we should be creating a new venue for sharing.

Distinction to be made: Behavioral research vs. implementation (The way the you explore the behavioral thing is by testing the model - a lot of behavioral assumptions tend to be built into models)

Why is reproducible research important?

Enables hypothesis to be tested, which facilitates verifiable progress. This is hard to do strictly from the words of a scientific paper.

Once you publish something as an innovation , we should have an expectation in the field that it should be tested. Lack of reproducibility allows singular results, which could be clouded by enthusiasm bias/commercial interests, data anomalies, and errors to move forward as “truths” as opposed to a meta analysis of multiple studies. In other scientific fields, another researcher often must replicate results.

If research is made more reproducible, the hope is that it will generate more useful results to be used in practice.

Hope: research generates something more useful for application

Why is research not reproducible?

NDAs and human subject requirements

Suggestions:

  • “scrubbed” datasets
  • remote computing
  • easy to reference NDA usage templates

Learning from other fields:

  • Medical records

Complex systems don’t easily lend themselves to being described in paragraph form

Large model systems are:

a - hard to describe in a singular research paper where space needs to be reserved for “the innovation”;
b - impossible to build replications of based on a word-for-word description.

Suggestions:

  • Well-documented code

If there was a code, you could dig in the code; not a small task but possible; code needs to be documented and have meta data;

Innovators aren’t necessarily the best communicators…or coders

Suggestion:

  • Need to value the communication as much as the innovation

Need economic incentives, but proprietary methods impossible to validate

Suggestion:

  • Standardized validation datasets for benchmarking

What activities should Zephyr undertake?

Templates

Ethics template - Template IRB proposal that people can take and throw into their own IRB that protects them to share their data (copy-paste this paragraph and it will protects you)

NDA template - Language people write into the NDAs is often too strict.

Use Foundation Money to Replicate Important Research

Has been done in other fields. It is hard to get people to replicate research if nobody is publishing papers on the replication.

Could alternatively fund a prize for finding problems in something important.

Peer Reviews

-

Example Projects

Foundation could give money to get a repository with good documentation and wiki - more valuable to the community; foundation could turn around the priority and give an incentive to the professor to look at the documentation

Establish Reproducibility Standards

Should zephyr encourage journals to say we have to have the reproducibility?

Require at least pseudo-code. This is usually more readable than creating well-documented “real” code.

However, precedents for sharing actual code exist. In physical science publishing a code within an operable context is part of the process, and social science data is often messy so having operational code or at least standardized inputs is important.

What research is most important to be reproducible?

-

How important is this versus everything else on other tables?

-

What do you think?