Building Incremental and Reproducible Data Pipelines for Tackling Climate Change

Talk

SPEAKERS

Patrick Ferris
Cambridge University
Michael Dales
Cambridge University

ABSTRACT

We present the good and the bad of building a dataflow engine in OCaml. The engine underpins a complex ecological analysis of avoided deforestation projects in tropical moist rainforests. We will discuss:

  • Onboarding experienced developers who are new to OCaml.
  • Building an operating system in OCaml to run Python/R code.
  • Developing geospatial libraries and how this benefited from Outreachy internships and the compiler's backwards compatibility.
  • Managing a transition from monadic, asynchronous libraries to direct-style code.

This work is part of a multi-year collaboration between the departments of Computer Science, Ecology, Zoology and Geography at the University of Cambridge.

VIDEO RECORDING