Building Incremental and Reproducible Data Pipelines for Tackling Climate Change

Talk

SPEAKERS

Patrick Ferris
Patrick Ferris
Cambridge University
Michael Dales
Michael Dales
Cambridge University

ABSTRACT

We present the good and the bad of building a dataflow engine in OCaml. The engine underpins a complex ecological analysis of avoided deforestation projects in tropical moist rainforests. We will discuss:

  • Onboarding experienced developers who are new to OCaml.
  • Building an operating system in OCaml to run Python/R code.
  • Developing geospatial libraries and how this benefited from Outreachy internships and the compiler's backwards compatibility.
  • Managing a transition from monadic, asynchronous libraries to direct-style code.

This work is part of a multi-year collaboration between the departments of Computer Science, Ecology, Zoology and Geography at the University of Cambridge.

VIDEO RECORDING