iA


Climate datasets in R

by Karthik Ram. Average Reading Time: about a minute.

As an ecologist working on climate change questions, I’ve always found it rather tedious to acquire and process climate data, especially when dealing with large spatiotemporal scales. Although many agencies provide free access to climate data, there is often some overhead (typically one to two days) before the data are made available for download via ftp. Next, one has to process such data to match the structure of the biological information. Some of these data are provided in one of many binary formats which requires additional processing. While individual scientists and labs have workflows to complete such disparate steps, they are rarely included as part of a publication thereby leaving out critical data provenance. Even when  peer-reviewed articles include one-off scripts (and associated data), missing provenance information makes it difficult to reproduce the results [cite]10.1038/nm1107-1276b[/cite]. Workflow repositories are needed to address the larger issue. In the meantime, one way to address the problem would be to encapsulate the above mentioned steps (data acquisition, format conversion and interpolation) as part of the code that are already included in supplementary materials.

On that note, I’m pretty excited by the announcement of a new R package called RNCEP in the current issue of Methods in Ecology and Evolution [cite]10.1111/j.2041-210X.2011.00138.x[/cite]. The package provides an interface to atmospheric data from National centers for environmental prediction and NCEP/DOE. By encapsulating all the steps from data acquisition and format conversion to interpolation and aggregation from within R, the package provides a way to document an entire workflow as part of an article supplement. As more data repositories open up APIs, similar packages will go a long way towards promoting more open science.

8 comments on ‘Climate datasets in R’

  1. Inundata says:

    New blog post: Climate datasets in R http://t.co/Zw7Zh2v #ecology #rstats

  2. Rich Grenyer says:

    RT @_inundata: New blog post: Climate datasets in R http://t.co/Zw7Zh2v #ecology #rstats

  3. New blog post: Climate datasets in R http://t.co/Zw7Zh2v #ecology #rstats

  4. [...] a nice piece by Karthik Ram, of Inundata, about RNCEP, an application whose introduction we recently published [...]

  5. Dylan Childs says:

    Much more of this is needed. RT @rich_ RT @_inundata: New blog post: Climate datasets in R http://t.co/Zw7Zh2v #ecology #rstats

  6. Much more of this is needed. RT @rich_ RT @_inundata: New blog post: Climate datasets in R http://t.co/Zw7Zh2v #ecology #rstats

  7. Adam Sparks says:

    Much more of this is needed. RT @rich_ RT @_inundata: New blog post: Climate datasets in R http://t.co/Zw7Zh2v #ecology #rstats

  8. [...] of RNCEP: global weather and climate data at your fingertips, which has been receiving fantastic interest since we published it in July of this year. This open source package, written in R, is intended [...]

Leave a Reply