Exploring how climate will impact plant-insect distributions and interactions using open data and informatics


Open data repositories, including those from citizen science efforts, are rich sources of research grade data that are becoming key to asking and answering questions in ecology. Simultaneously, informatics tools are becoming increasingly accessible to the non-specialist and are more commonly integrated into the college curriculum of biology students. This series of 3 classes (~360 minutes of in class activity time) guides students on how to collect, curate, and analyze citizen science data using common research computing tools: R, RStudio, Git, and GitHub. These are in silica experiments examining (1) the species distributions of butterflies and their host plants based on observations submitted to the web platform iNaturalist and (2) how those distributions may change in the future due to global climate change. Students will download and install software, retrieve and curate citizen science data, model the occurrence data to produce a species distribution of butterfly and host plant, and develop hypotheses on how climate change may or may not affect the future distribution of butterfly and host plant. Students then test these hypotheses using estimates of future climate variables, evaluate the strength of their results, and present a summary of these explorations to their peers using additional class time if desired. This series of experiments will result in 4 group products and 1 individual product for evaluation.


Wendy L. Clement1,4, Kathleen L. Prudic2, and Jeffrey C. Oliver3

1Department of Biology, The College of New Jersey, 2000 Pennington Road, Ewing, NJ 08638

2Department of Entomology, University of Arizona, PO Box 210036, Tucson, AZ 85721

3Office of Digital Innovation & Stewardship, University of Arizona Libraries, 1510 E. University Blvd. Tucson, AZ 85721

4Corresponding Author: Wendy L. Clement (clementw@tcnj.edu)


This module requires three 2-hour lab periods plus time for in-class presentations (allotted time varies depending on group and class size and whether presentations are face to face or online). In the initial 2-hour lab period, class time is used to introduce students to the topics of ecoinformatics, open data repositories, citizen science, and climate change. Student groups then identify a butterfly-host plant interaction and learn about its natural history. In the second 2-hour period, students download the relevant data from iNaturalist and then are given instructions on how to create a species distribution map. Students develop a hypothesis concerning how climate change may impact this specific butterfly-host plant pair in the next 50 years. In the third 2-hour lab period, students evaluate their hypothesis by creating another series of species distribution maps using predicted climate climate variables for the next 50 years. Students then synthesize the information into a final group oral presentation and individual written assignment.


Instructors’ time preparing for class will depend on familiarity with research computing resources including R, RStudio, and GitHub. Students may spend several hours gathering natural history data on their butterfly-host plant interaction, (re-)analyzing their data, interpreting results, creating figures, and developing presentations. We estimate 6-8 hours total of out of class time depending on prior experiences and skills. Collaborative platforms such as Google Docs, Open Science Framework, Slack etc., can help facilitate group work outside of class.


Students will have a variety of written, visual, and oral products for evaluation. Student Product 1: Natural History of Butterfly-Host Plant Interaction; Student Product 2: Species distribution maps (SDMs) and hypothesis; Student Product 3: Future Species Distribution Models and Hypothesis Evaluation; Student Product 4: Presentation of project and results; Student Product 5: Synthesis and reflection of group projects. Student Products 1-4 are assessed by group and Student Product 5 is assessed individually.


Students will complete this exercise using a computer with access to the internet. Possible implementations include: 1) a shared computer lab with the ability to install software, or 2) a wi-fi enabled classroom in which students can use individual laptops with software installed prior to class. Guides for downloading relevant software have been included as supplementary documents.


This exercise was first implemented in an upper level undergraduate course, Ecology and Evolution of Plant Insect Interactions at The College of New Jersey in Fall 2017. This course meets twice a week for two hours per meeting (4 hours/week total) and is capped at 24 students. The class was divided among groups of three or four students during this module. This activity could be scaled up to hundreds of students depending on the instructor's capabilities with the research computing skills and class support. The module would be particularly amenable to large enrollment courses with a laboratory or recitation component.


This activity was performed at a public, 4-year, primarily undergraduate institution with upper level biology students.


This exercise is designed to scale in a face-to-face or online environment given the students have access to download the freely distributed software (R [https://www.r-project.org/], RStudio [https://www.rstudio.com/products/rstudio/download/], Git [https://git-scm.com/downloads]), and source code (GitHub [https://github.com]). Species pairs can be exchanged for any symbiotic species interactions based on course content and the learning goals. This content is designed to be flexible; any species on iNaturalist can be analyzed with minor modifications to the code. These activities could be scaled to larger classes; however, such a venture would depend greatly on instructor knowledge, student experience, and the human capital resources necessary to properly implement the activities at scale. Of particular importance to a large classroom setting are skilled assistants (e.g., teaching assistants) necessary to assist and troubleshoot various software challenges during instruction. Archiving the materials on Open Science Framework allows the authors to update the code and work with instructors as needed. As programming and data science skills become more prevalent in pre-college instruction and curricula, we anticipate this activity becoming more applicable and accessible to introductory college courses.


Description of other Resource Files:

Supporting Files Per Class:

  Class 1

  1. PreClass1-Assignment.docx - Details for an assignment each student should complete prior to the first class that familiarizes them with citizen science efforts, iNaturalist, and the effects of climate change.
  2. Class1-Slides.pptx - Slides to support an active lecture introducing concepts of ecoinformatics, biodiversity science, citizen science, climate change, and the goals of the research project.
  3. SP1-InstructionsandAssignment-NaturalHistoryofButterfly-HostPlantInteraction.docx - An in-class assignment that guides student groups on selecting a butterfly-host plant interaction for this research project as well as pertinent information about the natural history of the butterfly and host plant.
  4. SpeciesPairsSuggestions.xlsx - An excel sheet of suggested butterfly-host plant interactions that includes common and Latin names of butterflies and plants, numbers of host plants per butterfly, conservation status of the butterfly, and additional information about the butterfly-host plant interaction. Note - it is at the discretion of the instructors whether to offer the butterfly-host plant suggestions to student groups.

Class 2

  1. PreClass2-Assignment.docx - Details for an assignment students complete prior to the second class to download necessary software and complete a reading about distribution changes in butterfly-host plant interactions in Europe.
  2. Class2-Slides.pptx - Slides to accompany an introduction to species distribution models.
  3. SP2-Instructions-SpeciesDistributionMapsandHypothesis.docx - Instructions to guide students on generating species observation maps and species distribution models using citizen science data from iNaturalist and R - to be completed during class.
  4. SP2-SimplifiedInstructions-SpeciesDistributionMapsandHypothesis.docx - Instructions to guide students on generating species observation maps and species distribution models using citizen science data from iNaturalist for users familiar with R - to be completed during class.
  5. SP2-Assignment-SpeciesDistributionMapsandHypothesis.docx - Details for the assignment based on the species observation maps and species distribution models generated during class. Note - student groups should work on this during class and can complete outside of class if need be.
  6. SP2-Rubric-SpeciesDistributionMapandHypothesis.docx - Rubric to support grading of SP2-Assignment-SpeciesDistributionMapsandHypothesis.
  7. SP2-AnalysisScriptInformation.docx - Detailed information for each of the scripts used during class to generate species distribution models in R.
  8. DownloadInstructions-Git.docx - Download instructions for Git that students should complete prior to class (if personal computers are used).
  9. DownloadInstructions-R.docx - Download instructions for R that students should complete prior to class (if personal computers are used).
  10. DownloadInstructions-RStudio.docx - Download instructions for RStudio that students should complete prior to class (if personal computers are used).
  11. Computer code and instructions: https://github.com/jcoliver/biodiversity-sdm-lesson This is a link to the GitHub repository for the most up-to-date version of the R scripts supporting this model. An overview of the module, example files, and help documents are available as well.
  12. HelpDocumentForCommonErrorsAndHelpfulWebsites.docx - A document that provides information on common errors when generating species distribution models in R. This document should be provided to students as they work through SP2-Instructions-SpeciesDistributionMapsandHypothesis.docx.

Class 3

  1. SP3-Instructions-FutureSpeciesDistributionModels.docx - Instructions to guide students on generating forecast species distribution models using citizen science data from iNaturalist and the GFDL-ESM2G climate model for 2070 - to be completed during class.
  2. SP3-Assignment-FutureSpeciesDistributionModels.docx - Details for the assignment based on present and forecast species distribution models generated during class. Note - student groups should work on this during class and can complete outside of class if need be.
  3. SP3-Rubric-FutureSpeciesDistributionModels.docx - Rubric to support grading of SP3-Assignment-FutureSpeciesDistributionModels.

Class 4

  1. SP4-InstructionsandAssignment-PresentationofProjectandResults.docx - Instructions guiding student groups to prepare a PowerPoint presentation of their research project results.
  2. SP4-Rubric-PresentationofProjectandResults.docx - Rubric to support grading of SP4-InstructionsandAssignment-PresentationofProjectandResults.
  3. SP5-InstructionsandAssignment-SynthesisandReflectiononGroupProjects.docx - Instructions for an individual written assignment to be completed outside of class. This assignment contains two prompts - one asking students to reflect on their own experiences while working on this research project and a second asking students to synthesize information from the research projects they heard during in-class presentations.

Student Product Examples

  1. SP1-Example-NaturalHistoryofButterfly-HostPlantInteraction.docx - An example of the work student groups should complete for SP1-InstructionsandAssignment-NaturalHistoryofButterfly-HostPlantInteraction.
  2. SP2-Example-SpeciesDistributionMapandHypothesis.docx - An example of the work student groups should complete for SP2-Assignment-SpeciesDistributionMapsandHypothesis.
  3. SP3-Example-FutureSpeciesDistributionModels.docx - An example of the work student groups should complete for SP3-Assignment-FutureSpeciesDistributionModels.


This activity would not have been possible without the amazing open science and data sources specifically iNaturalist (Ken-ichi Ueda and Scott Loarie) and National Oceanic and Atmospheric Administration (NOAA) climate predictions and the willingness of the students enrolled in the Fall 2017 Plant-Insect Interactions course at The College of New Jersey to participate in this project.


All authors contributed to this activity. WLC's course on Plant-Insect Interactions was the catalyst for developing this group project. WLC worked with KLP and JCO to design an exercise and test it in the classroom. WLC designed and implemented the learning evaluation assessments. JCO designed and packaged the code for the students. KLP worked with available data on iNaturalist to assess feasibility of the group project and wrote a first draft of the manuscript which the other authors edited.


Wendy L. Clement, Kathleen L. Prudic, and Jeffrey C. Oliver. 16 August 2018, posting date. Exploring how climate will impact plant-insect distributions and interactions using open data and informatics. Teaching Issues and Experiments in Ecology, Vol. 14: Experiment #1 [online]. https://tiee.esa.org/vol/v14/experiments/clement/abstract.html

Western giant swallowtail, Papilio rumiko, caterpillar feeding on mandarin citrus, Citrus reticulata. Photograph by Jeffrey C. Oliver and licensed under creative commons (CC BY-SA 4.0, https://commons.wikimedia.org/wiki/File:Papilio_rumiko.jpg).

full size image