Module #10 - Build Your Own R Package

This assignment called to create a new R package with at least one unique function.

My R package "nerrsclean" provides 1 function, clean(), to prepare data collected under NOAA's National Estuarine Research Reserve System's (NERRS) National Monitoring Program for analysis. The function accepts two arguments:

  • df: dataframe from CSV file
  • normalize: user option to add normalized columns or not (TRUE / FALSE)

The function returns a cleaned dataframe with or without additional columns of normalized values. The data sets appropriate for this tool can be found here: http://cdmo.baruch.sc.edu/dges/

The purpose of this package is to serve as a preliminary data cleaning tool for any of the .CSV files downloadable from the above site. I found value in creating a package for this specific task because if I were in the shoes of someone that had to regularly handle this type of environmental data, it would be helpful to have a simple tool to handle the often most time consuming part of analysis, the data cleaning.

When setting up the package, I had some confusion at first that was mostly due to me not reading all the documentation for "roxygen2". For example, as I was tweaking the DESCRIPTION file and other details I noticed that roxygen would only update the .Rd and NAMESPACE files if they were not already present.

Knowing what I know now, I would make these 2 changes before creating my next R package:

  1. Have the name and objective of the package before creating anything in RStudio
  2. Using a smaller dataset when testing the functions. My test file was over 500,000 records and took a few minutes to load with each pass.

It was exciting to create a new package and know that I have the ability to share content with the R community. I had a great feeling after opening a new project and seamlessly installing my package via GitHub into RStudio.

Below is the link to the package on GitHub and the DESCRIPTION file:

nerrsclean Package on GitHub

Link to DESCRIPTION file on GitHub




    Package: nerrsclean
    Type: Package
    Title: National Estuarine Research Reserve System Data Cleaning
    Version: 0.1.0
    Author: "Kevin Hitt  [aut, cre]"
    Description: Package for easily cleaning data collected under NOAA's National
        Estuarine Research Reserve System's (NERRS) National Monitoring Program.
        Data from http://cdmo.baruch.sc.edu/dges/
    Depends:
        R (>= 3.1.2),
        scales
    License: CC0
    Encoding: UTF-8
    LazyData: true
    RoxygenNote: 7.1.0.9000