Six not-so-basic base R functions

reference
There are so many goodies in base R. Let’s explore some functions you may not know.
Published

January 17, 2024

A crop of Leonetto Cappiello, Benedictine, showing a man holding a lantern over a city.

Leonetto Cappiello, Benedictine

R is known for its versatility and extensive collection of packages. As of the publishing of this post, there are over 23 thousand packages on R-universe. But what if I told you that you could do some pretty amazing things without loading any packages at all?

There’s a lot of love for base R, and I am excited to pile on. In this blog post, we will explore a few of my favorite “not-so-basic” (i.e., maybe new to you!) base R functions. Click ‘Run code’ in order to see them in action, made possible by webR and the quarto-webr extension!1

Note

This post includes examples from the base, graphics, datasets, and stats packages, which are automatically loaded when you open R. Additional base R packages include grDevices, utils, and methods.2

  1. invisible(): Return an invisible copy of an object
  2. noquote(): Print a character string without quotes
  3. coplot(): Visualize interactions
  4. nzchar(): Find out if elements of a character vector are non-empty strings
  5. with(): Evaluate an expression in a data environment
  6. lengths(): Determine lengths of a list or vector elements
  7. Null coalescing operator %||%: Return first input if not NULL, otherwise return second input
Note

I accidentally had seven functions in my post, even though it’s titled “Six not-so-basic base R functions.” Oops! Consider the null-coalescing operator a bonus, as it’s not part of base R yet. 😊

1. invisible

The invisible() function “returns a temporarily invisible copy of an object” by hiding the output of a function in the console. When you wrap a function in invisible(), it will execute normally and can be assigned to a variable or used in other operations, but the result isn’t printed out.

Below are examples where the functions return their argument x, but one does so invisibly.

The way to see invisible output is by saving to a variable or running print(). Both of the below will print:

Let’s try another example. Run the chunk below to install the purrr and tidytab packages. Installing the CRAN version of purrr from the webR binary repository is as easy as calling webr::install(). The tidytab package is compiled into a WebAssembly binary on R-universe and needs the repos argument to find it. mount = FALSE is due to a bug in the Firefox WebAssembly interpreter. If you’re not using Firefox, then I suggest you try the code below with mount = TRUE! (Note: this might take a few seconds, and longer with mount = FALSE.)

Using purrr and tidytab::tab2() together results in two NULL list items we do not need.

Running invisible() eliminates that!

When writing a function, R can print a lot of stuff implicitly. Using invisible(), you can return results while controlling what is displayed to a user, avoiding cluttering the console with intermediate results.

Per the Tidyverse design guide, “if a function is called primarily for its side-effects, it should invisibly return a useful output.” In fact, many of your favorite functions use invisible(), such as readr::write_csv(), which invisibly returns the saved data frame.

2. noquote

The noquote() function “prints character strings without quotes.”

I use noquote() in a function url_make that converts Markdown reference-style links into HTML links. The input is a character string of a Markdown reference-style link mdUrl and the output is the HTML version of that URL. With noquote(), I can paste the output directly in my text.

Try it out in an anonymous function below!

Learn more about this syntax in my previous blog post!

3. coplot

The coplot() function creates conditioning plots, which are helpful in multivariate analysis. They allow you to explore pairs of variables conditioned on a third so you can understand how relationships change across different conditions.

The syntax of coplot() is coplot(y ~ x | a, data), where y and x are the variables you want to plot, a is the conditioning variable, and data is the data frame. The variables provided to coplot() can be either numeric or factors.

Using the built-in quakes dataset, let’s look at the relationship between the latitude (lat) and the longitude (long) and how it varies depending on the depth in km of seismic events (depth).

To interpret this plot:

  • Latitude is plotted on the y-axis
  • Longitude is plotted on the x-axis
  • The six plots show the relationship of these two variables for different values of depth
  • The bar plot at the top indicates the range of depth values for each of the plots
  • The plots in the lower left have the lowest range of depth values and the plots in the top right have the highest range of depth values

The orientation of plots might not be the most intuitive. Set rows = 1 to make the coplot easier to read.

Here, you can see how the area of Fiji earthquakes grows smaller with increasing depth.

You can also condition on two variables with the syntax coplot(y ~ x| a * b), where the plots of y versus x are produced conditional on the two variables a and b. Below, the coplot shows the relationship with depth from left to right and the relationship with magnitude (mag) from top to bottom. Check out a more in-depth explanation of this plot on StackOverflow.

I first learned about coplot() thanks to Eric Leung’s tweet. Thanks, Eric!

4. nzchar

From the documentation, “nzchar() is a fast way to find out if elements of a character vector are non-empty strings”. It returns TRUE for non-empty strings and FALSE for empty strings. This function is particularly helpful when working with environment variables - see an example in the tuber documentation!

I have written about nzchar in the past and I’ve also explained how to create a GIF using asciicast!

5. with

If you use base R, you’ve likely encountered the dollar sign $ when evaluating expressions with variables from a data frame. The with() function lets you reference columns directly, eliminating the need to repeat the data frame name multiple times. This makes your code more concise and easier to read.

So, instead of writing plot(mtcars$hp, mtcars$mpg), you can write:

This is particularly handy to use with the base R pipe |>:

Michael Love’s Tweet shows how to connect a dplyr chain to a base plot function using with():

6. lengths

lengths() is a more efficient version of sapply(df, length). length() determines the number of elements in an object, and lengths() will provide the lengths of elements across columns in the data frame.

Pretty straightforward but I think it is a neat function :)

Note 2024-01-21: As @ProfBootyPhD mentioned on Twitter, a better example of lengths() would be a list “since all the columns of a df are required to be the same length.” Here is the example from StackOverflow:

7. Null-coalescing operator in R, %||%

OK, this one isn’t in base R – yet! In the upcoming release, R will automatically provide the null-coalescing operator, %||%. Per the release notes:

‘L %||% R’ newly in base is an expressive idiom for the ‘if(!is.null(L)) L else R’ or ‘if(is.null(L)) R else L’ phrases.

Or, in code:

`%||%` <- function(x, y) {
   if (is_null(x)) y else x
}

Essentially, this means: if the first (left-hand) input x is NULL, return y. If x is not NULL, return the input.

It was great to see Jenny Bryan and the R community celebrate the formal inclusion of the null-coalescing operator into the R language on Mastodon. The null-coalescing operator is particularly useful for R package developers, as highlighted by Jenny in her useR! 2018 keynote, used when the tidyverse team needs to assess whether an argument has been supplied, or if the default value which is commonly NULL has been passed, meaning that the default argument has been supplied.

Jenny Bryan's Code smell and feels talk, showing the slide showing an example of the use of the null-coalescing operator.
Jenny Bryan’s Code smell and feels null-coalescing operator example


However, the null-coalescing operator can also be useful in interactive use, for functions that take NULL as a valid argument. In this case, if supplied in the argument itself it can yield different interesting behaviors. For example:

There’s more discussion about the utility of the function.

The fun-ctions never stop

Want even more functions (base R or not)? Here are some other resources to check out:

Thanks to all community members sharing their code and functions!

Liked this article? I’d love for you to share!

Footnotes

  1. Many thanks to the following resources for making this post possible:

    ↩︎
  2. This is a handy guide for seeing the packages loaded in your R session!↩︎