Chapter 17 Testing

As your app gets more complicated it becomes impossible to hold it all in your head simultaneously. Testing is a way to capture how you expect code to work and turn it into a permanent artefact that you can run again and again and again to ensure that your code keeps working the way that you expect.

We’ll discuss three basic levels of testing in this chapter:

  • Testing functions with testthat.

  • Testing reactivity with testthat and shiny.

  • Testing client side with shinytest.

Start simplest, and work our way up. This should also be how you write tests — use the simplest test. This matches the advice that I’ve been giving your throughout the entire book — if you can write code outside of Shiny, you should, because reactivity is complicated, and isolating it to the small part of your code as possible will make your app as easy to understand as possible.

We’ll talk about the structure of tests in more detail very soon, but to give you an idea, this is what a test looks like:

test_that("", {
  
})

There’s a mix of regular R code that generates some output, and then some expectations, functions that start with expect_ that define what the results should be. If the expectation is not true, the test will fail, and you’ll learn about it.

Why write tests? As mentioned in Chapter 13, you don’t have to write automated tests. But if you don’t, you need to carefully prepare a script to ensure that you do the same things as consistently as possible. You only need to work through such a script a few times to get a feel for how tedious is. Sure, turning that script into code is going to be painful when you first do it (because you’ll need to carefully turn every key press and mouse click into a line of code), but every time you need to run it, it is much much easier.

When should you write tests? There are three basic options

  • Before you write the code. This is a style of code called test driven development, and if you know exactly how a function should behave, it makes sense to capture that knowledge as code before you start writing the implementation.

  • After you write the code. While writing code you’ll often build up a mental to-do list of worries about your code. After you’ve written the function, turn these into tests so that you can be confident that the function works the way that you expect.

  • When you find a bug. Whenever you find a bug, it’s good practice to turn it into an automated test case. This has two advantages. Firstly, to make a good test case, you’ll need to relentlessly simplify the problem until you have a very minimal reprex that you can include in a test. Secondly, you’ll make sure that the bug never comes back again!

But there’s a balance — you want to make sure that your tests are fast to run. If they’re not fast, it’ll be annoying to run time, and you won’t run them as often as you should.

library(shiny)
library(testthat)
library(shinytest)
#> 
#> Attaching package: 'shinytest'
#> The following object is masked _by_ '.GlobalEnv':
#> 
#>     testApp

17.1 Testing non-Shiny functions

Easiest to test code that is outside of the server functions, i.e. functions that you’ve extracted out as in Chapter 14. And for complex apps you as much code as possible should be independent of Shiny, mostly because it’s so much easier to reason about, test, and document. This will also serve as your intro to testthat independent of Shiny so you get the basics of testing down in a simpler environment. More details in https://r-pkgs.org/tests.html.

17.1.1 Basic structure

Tests have three levels of hierarchy:

  • File. All files live in tests/testthat. Each file should correspond to a file in R/, e.g. R/module.R it will create tests/testthat/module.R. Easiest way to create (or find!) that file is to run use_test().

  • Test. Each file broken down into tests. Each test is a call to test_that().

  • Expectation: each test contains one or more expectations. These are very low level assertions. The two most important for Shiny apps expect_equal(), expect_error(), and expect_snapshot_output(). Many expectations others can be found on the testthat website.

You can run all the tests for the current file with devtools::test_file(). Like use_test(), it looks at which file you currently have open in RStudio, and then automatically runs the appropriate tests.

17.1.2 Server side functions

Take load_file() from 14.3.1:

load_file <- function(name, path) {
  ext <- tools::file_ext(name)
  switch(ext,
    csv = vroom::vroom(path, delim = ","),
    tsv = vroom::vroom(path, delim = "\t"),
    validate("Invalid file; Please upload a .csv or .tsv file")
  )
}

There are three main things we want to test — can it load a csv file, can it load a tsv file, and does it give an error message?

test_that("load_file() handles inputs types", {
  # Create sample data
  df <- tibble::tibble(x = 1, y = 2)
  path_csv <- tempfile()
  path_tsv <- tempfile()
  write.csv(df, path_csv, row.names = FALSE)
  write.table(df, path_tsv, sep = "\t", row.names = FALSE)
  
  expect_equal(load_file("test.csv", path_csv), df)
  expect_equal(load_file("test.tsv", path_tsv), df)
  expect_error(load_file("blah", path_csv), "Invalid file")
})

Note that I create a very simple data frame and use it to create temporary data inside of my test. This is good practice because you want your tests to be as self-contained as possible. That makes it much easier to debug them when they fail, and you need to figure out why.

17.1.3 User interface functions

Hard to describe exactly what HTML should like. So best we can do is check that it doesn’t change unexpectedly.

expect_snapshot_output()

17.2 Workflow

Take a brief digression to work on your workflow before diving into testing Shiny specific code.

17.2.1 Code coverage

devtools::test_coverage() and devtools::test_coverage_file() will perform “code coverage”, running all the tests and recording which lines of code are run. This is useful to check that you have tested the lines of code that you think you have tested, and gives you an opportunity to reflect on if you’ve tested the most important, highest risk, or hardest to program parts of your code.

Won’t cover in detail here, but I highly recommend trying it out. Main thing to notice is that green lines are tested; red lines are not.

Basic workflow: Write tests. Inspect coverage. Contemplate why lines were tested. Add more tests. Repeat.

Not a substitute for thinking about corner cases — you can have 100% test coverage and still have bugs. But it’s a fun and a useful tool to help you think about what’s important, particularly when you have complex nested code.

17.2.2 Keyboard shortcuts

If you use RStudio, it’s worth setting up some keyboard shortucts:

  • Cmd/Ctrl + Shift + T is automatically bound to devtools::test()

  • Cmd/Ctrl + T to devtools::test_file()

  • Cmd/Ctrl + Shift + R to devtools::test_coverage()

  • Cmd/Ctrl + R to devtools::test_coverage_file()

You’re of course free to choose whatever makes sense to you. Keyboard shortcuts using Shift apply to the whole package. Without shift apply to the current file. Use the file based keyboard shortcuts for rapid iteration on a small part of your app. Use the whole package shortcuts to check that you haven’t accidentally broken something unrelated.

This is what my keyboard shortcuts look like for the mac.

17.2.3 Summary

  • From the R file, use usethis::use_test() to create the test file (the first time its run) or navigate to the test file (if it already exists).

  • Write code/write tests. Press cmd/ctrl + T to run the tests and review the results in the console. Iterate as needed.

  • If you encounter a new bug, start by capturing the bad behaviour in a test. In the course of making the minimal code, you’ll often get a better understanding of where the bug lies, and having the test will ensure that you can’t fool yourself into thinking that you’ve fixed the bug when you haven’t.

  • Press ctrl/cmd + R to check that you’re testing what you think you’re testing

  • Press ctrl/cmd + shift + T to make you have accidentally broken anything else.

17.3 Testing reactivity

Now that you have your non-reactive code tested, it’s time to move to Shiny specific stuff. We’ll start by testing the flow of reactivity in the server function simulating everything in R. This allows you to check for the vast majority of reactivity issues. In the next section, we’ll talk about problems that require a full browser loop.

Let’s start with a simple app, that has a few inputs, an output, and some reactives.

ui <- fluidPage(
  numericInput("x", "x", 0),
  numericInput("y", "y", 1),
  numericInput("z", "z", 2),
  textOutput("out")
)
server <- function(input, output, session) {
  xy <- reactive(input$x - input$y)
  yz <- reactive(input$z + input$y)
  xyz <- reactive(xy() * yz())
  output$out <- renderText(paste0("Result: ", xyz()))
}

myApp <- function(...) {
  shinyApp(ui, server, ...)
}

Testing this code using the approach above because all the complexity is in the reactivity, and the reactivity is sealed inside the server function in a way that’s hard to access. Shiny 1.5.0 provides a new tool to help with this challenge: testServer(). It takes a Shiny app, and allows you to run code as if it was inside the server function:

testServer(myApp(), {
  session$setInputs(x = 1, y = 1, z = 1)
  print(xy())
  print(output$out)
})
#> [1] 0
#> [1] "Result: 0"

Note the use of session$setInputs() — this is the key way in which you interact with the app, as if you were a user. You can then access and inspect the values of reactives and outputs. To turn this into a test, you just wrap it up in a test_that() block and use some expectations:

test_that("reactives and output updates", {
  testServer(myApp(), {
    session$setInputs(x = 1, y = 1, z = 1)
    expect_equal(xy(), 0)
    expect_equal(yz(), 2)
    expect_equal(output$out, "Result: 0")
  })
})

Note that unlike a real Shiny app, all inputs start as NULL. That’s because this is a pure server side simulation; while we give it that app object that contains both the UI and server, it only uses the server function. We’ll talk more about this limitation and how to work around shortly.

17.3.1 Modules

You could test modules in the same way as you test an app, assuming you’ve followed my advice in Chapter 17.3.1, because every module will have an app function already. But you can also test the module server directly:

datasetServer <- function(id) {
  moduleServer(id, function(input, output, session) {
    reactive(get(input$dataset, "package:datasets"))
  })
}

test_that("can find dataset", {
  testServer(datasetServer, {
    dataset <- session$getReturned()
    
    session$setInputs(dataset = "mtcars")
    expect_equal(dataset(), mtcars)
    
    session$setInputs(dataset = "iris")
    expect_equal(dataset(), iris)
  })
})

Here I use session$getReturned() to capture the reactive returned from the session so that I can later check that it varies as I expect as I change the input name.

Do we need to test what happens if input$dataset isn’t a dataset? In this case, no because we know that the module UI restricts the options to valid choices.

17.3.2 Timers

Time does not advanced automatically, so if you are using reactiveTimer() or invalidateLater(), you’ll need to manually trigger the advancement of time by calling session$elapse(millis = 300)

17.3.3 Limitations

testServer() is a simulation of your app. The simulation is useful because it lets you quickly test reactive code, but it is not complete. Importantly, much of Shiny relies on javascript. This includes:

  • The update functions, because they send JS to the browser which pretends that the user has changed something.

  • req() and validate().

If you want to test them, you’ll need to use the next technique.

17.4 Testing interaction

Manual usage of the shinytest package. You can use it as the website recommends, https://rstudio.github.io/shinytest (https://blog.rstudio.com/2018/10/18/shinytest-automated-testing-for-shiny-apps/). But I’m not going to cover that here, because I think it’s a little too fragile for use. (It’s great if you don’t know how to use testthat, but since I’ve explained testthat here, I don’t think you get any particularly great benefits from the snapshotting function).

Pros: Very high fidelity, since it actually starts up an R process (since Shiny apps are blocking), and a browser in the background.

Cons: Slower. Can only test the outside of the app, i.e. you can’t see the values of specific reactives, only their outcomes on the app itself. You have to manually turn every action you’d usually perform with the mouse and keyboard into a line of code.

17.4.1 Basic operation

Requires an app on disk. So what to do in a package? Just create app.R like shiny::shinyApp(myPackage::myApp()).

test_that("app works", {
  app <- shinytest::ShinyDriver$new("apps/shiny-test")
  app$setInputs(x = 1)
  expect_equal(app$getValue("y"), 2)
})

Possible to do more advanced things like simulating keypresses, taking screenshots, etc.

ShinyDriver$new() is relatively expensive, which means that you’ll tend to have fairly large tests. The best way to fight this tendency is to test everything else at a lower-level.

17.5 Challenges

  • Complex output (like plots and htmlwidgets). Focus on testing the inputs. Snapshot testing