Chapter 10 Reducing duplication

If you have been creating a lot of your app via copy and paste, parts of your app may be very similar to each other. The do not repeat yourself, or DRY, principle of software engineering (popularised by the Pragmatic Programmers) states that “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”.

Copy and paste is a great starting technique but if you rely on it too much you’ll end up with apps that are hard to understand (because it’s hard to see the important differences when you have a lot of copy and pasted code) and are fragile to changes (because it’s easy to forget to update one of the places that you’ve duplicated code).

A good rule of thumb is that if you have copy and pasted something three times, it’s a good time to make a function or use some other technique to reduce the amount of duplication.

These techniques also allow you to spread your app code across multiple files. As your app grows, sandwhiching all of your code into a single app.R will start to become painful. This chapter describes the techniques you can use to break your app apart into smaller independent pieces, starting with functions and culminating with modules.

If you learned Shiny with an older version of Shiny, you might be more familiar with using separate files for the front end (ui.R) and back end (server.R). That organisation continues to work, but is no longer recommended: if you have an older app, I recommend doing a little copy and paste to combine the two files into a single app.R. Similarly, if you’re using global.R inline it into app.R.

Advantages of functions and modules:

  • Clearly isolated behaviour through specified inputs and outputs means it is easier to understand how parts of your app fit together, and you don’t have to worry about spooky action at a distance where changing one part of your app changes the way an apparently unrelated part works.

  • Reducing duplication makes it easier to respond to changing needs because instead of having to track down and change every place you duplicated code, you can just change it in one place.

  • You can spread your app across multiple files, so that it can be more easily digested in chunks. Because you’re using functions and modules you can read the files independently. You don’t have to load all the pieces into your head to understand how the whole thing hangs together.

Often the hardest part is decomposing your big problem into smaller independent pieces. I include some case studies here to help you get a sense of how this feels, but ultimately it’s a skill that can only be learned with practice. Try and set aside some time each week where you’re not improving the behaviour or appearance of your app, but simply making it easier to understand. This will make your app easier to change in the future, and as you practice these skills your first attempt will become higher quality.

library(shiny)

10.1 Using functions

Sometimes you can extract out duplicated code using functions. For example, if you’ve copied and pasted some UI code to create variants with different names:

Or you have a self contained set of reactives:

However, a function alone with only take you so far because typically you’ll have some connection between the front end and back end, and you need some way to coordinate the two. Shiny uses identifiers so you need some way to share them. This gives rise to Shiny modules.

10.1.1 Helper functions

If, given specific values, your app requires complex calculation, first start by pulling that calculation out into separate function:

server <- function(input, output, session) {
  data <- reactive({
    # complex data calculation involving input$x, input$y, input$z
  })
}
my_helper <- function(x, y, z) {
  ...
}

server <- function(input, output, session) {
  data <- reactive(my_helper(input$x, input$y, input$z))
}

When extracting out such helpers, avoid putting any reactive component inside the function. Instead, pass them in through the arguments.

There are two advantages to using a function:

  • It allows you to move it to a separate file

  • It makes it clear from the outside exactly what inputs your function takes. When looking at a reactive expression or output, there’s no way to easily tell exactly what values it depends on, except by carefully reading the code block. The function definition is a nice signpost that tells you exactly what to inspect.

A function also enforces this independence — if you try and refer to an input that you did not pass into the function, you’ll get an error. This enforced independence becomes increasingly important as you create bigger and bigger apps because it ensures that pieces of your app are independent and can be analysed in isolation.

As your collection of helper functions grow, you might want to pull them out into their own files. I recommend putting that file in a R/ directory underneath the app directory. Then load it at the top of your app.R:

library(shiny)

source("R/my-helper-function.R")

server <- function(input, output, session) {
  data <- reactive(my_helper(input$x, input$y, input$z))
}

(A future version of shiny will automatically source all files in R/, https://github.com/rstudio/shiny/pull/2547, so you’ll be able to remove the source() line.)

10.1.2 UI functions

You can apply these same ideas to generating your UI. If you have a bunch of controls that you use again and again and again, it’s worth doing some up front work to make a function that saves some typing.

This can be useful even if all you’re doing is changing three or four default arguments. For example, imagine that you’re creating a bunch of sliders that need to each run from 0 to 1, starting at 0.5, with a 0.1 step. You could do a bunch of copy and paste:

ui <- fluidRow(
  sliderInput("alpha", "alpha", min = 0, max = 1, value = 0.5, step = 0.1),
  sliderInput("beta", "beta", min = 0, max = 1, value = 0.5, step = 0.1),
  sliderInput("gamma", "gamma", min = 0, max = 1, value = 0.5, step = 0.1),
  sliderInput("delta", "delta", min = 0, max = 1, value = 0.5, step = 0.1)
)

But even for this simple case, I think it’s worthwhile to pull out the repeated code into a function:

sliderInput01 <- function(id, label = id) {
  sliderInput(id, label, min = 0, max = 1, value = 0.5, step = 0.1)
}

ui <- fluidRow(
  sliderInput01("alpha"),
  sliderInput01("beta"),
  sliderInput01("gamma"),
  sliderInput01("delta")
)

If you’re comfortable with functional programming, you could reduce the code still further as below. htmltools (the package that provides the underlying html code to Shiny) supports tidy dots only in the development version. fluidRow(!!!list(a, b)) is equivalent to fluidRow(a, b). This technique is sometimes called splatting because you’re splatting the elements of a list into the arguments of a function.

if (packageVersion("htmltools") >= "0.3.6.9004") {
  vars <- c("alpha", "beta", "gamma", "delta")
  sliders <- purrr::map(vars, sliderInput01)
  ui <- fluidRow(!!!sliders)
}

I’m not going to teach functional programming here, but I will show off some examples. It’s a good example of where improving your general R programming skills pays off in your Shiny apps.

10.1.3 Reactives

Note that you want to keep as much reactivity inside the server function as possible. So it takes a generic path and it returns a data frame, not a reactive.

10.1.4 Case study

Lets explore this idea with a realistic Shiny app, inspired by a post, https://community.rstudio.com/t/38506, on the RStudio community forum. The post contained some code that looks like this:

fluidRow(
  box(
    width = 4,
    solidHeader = TRUE,
    selectInput("traffickingType",
      label = "Choose a trafficking type: ",
      choices = sort(unique(ngo$Trafficking.Type)),
      multiple = TRUE
    )
  ),
  box(
    width = 4,
    solidHeader = TRUE,
    selectInput("traffickingSubType",
      label = "Choose a trafficking sub type: ",
      choices = sort(unique(ngo$Trafficking.Sub.Type)),
      multiple = TRUE
    )
  ),
  box(
    width = 4,
    solidHeader = TRUE,
    selectInput("gender",
      label = "Choose a gender: ",
      choices = sort(unique(ngo$Victim.Gender)),
      multiple = TRUE
    )
  )
)

It’s a little hard to see what’s going on here because repeated code makes the differences harder to see. When looking at this code I see two places where I could extract out a function:

  • The call to box() repeats width = 4 and solidHeader = TRUE. It appears that the intent of this code is making a header, so I’ll call the function headerBox.

  • The calls to selectInput() repeat multiple = TRUE and all use the same strategy for determining the choices: pulling unique values from a data frame column. This function is tied to a specific dataset, so I’ll call it ngoSelectInput().

That leads me to:

ngoSelectInput <- function(var, label, multiple = TRUE) {
  choices <- sort(unique(ngo[[var]]))
  label <- paste0("Choose a ", label, ": ")
  selectInput(var, label, choices = choices, multiple = multiple)
}
boxHeader <- function(...) {
  box(width = 4, solidHeader = TRUE, ...)
}

fluidRow(
  boxHeader(ngoSelectInput("Trafficking.Type", "trafficking type")),
  boxHeader(ngoSelectInput("Trafficking.Sub.Type", "trafficking sub type")),
  boxHeader(ngoSelectInput("Victim.Gender", "gender"))
)

I made one simplifying assumption that would also require changes on the server side: when filtering based on a variable, the input name should be the same as the variable name. I think this sort of consistency generally makes for code that’s easier to read and remember. For example, the names of the new inputs will match up perfectly to the data frame columns if I produce a reactive with only the selected rows:

ngo_filtered <- reactive({
  filter(ngo, 
    Trafficking.Type %in% input$Trafficking.Type,
    Trafficking.Sub.Type %in% input$Trafficking.Sub.Type,
    Victim.Gender %in% input$Victim.Gender
  )
})

You might consider genearalising to handle multiple datasets:

dfSelectInput <- function(df, var, label, multiple = TRUE) {
  choices <- sort(unique(df[[var]]))
  label <- paste0("Choose a ", label, ": ")
  selectInput(var, label, choices = choices, multiple = multiple)
}

This would be a good idea if you saw that pattern repeated in multiple places. But you’ll probably also need to introduce some additional component for the id. Otherwise dfSelect(df1, "x") and dfSelect(df2, "x") would generate a control with the same id, which is obviously going to cause problems. This is the problem of namespacing; we want somehow to have a hierarchy in the names. We’ll come back to this in modules, as this is one of the big problems that they solve.

If you had a lot more controls, I’d consider using functional programming to generate them. Again, I’ll just show an example so if you’re already familiar with FP you can see my basic approach. The key idea is to capture all the data you need to generate the columns in a single data frame, which is convenient to create with tibble::tribble(). A data frame is useful here because it easily generalises to any number of arguments

library(purrr)
vars <- tibble::tribble(
  ~ var,                  ~ label,
  "Trafficking.Type",     "trafficking type",
  "Trafficking.Sub.Type", "trafficking sub type",
  "Victim.Gender",        "gender"
)

Then we use purrr::pmap() to turn each row in the data frame to a call to ngoSelectInput(), use map() to wrap each select input into a boxHeader, and then !!! to

vars %>% 
  pmap(ngoSelectInput) %>% # create one select input for each row
  map(boxHeader) %>%       # wrap each in a boxHeader()
  fluidRow(!!!.)           # collapse into a single fluidRow()

If you have really advanced FP skills, you can even generate the call to dplyr::filter():

library(rlang)
#> 
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#> 
#>     %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#>     flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#>     splice
select <-  map(vars$var, function(v) expr(.data[[!!v]] == input[[!!v]]))
select
#> [[1]]
#> .data[["Trafficking.Type"]] == input[["Trafficking.Type"]]
#> 
#> [[2]]
#> .data[["Trafficking.Sub.Type"]] == input[["Trafficking.Sub.Type"]]
#> 
#> [[3]]
#> .data[["Victim.Gender"]] == input[["Victim.Gender"]]

If you haven’t seen .data before, it comes from tidy evaluation, the system that allows you to program with tidyverse packages that are designed for interactive exploration (like dplyr). It’s not necessary when writing interactive code (and it’s not strictly necessary here) but it makes the parallel between the data frame and the inputs more clear. We’ll talk more about tidy evaluation in Chapter XXX.

Again we’d use !!! to splat the generated expressions into filter():

filter(ngo, !!!select)

Don’t worry if this all looks like gibberish: you can just use copy and paste instead.

10.2 Using modules

Functions are great but they’re effectively only useful for exracting out pure UI code or pure computation used inside reactives. They don’t help (why?) if you want to build more complicated components that link UI and server.

A Shiny module is a pair of functions, corresponding to the front end UI and the backend server function.

Modules are way to create an app within an app. They force isolation of behaviour so that one module can’t affect another, and code outside of a module can only affect the inside in a way that the module explicitly allows.

10.2.1 Without modules

To illustrate why we need modules, and can’t just use regular functions, consider the following simple app. It allows the user to input their birthday as a string: this is a little faster than using a dateInput() since there’s no need to scroll through a calendar. But it means that we need to check that they’ve entered a correct date and give an informative message if they haven’t.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

ui <- fluidPage(
  textInput("date", "When were you born? (yyyy-mm-dd)"),
  textOutput("error"),
  textOutput("age")
)

server <- function(input, output, session) {
  birthday <- reactive({
    req(input$date)
    ymd(input$date, quiet = TRUE)
  })
  
  output$error <- renderText({
    if (is.na(birthday())) {
      "Please enter valid date in yyyy-mm-dd form"
    }
  })
  age <- reactive({
    req(birthday())
    (birthday() %--% today()) %/% years(1)
  })
  output$age <- renderText({
    paste0("You are ", age(), " years old")
  })
}

It seems plausible that as your app gets bigger you might want to use this date control in multiple places, so lets have a go at extracting it out into functions. We’ll need two functions – one to generate the UI, and one to do the computation on the server side:

ymdInputUI <- function(label) {
  label <- paste0(label, " (yyyy-mm-dd)")
  
  fluidRow(
    textInput("date", label),
    textOutput("error")
  )
}

ymdInputServer <- function(input, output, session) {
  date <- reactive({
    req(input$date)
    ymd(input$date, quiet = TRUE)
  })
  
  output$error <- renderText({
    if (is.na(date())) {
      "Please enter valid date in yyyy-mm-dd form"
    }
  })

  date
}

Note that the function we’ll use in the server function takes input, output, and session as arguments. We don’t strictly need session here but if we’re using input and output we might as well make it as similar to a regular server function as possible.

That leads to the following app:

ui <- fluidPage(
  ymdInputUI("When were you born?"),
  textOutput("age")
)

server <- function(input, output, session) {
  birthday <- ymdInputSever(input, output, session)
  age <- reactive({
    req(birthday())
    (birthday() %--% today()) %/% years(1)
  })
  
  output$age <- renderText({
    paste0("You are ", age(), " years old")
  })
}

There are two problems with this approach:

  • It always assumes that the control is called date. This means that we can’t have two controls in the same app.

    ui <- fluidPage(
      ymdInputUI("When was your mother born?"),
      ymdInputUI("When was your father born?")
    )
  • The UI has a output with id error that you can’t see from reading just the UI code. This makes it very easy to accidentally break the app.

    ui <- fluidPage(
      ymdInputUI("When were you born?"),
      textOutput("error")
    )

    Debugging the problem that this creates will be painful because it will reveal itself through failure of reactivity – the output won’t update as you expect, or you’ll get weird errors because two controls are fighting for the same input value.

These problems arise because we’ve used functions to isolate local variables; the code is simpler to understand because any variables created inside of ymdInputUI() and ymdInputServer() can’t be accessed outside. But there’s another important way that Shiny code can interface: through the names of input and output controls.

This is the problem that modules are designed to solve: creating inputs and reactives that are completely isolated from the rest of your app. Learning how to use modules will take a little time, but it will pay off by giving you the ability to write components that are guaranteed to be isolated from everything else in your app.

10.2.2 Making a module

To convert the code above into a module, we need to make two changes. First we need to add an id argument to our UI component, and use it with special NS() function. NS is short for namespace: it creates a “space” of “names” that is unique to the module.

ymdInputUI <- function(id, label) {
  ns <- NS(id)
  label <- paste0(label, " (yyyy-mm-dd)")
  
  fluidRow(
    textInput(ns("date"), label),
    textOutput(ns("error"))
  )
}

The key idea is that the argument to NS() is supplied by the person using the component, and the arguments to the function it produces is supplied by the person who wrote the component. This two-phase creation ensures that the final name combines properties needed by both the app author and the module author. This is a bit confusing at first, because you’re likely to be both the app and module author.

We now need to specify an id when creating the UI. This is important because it puts this id in the same place as all the others, so it’s easy to spot if you’ve used the same input id in multiple places.

ui <- fluidPage(
  ymdInputUI("birthday", "When were you born?"),
  textOutput("age")
)

We need to make a similar change to the server side of the module. Here instead of NS() we use callModule(). callModule() automatically tweaks the input and output so it looks for date inside the id namespace. You can think of it as doing input[[id]]birthday (but it’s actually input[[paste(id, "-", birthday)]]).

ymdInput <- function(id) {
  callModule(id = id, function(input, output, session) {
    date <- reactive({
      req(input$date)
      ymd(input$date, quiet = TRUE)
    })
    
    output$error <- renderText({
      if (is.na(date())) {
        "Please enter valid date in yyyy-mm-dd form"
      }
    })
  
    date
  })
}

(You may have seen modules used a little differently elsewhere. But I think this organisation makes it easier to understand what’s going on.)

Now the arguments to ymdInput() have changed: we pass in the id, and Shiny takes care of automatically plumbing up the input, output, and session in the appropriate namespaced way. That why I’ve removed the Server from the name - since the details are hidden from the interface.

server <- function(input, output, session) {
  birthday <- ymdInput("birthday")
  
  age <- reactive({
    req(birthday())
    (birthday() %--% today()) %/% years(1)
  })
  
  output$age <- renderText({
    paste0("You are ", age(), " years old")
  })
}

To help cement the ideas of modules in your head, the following case studies use module to develop a few simple reusable components.

10.2.3 Limited selection + other

Consider the following app, which provdies a way to select gender that is sensitive to the many possible ways that people can express their gender.25

ui <- fluidPage(
  radioButtons("gender", "Gender:",
    choiceValues = list("male", "female", "self-described", "na"),
    choiceNames = list(
      "Male",
      "Female",
      textInput("gender_self", NULL, placeholder = "Self-described"),
      "Prefer not to say"
    ),
    selected = "na",
  ),
  textOutput("txt")
)

server <- function(input, output, session) {
  observeEvent(input$gender_self, {
    req(input$gender_self)
    updateRadioButtons(session, "gender", selected = "self-described")
  })
  
  gender <- reactive({
    if (input$gender == "self-described") {
      input$gender_self
    } else {
      input$gender
    }
  })
  
  output$txt <- renderText({
    paste("You chose", gender())
  })
}

The gender and gender_self components are tightly bound together. We haven’t talked about updateRadioButtons() yet, but this is just a small convenience so that if you start typing a self-described gender, that radio button is automatically selected.

Convert to a module and generalise a little.

radioButtonsWithOther <- function(id, label, choices, selected = NULL, placeholder = NULL) {
  ns <- NS(id)
  
  radioButtons(ns("primary"), "Gender:",
    choiceValues = c(names(choices), "other"),
    choiceNames = c(
      unname(choices),
      list(textInput(ns("other"), NULL, placeholder = NULL))
    ),
    selected = selected
  )
}

radioButtonsWithOtherServer <- function(input, output, session) {
  observeEvent(input$primary, {
    req(input$other)
    updateRadioButtons(session, "primary", selected = "other")
  })
  reactive({
    if (input$primary == "other") {
      input$other
    } else {
      input$primary
    }
  })
}

ui <- fluidPage(
  radioButtonsWithOther("gender", 
    label = "Gender", 
    choices = list(
      male = "Male",
      female = "Female",
      na = "Prefer not to say"
    ), 
    placeholder = "Self-described", 
    selected = "na"
  ),
  textOutput("txt")
)

server <- function(input, output, session) {
  gender <- callModule(radioButtonsWithOtherServer, "gender")

  output$txt <- renderText({
    paste("You chose", gender())
  })
}

10.2.4 Hierarchical select boxes

library(tidyverse)
country_df <- countrycode::codelist %>% 
  as_tibble() %>% 
  select(iso3c, continent, country = cow.name) %>% 
  filter(!is.na(continent), !is.na(country))

continents <- sort(unique(country_df$continent))

ui <- fluidPage(
  selectInput("continent", "Continent", choices = continents),
  selectInput("country", "Country", choices = NULL)
)

server <- function(input, output, session) {
  countries <- reactive({
    country_df[country_df$continent == input$continent, , drop = FALSE]
  })
  
  observeEvent(input$continent, {
    updateSelectInput(session, "country", choice = countries()$country)
  })
}

shinyApp(ui, server)

10.2.6 Returning multiple reactives

  • Leave in a list.
  • Use zeallot

10.2.7 Exercises

  1. The following app plots user selected variables from the msleep dataset for three different types of mammals (carnivores, omnivores, and herbivores), with one tab for each type of mammal. Remove the redundancy in the selectInput() definitions with the use of functions.
library(tidyverse)

ui <- fluidPage(
  selectInput(inputId = "x",
              label = "X-axis:",
              choices = c("sleep_total", "sleep_rem", "sleep_cycle", 
                          "awake", "brainwt", "bodywt"),
              selected = "sleep_rem"),
  selectInput(inputId = "y",
              label = "Y-axis:",
              choices = c("sleep_total", "sleep_rem", "sleep_cycle", 
                          "awake", "brainwt", "bodywt"),
              selected = "sleep_total"),
  tabsetPanel(id = "vore",
              tabPanel("Carnivore",
                       plotOutput("plot_carni")),
              tabPanel("Omnivore",
                       plotOutput("plot_omni")),
              tabPanel("Herbivore",
                       plotOutput("plot_herbi")))
)

server <- function(input, output, session) {

  # make subsets
  carni <- reactive( filter(msleep, vore == "carni") )
  omni  <- reactive( filter(msleep, vore == "omni")  )
  herbi <- reactive( filter(msleep, vore == "herbi") )

  # make plots
  output$plot_carni <- renderPlot({
    ggplot(data = carni(), aes_string(x = input$x, y = input$y)) +
      geom_point()
  })
  output$plot_omni <- renderPlot({
    ggplot(data = omni(), aes_string(x = input$x, y = input$y)) +
      geom_point()
  })
  output$plot_herbi <- renderPlot({
    ggplot(data = herbi(), aes_string(x = input$x, y = input$y)) +
      geom_point()
  })

}

shinyApp(ui = ui, server = server)
  1. Continue working with the same app from the previous exercise, and further remove redundancy in the code by modularizing how subsets and plots are created.

  2. Suppose you have an app that is slow to launch when a user visits it. Can
    modularizing your app code help solve this problem? Explain your reasoning.

10.3 Packages

For more complicated apps, particularly apps that multiple people contribute to, there are substantial advantages to turning your app into a package. In that case, you might want to check out the golem package and accompanying “Buidling Big Shiny Apps” book.


  1. For a deeper dive on this issue, and a discussion of why many commonly used way of asking about gender can be hurtful to some people, I recommend reading “Designing forms for gender diversity and inclusion” by Sabrina Fonseca: https://uxdesign.cc/d8194cf1f51.↩︎