Skip to contents

Function wraps gtsummary::tbl_summary() to create a data summary table often seen in regulatory submissions. Continuous variable summaries are shown on multiple lines with additional summary statistics and percentages are shown for categorical variables; precision levels estimated based on values observed.


  by = NULL,
  label = NULL,
  statistic = NULL,
  digits = NULL,
  type = NULL,
  value = NULL,
  missing = c("no", "yes", "ifany"),
  missing_text = NULL,
  sort = NULL,
  percent = NULL,
  include = everything()



A data frame


A column name (quoted or unquoted) in data. Summary statistics will be calculated separately for each level of the by variable (e.g. by = trt). If NULL, summary statistics are calculated using all observations.


List of formulas specifying variables labels, e.g. list(age ~ "Age", stage ~ "Path T Stage"). If a variable's label is not specified here, the label attribute (attr(data$age, "label")) is used. If attribute label is NULL, the variable name will be used.


List of formulas specifying types of summary statistics to display for each variable.


List of formulas specifying the number of decimal places to round summary statistics. If not specified, tbl_summary guesses an appropriate number of decimals to round statistics. When multiple statistics are displayed for a single variable, supply a vector rather than an integer. For example, if the statistic being calculated is "{mean} ({sd})" and you want the mean rounded to 1 decimal place, and the SD to 2 use digits = list(age ~ c(1, 2)). User may also pass a styling function: digits = age ~ style_sigfig


List of formulas specifying variable types. Accepted values are c("continuous", "continuous2", "categorical", "dichotomous"), e.g. type = list(age ~ "continuous", female ~ "dichotomous"). If type not specified for a variable, the function will default to an appropriate summary type.


List of formulas specifying the value to display for dichotomous variables. gtsummary selectors, e.g. all_dichotomous(), cannot be used with this argument.


Indicates whether to include counts of NA values in the table. Allowed values are "no" (never display NA values), "ifany" (only display if any NA values), and "always" (includes NA count row for all variables). Default is "ifany".


String to display for count of missing observations. Default is "Unknown".


List of formulas specifying the type of sorting to perform for categorical data. Options are frequency where results are sorted in descending order of frequency and alphanumeric, e.g. sort = list(everything() ~ "frequency")


Indicates the type of percentage to return. Must be one of "column", "row", or "cell". Default is "column".


variables to include in the summary table. Default is everything()


a 'tbl_reg_summary' object

Example Output

Example 1

See also

See gtsummary::tbl_summary() help file

See vignette for detailed tutorial


tbl_reg_summary_ex1 <-
  df_patient_characteristics %>%
  tbl_reg_summary(by = trt, include = c(marker, status))