The {rmarkdown} package allows you to dynamically generate documents by mixing formatted text and results produced by R code. The generated documents can be in HTML, PDF, Word, and many other formats. It is therefore a very practical tool for exporting, communicating and disseminating analysis results.
There is a whole book on Rmarkdown, so we can only cover some of the essentials here.
This document was itself generated from R Markdown files.
In RStudio, click on the File menu, and select New Project…. Then click on New Directory. Give your project a name, and select a directory into which to place it. (Make sure you remember where you put it!) Once you have these fields filled out, click Create Project.
Next we’re going to set up some folders inside the project. Go to the Files pane, and click on New Folder. Name this one “data”, and click OK. This is where you will put the data related to this project. Create one more called “rmd”. R Markdown documents will go here.
An R Markdown document is a simple text file saved with the
.Rmd
extension.
In RStudio, you can create a new document by going to the File menu then choosing New file then R Markdown…. The first time you create an R Markdown document, you may be asked to install several packages. Go ahead and install those. Once RStudio has the appropriate packages, the following dialog box appears :
For now, you can leave all the default values and click OK. A file with sample content is then displayed.
Try editing some of the text in the file. Notice that is made up of some free text and some code sections.
Save your file with Cmd/Ctrl
+ S
,
remembering to give it the extension “.Rmd”. E.g. “ebola_analysis.Rmd”.
Be sure to save it in the “rmd” folder you just created.
You can now try rendering the document by clicking on the “knit” button at the top right:
This will create an HTML output that looks like this:
This new rendered file is stored in the same directory as your Rmd. It has the same name, except it ends with “.html” instead of “.rmd”.
HTML stands for Hyper text markup language and is the format that is used for most documents on the web.
Now let’s return to the rest of the Rmd to consider it part by part.
The first part of the document is its *header*. (It is also called “YAML”, which stands for “Yet another markup language”.) (The name is intended to be humorous.)
---
title: "Untitled"
output: html_document
date: "2022-10-09"
---
The YAML header must be located at the very beginning of the document, delimited by three dashes (`---`) before and after.
This header contains the document’s metadata, such as its title, author, date, plus a whole host of possible options that will allow you to configure or customize the entire document and its rendering. Here, for example, the line `output: html_document` indicates that the generated document must be in HTML format.
We can change the html_document
text to try out some
other formats.
First you can make so
With the output set to “word_document”, we get something like this:
Note that this creates a “.docx” version of our document in the “rmd” folder.
With the output set to “powerpoint_document”, it comes out like this:
If we change the output setting to “pdf_document”, we can get the same document in PDF format (for this you may be prompted to install tinytex on your computer, see below):
For PDF generation, you must have a working LaTeX
installation on your system. If not, Yihui Xie’s tinytex
extension aims to make it easier to install a minimal LaTeX
distribution regardless of your machine’s operating system. To use it,
you must first install the extension with
install.packages('tinytex')
, then run the following command
in the console (expect a download of about 200MB):
tinytex::install_tinytex()
More information on the tinytex website.
There is also a file format called “prettydoc”. To try this out, type
install.packages('prettydoc')
into the console and hit
enter. The output format for prettydoc is a little different
than the previous three we’ve seen, you need to type
prettydoc::html_pretty
in the output
section.
When you knit a prettydoc, you should see something like this:
We can even get a simple dashboard format. First we need to
install.packages('flexdashboard')
. Then if we set the
output
to flexdashboard::flex_dashboard
, and
knit, we get something like the following:
Note that it does not yet have tabs. To create tabs in a
flexdashboard, change some of your double hashtags ##
to
single hashtags #
. This will change the header style for
those sections, and get flexdashboard to render those headers as tabs
instead.
Many other formats are possible, and we encourage you to explore on your own!
Rmarkdown documents can be edited in either a “Source” mode or a “Visual” mode.
You can switch into visual mode for a given document using the
toolbars. For older RStudio versions, you may have an A
button at the top-right of the document toolbar
For newer RStudio versions, there is a pair of buttons to toggle between the modes:
What’s the difference between these two modes?
In source mode you see the raw markdown syntax.
Markdown is a simple set of conventions for adding formatting to
plain text. For example, to italicize text, you wrap it in as asterisk
*text here*
, and to start a new header, you use the pound
sign #
. We will learn these in detail below
But in visual mode, you instead see a Microsoft-word like WYSIWIG view:
with a toolbar for easy formatting:
That means you do not have remember the syntax for markdown elements. For example, if you want to make a section of text bold, you can simply highlight that piece of text and click on the bold button in the toolbar.
Now, while visual mode is much easier to use, we will teach you markdown syntax here for three reasons:
Visual mode is sometimes a buggy experience, and to debug this you’ll need to switch to source mode
Understandin markdown syntax is useful outside of Rmarkdown
Visual mode is not available in RStudio’s collaborative mode, which you may make use of
In the “Help” tab, if you look up “Markdown Quick Reference”, you will be able to find a wide variety of RMD options available to you.
You can define titles of different levels by starting a line with one
or more #
:
# Level 1 title
## Level 2 Title
### Level 3 Title
The body of the document consists of text that follows the Markdown syntax. A Markdown file is a text file that contains lightweight markup that helps set heading levels or format text. For example, the following text:
This is text with *italics* and **bold**.
You can define bulleted lists:
- first element
- second element
Will generate the following formatted text:
This is text with italics and bold.
You can define bulleted lists:
- first element
- second element
Note that you need spaces before and after lists, as well as keeping the listed items on separate lines, or else they will all crunch together rather than making a list.
We see that words placed between asterisks are italicized, lines that begin with a dash are transformed into a bulleted list, etc.
The Markdown syntax allows for other formatting, such as the ability to insert links or images. For example, the following code:
[Example Link](https://example.com)
… will give the following link:
We can also embed images. If you’re in Source mode, type:
[what you want the subtitle to say](images/picture_name.jpg)
,
replacing “what you want the subtitle to say” (it can also be blank),
“images” with the name of the image folder in your project, and
“picture_name.jpg” with the name of the image you want to use. Of
course, it is easier to do in Visual mode. From here, you can
just open the folder that holds your image on your computer and
drag-and-drop the image from the folder onto the page you’re building.
Or you can place the cursor where you want the image, click the button
above marked with a “picture” icon, follow the prompts, and insert your
image where the cursor is. Note that this will also create an “images”
folder in your project (if it doesn’t already exist) and put the image
file into the “images” folder.
When titles have been defined, if you click on the Show document outline icon completely to the right of the toolbar associated with the R Markdown file, a table of contents automatically generated from the titles is displayed and allows you to navigate easily in the document:
The customization of the generated document is done by modifying options in the preamble of the document. However, RStudio offers a small graphical interface to change these options more easily. To do this, click on the gear icon to the right of the Knit button and choose Output Options…
A dialog box appears allowing you to select the desired output format and, depending on the format, different options:
For the HTML format for example, the General tab allows you to specify if you want a table of contents, its depth, the themes to apply for the document and the syntax highlighting of the R blocks, etc. The Figures tab allows you to change the default dimensions of the graphics generated.
When you change options, RStudio will actually change the preamble of your document. So if you choose to show a table of contents and change the syntax highlighting theme, your header will become something like:
---
title: "R Markdown Review"
output:
html_document:
highlight: kate
knock: yes
---
You can modify the options directly by editing the preamble.
Note that it is possible to specify different options depending on the format, for example:
---
title: "R Markdown Review"
output:
html_document:
highlight: kate
knock: yes
pdf_document:
fig_caption: yes
highlight: kate
---
The complete list of possible options is present on the official documentation site (very complete and well done) and on the cheat sheet and the reference guide, accessible from RStudio via the Help menu then Cheatsheets.
In addition to free text in Markdown format, an R Markdown document contains, as its name suggests, R code. This is included in blocks (chunks) written the following way in Source mode:
```{r}
r_code <- 2+2
```
Which will produce the following in Visual mode:
<- 2+2 r_code
As this sequence of characters is not very easy to enter, you can use
the Insert menu of RStudio and choose R[^3], or use
the keyboard shortcut Command+Option+i
on Mac or
Ctrl+Alt+i
on Windows.
Note that it is possible to use other languages in code chunks.
In RStudio blocks of R code are usually displayed with a slightly different background color to distinguish them from the rest of the document.
When your cursor is in a block, you can enter the R code you want and execute it with Command + Enter. You can also execute all the code contained in a block by clicking on the green “play” button at the top right of the code chunk.
In RStudio, by default, the results of a block of code (text, table or graphic) are displayed directly in the document editing window, allowing them to be easily viewed and kept for the duration of the session.
This behavior can be changed by clicking the gear icon on the toolbar and choosing Chunk Output in Console.
It is also possible to pass options to each block of R code to modify its behavior.
Remember that a block of code looks like this:
```{r}
x <- 1:5
The options of a code block are to be placed inside the braces
{r}
, with a comma separating each option.
The first possibility is to give a name to the block. This
is indicated directly after the r
:
{r block_name}
It is not mandatory to name a block, but it can be useful in the event of a compilation error, to identify the block that caused the problem. Be careful, you cannot have two blocks with the same name.
In addition to a name, a block can be passed a series of options in
the form option=value
. Here is an example of a block with a
name and options:
```{r blockName, echo = FALSE, warning = TRUE}
x <- 1:5
And an example of an unnamed block with options:
```{r echo = FALSE, warning = FALSE}
x <- 1:5
One of the useful options is the echo
option. By default
echo
is TRUE
, and the block of R code is
inserted into the generated document, like this:
<- 1:5
x print(x)
## [1] 1 2 3 4 5
But if we set the echo=FALSE
option, then the R code is
no longer inserted into the document, and only the result is
visible:
## [1] 1 2 3 4 5
Here is a list of some of the available options:
Option | Values | Description |
---|---|---|
echo | TRUE/FALSE | Show (or hide) this R code chunk in the resulting knitted document |
eval | TRUE/FALSE | Run (or not) the code in this code chunk in the resulting knitted document |
include | TRUE/FALSE | Combines the options “echo and eval”; either show and run, or hide and don’t run |
message | TRUE/FALSE | Show (or hide) any system messages generated by running this code chunk in the resulting knitted document |
warning | TRUE/FALSE | Show (or hide) any warnings generated by running this code chunk in the resulting knitted document |
There are many other options described in particular in R Markdown reference guide{target = “_blank”} (PDF in English).
It is possible to modify the options manually by editing the header of the code block, but you can also use a small graphical interface offered by RStudio. To do this, simply click on the gear icon located to the right of the header line of each block:
You can then modify the most common options, and click on Apply to apply them.
You may want to apply an option to all the blocks in a document. For example, one may wish by default not to display the R code of each block in the final document.
You can set an option globally using the
knitr::opts_chunk$set()
function. For example, inserting
knitr::opts_chunk$set(echo = FALSE)
into a code block will
set the echo = FALSE
option to default for all subsequent
blocks.
In general, we place all these global modifications in a special
block called setup
and which is the first block of the
document:
```{r, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
It is also possible to write code chunks embedded in the text. If you go to Source mode and type
“The sum of a pair of 2s is ` r 2+2 `”
and then knit the RMD, the resulting document will evaluate the r code between the backticks. Note that you have to include the “r” at the beginning of your inline code chunk to get it to recognize it as R code.
You could also pass variables around your document just like in a regular R program. For example, on one line you could run,
``` {r} max_height <- max(women$height) ```
“The maximum height in the women data set is ` r max_height ` .”
The advantages of such a system are numerous:
a single document can show your entire analysis workflow, since the code, results and text explanations are included
the document can be very easily regenerated and updated, for example if the source data has been modified.
the variety of output formats (HTML, PDF, Word, slides, dashboards, etc.) makes it easy to present your work to others.
There are a number of ways for R Markdown Documents to show data tables. To start, you can see how our RMD displays a table with no formatting:
women
## height weight
## 1 58 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
## 7 64 132
## 8 65 135
## 9 66 139
## 10 67 142
## 11 68 146
## 12 69 150
## 13 70 154
## 14 71 159
## 15 72 164
It looks pretty basic. Next, to follow along you’ll want to load the following packages:
::p_load(flextable, gt, reactable) pacman
Flextable is better for showing simple tables supported by many formats. GT is better for showing complex tables in HTML documents. Reactable is better for showing very large tables in HTML by giving your audience the option to scroll through the tables.
"This is a flextable"
## [1] "This is a flextable"
::flextable(women) flextable
height | weight |
---|---|
58 | 115 |
59 | 117 |
60 | 120 |
61 | 123 |
62 | 126 |
63 | 129 |
64 | 132 |
65 | 135 |
66 | 139 |
67 | 142 |
68 | 146 |
69 | 150 |
70 | 154 |
71 | 159 |
72 | 164 |
"This is a GT table"
## [1] "This is a GT table"
::gt(women) gt
height | weight |
---|---|
58 | 115 |
59 | 117 |
60 | 120 |
61 | 123 |
62 | 126 |
63 | 129 |
64 | 132 |
65 | 135 |
66 | 139 |
67 | 142 |
68 | 146 |
69 | 150 |
70 | 154 |
71 | 159 |
72 | 164 |
"This is a reactable"
## [1] "This is a reactable"
::reactable(women) reactable
You can see many other types of table formats people have created at https://www.rstudio.com/blog/rstudio-table-contest-2022/
We have seen here the production of “classic” documents, but R Markdown allows you to create many other things.
The extension’s documentation site offers a gallery of the different possible outputs. You can create slides, websites or even entire books, like this document.
An interesting use is the creation of slideshows for presentations in the form of slides. The principle remains the same: we mix text in Markdown format and R code, and R Markdown transforms everything into presentations in HTML or PDF format. In general, the different slides are separated at certain heading levels.
Some slide templates are included with R Markdown, including:
ioslides
and Slidy
for HTML
presentationsbeamer
for PDF presentations via
LaTeX
When you create a new document in RStudio, these templates are accessible via the Presentation entry:
Other extensions, which must be installed separately, also allow slideshows in various formats. These include in particular:
Once the extension is installed, it generally offers a starting template when creating a new document in RStudio. These are accessible from the From Template entry.
There are also different templates allowing you to change the format and presentation of the generated documents. A list of these formats and their associated documentation can be accessed from the formats documentation page.
Note in particular:
Finally, the rmdformats extension offers several HTML templates particularly suitable for long documents.
Again, most of the time, these document templates offer a starting template when creating a new document in RStudio (entry From Template):
The following resources are all in English…
The book R for data science, available online, contains a chapter dedicated to R Markdown.
The extension’s official site contains very complete documentation, both for beginners and for advanced users.
Finally, the RStudio help (Help menu then Cheatsheets) provides access to two summary documents: a synthetic “cheat sheet” (R Markdown Cheat Sheet) and a more complete “reference guide” ( R Markdown Reference Guide).
Now we can put the tools we just used to work!
First, create a new R Markdown Project in RStudio.
Then open a web browser, go to https://bit.ly/view-ebola-data , and download the CSV. Here are the data you need.
Open your “downloads” folder, and copy or move the CSV to the “data” folder of your new project.
Create a new R Markdown Document in this project.
Open a web browser, go to https://tinyurl.com/ebola-script , highlight the code
(under “ebola-script”, starting with
1 # Ebola Sierra Leone analysis
and ending with 29 num_vars_plot
), and copy that text.
Paste the text into your new RMD under the setup
chunk.
We’re going to need to find the data, so paste “data/ebola_sierra_leone.csv” inside the readcsv function.
Next we’ll try running through the code to make sure it works. Click the Knit button above.
It didn’t work! One small difficulty with R Markdown is that it has a
very limited vision, and only looks in its own folder. You need to tell
it more explicitly where to look. See the new package in the
# Load packages
section, here
? Here will help
us change the frame of reference so we can access our data. Change our
previous attempt to readcsv
to the following:
ebola_sierra_leone <- readcsv(here("data/ebola_sierra_leone.csv"))
Now if you Knit, everything should run nicely. However, the document still looks disorganized, with visible code fragments and very basic displays. But we have the tools to change it!
Move the # Load packages ----
line and all the package
loading below it up into the setup code chunk. If you try
Knitting again you’ll see a bunch of messages spitting out from
all these load statements. We don’t want everyone to see that, so add
, message=FALSE
to your setup code chunk.
It’s also common to do all your data loading in one place, so move your
# Load data ----
and the readcsv
line up into
the setup code chunk as well.
Next let’s break this document apart into various sections.
Cut the # Cases by district —-
line, and paste it into
the white space between the code chunks. We probably don’t want to show
the parts where we’re adding data to the tabyl, so in this code chunk
add echo = false
.
Remember how we inserted code chunks earlier? We can use that same command to break code chunks apart, too. Break this code block apart into three individual chunks by using Command+Option+i or CTRL+Alt+i between each section. Pull the other two headers outside of the code.
Knit again and see what happens! You can play around with options here to see how they change the results.
As an example you can go back to the source document and use one of
the new tables you learned about. Add flextable,
inside the
p_load(
function, and try to display
district_tab
as a flextable.
Knit once more. You did it!