Scroll down to below this first code chunk.
R, like other programming languages, stores information (such as data) in objects, which are given labels so that we can refer to them as we are working.
Some kinds of objects are
TRUE
and FALSE
(note the all caps).There are also objects called vectors, which are like lists whose entries can be text strings, numbers, or logical values.
When we want to use an object a lot (such as a numeric value, like a mean, from some statistical computation), it is helpful to give it a name so we can refer to it by what it represents, instead of by its values.
We can give a name to an object using an expression of the form
This process is called assignment, because we are “assigning” the value to a container with the name we’ve chosen. The named thing is called a variable (which means something a bit different than a variable in the statistical sense, although a variable in code can refer to a statistical variable).
For example:
You can read the <-
symbol as “gets”, as in “(The name) my.name
gets (the value)”Colin Dawson". Notice that there is just one hyphen in the arrow. A common error is to add an extra hyphen to the arrow, which R will misinterpret as a minus sign.
It is also legal to use underscores and digits in variable names, but none of these can be used at the beginning of a name.
variable3 <- 57
## underscores are ok
variable_3 <- 57
## spaces are not
## my Name <- "Colin Dawson" # gives an error
## can't start a variable name with a digit
## 3rdvariable <- 57 # not a valid variable name
We can also store the result of a command in a named variable. A simple example is the following:
Now if I type the name of the new variable at the console, or refer to it by itself in a chunk, R will print out its contents:
## [1] 5
The 1 in brackets is there to indicate that the next value shown is the first entry in the variable called myResult
(Note that if you try to access the variable MyResult
, you will get an error, because you defined it with a lower case “m”). In this case the variable has only one entry, but sometimes we will hold lists of data or other values in a variable.
We can also use variables as the values of arguments, such as in:
Notice that if we run the chunk that defines these variables, we will see them appear in the Environment tab in the upper right pane. This shows us everything we’ve defined.
The Environment tab will also contain any variables that we defined in other documents, at the console, or in chunks that we’ve since deleted. This can cause problems, because variables can wind up referring to things that don’t exist in the current document, or to things that should have a different value in the current document.
Fortunately, when we Knit our document, the rendering program ignores the interactive environment and creates its own encapsulated environment that only contains variables we’ve defined in the current document (and similarly, only allows us to use datasets and functionality from packages that have been loaded in our document).
This means that we can only use variables in our document that have been defined prior to the point when we refer to them. If we try to use a variable above the chunk where it’s defined, it may work when we’re running chunks interactively (provided we’ve previously run the chunk where it’s defined), but it won’t work when we try to Knit. This is another reason why Knitting every so often is a good idea, since it helps us catch errors in our document that we might otherwise miss.
theAnswer
at the console (rather than in a code chunk), and assign it the value 42. Then, create a code chunk that refers to theAnswer
in an expression that computes twice the answer. What should happen is that the chunk will run fine when you just try to run it by itself, but if you try to Knit you’ll get an error.theAnswer
within the document.Most of what we do in R consists of applying functions to data objects, specifying some options for the function, which are called arguments. Together, the application of a function, together with its arguments, is called a command.
Many of the commands in R look a lot like functions from math class; that is, invoking R commands means supplying a function with some number of “inputs” (the arguments), which yields some kind of output, much as the sin()
function in math takes a number as input and returns another number corresponding to its trigonometric sine.
A useful analogy is that commands are like sentences, where the function is the verb, and the arguments (one of which usually specifies the data object) are the nouns.
There is often a “main” argument that comes first. This is like the direct object of the command.
For example, in the English command, “Draw a picture for me with some paint”, the verb “draw” acts like the function (what is the listener supposed to do?); the noun “picture” is the direct object (draw what?), and “me” and “paint” are extra (in this case, optional) details, that we might call the “recipient” and the “instrument”.
In the grammar of R, I could write this sentence like:
We are applying the function draw()
to the object "picture"
, and adding some additional detail about the recipient and material. Here the function is called draw
, and we have a main argument with the value "picture"
, and additional arguments recipient
and material
with the values "me"
, and "paint"
, respectively.
Technically speaking, "picture"
is the value of an argument too; we might have written
However, in practice, there is often a required first “main” argument whose name is left out of the command.
In R, arguments always go inside parentheses, and are separated by commas when there is more than one. For arguments whose names are explicitly given, the name goes to the left of the =
, and the value goes to the right.
The command
finds the logarithm of the number 100, using base 10 log.
We are applying the function log()
function to the value 100
and modifying the behavior of log()
through the optional argument base
that in this case specifies what kind of logarithm we want.
## [1] 6.643856
## [1] 4.60517
As we have seen, when we apply a function to some arguments, it produces a result. If we simply call the function, most of the time the result is just printed out. But often times we want to refer to or use that result later. In this case we can assign the result of the function call to a “container”; that is, to a named variable.
For example, if I have a variable called Income
, I might want to compute and store the log of that income variable:
Now logIncome
is a variable whose value is the log (base 10) of the Income
value.
.html
file in the ~/stat213/turnin/R-orientation/
folder.It’s useful to record some information about how your file was created at the very end of the file. I will typically include the following ‘footer’ in the templates I provide you.
mosaic
package version: 1.5.0tidyverse
package version: 1.3.1## R version 3.6.0 (2019-04-26)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.5 purrr_0.3.4
## [5] readr_1.4.0 tidyr_1.1.3 tibble_3.1.1 ggplot2_3.3.3
## [9] tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.1.1 xfun_0.19 haven_2.4.1 colorspace_1.4-1
## [5] vctrs_0.3.8 generics_0.0.2 htmltools_0.3.6 yaml_2.2.0
## [9] utf8_1.1.4 rlang_0.4.11 pillar_1.6.0 glue_1.4.2
## [13] withr_2.4.2 DBI_1.0.0 dbplyr_2.1.1 modelr_0.1.8
## [17] readxl_1.3.1 lifecycle_1.0.0 munsell_0.5.0 gtable_0.3.0
## [21] cellranger_1.1.0 rvest_1.0.0 evaluate_0.14 knitr_1.25
## [25] curl_3.3 fansi_0.4.0 broom_0.7.6 Rcpp_1.0.2
## [29] scales_1.0.0 backports_1.1.4 jsonlite_1.7.2 fs_1.3.1
## [33] hms_1.0.0 digest_0.6.21 stringi_1.4.3 grid_3.6.0
## [37] cli_2.5.0 tools_3.6.0 magrittr_2.0.1 crayon_1.4.1
## [41] pkgconfig_2.0.3 ellipsis_0.3.0 xml2_1.3.2 reprex_2.0.0
## [45] lubridate_1.7.10 assertthat_0.2.1 rmarkdown_2.5 httr_1.4.2
## [49] rstudioapi_0.13 R6_2.4.0 compiler_3.6.0