Unit II Assignment RCH
A Quick Tour of the R Commander
This chapter introduces the R Commander graphical user interface (GUI) by demonstrating its use for a simple problem: constructing a contingency table to examine the relationship between two categorical variables. In developing the example, I explain how to start the R Commander, describe the structure of the R Commanderinterface, show how to read data into the R Commander, how to modify data to prepare them for analysis, how to draw a graph, how to compute numerical summaries of data, how to create a printed report of your work, how to edit and re-execute commands generated by the R Commander, and how to terminate your R and R Commander session—in short, the typical work flow of data analysis using the R Commander. I also explain how to customize the R Commander interface.
In the course of this chapter, you’ll get an overview of the operation of the R Commander. Later in the book, I’ll return in more detail to many of the topics addressed in the chapter.
I assume that you have installed R and the Rcmdr package, as described in the preceding chapter. As well, if you haven’t read Chapter 1 , now is a good time to do so— Chapter 1 explains some typographical conventions used in this book, discusses the general characteristics and origin of R and the R Commander, and introduces the web site for the book.
Start R in the normal manner for your computer, for example, by double-clicking on the R desktop icon in Windows, by double-clicking on R.app in the Mac OS X Applications folder, or by clicking on the R icon in the Mac OS X Launchpad. 1 On a Linux or Unix machine, you’d normally start R by typing R at the command prompt in a terminal window.
Once R starts up, type the command library(Rcmdr) at the R > command prompt, and then press the Enter or Return key. This command should load the Rcmdr package and—after a brief delay—start the R Commander GUI, as shown in Figure 3.1 for Windows 2 or Figure 3.2 for Mac OS X. If you encounter a problem in starting R or the R Commander, see the sections on troubleshooting in Chapter 2 ( Section 2.2.1 for Windows, 2.3.4 for Mac OS X, or 2.4.1 for Linux/Unix).
Under Windows, the R Commander ( Figure 3.1 ) looks like a standard program. In contrast, under Mac OS X( Figure 3.2 ), the R Commander has its own main menu bar, unlike a standard application, which would use the menu bar at the top of the Mac OS X desktop. 3
As you can see, the main R Commander window looks very similar under Windows and Mac OS X. After this introductory chapter, I will show R Commander dialog boxes as they appear under Windows 10. As well, all dialogs and graphs in the text are rendered in monochrome (gray-scale) rather than in color. 4
At the top of the R Commander window there is a menu bar with the following top-level menus:
File contains menu items for opening and saving various kinds of files, and for changing the R working directory—the folder or directory in your file system where R will look for and write files by default.
Edit contains common menu items for editing text, such as Copy and Paste, along with specialized items for R Markdown documents (discussed in Section 3.6.2 ).
Data contains menu items and submenus for importing, exporting, and manipulating data (see in particular Sections 3.3 and 3.4 , and Chapter 4 ).
Statistics contains submenus for various kinds of statistical data analysis (discussed in several subsequent chapters), including fitting statistical models to data ( Chapter 7 ).
Graphs contains menu items and submenus for creating common statistical graphs (see in particular Chapter 5 ).
Models contains menu items and submenus for performing various operations on statistical models that have been fit to data (see Chapter 7 ).
Distributions contains a menu item for setting the R random-number-generator seed for simulations, and submenus for computing, graphing, and sampling from a variety of common (and not so common) statistical distributions (see Chapter 8 ).
Tools contains menu items for loading R packages and R Commander plug-in packages (see Chapter 9 ), for setting and saving R Commander options (see Section 3.9 ), for installing optional auxiliary software (see Section 2.5 ), and, under Mac OS X, for managing app nap for R.app (see Section 2.3.3 ).
Help contains menu items for obtaining information about the R Commander and R, including links to a brief introductory manual and to the R Commander and R web sites; information about the active data set; and a link to a web site with detailed instructions for using R Markdown to create reports (see Section 3.6 ).
The complete R Commander menu tree is shown in the appendix to this book (starting on page 199 ).
FIGURE 3.1: The R Console and R Commander windows at startup under Windows 10.
FIGURE 3.1: The R Console and R Commander windows at startup under Windows 10.
FIGURE 3.2: The R.app and R Commander windows at startup under Mac OS X.
Below the menus is a toolbar, with a button showing the name of the active data set (displaying < No active dataset> at startup), buttons to edit and view the active data set, and a button showing the active statistical model (displaying <No active model> before a statistical model has been fit to data in the active data set). The Data set and Model buttons may also be used to choose from among multiple data sets and associated statistical models if more than one data set or model resides in the R workspace—the region of your computer’s main memory where R stores data sets, statistical models, and other objects.
Below the toolbar there is a window pane with two tabs, labelled respectively R Script and R Markdown, that collect the R commands generated during your R Commander session. The contents of the R Script and R Markdowntabs can be edited, saved, and reused (as described in Section 3.6 ), and commands in the R Script tab can be modified and re-executed by selecting a command or commands with the mouse (left-click and drag the mouse cursor over the command or commands) and pressing the Submit button below the R Script tab. If you know how, you can also type your own commands into the R Script tab and execute them with the Submit button (see Section 3.7 ). 5 The R Markdown tab, initially behind the R Script tab, also accumulates the R commands that are generated during a session, but in a dynamic document that you can edit and elaborate to create a printed report of your work (as described in Section 3.6.2 ).
The R Commander Output pane appears next: The Output pane collects R commands generated by the R Commander along with associated printed output. The text in the Output pane is also editable, and it can be copied and pasted into other programs (as described in Section 3.6.1 ).
Finally, at the bottom of the R Commander window, the Messages pane records messages generated by R and the R Commander—numbered and color-coded notes (dark blue), warnings (green), and error messages (red). For example, the startup note indicates the R Commander version, along with the date and time at the start of the session.
Once you have started the R Commander GUI, you can safely minimize the R Console window—this window occasionally reports messages, such as when the R Commander causes other R packages to be loaded, but these messages are incidental to the use of the R Commander and can almost always be safely ignored. 6
3.3 Reading Data into the R Commander
Statistical data analysis in the R Commander is based on an active data set in the form of an R data frame. A data frame is a rectangular data set in which the rows (running horizontally) represent cases (often individuals) and the columns (running vertically) represent variables descriptive of those cases. Columns in data frames can contain various forms of data —numeric variables, character-string variables (with values such as “Yes”, “No”, or “Maybe”), logical variables (with values TRUE or FALSE), and factors, which are the standard representation of categorical data in R. Typically, data frames used in the R Commander consist of numeric variables and factors, and character and logical variables, if present, are treated as factors.
R and the R Commander permit you to have as many data frames in your workspace as will fit, 7 but only one is active at any given time. You can read data into data frames from several sources using the R Commander menus: 8 See the Data > Import data submenu, and the Data > Data in packages > Read data set from an attached packagemenu item and associated dialog. If more than one data frame resides in your workspace, you can choose among them by pressing the Data set button in the toolbar or via the menus: Data > Active data set > Select active data set.
One convenient source of data is a plain-text ( “ASCII”) file with one line per case, variable names in the first line, and values in each line separated by a simple delimiter such as spaces or a comma. An example of a plain-text data file with comma-separated values, GSS.csv, is shown in Figure 3.3 . 9
The data in the file GSS.csv are drawn from the U.S. General Social Survey (GSS), and were collected between 1972 and 2012. The GSS is a periodic cross-sectional sample survey of the U. S. population conducted by the National Opinion Research Center at the University of Chicago. Many of the questions in the GSS are repeated in each survey, while other questions are repeated at intervals. To compile the GSS data set, I selected instances of the GSS that asked the question, “There’s been a lot of discussion about the way morals and attitudes about sex are changing in this country. If a man and a woman have sex relations before marriage, do you think it is always wrong, almost always wrong, wrong only sometimes, or not wrong at all?” I also included information about the year of the survey, and the respondents’ gender, education, and religion. Table 3.1 shows the definition of the variables in the GSS data set.
FIGURE 3.3: The GSS.csv file, with comma-delimited data from the U.S. General Social Survey from 1972 to 2012. Only a few of the 33,355 lines in the file are shown; the widely spaced ellipses (…) represent elided lines. The first line in the file contains variable names.
TABLE 3.1: Variables in the GSS data set.
|
Variable |
Values |
|
year |
numeric, year of survey, between 1972 and 2012 |
|
gender |
character, female or male |
|
premarital.sex |
character, always wrong, almost always wrong, sometimes wrong, or not wrong at all |
|
education |
character, less than high school, high school, or post-secondary |
|
religion |
character, Protestant, Catholic, Jewish, other, or none |
This is a natural point at which to explain how objects, including data sets and variables, are named in R: Standard R names are composed of lower- and upper-case letters (a–z, A–Z), numerals (0–9), periods (.), and underscores (_), and must begin with a letter or a period. As well, R is case sensitive; so, for example, the names education, Education, and EDUCATION are all distinct.
In order to keep this introductory example as simple as possible, when I compiled the GSS data set from the original source, I eliminated cases with missing values for any of the four substantive variables (of course, there were no missing values for the year of the survey). In R, missing values are represented by NA (“not available”), and in the R Commander, NA is the default missing-data code for text-data input, although another missing-data code (such as ?, ., or 99) can be specified. This and some other complications and variations are discussed in Chapter 4 on reading and manipulating data in the R Commander.
To read simply formatted data in plain-text files into the R Commander, you can use Data > Import data > from text file, clipboard, or URL. As the name of this menu item implies, the data can be copied to the clipboard (e.g., from a suitably formatted spreadsheet) or read from a file on the Internet, but most often the data will reside in a file on your computer.
The resulting dialog box is shown in Figure 3.4 . This is a comparatively simple R Commander dialog box—for example, it doesn’t have multiple tabs—but it nevertheless illustrates several common elements of R Commanderdialogs:
FIGURE 3.4: The Read Text Data dialog as it appears on a Windows computer (left) and under Mac OS X (right).
• There is a check box to indicate whether variable names are included with the data, as they are in the GSS.csv data file.
• There are radio buttons for selecting one of several choices—here, where the data are located, how data values are separated, and what character is used for decimal points (e.g., commas are used in France and the Canadian province of Québec).
• There are text fields into which the user can type information—here, the name of the data set, the missing-data indicator, and possibly the data-field separator.
I’ve taken all of the defaults in this dialog box, with the following two exceptions: I changed the default data set name, which is Dataset, to the more descriptive GSS. Recall the rules, explained above, for naming R objects. For example, GSS data, with an embedded blank, would not be a legal data set name. I also changed the default field separator from White space (one or more spaces or a tab) to Commas, as is appropriate for the comma-separated-values file GSS.csv.
The Read Text Data dialog also has buttons at the bottom that are standard in R Commander dialogs:
• The Help button opens an R help page in a web browser, documenting either the use of the dialog or the use of an Rcommand that the dialog invokes. In this case, pressing the Help button opens up the help page for the Rread.table function, which is used to input simple plain-text data. R help pages are hyper-linked, so clicking on a link will open another, related help page in your browser. (Try it!)
FIGURE 3.5: The Open file dialog with the data file GSS.csv selected.
• Pressing the OK button generates and executes an R command (or, in the case of some dialogs, a sequence of R commands). 10 These commands are usually entered into the R Script and R Markdown tabs, and the commands and associated printed output appear in the Output pane. If graphical output is produced, it appears in a separate R graphics-device window.
Clicking OK in the Read Text Data dialog brings up a standard Open file dialog box, as shown in Figure 3.5 . I navigated to the location of the data file on my computer and selected the GSS.csv file. Notice that files of type .csv, .txt, and .dat (and their upper-case analogs) are listed by default—these are common file types associated with plain-text data files.
Clicking OK causes the data to be read from GSS.csv, creating the data frame GSS, and making it the active data set in the R Commander. The read.table command invoked by the dialog converts character data in the input file to R factors (here, the variables gender, premarital.sex, education, and religion).
• Clicking the Cancel button simply dismisses the Read Text Data dialog.
As is apparent, the order of the buttons at the bottom of the dialog box is different in Windows and in Mac OS X, reflecting differing GUI conventions on these two computing platforms.
FIGURE 3.6: The R Commander data-set viewer displaying the GSS data set.
3.4 Examining and Recoding Variables
Having read data into the R Commander from an external source, it’s generally a good idea to take a quick look at the data, if only to confirm that they’ve been read properly. Clicking the View data set button in the R Commandertoolbar brings up the data-viewer window shown in Figure 3.6 . Variable names remain at the top of the display as the rows are scrolled using the scrollbar at the right of the data-viewer window. Row numbers appear to the left of the data; if the rows of the data set were named, the row names would appear here (and row numbers or names remain at the left if it’s necessary to scroll the data viewer horizontally). You may leave the data-viewer window open on your desktop as you continue to work in the R Commander, or you may close the data viewer. If you leave it open, the data viewer will be automatically updated if you make subsequent changes to the active data set.
Although the GSS data set contains a moderately large number of cases (with n = 33, 354 rows), there are only five variables, and so I request a summary of all the variables in the data set, invoked by Statistics > Summaries > Active data set. The result is shown in Figure 3.7 :
• R commands generated in the R Commander session are accumulated in the R Script tab (and in the R Markdowntab, which is currently behind the R Script tab and consequently isn’t visible).
• These commands, along with associated printed output, appear in the Output pane; the scrollbar at the right of the pane allows you to examine previous input and output that has scrolled out of view. If some printed material is wider than the pane, you can similarly use the horizontal scrollbar at the bottom to inspect it. The R Commandermakes an effort to fit output to the width of the Output pane, but it isn’t always successful.
• Notice that the Messages pane now includes a note about the dimensions of the GSS data set, generated when the data set was read, and which appears below the initial start-up message.
The output produced by the summary(GSS) command includes a “five-number summary” for the numeric variable year, reporting the minimum, first quartile, median, third quartile, and maximum values of the variable, along with the mean. The other variables are factors, and the count in each level (category) of the factor is shown.
By default, the levels of a factor are ordered alphabetically. This is inconsequential in the case of gender, with levels “female” and “male”, but the levels of premarital.sex and education have natural orderings different from the alphabetic orderings. Although the categories of religion are unordered, I’d still prefer an ordering different from alphabetic, for example, putting the categories “other” and “none” after the others.
I won’t use all of the variables in the GSS data set in this chapter, but to illustrate reordering the levels of a factor, I’ll put the levels of education into their natural order. Clicking on Data > Manage variables in active data set > Reorder factor levels produces the dialog box at the left of Figure 3.8 . I select education in the variable list box in the dialog, leave the name for the factor at its default value same as original>, and keep the Make ordered factorbox unchecked. 11 Because the variable name is unchanged, the new education variable will replace the original variable in the GSS data frame, and so the R Commander will ask for confirmation when I click the OK button.
Variable list boxes are a common feature of R Commander dialogs:
• In general, left-clicking on a variable in an R Commander variable list selects the variable.
• If more than one variable is to be selected—which is not the case in the Reorder Factor Levels dialog—you can Ctrl-left-click to choose additional variables—that is, simultaneously hold down the Ctrl ( Control) key on your keyboard and click the left mouse button.
• Ctrl-clicking “toggles” a selection, so if a variable is already selected, Ctrl-clicking on its name will de-select it.
• The Ctrl key is used in the same way on Macs and on PCs, although on a Mac keyboard, the key is named control. You cannot use the Mac command key here instead of control.
• Similarly, Shift-clicking may be used to select a contiguous range of variables in a list: Click on a variable at one end of the desired range and then Shift-click on the variable at the other end.
• Finally, you can use the scrollbar in a variable list if the list is too long to show all of the variables simultaneously, and pressing a letter key scrolls to the first variable whose name begins with that letter. It’s unnecessary to scroll the variable list here because there are only four factors in the data set.
FIGURE 3.7: The R Commander window after summarizing the active data set.
FIGURE 3.7: The R Commander window after summarizing the active data set.
FIGURE 3.8: The Reorder Factor Levels dialog with education selected (left), and the Reorder Levels sub-dialog showing the reordered levels of education (right).
Clicking OK in the Reorder Factor Levels dialog brings up the sub-dialog at the right of Figure 3.8 . I type in the natural order for the educational levels prior to clicking OK.
My aim is eventually to construct a contingency table to explore whether and how attitude towards premarital sex has changed over time. To this end, I’ll group the survey years into decades. As well, because relatively few respondents answered “almost always wrong” to the premarital-sex question, I’ll combine this response category with “always wrong.” Both operations can be performed with the Recode dialog, invoked via Data > Manage variables in active data set > Recode variables. The resulting dialog, completed to recode year into decade, is displayed in Figure 3.9 .
The following syntax is employed in the Enter recode directives box in the dialog:
• A colon (:) is used to specify a range of values of the original numeric variable year.
• Factor levels (such as “1970s”) are enclosed in double quotes.
• The special values lo and hi can be used for the minimum and maximum values of a numeric variable.
FIGURE 3.8: The Reorder Factor Levels dialog with education selected (left), and the Reorder Levels sub-dialog showing the reordered levels of education (right).
Clicking OK in the Reorder Factor Levels dialog brings up the sub-dialog at the right of Figure 3.8 . I type in the natural order for the educational levels prior to clicking OK.
My aim is eventually to construct a contingency table to explore whether and how attitude towards premarital sex has changed over time. To this end, I’ll group the survey years into decades. As well, because relatively few respondents answered “almost always wrong” to the premarital-sex question, I’ll combine this response category with “always wrong.” Both operations can be performed with the Recode dialog, invoked via Data > Manage variables in active data set > Recode variables. The resulting dialog, completed to recode year into decade, is displayed in Figure 3.9 .
The following syntax is employed in the Enter recode directives box in the dialog:
• A colon (:) is used to specify a range of values of the original numeric variable year.
• Factor levels (such as “1970s”) are enclosed in double quotes.
• The special values lo and hi can be used for the minimum and maximum values of a numeric variable.
• An equals sign (=) associates each set of old values with a level of the factor to be created.
• Because just two surveys were conducted in the 2010s, I decided to include these implicitly with the surveys conducted in the 2000s; an equivalent final recode directive would be else = “2000s”.
• When, as here, and is typical, there is more than one recode directive, each directive appears on a separate line; press the Enter or return key on your keyboard when you finish typing each recode directive to move to the next line before typing the next directive.
• Press the Help button (and read Section 4.4.1 ) to see more generally how recode directives are specified.
FIGURE 3.9: The Recode Variables dialog, recoding year into decade.
In this example, I select the variable year from the variable list. I replace the default variable name (which is variable) with decade. I also leave the box checked to make decade a factor. 12
In addition to the now-familiar Help, OK, and Cancel buttons, there are also Apply and Reset buttons in the Recode Variables dialog:
• Clicking the Apply button is like clicking OK, except that, after generating and executing a command or set of commands, the dialog reopens in its previous state.
• As a general matter, R Commander dialogs “remember” their state from one invocation to the next if that’s sensible—for example, if the active data set hasn’t changed. Pressing the Reset button in a dialog restores the dialog to its pristine state.
After clicking the Apply button, the Recode Variables dialog reappears in its previous state. Because premarital.sex is to be recoded entirely differently from year, I then press the Reset button, and specify the desired recode, shown in Figure 3.10 . I select premarital.sex as the variable to be recoded, and enter premarital as the name of the new factor to be created; I could have used the same name as the original variable, in which case the R Commander would have asked for confirmation. Because I intend to leave the levels sometimes wrong and not wrong at all alone, I don’t have to recode them.
The single recode directive in the dialog changes “almost always wrong” and “always wrong” to “wrong”, with the values of the original factor premarital.sex on the left of = separated by a comma. This recode directive is too long to appear in its entirety in the dialog box, but the scroll bar at the bottom of the Enter recode directives text box allows you to see it; the whole directive is “almost always wrong”, “always wrong” = “wrong”. Because the levels of the new factor premarital are already in their natural order (alphabetically, not wrong at all, sometimes wrong, wrong), I don’t have to reorder them subsequently.
FIGURE 3.10: The Recode Variables dialog, recoding premarital.sex into premarital. Because of its length, the recode directive “almost always wrong”, “always wrong” = “wrong” isn’t entirely visible and the scrollbar below the Enter recode directives box is activated.
An advantage of a graphical user interface like the R Commander is that choices usually are made by pointing and clicking, minimizing the necessity to type and thus tending to reduce errors. The R Commander doesn’t entirelyeliminate typing, however: You must be careful when typing recode directives, and more generally in the R Commander when you type text into a dialog box. If, for example, you type an existing factor level incorrectly in a recode directive, the directive will have no effect. You have to include spaces, and other punctuation such as commas, if these appear in level names. Also remember that R names are case sensitive.
Notice that I’ve used the Recode dialog for two distinct purposes:
1. I created a factor (decade) from a numeric variable (year) by dissecting the range of the numeric variable into class intervals, often called bins. Binning is useful because it allows me to make a contingency table (in Section 3.5 ), relating attitude towards premarital sex to the date of the survey; there are too many distinct values of year to treat them as separate categories in the contingency table, and in the extreme case of a truly continuous numeric variable, all of the values of the variable may be unique.
2. I combined some categories of a factor (premarital.sex) to create a new factor (premartial). Combining factor categories was useful because one of the levels of premarital.sex, “almost always wrong”, was chosen by relatively few respondents.
FIGURE 3.11: The Bar Graph dialog, showing the Data and Options tabs. The x-axis label, Attitude Towards Premarital Sex, is too long to be visible in its entirety, so the scrollbar below the label is activated.
I could examine the distribution of the recoded variable by selecting Statistics > Summaries > Frequency distributions, picking premarital in the resulting dialog, but—primarily to illustrate drawing a graph—I instead construct a bar graph of the distribution: Selecting Graphs > Bar graph produces the dialog box in Figure 3.11. The same dialog in Mac OS X is shown in Figure 3.12.
As is common in R Commander dialogs, there are two tabs in the Bar Graph dialog box, in this case named Dataand Options. I select premarital from the variable list in the Data tab. Were I to click the Plot by groups button, a sub-dialog box would open, permitting me to select a grouping factor, with one bar graph constructed for each group. I type Attitude towards Premarital Sex into the x-axis label box in the Options tab, replacing an automatically generated axis label, denoted by <auto>; because the axis label is longer than the text box, the scrollbar below the label is activated. Clicking OK opens a graphics-device window with the graph shown in Figure 3.13.
FIGURE 3.12: The Bar Graph dialog as it appears under Mac OS X (showing only the Data tab).
FIGURE 3.13: R graphics-device window with a bar graph for recoded attitude towards premarital sex.
3.5 Making a Contingency Table
I’m now ready to construct a contingency table to examine the relationship between the factors decade and premarital. From the R Commander menus, I choose Statistics > Contingency tables > Two-way table, producing the dialog box in Figure 3.14. In the Data tab, shown in the upper panel of the figure, I select the row and column variables for the table, premarital and decade, respectively. The Subset expression box near the bottom of the tab can be used to make a table for a subset of the full data set. For example, entering gender == “male” in this box would restrict the table to male respondents. Notice the double equals sign (==) for testing equality and the use of quotes around the factor level “male”. 13
The lower panel of Figure 3.14 shows the Statistics tab. Here, I press the radio button for Column percentagesbecause decade, the column variable, is the explanatory (“independent”) variable, and premarital, the row variable, is the response (“dependent”) variable in the contingency table: It is standard practice to compute percentages within categories of the explanatory variable so as to make comparisons among these categories. The default in the dialog is No percentages. I leave the Chi-square test box checked—it’s checked by default. 14 Clicking OK produces R commands and associated output in the R Commander Output pane; the commands and output are shown in Figure 3.15 .
Examining the percentage table, it’s evident that disapproval of premarital sex has declined over time, and, from the chi-square test, the relationship in the table is highly statistically significant: The p value for the chi-square test statistic in the printout, given as p-value < 2.2e-16, is to be read as p < 2.2×10−16, that is, less than 0.00000000000000022 (15 zeroes to the right of the decimal point followed by 22)—effectively 0. It is common for computer software like R to report very large or (as here) very small numbers in this format, called scientific notation.
In addition to the chi-square test and statistics associated with it (chi-square components and expected frequencies), the Statistics tab in the Two-Way Table dialog includes an option for computing Fisher’s exact test for association in a contingency table.
FIGURE 3.14: The Two-Way Table dialog, Data tab (top) and Statistics tab (bottom).
FIGURE 3.15: Contingency table for attitude towards premarital sex by decade.
FIGURE 3.15: Contingency table for attitude towards premarital sex by decade.
I’ve explained how R commands accumulate in the R Script tab as your R Commander session progresses. These commands can be edited and saved in a .R file via the R Commander menus, File > Save script or File > Save script as. As well, at the end of the session, the R Commander offers to save the script (see Section 3.8 ). A saved script can be reloaded into the R Script tab in a subsequent R Commander session, or used in an R editor independent of the R Commander, such as RStudio (discussed briefly in Section 1.4 ).
3.6.1 Creating a Report by Cutting and Pasting
The text in the Output pane is also editable, and you can copy and paste text from the Ouput pane into a text-editor or word-processor document to create a simple record of your work: Just use the R Commander Edit menu; the right-click context menu, after first clicking in the Output pane; or standard edit-key combinations. 15 If you go this route, however, be sure to use a monospaced (typewriter) font such as Courier, or your R output won’t be properly aligned. The text in the Output pane can also be saved to a file, via File > Save output or File > Save output as.
You can similarly save graphs (such as the histogram in Figure 3.13 on page 34 ) from an R graphics device. Under Windows, graphs can be copied to the clipboard and subsequently pasted into a word-processor document or saved in a graphics file and subsequently imported into a document. To save a graph, use either the File menu in the graphics device or the right-click context menu. If you activate History > Recording from the Windows R graphics device, you’ll be able to scroll through graphs in the graphics device via the Page Up and Page Down keys.
On Mac OS X, the R Commander creates graphs in a Quartz graphics-device window. The Quartz graphics device also supports copying to the clipboard via the key-combination command-c, and you can save graphs to PDF files via File > Save As. A graph copied to the clipboard can be pasted into most Mac word processors via command-v; similarly, a graph saved as a PDF file can typically be imported into a word processor document. Plot history is saved by default in the Quartz graphics device, and you can move back and forth among your graphs via the command-← and command-→ key combinations.
3.6.2 Creating a Report as a Dynamic Document
In addition to the relatively crude approach of copying and pasting R output and graphs, the R Commander supports writing reports in the simple Markdown mark-up language. 16 Just as R commands accumulate during your session in the R Script tab, they are also written into the R Markdown tab. The advantage of using R Markdown in comparison to cutting and pasting output is that the R Markdown document that you create is a permanent, reproducible record of your work, intermixing executable R commands (essentially, the contents of the R script for your session) with your explanatory text. The resulting R Markdown document is then compiled into a report that includes R commands along with associated printouts and graphs. The contents of the R Markdown tab can also be saved (via File > Save R Markdown file or File > Save R Markdown file as), to be reloaded and reused in a subsequent R Commander session or in a compatible R editor, such as RStudio (see Section 1.4 ).
The R Markdown tab begins with one of two (customizable) R Markdown templates, depending upon whether you have installed the optional auxiliary Pandoc software: 17 If Pandoc is installed, the R Commander uses the newer rmarkdown package (Allaire et al., 2015a) to convert the R Markdown document into a Word file, an HTML file (a “web” page), or (if LATEX is additionally installed) a PDF file. If Pandoc isn’t installed on your computer, the R Commander uses the older markdown package (Allaire et al., 2015b) to convert the R Markdown document into an HTML file. The initial contents of these alternative R Commander R Markdown templates are shown in Figure 3.16 .
Both forms of the R Markdown template begin with title, author, and date fields. As likely is obvious, you should replace the generic title (Replace with Main Title) with your own descriptive title (as in the example below), and Your Name with your name. Leave the date field alone—in both templates a date and time stamp will be generated automatically—unless you want to hard-code the date. For the rmarkdown version of the template, you must retain the pairs of quotes (“…”) in the title, author, and date fields.
Both R Markdown templates then include a block of commands to customize the document and to load packages necessary for executing subsequent commands. This command block—which in both templates begins with a line of the form ```{r etc.} and ends with the line ``` (i.e., three back-ticks)—should be left as is (unless you know what you’re doing!).
With few exceptions (such as R commands that require direct user intervention), each time the R Commandergenerates a command or set of commands, they are entered into an R command block in the R Markdown tab, delimited by ```{r} at the top of the block and ``` at the bottom. Except in two cases, you probably shouldn’t alter these command blocks (again, unless you know what you’re doing):
• You should feel free to delete an entire command block, including the initial ```{r} and terminating ```, either by directly editing the text in the R Markdown tab or via the R Commander menus: Edit > Remove last Markdown command block, which deletes the command block generated by the preceding R Commander action. You may wish to do this, for example, if you generate incorrect or unwanted output. You should be careful, however, to insure that deleting a command block doesn’t disturb the logic of the R session: For example, you shouldn’t delete a block that reads a data set and retain subsequent blocks that perform computations on the data set.
• You can control the size of graphs drawn in a command block by adding fig.height and fig.width argumentsto the initial ```{r} line. For example, ```{r fig.height=4, fig.width=6} sets figure height to 4 inches and width to 6 inches.
As in the R Script tab, you can edit text in the R Markdown tab by typing in the tab, by using the Edit menu when the cursor is in the tab, by right-clicking when the cursor is in the tab and selecting edit actions from the resulting context menu, or by using standard edit key combinations. Generally more conveniently, however, you can open an editor window via Edit > Edit R Markdown document, by right-clicking in the R Markdown tab when the cursor is in the tab and selecting Edit R Markdown document from the context menu, or by the key combination Ctrl-e when the cursor is in the R Markdown tab. 18 The R Markdown document editor for the current session is shown in Figure 3.17 .
(a) Template for use with rmarkdown (and requiring Pandoc).
(b) Template for use with markdown (in the absence of Pandoc).
(b) Template for use with markdown (in the absence of Pandoc).
FIGURE 3.16: R Markdown templates used in the R Commander.
FIGURE 3.17: The R Markdown editor window.
The R Markdown editor includes File, Edit, and Help menus, the last providing help both on using R Markdownand on the editor itself, a toolbar with buttons for common editing operations (hover your mouse above the buttons to see “tool tips” describing the action associated with each button) along with a button for generating a report (described below), and Help, OK, and Cancel buttons at the bottom of the window. Clicking OK closes the editor, saving your edits, while clicking Cancel closes the editor and discards your edits. The editor is a “modal” dialog: While the editor window is open, operation of the R Commander is suspended.
You can edit the R Markdown document periodically as your R Commander session develops, inserting explanatory text as you go, or you can edit the document at the end of your session. If you save the R Markdowndocument during or at the end of your session (via the main R Commander menus, File > Save R Markdown file or File > Save R Markdown file as), you can edit it in any text editor, including the RStudio programming editor (see Section 1.4 ): The file is saved as a plain-text document with file extension .Rmd.
In editing the R Markdown document, however, you should be very careful only to type text between different Rcommand blocks, each of which, recall, is delimited by ```{r} at the top and ``` at the bottom. Typing arbitrary text within a command block—between the initial ```{r} and terminating ``` of the block—almost inevitably will cause Rsyntax errors when the commands in the block are executed.
An illustrative edited R Markdown document for the current session appears in Figure 3.18 . Because it’s just for purposes of illustration, I’ve kept this document brief and only part of the document is displayed; in a real application, I’d include more descriptive and explanatory text.
Pressing the Generate report button in the editor or—if the editor isn’t open—in the main R Commander window brings up the dialog box in Figure 3.19 . If Pandoc isn’t installed on your computer, 19 pressing Generate report will simply create an HTML report without the intervening dialog. I select . html (web page) and click OK. That causes the R Markdown document to be compiled, including running embedded R commands in the various code blocks in an independent R session. The resulting .html file opens in the default web browser, as in Figure 3.20 .
3.6.2.1 Using Markdown: The Basics
Markdown is a punningly named, simple text markup language. R Markdown is an extension of Markdown that, as I’ve explained, accommodates embedded, executable R commands. An R Markdown document, stored in a file of type .Rmd, is compiled into a corresponding standard Markdown file of type .md, containing R input and output, including graphs. This Markdown document, in turn, is compiled into a typeset report in one of several formats, such as an HTML web page (i.e., a file of type .html). The R Commander manages the compilation process automatically when a report is generated.
Although the Markdown specification is very simple, it is also very flexible and powerful: As has been said of the children’s programming language LOGO (Papert, 1980), Markdown has a low threshold but a high ceiling. I’m concerned here, however, with very basic use of R Markdown, so we’ll just step over the low threshold. Much more information is available on line at http://rmarkdown.rstudio.com/ , a web site that’s accessible through the R Commander Help menu.
Basic Markdown syntax is illustrated in Figure 3.21 , alongside the corresponding typeset HTML document. Some of this basic Markdown syntax is used in the R Markdown document for the current R Commander session (in Figure 3.18 ).
FIGURE 3.18: An edited R Markdown document for the example in Chapter 3 , showing only part of the document (with elided lines marked by …).
FIGURE 3.19: The Select Output Format dialog for creating a report from the R Commander R Markdowndocument.
FIGURE 3.20: The compiled HTML report, as it appears in a web browser, with only the top of the page shown.
FIGURE 3.21: Basic Markdown syntax. A Markdown file is shown on the left and the corresponding HTML page on the right.
3.6.2.2 Adding a Little LATEX Math to an R Markdown Document*
The last line of the document in Figure 3.18 also demonstrates how to imbed LATEX math inside an R Markdowndocument. Although the details of LATEX are well beyond the scope of this section, simple LATEX math is reasonably intuitive. 20
In-line LATEX math is enclosed in dollar-signs ($… $). In the example document, the embedded math $df = 4$, $p < 2.2 \times 10^{-16}$ is typeset as df = 4, p < 2.2 × 10-16.
Similarly, a displayed equation can be specified using double dollar signs ($$). For example,
$$
y_i = \beta_0 + \beta_1 x_{1i} + \cdots + \beta_k x_{ki} + \epsilon_i
$$
would be typeset as the displayed equation
yi = β0 + β1 xi +…+ βkxki + ∈i
These simple LATEX examples illustrate how to use underscores for subscripts, as in y_i and x_{1i} (where the curly braces { and } are used for grouping), and carets for superscripts, as in 10^{-16}. Greek letters are specified as \beta ( β), \epsilon ( ∈), and so on. The LATEX symbol \cdots is typeset as three centered dots (…).
You can think of the R Commander R Script tab as a simple programming editor. As I’ve explained, as an interactive R Commander session progresses, the commands generated by the R Commander GUI accumulate in the R Scripttab. The resulting R script of commands can be saved in a .R file, to be reloaded into the R Commander in a subsequent session—or into another R programming editor, such as RStudio (discussed briefly in Section 1.4 )—to be modified or re-executed.
You can also type R commands directly into the R Commander Script tab, or modify commands previously generated by the GUI. The R Script tab isn’t a full-featured programming editor, but it does support basic editing functions, such as cut, copy, paste, undo, and so on, via a largely self-explanatory right-click context menu (shown at the left of Figure 3.22 ), the R Commander Edit menu (see Figure A.1 on page 200 ), and standard key-combinations (with the available key bindings listed in Table 3.2 ).
To provide a simple example of R-command editing, I return to the bar graph constructed for attitude towards premarital sex in the GSS data set, displayed in Figure 3.13 (on page 34 ). This graph was created by the R command
with(GSS, Barplot(premarital, xlab=“Attitude Towards Premarital Sex”, ylab=“Frequency”))
which, along with the other commands generated during the current R Commander session, appears in the R Scripttab. As a general matter, computing in R is performed by functions, which are called by name with their arguments in parentheses. The arguments may be given by position or may be named, with each named argument associated with a value by an equals sign (=). In this example, two functions are called: with and Barplot.
FIGURE 3.22: R Script tab right-click context menu (left) and R Markdown tab right-click context menu (right).
TABLE 3.2: R Commander edit-key bindings.
|
Key-Combination |
Action |
|
Ctrl-x |
cut selection to clipboard |
|
Ctrl-c |
copy selection to clipboard |
|
Ctrl-v |
paste selection from clipboard |
|
Ctrl-z or Alt-backspace |
undo last operation (may be repeated) |
|
Ctrl-w |
redo last undo |
|
Ctrl-f or F3 |
open find-text dialog |
|
Ctrl-a |
select all text |
|
Ctrl-s |
save file |
|
Ctrl-r or Ctrl-Tab |
submit (“run”) current line or selected lines ( R Script tab) |
|
Ctrl-e |
open document editor ( R Markdown tab) |
Except as noted, these key bindings work in the R Script and R Markdown tabs and in the Output and Messagespanes. Under Mac OS X, either the command or control key may be used. On keyboards with “function” keys, the F3function key may be used as an alternative to Ctrl-f.
• The with function takes two arguments, and both are specified here by position: The first argument is a data set, GSS in the example. The second argument is an expression referencing variables in the data set—in this case, a call to the Barplot function.
• The Barplot function is called with three arguments, the first specified by position and the other two by name: premarital is the variable in the GSS data set to used for the bar graph; the arguments xlab and ylab are character strings (enclosed in quotes) specifying the horizontal and vertical axis labels.
To see the complete list of arguments for the Barplot function, once again select Graphs > Bar graph from the R Commander menus and press the Help button in the resulting dialog. In addition to an explanation of its arguments, you’ll see that the Barplot function calls the barplot function (with a lower-case b) to draw the graph. 21 Clicking on the hyperlink for barplot brings up the help page for that function.
The barplot function has an optional argument, horiz, which, if set to TRUE, draws the bar graph horizontally rather than vertically. The R Commander Bar Graph dialog, however, makes no provision for this option.
To draw a horizontal bar graph of attitude towards premarital sex, I left-click and drag the mouse over the original Barplot command in the R Script tab, selecting the command, copy this text by Ctrl-c, and paste it by Ctrl-v at the bottom of the script. 22 I then edit the command to draw the horizontal bar graph:
with(GSS, Barplot(premarital, ylab=“Attitude Towards Premarital Sex”, xlab=“Frequency”, horiz=TRUE))
box()
In addition to setting horiz=TRUE, I also exchange the xlab and ylab arguments, so that the axes are properly labelled, and I add a second command, a call to the box function (with no arguments 23), to draw a box around the plotting region in the graph. Selecting these commands with the mouse, I press the Submit button, obtaining the modified, horizontal bar graph in Figure 3.23.
It’s a limitation of the R Commander R Script tab that you have to submit complete commands: You may continue a command over as many lines as necessary, and you may simultaneously submit more than one complete command (as I’ve done in this example), but submitting a partially complete command results in an error. Moreover, submitting a command or commands for execution in this manner from the R Script tab also causes the commands to be entered into the document in the R Markdown tab.
R command blocks automatically incorporated into the R Markdown tab (discussed in Section 3.6) are similarly modifiable. Moreover, if you know what you’re doing, you can write your own R command blocks. It’s important to pay attention to the order of commands, however: For example, you can’t use a data set in a computation before you input the data.
FIGURE 3.23: Horizontal bar graph of attitude towards premarital sex in the GSS data set. This graph was created by modifying the Barplot command produced by the R Commander Bar Graph dialog.
3.8 Terminating the R Commander Session
In most cases, the simplest and safest method of terminating an R Commander session is to select File > Exit > From Commander and R from the R Commander menus. In addition to clo
• The with function takes two arguments, and both are specified here by position: The first argument is a data set, GSS in the example. The second argument is an expression referencing variables in the data set—in this case, a call to the Barplot function.
• The Barplot function is called with three arguments, the first specified by position and the other two by name: premarital is the variable in the GSS data set to used for the bar graph; the arguments xlab and ylab are character strings (enclosed in quotes) specifying the horizontal and vertical axis labels.
To see the complete list of arguments for the Barplot function, once again select Graphs > Bar graph from the R Commander menus and press the Help button in the resulting dialog. In addition to an explanation of its arguments, you’ll see that the Barplot function calls the barplot function (with a lower-case b) to draw the graph. 21 Clicking on the hyperlink for barplot brings up the help page for that function.
The barplot function has an optional argument, horiz, which, if set to TRUE, draws the bar graph horizontally rather than vertically. The R Commander Bar Graph dialog, however, makes no provision for this option.
To draw a horizontal bar graph of attitude towards premarital sex, I left-click and drag the mouse over the original Barplot command in the R Script tab, selecting the command, copy this text by Ctrl-c, and paste it by Ctrl-v at the bottom of the script. 22 I then edit the command to draw the horizontal bar graph:
with(GSS, Barplot(premarital, ylab=“Attitude Towards Premarital Sex”, xlab=“Frequency”, horiz=TRUE))
box()
In addition to setting horiz=TRUE, I also exchange the xlab and ylab arguments, so that the axes are properly labelled, and I add a second command, a call to the box function (with no arguments 23), to draw a box around the plotting region in the graph. Selecting these commands with the mouse, I press the Submit button, obtaining the modified, horizontal bar graph in Figure 3.23.
It’s a limitation of the R Commander R Script tab that you have to submit complete commands: You may continue a command over as many lines as necessary, and you may simultaneously submit more than one complete command (as I’ve done in this example), but submitting a partially complete command results in an error. Moreover, submitting a command or commands for execution in this manner from the R Script tab also causes the commands to be entered into the document in the R Markdown tab.
R command blocks automatically incorporated into the R Markdown tab (discussed in Section 3.6) are similarly modifiable. Moreover, if you know what you’re doing, you can write your own R command blocks. It’s important to pay attention to the order of commands, however: For example, you can’t use a data set in a computation before you input the data.
FIGURE 3.23: Horizontal bar graph of attitude towards premarital sex in the GSS data set. This graph was created by modifying the Barplot command produced by the R Commander Bar Graph dialog.
3.8 Terminating the R Commander Session
In most cases, the simplest and safest method of terminating an R Commander session is to select File > Exit > From Commander and R from the R Commander menus. In addition to clo
• The with function takes two arguments, and both are specified here by position: The first argument is a data set, GSS in the example. The second argument is an expression referencing variables in the data set—in this case, a call to the Barplot function.
• The Barplot function is called with three arguments, the first specified by position and the other two by name: premarital is the variable in the GSS data set to used for the bar graph; the arguments xlab and ylab are character strings (enclosed in quotes) specifying the horizontal and vertical axis labels.
To see the complete list of arguments for the Barplot function, once again select Graphs > Bar graph from the R Commander menus and press the Help button in the resulting dialog. In addition to an explanation of its arguments, you’ll see that the Barplot function calls the barplot function (with a lower-case b) to draw the graph. 21 Clicking on the hyperlink for barplot brings up the help page for that function.
The barplot function has an optional argument, horiz, which, if set to TRUE, draws the bar graph horizontally rather than vertically. The R Commander Bar Graph dialog, however, makes no provision for this option.
To draw a horizontal bar graph of attitude towards premarital sex, I left-click and drag the mouse over the original Barplot command in the R Script tab, selecting the command, copy this text by Ctrl-c, and paste it by Ctrl-v at the bottom of the script. 22 I then edit the command to draw the horizontal bar graph:
with(GSS, Barplot(premarital, ylab=“Attitude Towards Premarital Sex”, xlab=“Frequency”, horiz=TRUE))
box()
In addition to setting horiz=TRUE, I also exchange the xlab and ylab arguments, so that the axes are properly labelled, and I add a second command, a call to the box function (with no arguments 23), to draw a box around the plotting region in the graph. Selecting these commands with the mouse, I press the Submit button, obtaining the modified, horizontal bar graph in Figure 3.23.
It’s a limitation of the R Commander R Script tab that you have to submit complete commands: You may continue a command over as many lines as necessary, and you may simultaneously submit more than one complete command (as I’ve done in this example), but submitting a partially complete command results in an error. Moreover, submitting a command or commands for execution in this manner from the R Script tab also causes the commands to be entered into the document in the R Markdown tab.
R command blocks automatically incorporated into the R Markdown tab (discussed in Section 3.6) are similarly modifiable. Moreover, if you know what you’re doing, you can write your own R command blocks. It’s important to pay attention to the order of commands, however: For example, you can’t use a data set in a computation before you input the data.
FIGURE 3.23: Horizontal bar graph of attitude towards premarital sex in the GSS data set. This graph was created by modifying the Barplot command produced by the R Commander Bar Graph dialog.
3.8 Terminating the R Commander Session
In most cases, the simplest and safest method of terminating an R Commander session is to select File > Exit > From Commander and R from the R Commander menus. In addition to clothing • The with function takes two arguments, and both are specified here by position: The first argument is a data set, GSS in the example. The second argument is an expression referencing variables in the data set—in this case, a call to the Barplot function.
• The Barplot function is called with three arguments, the first specified by position and the other two by name: premarital is the variable in the GSS data set to used for the bar graph; the arguments xlab and ylab are character strings (enclosed in quotes) specifying the horizontal and vertical axis labels.
To see the complete list of arguments for the Barplot function, once again select Graphs > Bar graph from the R Commander menus and press the Help button in the resulting dialog. In addition to an explanation of its arguments, you’ll see that the Barplot function calls the barplot function (with a lower-case b) to draw the graph. 21 Clicking on the hyperlink for barplot brings up the help page for that function.
The barplot function has an optional argument, horiz, which, if set to TRUE, draws the bar graph horizontally rather than vertically. The R Commander Bar Graph dialog, however, makes no provision for this option.
To draw a horizontal bar graph of attitude towards premarital sex, I left-click and drag the mouse over the original Barplot command in the R Script tab, selecting the command, copy this text by Ctrl-c, and paste it by Ctrl-v at the bottom of the script. 22 I then edit the command to draw the horizontal bar graph:
with(GSS, Barplot(premarital, ylab=“Attitude Towards Premarital Sex”, xlab=“Frequency”, horiz=TRUE))
box()
In addition to setting horiz=TRUE, I also exchange the xlab and ylab arguments, so that the axes are properly labelled, and I add a second command, a call to the box function (with no arguments 23), to draw a box around the plotting region in the graph. Selecting these commands with the mouse, I press the Submit button, obtaining the modified, horizontal bar graph in Figure 3.23.
It’s a limitation of the R Commander R Script tab that you have to submit complete commands: You may continue a command over as many lines as necessary, and you may simultaneously submit more than one complete command (as I’ve done in this example), but submitting a partially complete command results in an error. Moreover, submitting a command or commands for execution in this manner from the R Script tab also causes the commands to be entered into the document in the R Markdown tab.
R command blocks automatically incorporated into the R Markdown tab (discussed in Section 3.6) are similarly modifiable. Moreover, if you know what you’re doing, you can write your own R command blocks. It’s important to pay attention to the order of commands, however: For example, you can’t use a data set in a computation before you input the data.
FIGURE 3.23: Horizontal bar graph of attitude towards premarital sex in the GSS data set. This graph was created by modifying the Barplot command produced by the R Commander Bar Graph dialog.
3.8 Terminating the R Commander Session
In most cases, the simplest and safest method of terminating an R Commander session is to select File > Exit > From Commander and R from the R Commander menus. In addition to clothing. R Commander window, you will also end your R session without saving the R workspace. On exit, the R Commander will give you an opportunity to save the contents of the R Script tab, the R Markdown tab, and the Output pane, however.
You can also exit from the R Commander without terminating your R session by selecting File > Exit > From Commander. Again, the R Commander will prompt you to save your work. You can subsequently exit from the R Console via File > Exit on Windows or File > Close on Mac OS X.
Upon exiting from the R Console, R will ask whether you want to save your workspace, with saving the workspace as the default response. You should almost surely not save the workspace. A saved workspace, which will be automatically reloaded in a subsequent R session, may cause the R Commander to fail to function properly. 24
If you exit from the R Commander without closing R, you can restart the R Commander interface by entering Commander() at the R > command prompt. It’s important to spell Commander() with an upper-case C and to include the parentheses. Restarting the R Commander in this manner initiates a fresh session—your previous work doesn’t appear in the R Script tab, the R Markdown tab, or the Output pane. If, however, you saved the script or R Markdown document before exiting from the R Commander, you can reload these documents via the R Commander File menu.
Finally, you can exit from both R and the R Commander by closing the R Console directly. I don’t recommend this procedure, however, because you won’t have a chance to save your work in the R Commander. 3.9 Customizing the R Commander*
The default configuration of the R Commander should be fine for most users, but some aspects of the appearance and behavior of the software can be customized to reflect your preferences and needs. Specific R Commander features are set via the R options command, and can be saved so that they are persistent across R Commander sessions.
3.9.1 Using the Commander Options Dialog
The most convenient way to set many R Commander options is via Tools > Options, which produces the dialog box in Figure 3.24 . The several tabs in this dialog show the default selections, some of which vary from one operating system to another: For example, the default Theme in the Other Options tab is vista on Windows and clearlooks on Mac OS X.
Most of the settings in the Commander Options dialog are self-explanatory, and you can experiment with the settings to see their effects on the R Commander. A few comments on the options are in order, however:
FIGURE 3.24: The Commander Options dialog, with Exit, Fonts, Output, and Other Options tabs.
• Clicking on one of the text color-selection buttons in the Fonts tab brings up a color-selection sub-dialog. See the discussion of colors in Section 3.9.3 .
• To display the R Commander on a screen for a presentation with a data projector, I typically set the Dialog text font size to 14 points and the Script and output font size to 15 points in the Fonts tab. You might try that as a starting point.
• If you want to use LATEX for creating reports (see Section 3.6 ), check the Use knitr box in the Output pane. You’ll probably also want to uncheck the Use R Markdown box, unless you wish to create both LATEX and Markdowndocuments. For information about using the knitr package with LATEX, see Xie (2015) and http://yihui.name/knitr/ .
• Possibly starting with the document templates supplied by the Rcmdr package, you can prepare your own R Markdown or knitr LATEX template and place it anywhere on your file system, indicating the location in the Output tab.
• If you are unable to install the rgl package for 3D dynamic graphs (see Section 5.4.1 ), then you can uncheck the Use rgl package box in the Other Options tab.
• As described in Section 7.2.4 , the R Commander uses dummy-coded contrasts created by the contr.Treatmentfunction for factors in linear-model formulas and orthogonal-polynomial contrasts created by the contr.polyfunction for ordered factors. You can change these choices in the Other Options tab.
• By default, the R Commander orders variables alphabetically in variable list boxes. If you prefer to retain the order in which variables appear in the active data set, uncheck Sort variables names alphabetically in the Other Optionstab.
3.9.2 Setting Rcmdr Package Options
R Commander options—including those selected via the Commander Options dialog—are set with the R optionscommand. There are many available options beyond those accessible through the dialog, and you can use the options command directly at the R command prompt to specify them. The general format of this command is options(Rcmdr=list( option.1 = setting.l, option.2=setting.2, …, option.n=setting.n)).
For example, to prevent the R Commander from checking at start-up whether all recommended packages are installed, and to cause the RcmdrPlugin.survival package to be loaded when the R Commander starts, 25 enter options(Rcmdr=list(check.packages= FALSE, plugins=“RcmdrPlugin.survival”)) before loading the Rcmdr package via library(Rcmdr) .
To see the full range of available options, type help(“Commander”, package=“Rcmdr”) at the R command prompt, or select Help > Commander help from the R Commander menus.
3.9.3 Managing Colors in the R Commander
Many of the graphics functions employed by the R Commander use the R color palette for color selection. You can modify the palette by selecting
Graphs > Color palette from the R Commander menus, producing the dialog box at the top of
Figure 3.25
, which initially displays the current palette. Typically, the first color in the palette
(black in the default palette) is used for most graphical elements, with the remaining colors used successively. You may wish to change the default colors if you are red-green color blind, for example.
FIGURE 3.25: Set Color Palette dialog (top) and the Windows Select a Color sub-dialog (bottom). A color version of this figure appears in the insert at the center of the book.
Pressing one of the color buttons—for example, the second (red) button—brings up the Select a Color sub-dialog shown at the bottom of Figure 3.25. The structure of the color-selection dialog varies by operating system but its use is simple and direct: In the Windows version of the dialog, displayed in the figure, you can select a basic color by left-clicking on it, select a color by clicking in the color-selection box or moving the slider at the right, define a color by hue, saturation, and luminosity, or define a color by the intensity (0–255) of its red, green, and blue additive primary-color components.
R maintains a list of more than 650 named colors. If you choose a color that’s near a named color, then the name appears underneath the color button in the Set Color Palette dialog. As you can see, all eight of the colors in the default palette have names.
3.9.4 Saving R Commander Options
R has an elaborate start-up process employing several configuration files. 26 Users who just want to customize the R Commander, however, can cut through the details: If you make selections in the Commander Options dialog or employ the R options command directly to configure the R Commander, then simply select Tools > Save Rcmdr options from the R Commander menus, producing the dialog in Figure 3.26. Prior to this menu selection, I changed the R Commander dialog text font size to 14 points, and the script and output font size to 15 points.
FIGURE 3.26: The R Commander Save Commander Options editor. The dialog edits or creates the file .Profile in your home directory.
The Save Commander Options dialog in Figure 3.26 is a simple text editor. The dialog creates the R configuration file .Rprofile in your home directory, where it will be found when you next start R. If you already have a .Rprofile file in your home directory, then the dialog will modify it with your current R Commander options—those lines in the file between
###! Rcmdr Options Begin !###
and
###! Rcmdr Options End !###
Near the bottom of this block of lines is a command that you can “uncomment” (by removing the #s) to start the R Commander whenever R starts up:
# Uncomment the following 4 lines (remove the #s)
# to start the R Commander automatically when R starts:
# local({
# old <- getOption(‘defaultPackages’)
# options(defaultPackages = c(old, ‘Rcmdr’))
# })
Any other contents of a pre-existing .Rprofile file are undisturbed.
1If you plan to use R and the R Commander frequently under Mac OS X, it is convenient to add the R icon to the dock.
2This is how the R Console appears in the R for Windows single-document interface (SDI) that I recommended in the installation instructions in Chapter 2. If you instead installed R with the default multiple-document interface (MDI), then the R Console appears inside a larger RGui window—not an ideal arrangement for the R Commander.
3As explained in Chapter 2, the Tcl/Tk GUI builder installed with R and used by the R Commander employs the X11 windowing system rather than the native Mac Quartz windowing system. This is why the R Commander can’t use the standard Mac OS X top-level menu bar.
4In the few instances in which color is important to the interpretation of a figure, the figure is repeated in its original color version in the center insert to the book; these instances are noted in the figure captions.
5If a command is self-contained on a single line, then it can be executed by pressing Submit when the cursor is anywhere in the line; if a command extends over several lines, however, then all lines must be selected and submitted simultaneously for execution.
6Before minimizing the Mac OS X R Console, however, make sure that app nap is turned off, or—as explained in Section 2.3.3—the R Commander may become unresponsive!
7Unless you’re working with massive data sets, in which case the R Commander is probably not a good choice of interface to R, fitting data into the R workspace will not be an issue.
8Data input from various sources is discussed in more detail in Chapter 4.
9GSS.csv and other data files employed in this book are available for download on the web site for the book: See Section 1.5.
10 OK is also the default button in the dialog (under Windows, the button is outlined in blue), and so pressing the Return or Enter key is equivalent to left-clicking on the button.
11An ordered factor in R is a factor whose levels are recognized to have an intrinsic ordering. I could use an ordered factor here, but there is no real advantage to doing so, and I will not employ ordered factors in this book.
12More than one variable can be selected if the same recode directives are to be applied to each. If several variables are to be recoded, the name supplied is treated as a prefix for the new variable names, which are formed by pasting the prefix onto the names of the recoded variables.
13The double equals, ==, is used to test equality because the ordinary equals sign (=) is used for other purposes in R—in particular, to assign a value to a function argument (as in log(100, base=10)) or to assign a value to a variable (as in x = 10). Most R programmers prefer to use the left-pointing arrow (<-) for the latter purpose (as in x <- 10, read “the variable x gets—or is assigned—the value 10.”). R expressions, including relational operators such as ==, are discussed in Section 4.4.2.
14If you’re unfamiliar with the chi-square test of independence in a two-way table, don’t worry; you’ll almost surely study it in your introductory statistics course.
15On Windows or Linux/Unix, you can use the key combinations Ctrl-x for cut, Ctrl-c for copy, and Ctrl-v for paste. On Mac OS X, you can use command-x for cut, command-c for copy, and command-v for paste, in addition to the various control-key combinations. See Section 3.7 for more information on editing in the R Commander.
16The R Commander also supports reports written in the more sophisticated LATEX mark-up language: See Section 3.9 on customizing the R Commander.
17See the instructions for installing optional software in Section 2.5.
18Under Mac OS X, you can also use command-e.
19See Section 2.5 for information on installing Pandoc.
20If you’re interested in pursuing the topic, a good place to start is the Wikipedia article on LATEX, at https://en.wikipedia.org/wiki/LaTeX, which includes several useful links and references.
21The Barplot function resides in the RcmdrMisc package, which is loaded when the R Commander starts, while the barplot function is in the graphics package, which is a standard part of R.
22Alternatively, I could have edited the original command in place and resubmitted it for execution, but by copying the command I retain both versions.
23Even though the box() command has no arguments, it’s still necessary to include the parentheses so R knows that this is a function call.
24If you experience problems caused by an inadvertently saved workspace, you can remove the .RData file containing the saved workspace by following the troubleshooting instructions in Section 2.2.1 for Windows, 2.3.4 for Mac OS X, or 2.4.1 for Linux/Unix.
25See Chapter 9 for a discussion of R Commander plug-in packages.
26Enter the command ?Startup at the command prompt in the R console for an explanation of R’s start-up procedure.