analysis with R need screen shots in R
3.2 Free-Format Input
Free-format data are text files containing numbers or character strings separated by spaces. Optionally the file may have a header containing variable names. Here's an example of a data file containing information on three variables for 20 countries in Latin America:
setting effort change Bolivia 46 0 1 Brazil 74 0 10 Chile 89 16 29 Colombia 77 16 25 CostaRica 84 21 29 Cuba 89 15 40 DominicanRep 68 14 21 Ecuador 70 6 0 ElSalvador 60 13 13 Guatemala 55 9 4 Haiti 35 3 0 Honduras 51 7 7 Jamaica 87 23 21 Mexico 83 4 9 Nicaragua 68 0 7 Panama 84 19 22 Paraguay 74 3 6 Peru 73 0 2 TrinidadTobago 84 15 29 Venezuela 91 7 11 |
This small dataset includes an index of social setting, an index of family planning effort, and the percent decline in the crude birth rate between 1965 and 1975. The data are available at http://data.princeton.edu/wws509/datasets/ in a file called effort.dat which includes a header with the variable names.
R can read the data directly from the web:
> fpe <- read.table("http://data.princeton.edu/wws509/datasets/effort.dat")The function used to read data frames is called read.table. The argument is a character string giving the name of the file containing the data, but here we have given it a fully qualified url (uniform resource locator), and that's all it takes.
Alternatively, you could download the data and save them in a local file, or just cut and paste the data from the browser to an editor such as Notepad, and then save them. Make sure the file ends up in R's working directory, which you can find out by typing getwd(). If that is not the case you can use a fully qualified path name or change R's working directory by calling setwd with a string argument. Remember to double up your backward slashes (or use forward slashes instead) when specifying paths.
The special symbol <-is R's assignment operator, which we have encountered already. Here we assigned the data to an object named fpe. To print the data simply type the name of the object.
> fpe setting effort changeBolivia 46 0 1Brazil 74 0 10 ... output edited ...Venezuela 91 7 11In this example R detected correctly that the first line in our file was a header with the variable names. It also inferred correctly that the first column had the observation names. (Well, it did so with a little help; I made sure the row names did not have embedded spaces, hence CostaRica. Alternatively, I could have used "Costa Rica" in quotes as a row name.)
You can always tell R explicitly whether or not you have a header by specifying the optional argument header=TRUE or header=FALSE to the read.table function. This is important if you have a header but lack row names, because R's guess is based on the fact that the header line has one less entry than the next row, as it did in our example.
If your file does not have a header line, R will use the default variable names V1, V2, ..., etc. To override this default use read.table's optional argument col.names to assign variable names. This argument takes a vector of names. So, if our file did not have a header we could have used the command
> fpe = read.table("noheader.dat", + col.names=c("setting","effort","change"))Incidentally this is the first time that our command did not fit in a line. R code can be continued automatically in a new line simply by making it obvious that we are not done, for example ending the line with a comma, or having an unclosed left parenthesis. R responds by prompting for more with the continuation symbol + instead of the usual prompt >.
If your file does not have observation names, R will simply number the observations from 1 to n. You can specify row names using read.table's optional argument row.names, which works just like col.names; type ?data.frame for more information.
There are two closely related functions that can be used to get or set variable and observation names at a later time. These are called names (for the variable names), and row.names (for the observation names). Thus, if our file did not have a header we could have read the data and then changed the default variable names using the names function:
> fpe = read.table("noheader.dat")> names(fpe) = c("setting","effort","change")Technical Note: If you have a background in other programming languages you may be surprised to see a function call on the left hand side of an assignment. These are special 'replacement' functions in R. They extract an element of an object and then replace its value.
In our example all three-variables were numeric. R will handle string variables with no problem. If one of our variables was sex, coded M for males and F for females, R would have created a factor, which is basically a categorical variable that takes one of a finite set of values called levels. In Section 5 we will use a data frame with categorical variables to illustrate logistic regression. Another way to generate factors is by grouping a numeric covariate. An example appears in Section 4 below.
Exercise: Use a text editor to create a small file with the following three lines:
a b c1 2 34 5 6
Read this file into R so the variable names are a, b and c. Now delete the first row and read the file again so the variable names are still a, b and c.
10 years ago
10
Purchase the answer to view it

- img_20161012_1359121.jpg
- img_20161012_135743.jpg
Purchase the answer to view it

- r1.jpg
- r2.jpg
- r3.jpg