Analyze data using R Programming
Documenting Research Guide Last Revised: 4/22/2020 1
Documenting Research Guide
1. Organizing your Research and Documentation
2. Writing Tips
3. Example Outline, based on the analysis in Unit 3 Part 1
Documenting Research Guide Last Revised: 4/22/2020 2
Organizing your Research and Documentation I. Introduction; the section header is the title of the paper, not introduction
a. Describe the broader context in which the problem exists, the topic
b. Lead the reader to the problem statement.
c. Do not explicitly state the problem, research questions, or methodology, etc
d. The introduction is a summary of the paper at a very high level.
II. Statement of the Problem
a. This section may come straight from an assignment’s instructions
b. You can even begin with “The problem to be addressed by this study is…”
c. Provide the ideal, current, and intent of the problem for research
III. Research Methodology
a. This section should have a short introduction to your research questions
b. Research questions; unambiguously declared in research papers or presentations
c. Identify the methods of analysis that will be used to address the research questions
i. If the methodology has statistical assumptions that must be met, then these are also declared
ii. If the method has limitations, state what they are
d. Identify the population and sample; in our course, this is the data set and source of the data
IV. Results
a. Written with no opinion, this is a statement of facts uncovered in the analysis
b. Include all findings, even if they do not support your conclusion
c. Include the extent to which the data met the statistical assumptions if the method of analysis has assumptions.
V. Impact of the Results (or Discussion)
a. Introduce the results in terms of the data. Discuss how the results impact the problem statement.
b. State the first research question
i. Each research question is individually addressed in terms of the results and
ii. How results address the research question,
iii. Report any themes or patterns that are identified
c. Second research question
i. How results address the research question,
ii. Report any themes or patterns that are identified
d. Discuss any results from the analysis not specific to the research questions and the impact of these results.
VI. Recommendations for future analysis
a. Were there any ideas or insight the data analysis unveiled that looked like it may require further analysis?
b. Did your analysis lead you to believe a different method may result in more meaningful findings?
c. Use the research you did to recommend further research
VII. Conclusion
a. The conclusion is a summary of everything in the documentation.
b. Nothing new shall be introduced in the conclusion
c. Highlight key points of the research or findings
Documenting Research Guide Last Revised: 4/22/2020 3
Writing Tips • When writing a paper or developing a presentation, always include an introduction and conclusion.
• The basic principles of writing remain constant. Every paragraph should have a topic sentence. Transitional
sentences should be written when moving from one topic to the next.
• Think about basic compositional writing:
o In the introduction, there is a declaration of the main idea and supporting ideas. The body paragraphs
are on each of the supporting ideas. The conclusion is essentially restating the introduction.
o In research papers, section headings are your supporting ideas. Use the outline to organize your
graduate-level writing.
• Do not rely on plagiarism checking software to determine if you have plagiarized. Use common sense!
o Did you take material from an outside source?
o Then did you use that material? Then cite your source.
• Every reference must be included. Every reference must be cited in the text.
• When you cite a source, the statement that is cited must be properly paraphrased or it must be a quote.
• Use APA 7 in your writing throughout this course.
Documenting Research Guide Last Revised: 4/22/2020 4
Example Outline, based on the analysis in Unit 3 Part 1
Saving for a Home in an Unknown Market I. Saving for a Home in an Unknown Market
a. A family is moving across the country to buy a home in Boston,
Massachusetts.
b. What neighbors are good or bad
c. What home is a good buy, a steal, or over-priced
d. Neighborhoods with risk
II. Statement of the Problem
Most people mortgage homes. If you have lived in the same area, then it is likely
that you have a good understanding of home values. However, if you move to a
new area, you may not. Home values change, so how can you determine how
much the home you may want will cost? Understanding what characteristics of
the physical environment influence the values of homes will provide insight into
the values.
III. Research Methodology
a. Based on the data by Harrison and Rubinfeld (1978):
i. When considering the homes within the Boston dataset, what physical
characteristics impact the value of a home the most?
ii. Considering the characteristics that impact values the most, can the
value of a home, given these physical characteristics, be predicted
using the Boston data?
iii. The sample data contains the physical characteristics of the Boston
area from 1978:
1. Crime rate per capita by area,
2. Zone ratio of lots over 25,000 square feet by area,
3. Industrial versus residence land ratio by area, and
4. Charles riverfront property (yes or no) by area,
5. Nitrogen oxide concentration in the air in parts per million by area,
and
6. The average number of rooms per dwelling by area,
7. Homes built before 1940 versus newer homes ratio by area, and
8. The weighted average of the distance to employment centers by
area,
9. Radial highway accessibility index by area, and
10. Tax rate per $10,000 by area,
11. The student-teacher ratio in schools by area, and
12. The proportion of individuals in town denoted as black (racial
recrimination) by area,
13. Poverty income status of the population as a percentage by area,
and
14. Median home value of owned homes per $1000, closer to $10,000
for today’s standards, by area
The topic should be a complete sentence.
Supporting ideas can be incomplete sentences
The entire problem statement is written out.
Cite the source of the data, the sample.
All research questions are declared in the outline.
Include the reference of the data, in APA 7.
Declare the variables in words. Declare what the numbers represent, such as parts per million or by
$1000.
Do not use variable names.
Documenting Research Guide Last Revised: 4/22/2020 5
iv. Harrison, D., & Rubinfeld, D. L. (1978). Hedonic prices and the demand
for clean air. Journal of Environmental Economics and Management,
5(1), 81-102. https://doi.org/10.1016/0095-0696(78)90006-2
b. To address both research questions, a random forest model will be
created.
v. The use of importance features to identify characteristics’ impact on
explained variance
vi. Use the model to access the accuracy of predictions
vii. There are no statistical assumptions for this method of analysis.
viii. Limitations
1. Random forest models use existing data to find patterns for
prediction; cannot predict what it is not trained for
2. Interpretability can be difficult
IV. Results
V. Impact of the Results
VI. Recommendations for future analysis
VII. Conclusion
Identify the plan for analysis, based on the
research questions.
Declare how this method can address each of the
research questions.
Declare any statistical assumptions for this
method of analysis with a credible reference.
If there are known limitations to this method,
declare them.
Declare the headings for the remaining fields