assignment work (AVD)

profileRavi12
Assignment2SpAuBr.pdf

11/4/20 Assignment 2 Sp Au Br.docx P a g e | 1

Research Assignment 2 The Outline for Research Assignment 2 and Research Assignment 2 will use this document.

Use the Documenting Research Guide to understand how to use the information in this document for

either of these submissions. Ask questions, if needed!

Problem:

Employers’ external job postings need to be posted to the one job board that targets their

model candidate and only receive applicants that are perfect for the role. In reality jobs are

typically posted in numerous places, and both suitable and unsuitable candidates apply for the

role. Using specific candidate characteristics and a specific job board, considering what may or

may not influence the use of a specific job board will lead to better targeting of candidates,

reducing redundant job postings, and decreasing the number of unfit candidates.

Question 1:

What are the most influential features when predicting whether a survey respondent has used

the SO job board or is aware of the board but has never used it when considering respondents

who reported residing in the country of Spain, Australia, or Brazil; reported their age as

somewhere between 18 and 65 years old; and that indicated that they were either not at all,

somewhat, or very confident in their manager; reported an undergraduate major in either an

engineering field, information systems, or web design, or statistics; in addition to the responses

these respondents reported regarding employment; how often the respondent contributes to

open source; and whether or not they code for a hobby; when the respondent indicated that

the number of years they have been coding is somewhere within one to 49 years using the

data from SO (2019)?

Question 2:

You are responsible for developing a second research question. This question must meet the

criteria from Unit 1 Part 1. Additionally, it must relate to the problem statement. It does not

have to use the same subset of data as the other research question. The analysis method

must be an analysis method demonstrated in one of the lectures. When completing the outline,

make sure to include both the given question and the well-developed, sound research question

you have developed.

Data:

• The data and data dictionaries are online.

o Note: The raw data in your program must be in the original form. Do not modify the data

outside of the programming. Use the data dictionary to understand the data.

o The data and data dictionary are downloaded together. When you visit this site, ensure

you select the 2019 survey and you cite and reference the source in your work.

▪ Stack Overflow. (2019). Stack overflow annual developer survey [Data set and

code book]. https://insights.stackoverflow.com/survey/

• Create a subset of data to represent the sample of secondary data in this analysis, based on

the research questions.

11/4/20 Assignment 2 Sp Au Br.docx P a g e | 2

Data Cleaning:

• Do not remove missing values during cleaning.

• When changing an object or part of an object, validate the change that occurred as expected.

• The steps that are taken in cleaning are not discussed in the research paper.

Analyze:

• When analyzing the given research question, you must use a random forest model.

o You must attempt to improve the model performance by one of the methods covered in

Unit 5.

o The research question you write must make use of a method of analysis demonstrated

in the lectures from this course.

o The use of Accuracy is not suitable in and of itself to determine the validity and reliability

of the model.

• The sub-stages of Analyze are necessary at least two times; profile, prepare, and apply. This

method is for programming, not documenting research.

• Ensure you establish that the model is valid and reliable before discussing the influential

indicators.

Results section and discussion section:

• Ensure that assertions and assessments in the results and discussion sections are derived

from the analysis in R.

• Do not speculate. Use evidence. When documenting the results, consider the generalizability.

• Explain what was done to improve model performance in words: not programming functions,

variable names, or argument names. Assume the reader cannot see the programming code or

raw data, but needs to understand what you did to improve the performance.

Future recommendations:

• Include recommendations for future analysis, based on the research in R.

• Explore the insights you can gain from this model and provide your interpretations when

documenting your research.

Bonus challenge:

Compare the influential indicators in predicting the outcome depending on the country by creating

separate models for each country. Describe if there were or were not distinct differences in the

contribution of the different predictors. Do not speculate when discussing the findings.

Tip: An additional research question that meets the five criteria from the first lecture will bring

this additional analysis into the focus of the research. The challenge does not replace the original

research requirements for this assignment. If you were to complete the challenge, there would be

three research questions.

Required files to submit:

1) Research paper in APA 7 format; MS Word document file type

2) R Script; final version with file type .r

11/4/20 Assignment 2 Sp Au Br.docx P a g e | 3

Important Information:

• You will receive an email confirming the submission. Should you receive that email, your

submission is received.

o An error is derived from the use of SafeAssign.

o SafeAssign does not recognize r file types. The warning does not impact the

submission.

• The research paper will be written in a professional writing style, following APA 7 student

paper format, use the student paper template.

o The document shall be 3-5 pages and at least 1000 words. The page count does

include the cover page, tables, or figures, or the reference page.

o Ensure that every reference in the reference list is also cited in the text.

o Do not forget to cite and reference the source of the data.

• It is ill-advised to modify the problem statement and research question provided.

• If the research problem or research questions are modified, the requirements of the analysis

will not change, nor the objective outlined in the original research question.

• There are several different versions of this assignment. If the submitted work is in line with a

different version than assigned, the submitted work is a demonstration of academic

dishonesty. Do not share the work with peers. Do not accept work that you did not do.

• Take a look at the rubric to get the best grade possible.

  • Problem:
  • Question 1:
  • Question 2:
  • Data:
  • Data Cleaning:
  • Analyze:
  • Results section and discussion section:
  • Future recommendations:
  • Bonus challenge:
  • Required files to submit:
  • Important Information: