Empirical Methods in Software Engineering Assignment

profilefarzadbigz
ExperimentMethods3.pptx

CSCI 848: Empirical Methods in Software Engineering Selection Research Methods

Gursimran Singh Walia

[email protected]

Example

3

Revisiting Lecture 1 slide

Some Example Questions

Existence

Does model merging ever happen in practice?

Description/Classification

What are different types of model merging that occur in practice on large scale systems?

Descriptive – Comparative

How does model merging with explicit representation of relationships differ from model merging without such representation?

Causality

Does an explicit representation of relationship between models cause developers to explore different ways of merging models?

Causality – Comparative

Does the algebraic representation of relationship in Stu’s tool lead developers to explore more than do point cuts in AOM (Aspect oriented modeling)?

How should Stu select research method?

Next- Selecting Research Methods and Data Collection

CSCI 783: Empirical Software Engineering

What is an empirical truth?

For specified research questions, what to accept as valid answer?

RQ: “Do fisheye-views cause an improvement in efficiency for file navigation?”

Jane’s PhD Advisor: “the only conclusive way to prove that A causes B is to manipulate A in a controlled setting. and measure the effect on B”

Vs.

Jane’s Committee Member: “Jane should conduct her research in the field, investigating what developers actually do on real projects”

12

Having specified the research question (s), it is worth considering what to accept as valid answers.

Different people make different assumptions about scientific truth

Jane’s PhD advisor insists that the only trustworthy evidence to answer this question comes from experiments conducted under controlled laboratory conditions, pointing out that the only conclusive way to prove that A causes B is to manipulate A in a controlled setting, and measure the effect on B.

Vs.

Another member of Jane’s thesis committee is an experienced software practitioner and he claims that laboratory experiments are useless, as they ignore the messy complexity of real software projects. He points out that judgments about “improvements” to file navigation are subjective, and contextual factors such as distractions have a major impact. He suggests that Jane should conduct her research in the field, investigating what developers actually do on real projects.

12

CSCI 783: Empirical Software Engineering

Philosophical stances

The stance you adopt affect the methods you choose (that you believe lead to acceptable evidence) in response to your research question(s)

13

The stance you adopt affects which methods you believe lead to acceptable evidence in response to your research question(s). Being explicit about your stance also helps when talking and writing about research. You might not be able to convince other people to change their stance, but you will be able to argue cogently for why you chose the methods you did.

• Positivitism states that all knowledge must be based on logical inference from a set of basic observable facts. Positivists are reductionist, in that they study things by breaking them into simpler components. This corresponds to their belief that scientific knowledge is built up incrementally from verifiable observations, and inferences based on them. Positivism has been much attacked over the past century due to doubts about the reliability of our observations of the world, and the complication that scientific “fact” built up in this manner sometimes turns out to be wrong. While positivism still dominates the natural sciences, most positivists today might more accurately be described as post-positivists, in that they tend to accept the idea (due to Popper) that it is more productive to refute theories than to prove them, and we increase our confidence in a theory each time we fail to refute it, without necessarily ever proving it to be true. Positivists prefer methods that start with precise theories from which verifiable hypotheses can be extracted, and tested in isolation. Hence, positivism is most closely associated with the controlled experiment; however, survey research and case studies are also frequently conducted with a positivist stance. Note that a belief in reductionism is needed to accept laboratory experiments as valid in software engineering – you have to convince yourself that the phenomenon you are interested in can

be studied in isolation from its context.

• Constructivism, also known as interpretivism , rejects the idea that scientific knowledge can be separated from its human context. In particular, the meanings of terms used in scientific theories are socially constructed, so interpretations of what a theory means are just as important in judging its truth as the empirical observations on which it is based. Constructivists concentrate less on verifying theories, and more on understanding how different people make sense of the world, and how they assign meaning to actions. Theories may emerge from this process, but they are always tied to the context being studied. For example, an anthropologist studying the culture of a software design team might seek to find out how different members of the team think about and use the tools they have available, and build local theories that explain why this particular team uses tools in the way that they do. This stance is often adopted in the social sciences, where positivist/reductionist approaches have little to say about the richness of social interactions. Constructivists prefer methods that collect rich qualitative data about human activities, from which local theories might emerge. Constructivism is most closely associated with ethnographies, although constructivists often use exploratory case studies and survey research too.

• Critical Theory judges scientific knowledge by its ability to free people from restrictive systems of thought (Calhoun, 1995). Critical theorists argue that research is a political act, because knowledge empowers different groups within society, or entrenches existing power structures. Critical theorists therefore choose what research to undertake based on whom it helps. They prefer participatory approaches in which the groups they are trying to help are engaged in the research, including helping to set its goals. Critical theorists therefore tend to take emancipatory or advocacy roles. In sociology, critical theory is most closely associated with Marxist and feminist studies, along with research that seeks to improve the status of various minority groups. In software engineering, it includes research that actively seeks to challenge existing perceptions about software practice, most notably the open source movement, and, arguably, the process improvement community and the agile community. Critical theorists often use case studies to draw attention to things that need changing.

However it is action research that most closely reflects the philosophy of critical theorists.

• Pragmatism acknowledges that all knowledge is approximate and incomplete, and its value depends on the methods by which it was obtained. For pragmatists, knowledge is judged by how useful it is for solving practical problems. Put simply, truth is whatever works at the time. This stance therefore entails a degree of relativism: what is useful for one person to believe might not be useful for another; therefore truth is relative to the observer. To overcome the obvious criticisms, many pragmatists emphasize the importance of consensus – truth is uncovered in the process of rational discourse, and is judged by the participants as whatever has the better arguments. Pragmatism is less dogmatic than the other three stances described above, as pragmatists tend to think the researcher should be free to use whatever research methods shed light on the research problem. In essence, pragmatism adopts an engineering approach to research – it values practical knowledge over abstract knowledge, and uses whatever methods are appropriate to obtain it. Pragmatists use any available methods, and strongly prefer mixed methods research, where several methods are used to shed light on the issue under study.

Although there are examples of research from each of these stances in the software engineering literature, the underlying philosophies are never mentioned. We believe this has contributed to confusion around the selection of empirical methods and appropriate evaluation of empirical research. In particular, it is impossible to avoid some commitment to a particular stance, as you cannot conduct research, and certainly cannot judge its results, without some criteria for judging what constitutes valid knowledge.

The role of Theory building

Theories (building blocks of scientific knowledge) explain how and why certain phenomenon occur and allow predictions to be made.

A scientific theory identifies and defines a set of phenomena, and make assertions about the nature of the phenomenon

A good theory

Precisely defines the theoretical terms, so that a community of scientists can measure and observe them

also explains why certain relationships occur

14

A distinguishing feature of scientific study is the development of theories that explain how and why certain phenomena occur, and allow predictions to be made. Theories are therefore the building blocks of scientific knowledge. The different philosophical stances differ in their ideas about the role of theory. To the positivist, science is the process of verifying theories by testing hypotheses derived from them. To the constructivist, science is the process of seeking local theories that emerge from (and explain) the data. To the critical theorist, theories are assertions of knowledge (and therefore power), to be critiqued in terms of how they shape that power. To the pragmatist, theories are the products of a consensual process among a community of researchers, to be judged for their practical utility.

A scientific theory identifies and defines a set of phenomena, and makes assertions about the nature of those phenomena and the relationships between them.

A good theory precisely defines the theoretical terms, so that a community of scientists can observe and measure them. A good theory also explains why certain relationships occur. Positivists expect their theories to have strong predictive power, and so look for generalized models of cause-and-effect as the basis for theories. In contrast, constructivists expect theories to strengthen their understanding of complex situations, and so tend make more use of categorizations and analogies. Theories are also judged for aesthetic value. Often there is more than one theory that explains empirical observations, so the theories that are simpler, or more elegant are preferred.

THE ROLE OF THEORY BUILDING: An Example

Joe might develop a theory around the use of UML diagrams as a stylized form of external memory.

According to Joe’s Theory: “UML diagrams are used to summarize the results of meetings and discussions, to remind participants of a shared understanding that they have already developed”.

Joe’s theory must precisely define:

Meaning of terms such as: “diagram”, “participants”, and “discussions”

Should also explain why people chose to use UML in some circumstances and not others; and

It should be able to predict qualities of the diagram that a software team might produce based on certain factors

15

As an example, Joe might develop a theory around the use of UML diagrams as a stylized form of external memory. According to his theory, UML diagrams are used to summarize the results of meetings and discussions, to remind participants of a shared understanding that they have already developed.

Joe’s theory must precisely define the meaning of terms such as “diagram”, “participants”, “discussions,” in order to identify them in any studies performed. Joe’s theory should also explain why people choose to use UML in some circumstances but not others, and why they include certain things in their diagrams and exclude others. And finally, it should be able to predict qualities of the diagrams that a software team might produce based on certain factors.

It is important to understand that in any empirical study, theories have a strong impact on how things are observed and interpreted. The theory becomes a “lens” through which the world is observed. This happens whether or not theories are explicitly acknowledged, because real-world phenomena are simply too rich and complex to study without a huge amount of filtering. In quantitative research methods, the theoretical lens is used explicitly to decide which variables to isolate and measure, and which to ignore or exclude. In qualitative methods, the theoretical lens is often applied after data is collected, to focus the process of labeling and categorizing (“coding”) the data.

Grounded Theory

Grounded Theory is a technique for developing theory iteratively from qualitative data

In Grounded Theory

Initial analysis of the data begins with any preconceived categories.

As interesting patterns emerge, the researcher repeatedly compares these with existing data,

Collects more data to support or refute the emerging theory

16

Few scientists give thought to how theories are created. A notable exception is Grounded Theory, a technique for developing theory iteratively from qualitative data.

In grounded theory, initial analysis of the data begins without any preconceived categories. As interesting patterns emerge, the researcher repeatedly compares these with existing data, and collects more data to support or refute the emerging theory. Despite its close association with the constructivist stance, Grounded Theory probably approximates how most scientists end up developing theories. The difference is that Grounded Theory makes the process explicit and systematic.

Theories also play a role in connecting research to the relevant literature. By defining the key terms, the results of empirical studies can be compared. Furthermore, theories support the process of empirical induction because an individual study can never offer conclusive results. Each study adds more evidence for or against the propositions of the theory. Without the theory, we have no way of making sense of the accumulation of empirical results. Software Engineering researchers have traditionally been very poor at making theories explicit. Many of the empirical studies conducted over the past few decades fail to relate the collected data to an underlying theory. The net result is that results are hard to interpret, and studies cannot be compared.

Selecting Research Methods

You want to select a research method to collect and analyze empirical data, in order to

support or refute your theory, or

Answer your research questions

Research Design is the process of selecting a method for a particular research problem

The validity of the results depends on how well the research design compensates for the weaknesses of the methods

A method is a set of organizing principles around which empirical data is collected and analyzed. A variety of methods can be applied to any research problem, and it is often necessary to use a combination of methods to fully understand the problem. The choice of methods depends upon the theoretical stance of the researcher(s), access to resources (e.g., students or professionals as subjects/participants) and how closely the method aligns with the question(s) that have been posed.

Research Design is the process of selecting a method for a particular research problem, tapping into its strengths, while mitigating its weaknesses.

The validity of the results depends on how well the research design compensates for the weaknesses of the methods.

Research Methods: Controlled Experiments

A controlled experiment is an investing a testable hypothesis where

One or more independent variables (are manipulated to measure the effect on) vs. one or more dependent variables

Each combination of values of the independent variables is a treatment

Simple vs. Complex experiment designs

Human Subjects

A controlled experiment is an investigation of a testable hypothesis where one or more independent variables are manipulated to measure their effect on one or more dependent variables.

Controlled experiments allow us to determine in precise terms how the variables are related and, specifically, whether a cause-effect relationship exists between them. Each combination of values of the independent variables is a treatment.

The simplest experiments have just two treatments representing two levels of a single independent variable (e.g. using a tool vs. not using a tool). More complex experimental designs arise when there are more than two levels or more than one independent variable is used.

Most software engineering experiments require human subjects to perform some task, We measure the effect of the treatments on the subjects.

Controlled Experiments: Preconditions

What should guide your experiment design?

For example, Jane might decide to run an experiment to test hypothesis that:

“Fisheye views cause more efficient file navigation than traditional file tree explorer views”

The above hypothesis is drawn from a theory that explains the effect.

What is the theory?

What are the treatments?

Independent vs. Dependent Variables?

The theory is that fisheye views correspond well to the way that people see and navigate in the world, by offering more detail of a specific area of focus, together with a less detailed overview of the peripheral regions, and a smooth way of moving the focus of attention. The theory suggests that less time spent scrolling and fewer clicks should reduce navigation time

Independent Variable - type of file explorer view used

Dependent Variable – Length of time to navigate a file

A precondition for conducting an experiment is a clear hypothesis. The hypothesis (and the theory from which it is drawn) guide all steps of the experimental design, including deciding which variables to include in the study and how to measure them.

For example, Jane might decide to run an experiment to test the hypothesis that fisheye views cause more efficient file navigation than traditional file tree explorer views.

This hypothesis is drawn from a theory that explains the effect. The theory is that fisheye views correspond well to the way that people see and navigate in the world, by offering more detail of a specific area of focus, together with a less detailed overview of the peripheral regions, and a smooth way of moving the focus of attention. The theory suggests that less time spent scrolling and fewer clicks should reduce navigation time. This suggests the treatments should be the type of file explorer view used: fisheye view versus the traditional scrolled view, and the dependent variable should be the length of time to navigate to a file.

Controlled Experiments: Subjects and Tasks

“Fisheye views cause more efficient file navigation than traditional file tree explorer views”

The theory also helps to decide who the subjects are, and what tasks should be

the idea is to demonstrate that the hypothesis applies to the whole population by testing it on a representative sample

For example, Jane recruits CS grad students

and screens them to select subjects with lots of programming experience

In ESE, it is common to recruit students as subjects

The theory also helps to decide who the subjects are, and what the tasks should be. To ensure the results of the experiment are valid, the subjects should be drawn from a well-defined population – the idea is to demonstrate that the hypothesis applies to the whole population by testing it on a representative sample.

For her experiment, Jane recruits computer science grad students as subject programmers, and screens them to select subjects with lots of programming experience.

In SE, it is common to recruit students as subjects. This makes it easier to recruit a large group of subjects, but reduces external validity – an analytical argument is needed for why results on students might still apply to software developers in industry.

Controlled Experiments: Control

Control is important – variables other than the chosen independent variables must not be allowed to affect the experiment

E.g., in Jane’s case, “difference in skill levels” of her subject needs to be controlled. What can she do to control this variable?

so she might first divide her subjects into groups (or blocks) according to their skill level, and

randomly assign subjects from each block to the two treatments

Jane need to choose either:

Between Subjects Design, or

Within Subjects Design

however it introduces learning effect). You need to make a DECISION!!

Control is important – variables other than the chosen independent variables must not be allowed to affect the experiment.

In Jane’s case, differences in skill levels of her subjects may affect the experiment, so she might first divide her subjects into groups (or blocks) according to their skill level, and randomly assign subjects from each block to the two treatments, for a “between subjects design”.

An alternative is to use a “within subjects design”, in which each subject uses all treatments; however this might introduce learning effects from one treatment to the next, so this needs to be accounted for in the design. Jane needs to decide which confounding factor is more important to control.

Research Methods: Controlled Experiments

Controlled Experiments – reduce complexity by allowing only a few variables to vary in a controlled manner

If critical variables are ignored, the experiment might not generalize to real world settings

Variants on experiments are possible when true experiments are not available:

Quasi-Experiments

Time-Series Experiments

The experimental method is closely tied to the positivist stance. This is because experiments are essentially reductionist – they reduce complexity by allowing only a few variables of interest to vary in a controlled manner, while controlling all other variables.

If critical variables are ignored or controlled, the experimental results might not generalize to real world settings. For example, in choosing to focus on efficiency as a dependent measure, Jane ignores other possible measures, such as awareness of the file structure that may result from other navigation techniques. The reduction can also mask critical interaction effects, such as the interaction between expertise and preferred navigation environment. For these reasons, if Jane’s experiment confirms her hypothesis, it means she has evidence that fish-eye views are more efficient (as she defines efficiency), but it doesn’t necessarily mean that fisheye views are better suited to navigation!

Variants on experiments are possible and can be used in circumstances where a true experiment is not possible. For example, in quasi-experiments the subjects are not assigned randomly to the treatments. Quasi-experiments may be used, for example, when, for ethical reasons, subjects must be allowed to choose their treatment. Quasi-experiments are also used in the field. For example if an experiment is performed in a company, there may be constraints on which employees can work on which tasks. In time-series experiments, the effect of a treatment is measured in discrete time steps over a period of time. These variations are less powerful than true experiments, and require more careful interpretation.

Research Methods: Case Studies

Yin (2002) - “An empirical inquiry that investigates a contemporary phenomenon within its real-life context”

Case studies offer

in-depth understanding of how and why certain phenomena occur, and

can reveal the mechanisms by which cause-effect relationships occur.

Exploratory Case Studies - used as initial investigations of some phenomena to derive new hypotheses

Confirmatory Case Studies – used to test existing theories

There is much confusion in the SE literature over what constitutes a case study. The term is often used to mean a worked example. As an empirical method, a case study is something very different.

Yin (2002) introduces the case study as “an empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident.” Case studies offer in-depth understanding of how and why certain phenomena occur, and can reveal the mechanisms by which cause-effect relationships occur.

Exploratory case studies are used as initial investigations of some phenomena to derive new hypotheses and build theories, and confirmatory case studies are used to test existing theories. The latter are especially important for refuting theories: a detailed case study of a real situation in which a theory fails may be more convincing than ‘failed’ experiments in the lab. The detailed insights obtained from confirmatory case studies can also be useful for choosing between rival theories.

Conducting a Case Study: Precondition

Precondition – is a clear research question concerned with how or why certain phenomena occur

guides the selection of cases and the types of data to collect

E.g., Imagine that Jane is upset as her tool is not adopted by developers after her experiment

Jane noticed in the post-experiment interviews that subjects frequently mentioned using additional advanced features for navigation that do not involve the file explorer

RQ: “How do developers use navigation tool support for large systems under development?”, and

Specifically she focuses on remarks that: “expert developers use many different strategies for navigation, and move between them very rapidly”

She chose a LOCAL COMPANY with several very experienced developers

Focused on observations rather than interview data to find out what developers actually do in more detail and in-person

A precondition for conducting a case study is a clear research question concerned with how or why certain phenomena occur. This is used to derive a study proposition that states precisely what the study is intended to show, and to guide the selection of cases and the types of data to collect.

As an example, imagine that Jane is upset as her tool is not adopted by developers after her experiment. She noticed in the post-experiment interviews that subjects frequently mentioned using additional advanced features for navigation that do not involve the file explorer (the only navigation tool available in the experiment). Hence, she poses the research question “How do developers use navigation tool support for large systems under development?”, and decides to focus on a specific proposition suggested by the post-experiment interviews that “expert developers use many different strategies for navigation, and move between them very rapidly”.

This leads her to choose a local company with several very experienced developers as her case, and to focus on observational rather than interview data, to find out what the developers actually do at a fine grain of detail.

Conducting a Case Study: Selecting Cases

The selection of cases is critical

Case study uses purposive sampling

Aim is to select cases that are most relevant to study proposition

Single Case Design

if the theory holds for this critical case, it is likely to be true for many others; OR

extreme or unique case that is expected to yield interesting insights about what happens under extreme conditions

Multiple Case Design

E.g., if Jane’s theory predicted that experienced developers do file navigation differently from novices

A multiple case study could include both experts and novices

The selection of cases is a crucial step in case study research. Case study research uses purposive sampling rather than random sampling. The aim is to select cases that are most relevant to the study proposition.

Sometimes a single case is sufficient. This might be because it is a critical case for testing a well-formulated theory: if the theory holds for this case, it is likely to be true for many others. Or it might be an extreme or unique case that is expected to yield interesting insights about what happens under extreme conditions, such as a crisis. Sometimes it is sufficient to identify a typical case to gain more insight into common situations.

However, a multiple case design usually offers greater validity. The different cases are best thought of as replications, rather than members of a sample. For confirmatory case studies, these can be chosen as literal replications, where each case is expected to show the same results, or as theoretical replications, where cases are expected to show contrasting results for predictable reasons. An example of the latter would be if Jane’s theory predicted that experienced developers do file navigation differently from novices. A multiple case study could include both experts and novices, to confirm that the theory adequately explains both.

Conducting a Case Study: Selecting Data Sources

Variety of data sources are used

Qualitative data, including interviews and observation, play a central role, as these offer rich insights into the case.

Data collection is always performed w.r.t a well-defined unit of analysis

Unit of analysis might be a company, a project, a team, an individual developer, a particular episode or event, a specific work product, etc

In Jane’s Case - Unit of analysis

individual developer (will help her focus on personal style/pref.)

A project - allow her to identify whether project teams develop shared navigational styles

A variety of different data sources are typically used in case study research. Qualitative data, including interviews and observation, play a central role, as these offer rich insights into the case.

Data collection is always performed with respect to a well-defined unit of analysis. In software engineering, the unit of analysis might be a company, a project, a team, an individual developer, a particular episode or event, a specific work product, etc. Choosing an appropriate unit of analysis is important, to ensure the study focuses on the intended phenomena. In Jane’s case, she chooses the individual developer as her unit of analysis, allowing her to focus on personal style of different developers. Other choices would lead the case study in different directions. For example, choosing a project as the unit of analysis would allow her to identify whether project teams develop shared navigational styles, but would offer less insights into individual styles.

Note that Jane’s case (a company) has multiple embedded units of analysis (the developers). In some studies, the case is the same as the unit of analysis.

Conducting a Case Study: Appropriateness?

Case study is most appropriate where

Context is expected to play a role

e.g., if the stresses of a real project affect developers’ behavior or

Effects are expected to be wide ranging, or take a long time

e.g. weeks, months, years to appear

Major weakness of case studies is that the data collection and analysis is more open to interpretation and researcher bias

Case study research is most appropriate for cases where the reductionism of controlled experiments is inappropriate. This includes situations where the context is expected to play a role in the phenomena (for example if the stresses of a real project affect developers’ behaviour), or where effects are expected to be wide ranging, or take a long time (e.g. weeks, months, years) to appear.

The major weakness of case studies is that the data collection and analysis is more open to interpretation and researcher bias. For this reason, an explicit framework is needed for selecting cases and collecting data. Although an individual case study often reveals deep insights, the validity of the results depends on a broader framework of empirical induction. For example, in confirmatory case studies, evidence builds when subsequent case studies also support the theory and/or fail to support rival theories.

Research Methods: Survey Research

Survey Research may involve

Use of questionnaires for data collection, or

Structured Interviews

The defining characteristic of survey is the selection of a representative sample and how to reach that sample?

e.g., Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design.

Survey research is used to identify the characteristics of a broad population of individuals. It is most closely associated with the use of questionnaires for data collection. However, survey research can also be conducted by using structured interviews, or data logging techniques. The defining characteristic of survey research is the selection of a representative sample from a well defined population, and the data analysis techniques used to generalize from that sample to the population, usually to answer base-rate questions.

As an example, recall that Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design.

Research Methods: Survey Research

Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design.

He conducts a survey of software companies across the country to ask them whether they use UML, and if so how.

He decides to use individual developers as his unit of analysis, so that he can focus on how different developers perceive the utility of UML.

He posts his survey to a number of carefully selected developer email lists, and has a response rate of 10%.

The results from the survey are interesting. He discovers that only about 20% of the respondents use UML, and that the diagrams are rarely used in shared settings.

He also learns that class diagrams are the most frequently used diagram, with sequence diagrams a close second.

As an example, recall that Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design. He conducts a survey of software companies across the country to ask them whether they use UML, and if so how. He decides to use individual developers as his unit of analysis, so that he can focus on how different developers perceive the utility of UML. He posts his survey to a number of carefully selected developer email lists, and has a response rate of 10%. The results from the survey are interesting. He discovers that only about 20% of the respondents use UML, and that the diagrams are rarely used in shared settings. He also learns that class diagrams are the most frequently used diagram, with sequence diagrams a close second.

Research Methods: Survey Research

A major challenge in survey research is to control for sampling bias

because the respondents to the survey may not be representative of the target population

if the 10% who responded to Joe’s survey were the least busy of his targeted developers

it may be that the survey missed the most skilled, or most senior developers; OR

only people who are frustrated with UML answered his survey

challenge is to ensure that the questions are designed in a way that yields useful and valid data

hard to phrase the questions such that all participants understand them in the same way

As an example, recall that Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design. He conducts a survey of software companies across the country to ask them whether they use UML, and if so how. He decides to use individual developers as his unit of analysis, so that he can focus on how different developers perceive the utility of UML. He posts his survey to a number of carefully selected developer email lists, and has a response rate of 10%. The results from the survey are interesting. He discovers that only about 20% of the respondents use UML, and that the diagrams are rarely used in shared settings. He also learns that class diagrams are the most frequently used diagram, with sequence diagrams a close second.

Research Methods: Ethnographies

Ethnography is a form of research focusing on the sociology of meaning through field observation

One conclusion from Joe’s study is that people don’t seem to use UML in the way Joe expected

An ethnography would allow Joe to understand more about how developers use and share UML

He identifies a development team that allows him to observe design meetings for several weeks

He could supplement his notes on what he observes with a series of individual and group interviews

A special form of ethnography is participant observation

Survey research is used to identify the characteristics of a broad population of individuals. It is most closely associated with the use of questionnaires for data collection. However, survey research can also be conducted by using structured interviews, or data logging techniques. The defining characteristic of survey research is the selection of a representative sample from a well defined population, and the data analysis techniques used to generalize from that sample to the population, usually to answer base-rate questions.

As an example, recall that Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design.

Research Methods: Action Research

action researchers aim to intervene in the studied situations for the explicit purpose of improving the situation

E.g., education, where major changes in educational strategies cannot be studied without implementing them

in the process of studying the use of UML, imagine that Joe’s colleagues discussed with him their difficulty in integrating software components and predicting the effects of such integration

Joe sees this as an opportunity to work with them to try out ideas from model-driven development (MDD), and to study firsthand how UML changes the way that developers collaborate

Joe initiates a project to work with his colleagues to introduce MDD and to record the experiences

Survey research is used to identify the characteristics of a broad population of individuals. It is most closely associated with the use of questionnaires for data collection. However, survey research can also be conducted by using structured interviews, or data logging techniques. The defining characteristic of survey research is the selection of a representative sample from a well defined population, and the data analysis techniques used to generalize from that sample to the population, usually to answer base-rate questions.

As an example, recall that Joe wished to understand more about how UML is used in industrial settings, and how UML supports collaborative design.

Research Methods: Mixed Methods Approaches

You can use different methods as you learn more about their research topics.

While Jane began with the design of an experiment to test the efficiency of file navigation with the fisheye view, she went on to perform a case study to explore some of the unexpected findings from the experiment.

This approach can be characterized as mixed methods research

Conclusion

Described key elements of empirical research design:

A Clear RQ

Selecting appropriate research method

Study Design

Data Collection

Criteria for assessing the validity of results

Have not talked about

Replication of experiments

Meta Analysis

We have described the key elements of empirical research design: A clear research question provides a focus to your study. An explicit philosophical stance helps you understand your research goals, and select an appropriate research method. A research method helps you design a study, and decide what kinds of data to collect and how to collect it. A theory helps you explain the data and relate it to the research question and to previous studies in the literature. An appropriate set of criteria for assessing validity helps improve the study design, and clarify the nature of the conclusions.

We have not addressed a number of related topics, including replication and meta-analysis. As the number of empirical studies in software engineering increases, these become more important. In particular, it is only through empirical induction that we come to trust the results of empirical research – i.e. the results need to hold up across many different studies to be considered reliable. Meta-analysis is the process of systematically comparing the results of multiple studies, taking into account differences in the design and context of each individual study. In current software engineering research, meta-analysis is hard to accomplish because of huge variability in the style and quality of the published reports of empirical work.