Finance reserch report based on provided data

profileDennis547
Lecture6.pptx

Data Analysis – 1 Introduction

FINA305/405

1

Is economics science or art?

2

Conflicts between theory and practice

3

Conflicts b/t forecasts and realizations

4

Why analyzing data?

Researchers: To discover reliable business knowledge

Employees: To make your reports believable

Governments: To evaluate policy effects

Bottom line: To think in an innovative and interesting way

5

6

7

The key/challenge of data analysis

Causality

Pure effect

Ceteris paribus

Holding other factors constant

8

Data Analysis – 2 Fundamentals

FINA305/405

9

Agenda

Function notation

Types of variables

Types of data

Statistics

Practice in Excel

10

Function notation

Often we are interested in the relationship between two or more variables, denoted using a function:

Linear function:

Non-linear function:

11

We can consider our world as a data generating function.

Though we do not know the true function form.

We want to estimate it!

Terminologies

: dependent variable or explained variable

: independent variable or explanatory variable

: intercept term

the value of 𝑌 when

: coefficient or slope

a one-unit change of 𝑋 is associated with 𝛽 change of 𝑌

13

Example

14

Types of variable

Numerical

Discrete (binary or multinomial)

number of bedrooms in a house, number of children in a family

Continuous

salary, GDP, education, etc.

Categorical

Nominal scale:

Marriage status: not married, married, de facto, divorced, etc.

Ordinal scale: ranking

Evaluation scale: A+, A, A-, B+, …

15

Types of data

Cross-sectional data

Time-series data

Pooled cross-sectional data

Panel data

16

Cross-sectional data ()

YEAR CEO Name Gender Age Salary Total Compensation Industry Name
2009 David P. Storch MALE 57 799.208 4955.641 Aerospace & Defense
2009 Robert E. Switz MALE 62 695.711 2521.879 Communications Equipment
2009 Gerard Arpey MALE 50 669.646 3765.152 Airlines
2009 John S. Gilbertson MALE 66 706 879.3 Electronic Components
2009 Donald E. Brandt MALE 54 890.568 3997.26 Electric Utilities

17

Time-series data ()

YEAR CEO Name Gender Age Salary Total Compensation Industry Name
2005 David P. Storch MALE 53 716.6 12728.39 Aerospace & Defense
2006 David P. Storch MALE 54 741.5 12855.4 Aerospace & Defense
2007 David P. Storch MALE 55 768.248 8326.946 Aerospace & Defense
2008 David P. Storch MALE 56 791.295 3313.996 Aerospace & Defense
2009 David P. Storch MALE 57 799.208 4955.641 Aerospace & Defense

18

Pooled cross-sectional data ()

YEAR CEO Name Gender Age Salary Total Compensation Industry Name
2009 David P. Storch MALE 57 799.208 4955.641 Aerospace & Defense
2009 Robert E. Switz MALE 62 695.711 2521.879 Communications Equipment
2009 Gerard Arpey MALE 50 669.646 3765.152 Airlines
2009 John S. Gilbertson MALE 66 706 879.3 Electronic Components
2009 Donald E. Brandt MALE 54 890.568 3997.26 Electric Utilities
2008 V. James Marino MALE 58 856.25 3681.247 Personal Products
2008 Stanley M. Kuriyama MALE 54 400 669.394 Marine
2008 Paul J. Evanson MALE 66 1121.343 45342.249 Electric Utilities
2008 David Cote MALE 54 1825.962 20090.174 Aerospace & Defense
2008 David J. Aldrich MALE 51 583.404 1783.255 Semiconductors

19

Panel data ()

YEAR CEO Name Gender Age Salary Total Compensation Industry Name
2009 David P. Storch MALE 57 799.208 4955.641 Aerospace & Defense
2009 Robert E. Switz MALE 62 695.711 2521.879 Communications Equipment
2009 Gerard Arpey MALE 50 669.646 3765.152 Airlines
2009 John S. Gilbertson MALE 66 706 879.3 Electronic Components
2009 Donald E. Brandt MALE 54 890.568 3997.26 Electric Utilities
2008 David P. Storch MALE 56 791.295 3313.996 Aerospace & Defense
2008 Robert E. Switz MALE 61 742.415 2768.56 Communications Equipment
2008 Gerard Arpey MALE 49 666.348 4039.601 Airlines
2008 John S. Gilbertson MALE 65 706 968.324 Electronic Components
2008 Donald E. Brandt MALE 53 725 1730.574 Electric Utilities

20

Summary

Cross-sectional data:

Observations on multiple entities collected at a single point in time

Time-series data

A series of observations on one entity over successive periods of time

Pooled cross-sectional data

A combination of the above two with different entities

Panel data

A combination of the above two with the same entities

21

Notations

Subscripts (i or t) are used to denote different observations of a variable

We use subscript i for cross-sectional observations (i.e. states, individuals, etc), and t for time series observations (i.e. years, months, quarters)

Summation operator, capital sigma

22

Statistics

Mean vs median

The average value of the entire set of numbers ().

The middle value between the largest and smallest in a set of numbers.

23

Variance vs standard deviation

The spread/dispersion between numbers in a data set. Or how far each number in the set is from the mean.

24

Example

25

Theoretical distribution vs histogram

Theoretical distribution is a function showing all the possible values of the data and how often they occur.

A histogram/frequency is a graphical representation of the distribution of numerical data. It is an estimate.

26

Example

27

Correlation

Linear relationship between two variables.

28

Coefficient of correlation values

-1.0

+1.0

0

Perfect Positive Correlation

Increasing Degree of Negative Correlation

-.5

+.5

Perfect Negative Correlation

No Correlation

Increasing Degree of Positive Correlation

29

71

Coefficient of correlation plots

r = 1

r = -1

r = 0

Y

X

Y

X

Y

X

Y

X

30

72

Correlation matrix

A correlation matrix shows the correlation of pairs of variables.

Value $

Land Area

Rooms

Building Area

Value $

1

Land Area

0.00045

1

Rooms

0.60722

0.19927

1

Building Area

0.70607

-0.04193

0.86599

1

Diagonals always 1

31

Question

Correlation between education and wages is strong & positive.

Does this mean education “causes” higher wages?

Possible answers/theories

Yes.

Education improves skills, skilled workers get better paying jobs.

Not necessarily.

Individuals with high innate ability pursue more education. Innate ability (not education) causes wages to increase. Education is just a signal of ability.

Individuals in rich families get more education.

33

Practice in Excel

http:// www.rbnz.govt.nz/statistics/key-graphs/key-graph-house-price-values

Importing raw data

Summary statistics

Correlation matrix, Scatter plot

Histogram

Index and natural log transformation

å

-

å

-

å

-

-

=

2

)

X

i

(X

2

)

Y

i

(Y

)

X

i

)(X

Y

i

(Y

r

r.85