Econometrics homework

Vincent666
Class1.pptx

Economic Measurement

Class 1 for Econometrics 1

Vincent Geloso

Focus on the « econ » part

Economics is based on a series of a priori and axioms

Theories have to be logically consistent

However, some theories can be consistent while they can apply to the same phenomenom and be mutually exclusive

E.g. signaling versus asymmetric information

Some theories can also not be mutually exclusive and apply to differing degrees to explain a given applied situation

E.g. the American revolution and tax incidence

Econometrics is about sorting theories and their relevance to the applied situations

Question tells you the measurements you need

As such, the first step in any econometric effort is to ask a clear applied question:

Does class size affect schooling outcomes?

Is the effect of education on wages greater than the effet of an extra year of experience?

Did Quebec separatist governments affect Quebec’s economic growth? (Somers and Vaillancourt 2014; Geloso and Grier 2018)

Were French farmers less efficient farmers than English farmers? (Geloso et al. 2017)

Did the stock market crash in 1929 because of news about the Smoot-Hawley tariffs (Beaudreaux 2015)

The question will tell you the variables you need

Class size and scoring tests

Data on schooling achievements (e.g. which university, what degree, what field) and wages

Support for separatism and GDP per capita

Sample of farm output data and cultural markers

Stock market data and news events

Question tells you the measurements you need

Once the question is set, two things should happen in your mind

You dress up the function of what you are looking for so as to convert the question into an economic form (see section 1.5 on page 13)

You picture the shape of the data!

Skimmed milk (i.e. water), Canada

42278 42309 42339 42370 42401 42430 42461 42491 42522 42552 42583 42614 42644 42675 42705 42736 42767 42795 42826 42856 42887 42917 42948 42979 43009 43040 43070 43101 43132 43160 43191 43221 43252 43282 43313 43344 43374 43405 43435 43466 43497 43525 43556 43586 43617 4.93 4.93 4.93 4.92 5.09 5.09 5.07 5.0999999999999996 5.0999999999999996 5.13 5.13 5.1100000000000003 5.13 5.07 5.1100000000000003 5.09 5.0999999999999996 5.1100000000000003 5.1100000000000003 5.0999999999999996 5.0999999999999996 5.0999999999999996 5.09 5.0999999999999996 5.0599999999999996 5.07 5.07 5.07 5.0999999999999996 5.0999999999999996 5.1100000000000003 5.17 5.09 5.0999999999999996 5.08 5.24 5.25 5.24 5.23 5.22 5.22 5.2 5.19 5.19 5.17

Shape of data

Cross sections (different cases at a single point in time)

E.g. French and English Canadian farmers in Quebec in 1831

Time series (a single case over time)

E.g. the New York stock market daily data in 1929 and an index of good/bad news regarding trade tariffs

Pooled cross-sections (different cases over time)

E.g. taking the individuals in different annual surveys (such as the census) and putting the surveys together

Panel or longitudinal (same cases over time)

E.g. Take taxpayers in 1989 and track their income to 2019; track the French and English farms of 1831 over the different censuses until 1871 etc.

Types of variables

Nominal/categorical

A dataset of earnings where the state of origin (e.g. Texas, California) is included

It will be included as dummy variable (i.e. Texas or not Texas) and can be useful in many settings such in panels when you cannot control for certain things (in econometrics II, I will discuss this in great details with the fixed effects model)

Ordinal

When you can make an ordinal but not cardinal ranking of things!

I can say that an unskilled workers will earn less than a skilled worker, but I cannot evaluate the actual distance in skills between the two. Here, we also use dummy variables

Interval/ratio measurement

A cardinal measure that can set the actual distance between different observations of the same variable

A quick note of caution (not exam)

Whatever results you get are as good as the data you used to answer the question! Sometimes, people forget this and can get false results.

As we will see later in this class, we use data to make inferences. Some of these inferences may be fallacious!

Example: Simpson’s Paradox and the ecological inference fallacy

Example: The situation of the French language in Quebec (Arsenault Morin and Geloso 2018)

Assembling the data

As we will see later (weeks 4 and 5), there are populations and samples. Populations are generally very large and it is daunting to take the whole population.

Censuses can do that, but they are costly and there are still problems that make them deviate from the true population features (i.e. people say shit on censuses).

You must also consider the right population (if you want to assess wage effect of going to Harvard University, what is the right population?)

The features of a population are known as parameters

Assembling the data

You must assemble a sample then!

Ideally, you likerandom samples (see page 11 of textbook)

The parameters of the population are statistics (or estimators) in the sample – the sample is used to infer statements about the parameters (which are generally unknown)

The way to write the data

One thing that you will frequently see in econometrics is the use of log-form (or ln with the natural logarithm)

A log = the power to which a given base must be raised to equal that number

Log of 100 to base of 10 is 2 because 10^2 = 100

The natural log: as n goes to infinity,

Try it in excel (take 400 rows to mean your n and then try the equation above, it will tend to 2.71828)

The great virtue of logs for our purposes is that the effects can be expressed in proportional terms (remember that in micro, the use of log-linear form)

Index numbers

One easy way to convert large quantities of information is to use index numbers, a little like you did in macro classes with GDP

The downside with index numbers is that they can « drift » (Gerschenkron 1947 as an example with Soviet GDP, 1913-1945) See appendix B in textbook (will be on exam)

Trends and fluctuations

One important form of data is time series

We do not discuss this in great details in Econometrics I, we will do so in Econometrics II

But there are things to know largely because economics really often uses time series

Trends and fluctuations are the main issue

You can decompose them! One way is to use the regressions we will see later with a « time trend » variable to control for the trend.

The other way is to use moving averages (gets you the trend) and then you can divide actual over moving average to get the detrended movements (i.e. how things move around the trend)