empirical analysis

profilelyj19950309
project_example.pdf

ECON  203  Project   [SAMPLE]  

1  

The  effect  of  family  income  on  labor  supply  of  employed   youth  in  Argentina  

  1.  Introduction     In  basic  economic  theory,  individuals  maximize  a  utility  function  that  depends  on   consumption  of  goods  and  leisure,  subject  to  a  budget  constraint.  Family  income  can   be  an  important  factor  of  the  budget  constraint,  where  higher  family  income  would   increase  the  consumption  possibilities  of  the  individual  and  decrease  their  need  to   work   in  order  to  earn   income  to  maintain  a  similar   level  of  consumption.  At   the   same   time,   the   individual   would   be   better   off   by   working   less   since   they   value   leisure.       In  this  study  I  want  to  examine  the  main  determinants  of  labor  supply  of  working   youths  in  Argentina,  measured  by  hours  worked.  In  particular,  I  am  interested  in   testing   whether   the   effect   of   family   income   satisfies   the   predictions   given   by   traditional   economic   theory.   The   main   prediction   would   be   that   higher   family   income  would  decrease  the  number  of  hours  work  by  the  individual.     In order to evaluate the determinants of hours of labor supplied, I use individual level data for 2005 from the Permanent Survey of Households of the National Institute of Statistics (INDEC) from Argentina. I restrict the sample to youths between the ages of 18 and 25 that work a positive number of hours in order to identify how family income affects the intensive margin of hours of work supplied. By using multiple regressions model, I am able to examine the main determinants of youth labor supply and also examine how family income affects different populations.

I show that my final model satisfies all the underlying assumptions. The data has no serious outliers, nor are the independent variables highly correlated among each other. In addition, by examining the histograms of standardized residuals and scatter plot of residuals against predicted values of the dependent variable, I am able to conclude that the error term is normally distributed and the variance appears to be constant. This implies that it is not necessary to transform the dependent variable.

Results suggest that family income plays a central role in determining number of hours worked by youths in Argentina. Increasing family income by $1,000 pesos (around 16% of the mean income) is associated to a decrease of 9 hours of work per week (22% of the average hours worked). This evidence supports the theoretical predictions of the effect of family income on labor supply. I also find that age, gender, years of education and experience are important determinants of labor supply. On the other hand, family size, number of children and being married do not appear to be statistically related to hours worked. Finally, I test whether males respond differently to changes in family income compared to females by including the interaction between a dummy for male and family income. I find that males respond less to changes in family income than women.

ECON  203  Project   [SAMPLE]  

2  

2. Data

In this study I use data for Buenos Aires, Argentina for the year 2005, based on the Permanent Survey of Households compiled by the National Institute of Statistics (INDEC) and processed by CEDLAS of the University of La Plata. Since the main objective of this study is to analyze the effect of family income on labor supply, the dependent variable used will be total hours worked in a week (hrswrk) and the main independent variable is family monthly income (faminc). Family income does not include income earned by the individual and is measured in Argentine pesos. Other independent variables included are: age (age), a dummy variable that equals 1 if the individual is male (male), years of education (educ), number of children (nchild), a dummy variable that equals 1 if married (married), number of family members (nfamily), and a measure of potential experience (exp).

Table 1 presents descriptive statistics for my sample. The sample is comprised of 685 working individuals, where around 66% of them are male. On average, they work 41 hours a week, with a maximum of 98 hours and a minimum of 1 (individuals not working were excluded). The average family monthly income is $6,200 pesos. In Figure 1, I show the scatter plots of hours worked against each of the independent variables in order to identify severe outliers. From the plots it appears that there are no severe outliers. In addition, the plot of hours work against family income suggests that there may be a curve-linear relation between them.

Before performing regression analysis, I examined the correlation between all my variables in order to identify possible serious multicollinearity between independent variables. The correlation table is presented in Table 2. No two independent variables have a correlation higher than 0.8, therefore serious multicollinearity should not be a concern in the regression analysis.

3.  Regression  Analysis     In  order  to  examine  the  determinants  of  labor  supply  I  estimate  the  following  linear   regression  as  the  initial  model:     𝒉𝒓𝒔𝒘𝒓𝒌𝒊

= 𝜶 + 𝜷𝟏𝒇𝒂𝒎𝒊𝒏𝒄𝒊 + 𝜷𝟐𝒇𝒂𝒎𝒊𝒏𝒄𝟐 + 𝜷𝟑𝒏𝒇𝒂𝒎𝒊𝒍𝒚𝒊 + 𝜷𝟒𝒎𝒂𝒍𝒆𝒊 + 𝜷𝟓𝒏𝒄𝒉𝒊𝒍𝒅𝒊 + 𝜷𝟔𝒎𝒂𝒓𝒓𝒊𝒆𝒅𝒊 + 𝜷𝟕𝒆𝒅𝒖𝒄𝒊 + 𝜷𝟖𝒆𝒙𝒑𝒊 + 𝜷𝟗𝒂𝒈𝒆𝒊 + 𝜺𝒊  

  As  discussed  earlier,  family  income  (faminc)  is  likely  an  important  determinant  of   labor   supply   since   it   enters   the   budget   constraint   of   the   utility   maximization   problem  of   the   individual.   I   include   family   income  squared  since  the  scatter  plot   from  Figure  1  suggested  that  there  may  be  a  curve-­‐linear  relation  between  family   income   and   hours   worked.   There   are   other   independent   variables   that   are   potentially  important  factors  in  determining  labor  supply  and  typically  used  in  labor   economics.   For   example,   older   individuals   may   receive   less   support   from   their   parents  and  may  be   forced   to  work.  Being  married  or  having  children  may  also  

ECON  203  Project   [SAMPLE]  

3  

influence  the  decision  to  work.  At  the  same  time,  individuals  with  more  education   may  receive  more  attractive  job  offers,  increasing  the  opportunity  cost  of  leisure.       The  results   for   the   initial  model  are  presented   in  Table  3.  When  considering  the   overall  goodness  of  fit  of  the  model,  the  R-­‐squared  indicates  that  the  initial  model   explains   62%   of   the   variation   of   hours   of   work.   In   addition,   the   independent   variables  are  relevant  as  a  whole  since  the  null  hypothesis  of  the  F-­‐test  for  overall   significance  of  the  model  is  rejected  at  the  5%  level.  Even  though  the  model  is  good   as  a  whole,   there  are  many   individual  slope  coefficients   that  are  not  statistically   different  from  zero  even  at  the  10%  level.  These  variables  are:  number  of   family   members,  number  of  children  and  marital  status.  Additionally,  the  quadratic  term   for  family  income  has  a  p-­‐value  of  over  20%,  therefore  I  can  discard  a  curve-­‐linear   relation  between  family  income  and  hours  worked.     In   the   previous   section   we   already   discarded   the   problem   of   serious   multicolinearity   between   independent   variables.   One   reason   that   some   of   these   variables  may  not  be  significant  is  the  possibility  of  the  variance  not  being  constant   (heteroskedasticity)  or  non-­‐normality  of  the  error  term.  In  both  of  these  cases,  the   standard  errors  obtained  would  not  be  valid.  In  Figure  2  I  present  the  histogram  of   standardized  residuals  to  visually  inspect  for  non-­‐normality  of  the  error  term.  The   figure  clearly  shows  that  the  distribution  follows  a  normal  distribution.  To  test  for   the  existence  of  heteroskedasticity,  I  plot  the  residuals  against  the  predicted  hours   worked.  This  is  presented  in  Figure  3.  It  appears  that  the  variance  is  quite  constant   and  across  predicted  values  of  hours  work.     In  order  to  obtain  a  final  model  on  which  to  make  inference  of  the  results,  I  drop  the   variables  that  are  not  significant  and  re-­‐estimate  the  model.  Regression  results  are   presented  in  Table  4.  As  a  result  of  dropping  these  insignificant  variables,   the  R-­‐ squared   does   not   change   and   the   Adjusted   R-­‐squared   increases   slightly.   This   suggests  that  including  those  three  extra  variables  was  not  contributing  anything  to   explaining  the  variation  in  the  dependent  variable  and  including  them  was  actually   penalizing   more   than   contributing   to   the   Adjusted   R-­‐squared.   In   addition,   I   performed  a  partial  F-­‐test  to  make  sure  that  the  variables  excluded  from  the  model   do   not   significantly   contribute   to   explaining   the   variation   in   hours   worked.   The   partial  F-­‐test  statistic  is  0.0313  with  a  p-­‐value  of  0.9925.  Therefore  we  cannot  reject   the  null  hypothesis  of  the  three  variables  having  coefficients  equal  to  zero  (i.e.  being   irrelevant   in  explaining  hours  of  work).  All   remaining   independent  variables  are   statistically  different  from  zero  at  the  10%  level.  The  reduced  model  satisfies  the   assumption  of  normality  of  the  error  term  and  constant  variance  as  presented  in  the   histogram  and  scatter  plots  of  Figures  4  and  5.       Since  the  assumptions  appear  to  be  satisfied,  the  reduced  model  will  be  my  final   model  that  will  be  used  to  make  inference.  This  model  is  defined  as:    

𝒉𝒓𝒔𝒘𝒓𝒌𝒊 = 𝜶 + 𝜷𝟏𝒇𝒂𝒎𝒊𝒏𝒄𝒊 + 𝜷𝟐𝒎𝒂𝒍𝒆𝒊 + 𝜷𝟑𝒆𝒅𝒖𝒄𝒊 + 𝜷𝟒𝒆𝒙𝒑𝒊 + 𝜷𝟓𝒂𝒈𝒆𝒊 + 𝜺𝒊    

ECON  203  Project   [SAMPLE]  

4  

4.  Empirical  Results     The  coefficients  obtained  in  Table  4  suggest  that  if  family  income  were  to  increase   by  $1,000  pesos,  an  individual  from  the  sample  would  decrease  labor  supply  by  9   hours  per  week  on  average  (all  else  constant).  Considering  that  the  average  number   of  hours  worked  is  41,  this  would  represent  a  22%  decrease  in  hours  worked  per   week  (9/41=0.219),  a  sizeable  effect  of  family  income  on  labor  supply.       The  older  individuals  are,  the  more  hours  they  are  likely  to  work.  Increasing  age  by   one  year  relates  to  an  increase  of  8  hours  of  work  per  week.  The  dummy  variable  for   male  suggests  that  males  in  my  sample  on  average  work  2  hours  more  than  females.   Quite  surprisingly,  both  years  of  education  and  years  of  potential  experience  are   negatively  related  to  labor  supply  and  are  similar  in  magnitude  to  the  effect  of  age.       I   found   that   family   income   has   sizeable   effects   on   labor   supply.   An   interesting   question   is   whether   family   income   affects   different   populations   differently.   For   example,   the   relation   could   be   different   for   males   or   females.   In   order   to   test   whether   family   income   affects   males   differently   I   estimate   the   reduced   model   including  an  interaction  term  between  family  income  and  the  dummy  variable  for   male.  The  results  are  presented  in  Table  5.  For  the  interaction  term  I  can  reject  the   null  hypothesis  of  the  slope  equal  to  zero  allowing  a  5%  chance  of  Type  I  error.  The   interaction  term  suggests  that  women  reduce  their  labor  supply  more  than  men  if   family  income  increases.       5.  Summary  and  Discussion     This   study   explored   the   determinants   of   youth   labor   supply   for   workers   in   Argentina.  In  particular,  I  was  interested  in  analyzing  the  role  of  family  income  and   testing  whether   the   theoretical  predictions  are  satisfied  empirically.   I   found   that   family   income  is  one  of  the  most   important   factors   in  explaining  the  variation  in   hours   of   work.   Holding   all   else   constant,   an   increase   of   $1,000   pesos   in   family   income  relates  to  a  decrease  of  9  hours  of  work  per  week  (an  effect  of  around  22%   of  the  mean).  Other  factors  I  found  to  be  important  determinants  of  hours  worked   are  age,  education,  experience  and  gender.       One  significant  shortfall  of   this  study   is   that   I  am  only  considering   the   intensive   margin   of   labor   supply.   That   is,   only   focusing   on   individuals   who   are   working.   Family   income   could   potentially   have   different   effects   on   the   extensive   margin:   deciding  whether  to  work  or  not.  Since  I  do  not  include  individuals  with  zero  hours   of  work,  I  cannot  extrapolate  my  results  to  this  population.  Future  research  should   address  this  issue  carefully.       A  second  shortfall  is  that  I  am  using  a  very  selected  sample  for  Argentina,  which  may   not  necessarily  be  representative  of  other  populations.  Youths  living  in  the  United   Sates  or  in  European  countries  may  respond  differently  to  changes  in  family  income   since  the  labor  market  conditions  are  very  different  between  countries.    

ECON  203  Project   [SAMPLE]  

5  

Figure  1:  Scatter  plots  of  Hours  work  vs  independent  variables    

               

           

          Figure  2:  Histogram  of  standardized  residuals  –  initial  

model   Figure  3:  Residuals  vs  predicted  hours  of  work  -­‐  initial  

model  

   

0 20

40 60

80 10 0

hr sw rk

2000 4000 6000 8000 10000 faminc

0 20

40 60

80 10 0

hr sw rk

18 20 22 24 26 age

0 20

40 60

80 10 0

hr sw rk

0 5 10 15 nfamily

0 20

40 60

80 10 0

hr sw rk

0 .2 .4 .6 .8 1 male

0 20

40 60

80 10 0

hr sw rk

0 1 2 3 nchild

0 20

40 60

80 10 0

hr sw rk

0 .2 .4 .6 .8 1 married

0 20

40 60

80 10 0

hr sw rk

0 5 10 15 20 educ

0 20

40 60

80 10 0

hr sw rk

0 5 10 15 exp

0 .2

.4 .6

D en si ty

-2 -1 0 1 2 3 stdresid

-4 0

-2 0

0 20

40 R

es id

ua ls

20 40 60 80 Fitted values

ECON  203  Project   [SAMPLE]  

6  

Figure  4:  Histogram  of  standardized  residuals  –  reduced   model  

Figure  5:  Residuals  vs  predicted  hours  of  work  -­‐  reduced   model  

       

Table 1: Descriptive Statistics Variable Observations Mean Std. Dev. Min Max

hrswrk 685 41.587 17.188 1 98.1 faminc 685 6184.595 1547.212 1831.163 9643.931 age 685 22.261 2.189 18 25 nfamily 685 4.632 2.350 1 15 male 685 0.663 0.473 0 1 nchild 685 0.257 0.615 0 3 married 685 0.309 0.463 0 1 educ 685 10.645 2.692 3 17 exp 685 5.625 3.264 0 16

Notes: Own calculations based on Permanent Survey of Households for Argentina in 2005. Sample of youths of Buenos Aires that work.

   

Table 2: Correlations

hrswrk faminc age nfamily male nchild married educ exp

hrswrk 1 faminc -0.785 1

age 0.0429 -0.0366 1 nfamily -0.1014 0.1265 -0.1782 1

male 0.1647 -0.1451 -0.039 0.0171 1 nchild 0.0434 -0.0591 0.2269 -0.1368 0.0319 1

married 0.056 -0.0728 0.2376 -0.1385 0.0166 0.5936 1 educ -0.0053 0.0375 0.1085 -0.2388 -0.1767 -0.1638 -0.1406 1

exp 0.0322 -0.0557 0.5797 0.0767 0.1187 0.2862 0.2745 -0.7466 1 Notes: Own calculations based on Permanent Survey of Households for Argentina in 2005. Sample of youths of Buenos Aires that work.

0 .2

.4 .6

D en si ty

-2 -1 0 1 2 3 stdresid

-4 0

-2 0

0 20

40 R

es id

ua ls

0 20 40 60 80 Fitted values

ECON  203  Project   [SAMPLE]  

7  

  Table 3: Initial model regression results

Coefficient Std. Err. T-stat P-value [95% Conf. Interval]

faminc -0.011 0.002 -6.5 0.000 -0.014 -0.007 faminc2 0.000 0.000 1.22 0.225 0.000 0.000 age 8.102 4.441 1.82 0.069 -0.617 16.821 nfamily 0.052 0.184 0.28 0.778 -0.310 0.414 male 2.165 0.883 2.45 0.014 0.432 3.898 nchild -0.138 0.837 -0.17 0.869 -1.782 1.505 married 0.173 1.109 0.16 0.876 -2.005 2.351 educ -7.722 4.424 -1.75 0.081 -16.407 0.964 exp -8.008 4.451 -1.8 0.072 -16.748 0.733 Constant 45.516 27.509 1.65 0.098 -8.498 99.529

Observations 685

Source SS df MS R-squared 0.6228

Adj-R-squared 0.6178

Model 125857.77 9 13984.1967 F-stat 123.84

Residual 76219.1037 675 112.917191

Pval. F-stat 0.000

Notes: Own calculations based on Permanent Survey of Households for Argentina in 2005. Sample of youths of Buenos Aires that work. Ordinary least squares estimates presented.  

Table 4: Reduced model regression results

Coefficient Std. Err. T-stat P-value [95% Conf. Interval]

faminc -0.009 0.000 -32.59 0.000 -0.009 -0.008 age 8.185 4.431 1.85 0.065 -0.515 16.885 male 2.109 0.880 2.4 0.017 0.381 3.837 educ -7.844 4.414 -1.78 0.076 -16.511 0.822 exp -8.108 4.441 -1.83 0.068 -16.829 0.613 Constant 40.544 27.004 1.5 0.134 -12.478 93.565

Observations 685

SS df MS R-squared 0.622

Adj-R-squared 0.6192

Model 125680.337 5 25136.0675 F-stat 223.41

Residual 76396.5367 679 112.513309

Pval. F-stat 0.000

Notes: Own calculations based on Permanent Survey of Households for Argentina in 2005. Sample of youths of Buenos Aires that work. Ordinary least squares estimates presented.

ECON  203  Project   [SAMPLE]  

8  

   

Table 5: Reduced model regression results with interaction

Coefficient Std. Err. T-stat P-value [95% Conf. Interval]

faminc -0.010 0.000 -20.37 0.000 -0.011 -0.009 male -6.781 3.718 -1.82 0.069 -14.081 0.519 maleXfaminc 0.001 0.001 2.46 0.014 0.000 0.003 age 8.539 4.417 1.93 0.054 -0.133 17.211 educ -8.170 4.400 -1.86 0.064 -16.808 0.469 exp -8.443 4.427 -1.91 0.057 -17.136 0.249 Constant 44.258 26.946 1.64 0.101 -8.651 97.166

Observations 685

SS df MS R-squared 0.6253

Adj-R-squared 0.622

Model 126356.579 6 21059.4299 F-stat 188.57

Residual 75720.2948 678 111.681851

Pval. F-stat 0.000

Notes: Own calculations based on Permanent Survey of Households for Argentina in 2005. Sample of youths of Buenos Aires that work. Ordinary least squares estimates presented.