data analysis and R studio assignment

profileduty13
QDAassessmentexample.docx

The UK’s gender pay gap in 2019

Introduction

It has long been known that women tend to be paid less than men in the UK. For instance, Olsen and Walby (2004) note that in the UK in 2003 there was an 18% gap between what women working full-time earned compared to men. The situation is not unique to the UK and has persisted over time. Some hypothesised that the differences might be explained by differences in human capital or in the types of job taken. While this may account for some of the difference, it does not explain all of it. For example, Blackaby et al (2005) study the UK academic labour market and find a gender pay gap for academics with the same rank. The gap persists even when controlling for a variety of socio-demographic differences, geographical variation in wages and measures of productivity.

In this paper, we will utilise data from the UK’s Quarterly Labour Force Survey April-June 2019 to measure the gap between the hourly pay of men and women. We will test whether this difference is robust to the inclusion of a variety of common control variables e.g., age, education, sector, seniority etc.

Data

We first select the variables we will use in the analysis. The following variables are used:

Table 1: Variable names and definitions

Variable

Description

Hourly pay

Hourly wage in GBP

Sex

Sex

Age

Age years

Region

Government office region

Zero-hours contract

Whether the worker is on a zero hours contract

Full-time

Whether the person is full-time or part-time

Education

Highest educational qualification

Industry

Industry Sector

Seniority

Permanent or temporary job

Permanant

Number of employees at workplace

Firm Size

Married, living with spouse

Married

NS-SEC major group (SOC2010 based)

Hours worked

Basic usual hours

We filter the data to include only relevant observations and to exclude potentially erroneous values. We also extract the variables to be used in the analysis. The full dataset has responses from 86,548 people. The data are filtered based on the following set of rules:

1. Include only observations reporting an hourly wage.

2. Exclude observations where the hourly pay is below the minimum wage (£3.90 for an apprentice).

3. Exclude people who usually work zero hours.

4. Exclude people who worked 97 hours or more. The variable is top-coded so we do not know how many hours are usually worked.

5. Include only working age people i.e., 16-64 years.

6. Exclude any negative values as these are used to indicate missing values.

7. Exclude people who are on New-Deal full-time/part-time positions (a very small number of people).

8. Exclude people who do not know what their highest level of education is.

9. Exclude people with industry “Small employers and own account workers” due to a small number of observations in this category.

This gives a final sample size of 9,361 respondents. This is a substantial drop from the initial 86,548. Given that the survey covers the population and we have extracted only people in work, a substantial drop is to be expected. In Table 2, we present some basic summary statistics for the data.

Table 2: Summary statistics

Male (N=4338)

Female (N=5041)

Total (N=9379)

Hourly wage (£)

   Mean (SD)

17.718 (13.349)

14.204 (10.244)

15.829 (11.911)

   Range

3.900 - 276.940

3.920 - 246.170

3.900 - 276.940

Age

   Mean (SD)

41.636 (12.060)

41.913 (12.037)

41.785 (12.048)

   Range

16.000 - 64.000

16.000 - 64.000

16.000 - 64.000

Region

   London

391 (9.0%)

454 (9.0%)

845 (9.0%)

   Yorkshire and The Humber

339 (7.8%)

411 (8.2%)

750 (8.0%)

   South West

407 (9.4%)

453 (9.0%)

860 (9.2%)

   West Midlands

339 (7.8%)

423 (8.4%)

762 (8.1%)

   South East

619 (14.3%)

670 (13.3%)

1289 (13.7%)

   East of England

428 (9.9%)

455 (9.0%)

883 (9.4%)

   East Midlands

375 (8.6%)

393 (7.8%)

768 (8.2%)

   North East

180 (4.1%)

248 (4.9%)

428 (4.6%)

   North West

448 (10.3%)

559 (11.1%)

1007 (10.7%)

   Wales

182 (4.2%)

206 (4.1%)

388 (4.1%)

   Northern Ireland

281 (6.5%)

350 (6.9%)

631 (6.7%)

   Scotland

349 (8.0%)

419 (8.3%)

768 (8.2%)

Zero hours contract

   Yes

77 (1.8%)

108 (2.1%)

185 (2.0%)

   No

4261 (98.2%)

4933 (97.9%)

9194 (98.0%)

Full-time

   Full-time

3946 (91.0%)

3143 (62.3%)

7089 (75.6%)

   Part-time

392 (9.0%)

1898 (37.7%)

2290 (24.4%)

Education

   Degree or equivalent

1509 (34.8%)

2025 (40.2%)

3534 (37.7%)

   Higher education

387 (8.9%)

503 (10.0%)

890 (9.5%)

   A level or equivalent

1078 (24.9%)

1003 (19.9%)

2081 (22.2%)

   GCSE A*-C or equivalent

788 (18.2%)

1022 (20.3%)

1810 (19.3%)

   Other qualitfication

370 (8.5%)

288 (5.7%)

658 (7.0%)

   No Qualification

206 (4.7%)

200 (4.0%)

406 (4.3%)

Industry

   Public admin, education and health

894 (20.6%)

2479 (49.2%)

3373 (36.0%)

   Agriculture, forestry and fishing

29 (0.7%)

17 (0.3%)

46 (0.5%)

   Energy and water

142 (3.3%)

53 (1.1%)

195 (2.1%)

   Manufacturing

683 (15.7%)

275 (5.5%)

958 (10.2%)

   Construction

369 (8.5%)

101 (2.0%)

470 (5.0%)

   Distribution, hotels and restaurants

747 (17.2%)

894 (17.7%)

1641 (17.5%)

   Transport and communication

596 (13.7%)

214 (4.2%)

810 (8.6%)

   Banking and finance

709 (16.3%)

784 (15.6%)

1493 (15.9%)

   Other services

169 (3.9%)

224 (4.4%)

393 (4.2%)

Permanent

   Permanent

4177 (96.3%)

4835 (95.9%)

9012 (96.1%)

   Not permanent in some way

161 (3.7%)

206 (4.1%)

367 (3.9%)

Firm size

   1-10

696 (16.0%)

875 (17.4%)

1571 (16.8%)

   11-19

319 (7.4%)

418 (8.3%)

737 (7.9%)

   20-24

196 (4.5%)

219 (4.3%)

415 (4.4%)

   Don’t know but under 25

68 (1.6%)

95 (1.9%)

163 (1.7%)

   25-49

530 (12.2%)

762 (15.1%)

1292 (13.8%)

   50-249

1106 (25.5%)

1206 (23.9%)

2312 (24.7%)

   250-499

356 (8.2%)

343 (6.8%)

699 (7.5%)

   Don’t know but between 50 and 499

148 (3.4%)

140 (2.8%)

288 (3.1%)

   500 or more

919 (21.2%)

983 (19.5%)

1902 (20.3%)

Seniority

   Higher managerial and professional

1038 (23.9%)

697 (13.8%)

1735 (18.5%)

   Lower managerial and professional

1213 (28.0%)

1651 (32.8%)

2864 (30.5%)

   Intermediate occupations

423 (9.8%)

1145 (22.7%)

1568 (16.7%)

   Lower supervisory and technical

546 (12.6%)

229 (4.5%)

775 (8.3%)

   Semi-routine occupations

466 (10.7%)

806 (16.0%)

1272 (13.6%)

   Routine occupations

545 (12.6%)

362 (7.2%)

907 (9.7%)

   Never worked

107 (2.5%)

151 (3.0%)

258 (2.8%)

Usual hours worked

   Mean (SD)

38.162 (8.689)

30.529 (10.242)

34.060 (10.284)

   Range

1.000 - 90.000

1.000 - 80.000

1.000 - 90.000

Married

   No

2001 (46.1%)

2556 (50.7%)

4557 (48.6%)

   Yes

2337 (53.9%)

2485 (49.3%)

4822 (51.4%)

firm_size

   small_firm

1279 (29.5%)

1607 (31.9%)

2886 (30.8%)

   25-49

530 (12.2%)

762 (15.1%)

1292 (13.8%)

   50-249

1106 (25.5%)

1206 (23.9%)

2312 (24.7%)

   250-499

356 (8.2%)

343 (6.8%)

699 (7.5%)

   Don’t know but between 50 and 499

148 (3.4%)

140 (2.8%)

288 (3.1%)

   500 or more

919 (21.2%)

983 (19.5%)

1902 (20.3%)

log_wage

   Mean (SD)

2.711 (0.541)

2.515 (0.490)

2.605 (0.524)

   Range

1.361 - 5.624

1.366 - 5.506

1.361 - 5.624

As expected, a substantial difference in hourly pay between men and women is evident. On average, men earned £17.72 while women earned £14.2. We can use a t-test to test whether this difference is due to sampling variation.

##

## Welch Two Sample t-test

##

## data: HOURPAY by SEX

## t = 14.124, df = 8064.6, p-value < 2.2e-16

## alternative hypothesis: true difference in means between group Male and group Female is not equal to 0

## 95 percent confidence interval:

## 3.026284 4.001653

## sample estimates:

## mean in group Male mean in group Female

## 17.71763 14.20366

The t-test test the null hypothesis that the population mean wage for men is equal to the population mean wage for women. The p-value is substantially below 0.05 and we therefore reject the null hypothesis that the population means are equal. Table 2 shows that there are some differences in characteristics of the men and women in the sample. For instance, women are much more likely to work in public administration, education and health. Women also tend to work fewer hours. Men are more likely to occupy the most senior positions.

We can compare the distributions of wages graphically using a density plot.

The vertical lines on the plot show the median wages of men and women. We can clearly see that men have higher median wages than women. Men dominate the upper part of the distribution.

Analysis

Having established that there is a difference in the average wage of men and women, we now wish to test whether this difference can be explained by differences in the characteristics of the workers. Based on our brief review of previous literature, we include some commonly used variables to describe differences between the workers. To do this, we utilise a linear regression model. One of the assumptions of the linear regression model is that the residuals of the model should follow a normal distribution. Given that the hourly wage variable is highly skewed, we use a logarithmic transformation. This is a common approach in the literature and allows us to relate changes in characteristics to percentage changes in wage.

In the table below, we present five regression models. We begin with a bivariate model with sex as the only independent variable, we then add additional demographic factors (Model 2), followed by job characteristics (Model 3), then we add firm characteristics (Model 4), and finally we add the variables describing the region (Model 5).

Table 3: Regression results. Dependant Variable: Log Hourly Wage (£)

 

Model 1

Model 2

Model 3

Model 4

Model 5

Intercept

2.71***

2.61***

2.88***

2.69***

2.82***

 

(0.01)

(0.02)

(0.03)

(0.04)

(0.04)

Female

-0.20***

-0.22***

-0.14***

-0.11***

-0.11***

 

(0.01)

(0.01)

(0.01)

(0.01)

(0.01)

Age (Years)

 

0.01***

0.01***

0.01***

0.01***

 

 

(0.00)

(0.00)

(0.00)

(0.00)

Married

 

0.12***

0.08***

0.08***

0.08***

 

 

(0.01)

(0.01)

(0.01)

(0.01)

Education - Ref: Degree

 

 

 

 

 

 

 

 

 

 

 

     Some higher education

 

-0.32***

-0.18***

-0.17***

-0.16***

 

 

(0.02)

(0.01)

(0.01)

(0.01)

     A level

 

-0.40***

-0.21***

-0.19***

-0.18***

 

 

(0.01)

(0.01)

(0.01)

(0.01)

     GCSE

 

-0.50***

-0.25***

-0.23***

-0.21***

 

 

(0.01)

(0.01)

(0.01)

(0.01)

     Other

 

-0.57***

-0.25***

-0.24***

-0.24***

 

 

(0.02)

(0.02)

(0.02)

(0.02)

     No qualifications

 

-0.69***

-0.33***

-0.31***

-0.30***

 

 

(0.02)

(0.02)

(0.02)

(0.02)

Contract type

 

 

 

 

 

 

 

 

 

 

 

     Zero hours contract

 

 

0.05

0.03

0.04

 

 

 

(0.03)

(0.03)

(0.03)

     Part-time

 

 

-0.08***

-0.04***

-0.04***

 

 

 

(0.01)

(0.01)

(0.01)

     Non-permanant job

 

 

-0.06**

-0.07***

-0.07***

 

 

 

(0.02)

(0.02)

(0.02)

Seniority - Ref: Higher managerial and professional

 

 

 

 

 

 

 

 

 

 

 

     Lower managerial

 

 

-0.29***

-0.25***

-0.25***

 

 

 

(0.01)

(0.01)

(0.01)

     Intermediate

 

 

-0.52***

-0.49***

-0.49***

 

 

 

(0.01)

(0.01)

(0.01)

     Lower supervisory

 

 

-0.53***

-0.48***

-0.47***

 

 

 

(0.02)

(0.02)

(0.02)

     Semi-routine

 

 

-0.69***

-0.62***

-0.62***

 

 

 

(0.02)

(0.02)

(0.02)

     Routine

 

 

-0.69***

-0.66***

-0.65***

 

 

 

(0.02)

(0.02)

(0.02)

     Never worked

 

 

-0.56***

-0.48***

-0.47***

 

 

 

(0.03)

(0.03)

(0.03)

Firm size (employees) - Ref: <25

 

 

 

 

 

 

 

 

 

 

 

     25-49

 

 

 

0.05***

0.06***

 

 

 

 

(0.01)

(0.01)

     50-249

 

 

 

0.09***

0.09***

 

 

 

 

(0.01)

(0.01)

     250-499

 

 

 

0.13***

0.13***

 

 

 

 

(0.02)

(0.02)

     Between 50-499

 

 

 

0.12***

0.12***

 

 

 

 

(0.02)

(0.02)

     500+

 

 

 

0.20***

0.19***

 

 

 

 

(0.01)

(0.01)

Industry - Ref: Public Administration, Education & Health

 

 

 

 

 

 

 

 

 

 

 

     Agriculture

 

 

 

0.02

0.03

 

 

 

 

(0.06)

(0.06)

     Energy & Water

 

 

 

0.16***

0.17***

 

 

 

 

(0.03)

(0.03)

     Manufacturing

 

 

 

0.08***

0.09***

 

 

 

 

(0.01)

(0.01)

     Construction

 

 

 

0.19***

0.18***

 

 

 

 

(0.02)

(0.02)

     Distribution & hospitality

 

 

 

-0.04**

-0.04**

 

 

 

 

(0.01)

(0.01)

     Transport

 

 

 

0.14***

0.12***

 

 

 

 

(0.02)

(0.02)

     Banking

 

 

 

0.15***

0.13***

 

 

 

 

(0.01)

(0.01)

     Other services

 

 

 

0.02

0.01

 

 

 

 

(0.02)

(0.02)

Region - Ref: London

 

 

 

 

 

 

 

 

 

 

 

     Yorkshire and The Humber

 

 

 

 

-0.19***

 

 

 

 

 

(0.02)

     South West

 

 

 

 

-0.17***

 

 

 

 

 

(0.02)

     West Midlands

 

 

 

 

-0.19***

 

 

 

 

 

(0.02)

     South East

 

 

 

 

-0.10***

 

 

 

 

 

(0.02)

     East of England

 

 

 

 

-0.12***

 

 

 

 

 

(0.02)

     East Midlands

 

 

 

 

-0.20***

 

 

 

 

 

(0.02)

     North East

 

 

 

 

-0.20***

 

 

 

 

 

(0.02)

     North West

 

 

 

 

-0.19***

 

 

 

 

 

(0.02)

     Wales

 

 

 

 

-0.19***

 

 

 

 

 

(0.02)

     Northern Ireland

 

 

 

 

-0.19***

 

 

 

 

 

(0.02)

     Scotland

 

 

 

 

-0.12***

 

 

 

 

 

(0.02)

R2

0.03

0.29

0.46

0.50

0.51

Adj. R2

0.03

0.29

0.46

0.49

0.51

Num. obs.

9379

9379

9379

9379

9379

***p < 0.001; **p < 0.01; *p < 0.05

Model 1 reproduces our earlier result, showing that women earn less than men. In this case, the average difference is about 20%. By consulting the R-Squared measure, we can see that sex explains only around 2% of the variation in hourly wage rate. In Model 2, we add age, marital status and education level. Age is likely to be correlated with experience, and is positively associated with wages. Being married and living with a spouse also has a positive associated with the wage level (adding around 12%). The association of education with wages is as expected, with more education typically being associated with higher wages. Interestingly, taking these factors into account increases the estimated wage gap i.e., given the age, education and marital experience of the women, we would have expected them to earn more.

In Model 3 we introduce characteristics of the job. Part-time and non-permanent positions seem to have lower hourly wages. There is no statistically significant association with wages. As expected, more senior positions tend to have higher wages. We also note that in this model, the estimated wage gap is smaller. This is probably due to the fact that women are more likely to occupy part-time positions (which have lower wages) and they are less likely to occupy the most senior positions. Model 4 addins in characteristics of the firm i.e., firm size and industry. Larger firms tend to have higher wages. For industry, the reference category is public administration, health and education. Based on our descriptive statistics, we know that many women dominate these industries. Note that most industries have higher wages than this sector, indicated by the negative coefficients. For instance, note that wages in the construction sector (dominated by men) are around 19% higher than in public administration. Accordingly, the magnitude of the wage gap estimate is further reduced to around 11%. This model explains around half of the variation in wages.

In Model 5 we control for the region in which the respondent lives. The reference category here is London. As expected, given the centralised nature of the UK economy, wages are on average lower in every other region. In this most comprehensive model, the wage gap is estimated at 11%.

Before reaching final conclusions, it is necessary to test the robustness of our model.The model assumes that the independent variables can be considered exogenous. In this case, it seems reasonable to assume this. We should also ensure that our residuals are independent. The only structure in the data likely to result in a violation of this assumption is that there is some degree of nesting within households. However, there are many households and most households are likely to consist of only one or two adults. Any departure from the assumption of independence is likely to be minor.

We should also avoid serious multicollinearity While our model does not display signs of multicollinearity (e.g., an unexpected lack of statistical significance or incorrectly signed variables), we can examine variance inflation factors.

## GVIF Df GVIF^(1/(2*Df))

## SEX 1.322360 1 1.149939

## AGE 1.232729 1 1.110283

## married 1.152837 1 1.073702

## HIQUL15D 1.649343 5 1.051311

## FLEXW7 1.076102 1 1.037353

## FTPT 1.358746 1 1.165653

## JOBTYP 1.069649 1 1.034239

## NSECMJ10 2.197757 6 1.067821

## firm_size 1.188808 5 1.017446

## INDE07M 1.716052 8 1.034328

## GOR9D 1.127264 11 1.005460

All variance inflation factors are low, so we can conclude that multicollinearity is not a problem in our model. We can examine the residuals versus fitted plot to find out more about our model.

In the plot, we can see a slight non-linear pattern. There is a tendency that the model underestimates wages at the lowest and highest end of the scale. This seems fairly minor. We can test whether this is statistically significant using a RESET test.

##

## RESET test

##

## data: reg5

## RESET = 67.247, df1 = 2, df2 = 9335, p-value < 2.2e-16

With a p-value well below the level of significance (0.05), we can reject the null hypothesis of a correctly specified model. While this is potentially problematic, the scale of the nonlineary looks relatively small based on the plot.

The residuals versus fitted plot does not show classic signs of heteroscedasticity (e.g,. a funnel shape) but there is perhaps some indication that the variance may be lower over some portion of the distribution. We can test for statistical significance using a variety of tests. Here, we present the Breusch-Pagan test.

##

## studentized Breusch-Pagan test

##

## data: reg5

## BP = 340.84, df = 41, p-value < 2.2e-16

The test returns a statistically significant result, meaning we can reject the null hypothesis of homoscedasticity and conclude that the model exhibits heteroscedasticity. It seems reasonable to suppose that there may be more wage variation a the higher end of the distribution. For instance, some highly educated workers may choose to work in high-paid or low-paid jobs (which may not require their education). However, a worker with no qualifications is unlikely to have the same access to high-paying jobs and may therefore earn less. The potential for variation would be higher for the highly-educated workers. We can therefore likely conclude that the heteroscedasticity in our model is pure heteroscedasticity. This means that the coefficient estimates remain unbiased and consistent. However, the standard errors are not correct and therefore our ability to conduct hypothesis testing is compromised. This can be addressed by using heteroscedasticity robust standard errors.

The model also assumes that the residuals follow a normal distribution. There are several ways to determine whether this assumption is met. One popular way is to utilse a Q-Q plot as presented below.

图表, 折线图  描述已自动生成

If the residuals follow a normal distribution then the points should sit on the 45 degree line. In this case, there is a departure particularly at the upper end of the distribution. This is not unexpected given that the hourly wage rate is highly skewed. We have already used a logarithmic transformation to bring it closer to a normal distribution but, as shown in the density plots presented earlier, there is still a skewed element. We have not included many variables in our model which may help explain why some people have extremely high wages. Without additional variables, there is not much we can do. However, the OLS model is relatively robust to departures from normality so we can likely still rely on our hypothesis testing.

We therefore select Model 5 as our final specification, but present it with heteroscedasticity robust standard errors as Model 6.

Table 4: Model 5 with heteroscedasticity robust standard errors. Dependant Variable: Log Hourly Wage (£)

 

Model 6

Intercept

2.82***

 

(0.04)

Female

-0.11***

 

(0.01)

Age (Years)

0.01***

 

(0.00)

Married

0.08***

 

(0.01)

Education - Ref: Degree

 

 

 

     Some higher education

-0.16***

 

(0.01)

     A level

-0.18***

 

(0.01)

     GCSE

-0.21***

 

(0.01)

     Other

-0.24***

 

(0.02)

     No qualifications

-0.30***

 

(0.02)

Contract type

 

 

 

     Zero hours contract

0.04

 

(0.04)

     Part-time

-0.04***

 

(0.01)

     Non-permanant job

-0.07***

 

(0.02)

Seniority - Ref: Higher managerial and professional

 

 

 

     Lower managerial

-0.25***

 

(0.01)

     Intermediate

-0.49***

 

(0.02)

     Lower supervisory

-0.47***

 

(0.02)

     Semi-routine

-0.62***

 

(0.02)

     Routine

-0.65***

 

(0.02)

     Never worked

-0.47***

 

(0.03)

Firm size (employees) - Ref: <25

 

 

 

     25-49

0.06***

 

(0.01)

     50-249

0.09***

 

(0.01)

     250-499

0.13***

 

(0.02)

     Between 50-499

0.12***

 

(0.02)

     500+

0.19***

 

(0.01)

Industry - Ref: Public Administration, Education & Health

 

 

 

     Agriculture

0.03

 

(0.05)

     Energy & Water

0.17***

 

(0.03)

     Manufacturing

0.09***

 

(0.01)

     Construction

0.18***

 

(0.02)

     Distribution & hospitality

-0.04**

 

(0.01)

     Transport

0.12***

 

(0.02)

     Banking

0.13***

 

(0.01)

     Other services

0.01

 

(0.02)

Region - Ref: London

 

 

 

     Yorkshire and The Humber

-0.19***

 

(0.02)

     South West

-0.17***

 

(0.02)

     West Midlands

-0.19***

 

(0.02)

     South East

-0.10***

 

(0.02)

     East of England

-0.12***

 

(0.02)

     East Midlands

-0.20***

 

(0.02)

     North East

-0.20***

 

(0.02)

     North West

-0.19***

 

(0.02)

     Wales

-0.19***

 

(0.02)

     Northern Ireland

-0.19***

 

(0.02)

     Scotland

-0.12***

 

(0.02)

***p < 0.001; **p < 0.01; *p < 0.05

There are no major changes in statistical significance as a result of using the new standard errors. As expected, the sex variable remains highly statistically significant in the model.

Discussion and conclusion

This paper set out to test whether there was still a gender pay gap in the UK in 2019. It also aimed to measure the magnitude of this gap when some important differences between workers and their jobs were controlled for. Data from the UK’s Quarterly Labour Force Survey April-June 2019. We find that there is indeed still a gap between the hourly wage between men and women. Without accounting for differences between the workers and jobs, the gap was estimated to be around 20%. In our preferred model specification (controlling for differences in people, job type, industry, and region) the gap shrinks to around 11%. The gap therefore remains substantial.

Our model does not provide an explanation for why the gap exists. One hypothesis is that women are discriminated against when it comes to setting wages. We showed that when controlling for differences in jobs, the magnitude of the gap shrinks. However, we may debate whether the choice of industry is really a free choice. For instance, wages in the male-dominated construction industry tend to be higher than in the female-dominated public administration, health and education sector. One interpretation of the results might be to say that women moving into the construction industry could improve their wages. However, we might question whether women face barrier moving into the construction industry. Further research would be required to explore the underlying mechanisms and causes of the observed disparities.

Our model has some additional limitations. There were some variables which have been used in previous literature which were unavailable in our dataset. For instance, it has been suggested that part of the reason for a gender wage gap may be related to career breaks. We were also unable to include variables describing differences in experience. Including these variables may have had an impact on the estimated magnitude of the gap. Nevertheless, we have controlled for several important factors and demonstrated that the existence of a gap is robust to the inclusion of a large number of control variables.

Word count

Method

koRpus

stringi

Word count

2305

2272

Character count

14022

14021

Sentence count

170

Not available

Reading time

11.5 minutes

11.4 minutes

Reflective statment

I believe this work should be awarded around an A3. In-depth evidence is presented for attainment of all of the course’s learning outcomes and everything asked for in the assignment brief has been covered. A research question is posed and appropriate data is used to address it. The data is cleaned and processed before being described and visualised. A series of regression models are then presented, discussed and tested. In the end, a relatively robust model is produced and the research question is answered. I enjoyed the process of writing the assignment, though it was time consuming and not without challenges.

References

Blackaby, D., Booth, A. L., & Frank, J. (2005). Outside offers and the gender pay gap: Empirical evidence from the UK academic labour market.  The Economic Journal, 115(501), F81-F107.

Olsen, W. K., & Walby, S. (2004). Modelling gender pay gaps.  Working Paper Series No. 17, Equal Opportunities Commission

image1.png

image2.png

image3.png