Presentation slides

Renaaz
Economicsapplications.pdf

Economic Applications of

Big Data & Predictive Analytics

FINM4100

Analytics in Accounting,

Finance and Economics

Week 9

Lesson Learning Outcomes

1 Define and review ideas around micro- and

macroeconomics

2 Review the concept of correlation

3 Analyse Macroeconomic data

Why Build Models?

“Just because you

have more data

doesn’t mean that

you’re going to make

better decisions.”

Models encapsulate

patterns that exist in

data, helping us make

sense of them.Christina Zhu Assistant Professor of Accounting Wharton School of the University of Pennsylvania

SELTS

• Student feedback is usually done in week 9

• You may be asked to fill in a survey

This Photo by Unknown Author is licensed under CC BY-SA

Software for today

1. Google Colab

• Either

A. watch the teacher demonstrate analytics and accounting in python OR

B. you can run the python scripts yourself in Google Colab

• If you want to run the code provided, make sure you have access (signed in) to Google Colab https://colab.research.google.com

2. Exploratory

A. watch the teacher demonstrate analytics and accounting in Exploratory OR

B. run each step yourself online (access is explained on the next slide)

Dataset

• Data: countries of the world.csv (1970 to 2017)

• Business Problem: How do we determine factors affecting a country's GDP per capita and make a model using the data of many countries?

• We have data from 227 countries and variables (factors) such as GDP, population, literacy, crops (%), birthrate, and others.

• We will explore correlations between each factor and GDP across various countries in python

• Make charts (try multiple linear regression in Exploratory)

This Photo by Unknown Author is licensed under CC BY-SA

What is Economics?

• Economics is the study of how society allocates scarce

resources to satisfy unlimited wants

• We can consider two branches of economics:

▪ Microeconomics is the study of how single economic

units of society make economic decisions

▪ Macroeconomics is the study of how an aggregated

economy makes economic decisions

What is Economics?

Is the study of how society allocates scarce resources

to satisfy unlimited wants

Economics

Production,

distribution

and

consumption

Scarcity,

choice and

decision

making

Microeconomics

Focus:

• How individual consumers and companies make decisions

• How they respond to changes in price

• Why different goods have different prices

• How humans may trade in an optimal way

Typical topics in this area are:

• Demand and supply

• Costs of producing goods (production, revenue and costs)

• Market structure, e.g. perfect competition

This Photo by Unknown Author is licensed under CC BY-ND

Macroeconomics

Focus:

The overall economy of a region, e.g. country, using aggregated data

Typical topics in this area are:

• Economic cycles

• Economic growth

• Fiscal and monetary policy

• Unemployment rates

• Gross Domestic Product (GDP) which is a broad measure of a

country’s economic performance

T h is

P h o to

b y U

n k n o w

n A

u th

o r is

lic e n s e d u

n d e r C

C B

Y

We will be analysing GDP data today

Why is Economic Growth important?

• It is an indicator of a healthy economy

• One theory says increasing GDP leads to more employment in some sectors

• It leads to a better standard of living

• Key components of economic growth are thought to be – Natural resources

– Infrastructure

– Population/labour

– Human capital

– Technology

– Law

This Photo by Unknown Author is licensed under CC BY-SA-NC

GDP per capita 2021

How are we doing?

Activity 1: Think – pair – share

Economics

• Watch the video below which compares micro- and macro- economics

• https://www.youtube.com/watch?v=nJbWj_kHCJQ

• Form pairs

• Person 1 will explain macroeconomics to person 2, then person 2 will explain microeconomics to person 1

• Report back to class with comments and questions

Review of concepts

• Before analysing today’s data, we need to review the idea of – Covariance and correlation

– correlation heatmaps

This Photo by Unknown Author is licensed under CC BY

Two Measures of Association

▪ Covariance (is there any pattern to the way two variables

move together?)

a. Only concerned with the direction of the relationship

b. No causal effect is implied

c. Is affected by units of measurement

▪ Correlation coefficient which incorporates part of the

covariance formula (how strong is the linear relationship

between two variables?)

Correlation coefficient

Also called Standardised Covariance and is between –1 and 1

• The closer to –1, the stronger the negative linear relationship

• The closer to 1, the stronger the positive linear relationship

• The closer to 0, the weaker the linear relationship

This Photo by Unknown Author

is licensed under CC BY-NC-ND

Visualising correlation coefficient

• Method 1: Correlation heatmap

This Photo by Unknown Author is licensed under CC BY-SA

Visualising correlation coefficient

Y

X

Y

X

Y

X r = -1.0 r = 0r = +0.3

Method 2: Plots of pairs of variables

Formulae for Covariance and

Correlation

Measures the relative strength of the linear relationship

between two variables

Sample covariance

and correlation coefficient

where

𝑟 = σ𝑖=1 𝑛 ሻ𝑥𝑖 − ҧ𝑥 (𝑦𝑖 − ത𝑦

σ𝑖=1 𝑛 𝑥𝑖 − ҧ𝑥

2 σ𝑖=1 𝑛 𝑦𝑖 − ത𝑦

2

COV(x, yሻ = σ𝑖=1 𝑛 ሻ𝑥𝑖 − ҧ𝑥 (𝑦𝑖 − ത𝑦

𝑛 − 1

ҧ𝑥 is the mean of the x’s ത𝑦 is the mean of the y’s

countries of the world.csv data

• In today’s data some of the variables are obvious while others are not

• It also has commas instead of dots (which we will deal with later)

• Variables – Agriculture

– Industry

– Service

• These three represent labour force by sector, so if agriculture in Liberia is 0,769. It is really 0.769 and means that 76.9% of the work force in Liberia work in the agricultural sector. Similarly for Industry and Service.

• Climate measure is a classification between 1 (drier) and 4 (milder)

Activity Open the script and run

or watch the demo

• Download the data countries of the world.csv to a directory of your choice

• Open the script below

https://colab.research.google.com/drive/15LsR6QoH858T4e2U4LHFtlzWSL EJrWMG?usp=sharing

• You will be prompted in the second block of code to choose the data file

• Click in the box and find your countries of the world.csv to be uploaded

• Run the rest of the script and analyse the output as it is generated, e.g. correlation heatmap, countries with the highest GDP, etc.

Sample Output

Sample Output

Sample Output

Data Modification

• Make a copy of the data file in your folder

• Open the data in Microsoft Excel

• We would normally use a dot to indicate accuracy to one or

more decimal places, however a comma has been used here

• Highlight the data columns with commas

• Go to the “Editing menu”

• Click on Find & Select and scroll down to “replace”

• Replace commas , for dots . (Enter symbols as below) and click

on Replace all

• Save your file

Data Modification

• Create a new column heading in column U called “GDP Low_High”

• Type =IF(I2<3000, 0,1) in cell U2 and enter

• Click on the corner of that cell (you should see a cross), hold and drag it down

the column to repeat the formula in rows down to cell U228

• You should see a zero if GDP < $3000 per capita and a one otherwise

• Save your file

Exploratory

• Access Exploratory

• Start a new project called GDP analysis

• Use Data Frames + to find and import the modified data file

• Change variable GDP Low_High from numeric to logical before clicking on save

• Select Analytics

• We are going to go through a simple guided Decision tree model then you can experiment and try to interpret your own

• Instructions for the model type and variables are on the next slide

Exploratory analytics model

• Select Decision Tree as the type

• GDP Low_High as the Target variable

• Phones, birthrate and Agriculture as the predictor variables

• Leave sample size as is an run

• You will see a tree which is to be read from the top

• We will start to interpret this (first see next slide)

Simple Decision Tree

• The model makes its own

thresholds if you don’t make

all variables binary

• Positive of each condition is

to the right and negative to

the left

• If you add the percentages

from the bottom of the tree,

they sum at each level, e.g.

• 7% + 4% make up the 11%,

• 11% + 25% make up the

36%

Simple Decision Tree

The model makes its own thresholds if you don’t make all variables binary

Positive of each condition is to the right and negative to the left

• Rule 1: “< 75 phones per 1000

persons”

• In the case “no” = “>=75 phones

per 1000 persons”

• 64% of the countries have >=75

phones per 1000 persons (dark

blue)

• This gives them a (0.92) 92%

chance of having a GDP >=$3000

per capitaOf the countries with < 75 phones

per 1000 persons (36%), only a

0.15 (15%) have a GDP >=$3000

per capita

Simple Decision Tree

• Rule 2: “Agricultural workforce >=20%”

• If we split the group with >75 phones per 1000

persons up further into those with an Agricultural

workforce >=20% or not

• We find that 59% of countries have >75 phones

per 1000 persons and an Agricultural workforce

>=20%

• This raises the chance of the country having a

GDP >=$3000 per capita to 0.96, i.e. 96%, given

the two other conditions

Simple Decision Tree • Rule 3: “Birthrate >=29 (thought to be

roughly 29 births per 1000 capita)

• 11% of countries have <75 phones per

1000 capita and a birth rate < 29 both

per 1000 capita

• These would give the countries a 43%

chance of having a GDP >=$3000 per

capita

• 4% of the countries have <75 phones

per 1000 capita and a birth rate < 29

both per 1000 capita and an Agricultural

workforce < 16%. 62% in this category

have a GDP >=$3000 per capita

If you look at the “Importance” menu (green) , the order

of importance is phones, birth rate, agriculture

Decision Tree Exploration

• Try some different combinations of predictor variables and attempt to interpret the results

• You will find that the thresholds change a lot

• Report back to class as needed

This Photo by Unknown Author is licensed under CC BY

Vis poverty with satellite data

• If time (or in your own time) look at the report at

• https://www.kaggle.com/reubencpereira/visua lizing-poverty-w-satellite-data/report

• and interact with the maps on Kaggle

• You may have to sign in