Intro to using R and DBnomics

Political economy of finance in Europe: Week 3

Benjamin Braun

Why do we do this?

Agenda

  1. Why do we do this?

  2. Current account balance

  3. International investment position

  4. Bilateral international investment position

Why R?

  1. Much more efficient than Excel and much, much more fun

  2. Python just as useful (but I can only teach you R)

  3. Possibilities are endless: For example, all slides for this course are written using R and Quarto

How to get started in R

  1. Install R from here

  2. Install RStudio from here

  3. Open RStudio and open a new R script (the top left or File -> New File -> R script

  4. Type or copy & paste code from the seminar slides

  5. Execute code chunks by pressing ‘control/command + enter’

How to learn R

  1. Start with R for Data Science (2e), by Hadley Wickham, father of the tidyverse.

  2. Datacamp or similar for interactive learning.

  3. Stack Overflow has answers to almost all your questions

  4. AI models got trained on all of the above and are very good at answering coding questions.

DBnomics

Install and load packages

Install packages (you need to do this only once; code can be deleted or commented out afterwards):

install.packages("tidyverse") 
install.packages("rdbnomics") 
install.packages("janitor") 
install.packages("thematic")

Load the packages (you need to do this every time you open this script):

library(tidyverse) #https://www.tidyverse.org/
library(rdbnomics) #https://git.nomics.world/dbnomics/rdbnomics
library(janitor) #https://github.com/sfirke/janitor
library(thematic) #For styling plots: https://rstudio.github.io/thematic/index.html

Current account balance

Agenda

  1. Why do we do this?

  2. Current account balance

  3. International investment position

  4. Bilateral international investment position

Let’s say we want to download and plot a time series of the UK’s current account balance.

Loading the data

  1. On DBnomics, select provider (IMF) and series (BOP)
  2. Use filters to select the UK and, in the field ‘Indicator’, the relevant balance of payments item. Here, we’ll go with Current Account, Total, Net, US Dollars

Filter selection

Loading the data

  1. Move to {rdbnomics}. We’ll write our download code using the most efficient method, called ‘masking’ (explained in this vignette):
  • Provider: IMF | Series: BOP
  • Frequency: A (annual)
  • Country: GB
  • Indicator: BCA_BP6_USD
bop_imf <- rdb("IMF", "BOP", mask = "A.GB.BCA_BP6_USD")

Standard steps

We’ll be working in the Tidyverse: A beginner-friendly ‘dialect’ of R supported by a collection of packages.

  1. In the Tidyverse, dataframes should be formatted as tibbles. We re-format what {rdbnomics} delivers into a tibble:
bop_imf <- as_tibble(bop_imf)

Standard steps

  1. Inspect variable names: Many are standard across DBnomics datasets. Unfortunately, the names are all over the place:
#Inspect names of some variables
bop_imf |> 
  select(7, 12, 14, 15, 18)
# A tibble: 54 × 5
  Indicator           period     REF_AREA `Reference Area`  value
  <chr>               <date>     <chr>    <chr>             <dbl>
1 Current Account, T… 1970-01-01 GB       United Kingdom    1970.
2 Current Account, T… 1971-01-01 GB       United Kingdom    2717.
3 Current Account, T… 1972-01-01 GB       United Kingdom     533.
4 Current Account, T… 1973-01-01 GB       United Kingdom   -2412.
5 Current Account, T… 1974-01-01 GB       United Kingdom   -7448.
6 Current Account, T… 1975-01-01 GB       United Kingdom   -3465.
# ℹ 48 more rows

Standard steps

  1. clean_names from the {janitor} package let’s us rename variables according to a consistent scheme:
bop_imf <- clean_names(bop_imf)

#Inspect variable names again
bop_imf |> 
  select(7, 12, 14, 15, 18)
# A tibble: 54 × 5
  indicator             period     ref_area reference_area  value
  <chr>                 <date>     <chr>    <chr>           <dbl>
1 Current Account, Tot… 1970-01-01 GB       United Kingdom  1970.
2 Current Account, Tot… 1971-01-01 GB       United Kingdom  2717.
3 Current Account, Tot… 1972-01-01 GB       United Kingdom   533.
4 Current Account, Tot… 1973-01-01 GB       United Kingdom -2412.
5 Current Account, Tot… 1974-01-01 GB       United Kingdom -7448.
6 Current Account, Tot… 1975-01-01 GB       United Kingdom -3465.
# ℹ 48 more rows

Standard steps

  1. We use {ggplot2} for plotting. Years go on the x-axis, values on the y-axis. The plot is still empty because we haven’t yet told ggplot what geom to use to populate it.
ggplot(
  data = bop_imf,
  aes(x = period, 
      y = value)
  )

Standard steps

  1. For a bar chart thats geom_col. And let’s give the plot a title.
ggplot(
  data = bop_imf,
  aes(x = period, 
      y = value)
  ) +
  geom_col() +
  labs(title = 'CA balance UK')

Now that’s nice, but this is a class about finance, not trade. So what about the UK’s international investment position?

International investment position:

  • Shows an economy’s stock of external financial assets and liabilities at a particular point

  • This stock is the result of past external transactions measured according to current market values

  • Components:

    • Credit assets (and debt liabilities)
    • Equity assets and liabilities
    • Derivatives
    • Monetary gold and special drawing rights

International investment position

Agenda

  1. Why do we do this?

  2. Current account balance

  3. International investment position

  4. Bilateral international investment position

Loading the data: IIP

#Load data using mask method
bop_imf_raw <- rdb("IMF", "BOP", mask = "A.DE+BR+CN+GB+IN+MX.IADF_BP6_USD+IAD_BP6_USD+IAO_BP6_USD+IAPD_BP6_USD+IAPE_BP6_USD+IAR_BP6_USD+ILD_BP6_USD+ILF_BP6_USD+ILO_BP6_USD+ILPD_BP6_USD+ILPE_BP6_USD+ILRRL_NRES_S_BP6_USD")

#Convert data frame to tibble format
bop_imf <- as_tibble(bop_imf_raw)

#Remove rows with missing values
bop_imf <- bop_imf |> 
  filter(!is.na(value))

#Clean variable names via 'janitor' package
bop_imf <- clean_names(bop_imf)

#Change confusing country variable name
bop_imf <- bop_imf |> 
  rename(country = "reference_area")

To produce the plot on the next slide, we need corresponding assets and liabilities to have identical character values. So let’s create a variable bop_items that fulfils this requirement:

bop_imf <- bop_imf |> 
  mutate(bop_item = case_when(
    str_detect(indicator, "Derivatives") ~ "Derivatives",
    str_detect(indicator, "Direct Investment") ~ "Direct investment",
    str_detect(indicator, "Other Investment") ~ "Other investment",
    str_detect(indicator, "Debt Securities") ~ "PF: Debt securities",
    str_detect(indicator, "Direct Investment") ~ "Direct investment",
    str_detect(indicator, "Equity") ~ "PF: Equity and fund shares",
    str_detect(indicator, "Reserve Assets") ~ "Reserve assets"))

Since bop_items no longer contains any information about whether we’re looking at an asset or a liability, let’s add a separate asset/liability label variable:

bop_imf <- bop_imf |>
  mutate(label = case_when(
    str_detect(indicator, "Assets") ~ "A",
    str_detect(indicator, "Liabilities") ~ "L"))

It turns out the indicator column contained two variables—a no-go for tidy data. What we’ve done: We’ve separated the names of the financial instruments from the asset/liability label.

Now we can give a negative sign to all liability values:

bop_imf <- bop_imf |> 
  mutate(value = if_else(label == "L", -1*value, value)) 

Plot the UK’s IIP

bop_imf |> 
  filter(country == "United Kingdom") |> 
  ggplot(aes(x = period, 
             y = value/1000000, 
             fill = bop_item)) +
  geom_col() +
  geom_hline(yintercept=0) +
  scale_fill_manual(values=okabe_ito()) +
  labs(y = "USD trillion", 
       x = NULL,
       fill = "BoP item")

Three imbalanced growth models

p <- bop_imf |> 
  filter(ref_area %in% c("BR", "CN", "DE")) |> 
  filter(period > "1999-01-01") |>
  filter(!is.na(bop_item)) |> 
  ggplot(aes(x = period, y = value/1000000, fill = bop_item)) +
  geom_col() +
  geom_hline(yintercept=0) +
  scale_fill_manual(values=okabe_ito()) +
  labs(y = "USD trillion", 
       x = NULL,
       fill = "BoP item") +
  facet_wrap(vars(country), scales = "free_y") +
  theme(legend.position = "bottom")

Three imbalanced growth models

Now that’s nice, but I want to know about my country’s bilateral IIP.

Bilateral international investment position

Agenda

  1. Why do we do this?

  2. Current account balance

  3. International investment position

  4. Bilateral international investment position

Sourcing bilateral IIP data

Bilateral IIP data is not available from the IMF. For European countries, we can go to Eurostat, which luckily is also available via DBnomics.

I started selecting the same BOP items as above for Italy here.

Loading the data

#Eurostat BoP data via rdbnomics package: Assets, liabilities, net-position
#To break down 'Other investment', use FA__O__F2+FA__O__F4+FA__O__F519+FA__O__F81
bop_bil_raw <- rdb("Eurostat", "bop_iip6_q", mask = "Q.MIO_EUR.FA__D__F+FA__F__F7+FA__O__F+FA__P__F3+FA__P__F5.S1.S1.A_LE+L_LE+N_LE.AT+BE+HR+CY+EE+FI+FR+GR+IT+LV+LT+MT+PT+SI+SK+ES+IE+LU+NL+CH+OFFSHO+CN_X_HK+UK+HK+JP+US+IN+RU+BR+WRL_REST.IT")

#Remove missing values
bop_bil <- bop_bil_raw[!is.na(value)]

#Transform data frame into tibble
bop_bil <- as_tibble(bop_bil)

#Clean variable names
bop_bil <- clean_names(bop_bil)

#Convert values to € million to € billion 
bop_bil <- bop_bil |> 
  mutate(value = value/1000)

Renaming

#Rename BOP items for plotting
bop_bil <- bop_bil |> 
  mutate(bop_item = case_when(
    str_detect(bop_item_2, "direct investment") ~ "Direct investment",
    str_detect(bop_item_2, "other investment") ~ "Other",
    str_detect(bop_item_2, "debt securities") ~ "PF: Debt securities",
    str_detect(bop_item_2, "equity") ~ "PF: Equity and fund shares",
    str_detect(bop_item_2, "derivatives") ~ "Derivatives"))
#Rename BOP items for plotting
bop_bil <- bop_bil |> 
  rename("country" = geopolitical_entity_reporting,
         "country_partner" = geopolitical_entity_partner)

Further processing

#Select relevant variables
bop_bil <- bop_bil |> 
  select(geo, country, country_partner, period, stk_flow, bop_item, value)

#Rename partner countries
bop_bil <- bop_bil |> 
  pivot_wider(names_from = country_partner, 
              values_from = value) |> 
  rename("China" = "China except Hong Kong") |> 
  pivot_longer("Austria":"Rest of the world", 
               names_to = "country_partner", 
               values_to = "value")

#Add negative sign to liabilities 
bop_bil <- bop_bil |> 
  mutate(value = if_else(stk_flow == "L_LE", -1*value, value)) 

Plot assets and liabilities for Italy

p <- bop_bil |> 
  filter(stk_flow != "N_LE") |> 
  filter(!country_partner %in% c(
    "Rest of the world", "Cyprus", "Malta", "Estonia",
    "Slovakia", "Slovenia", "Latvia", "Lithuania")) |>
  ggplot(aes(x = period, y = value, fill = bop_item)) +
  geom_col(position = "stack") +
  scale_fill_manual(values= okabe_ito()) +
  labs(y = "EUR billion", 
       x = NULL,
       fill = "BoP item") +
  facet_wrap(vars(country_partner), 
             scales = "free", nrow = 4) +
  theme(legend.position = "bottom")

Aggregate countries into groups

bop_bil <- bop_bil |> 
  mutate("country_group" = case_when(
    country_partner %in% c("Brazil", "China", "India", "Russia") ~ "BRICs",
    country_partner %in% c("Hong Kong", "Offshore financial centers", "Switzerland") ~ "Financial centers - non-EU",
    country_partner %in% c("Ireland", "Luxembourg", "Netherlands", "Malta", "Cyprus") ~ "Euro area - Financial centers",
    country_partner %in% c("Austria", "Belgium", "Croatia", "Cyprus", "Estonia", 
                                           "Finland", "France", "Greece", "Italy", "Latvia", "Lithuania",
                                           "Portugal", "Slovenia", "Slovakia", "Spain") ~ "Euro area - rest",
    .default = country_partner
  ))

Plot assets and liabilities for Italy

p <- bop_bil |> 
  filter(stk_flow != "N_LE") |> 
  filter(period >= "2008-01-01") |> 
  filter(!country_group %in% c("Rest of the world", "Japan")) |> 
  group_by(period, country_group, bop_item, stk_flow) |> 
  summarise("value" = sum(value, na.rm = TRUE)) |> 
  ggplot(aes(x = period, y = value/1000, fill = bop_item)) +
  geom_col(position = "stack") +
  scale_fill_manual(values= okabe_ito()) +
  labs(y = "EUR trillion", 
       x = NULL,
       fill = "BoP item") +
  facet_wrap(vars(country_group), scales = "free_x") +
  theme(legend.position = "bottom")