To what extent do technology, tourism, pharmaceutical, and energy stocks differ in terms of their price volatility and recovery dynamics during and after the COVID-19 pandemic?
Stock Market Data
Source Yahoo Finance –> accessed via quantmod package in R
Type of data: Daily Stock Prices from January 2019 until December 2023
Stocks used:
Apple Inc. (AAPL) – Technology
BioNTech SE (BNTX) – Pharmaceutical
Deutsche Lufthansa AG (LHA.DE) – Tourism/Aviation
Shell plc (SHEL) – Energy
Limit to closing values and extracting them
Combining all Stocks closing Value in one table –> difficulties due to differing in number of rows
Solution for that was instead of combining vectors directly, all time series were first aligned by date using merge() on xts objects. Missing trading days were handled via NA values.
Data type: Daily confirmed COVID cases, continuously updated and well known for academic research and policy analysis
Changing from Wide Format –> long Format, so every row corresponds
Provincial –> country aggregation
Cumulative –> daily values
Global aggregation
Final COVID Dataset: Daily global confirmed cases, daily global deaths & time index
The time series plots can be used to identify initial structural breaks, which become particularly important later in the analysis.
ADF test: H0: time series is not stationary (unit root), H1: time series is stationary
Stationary if tau value is smaller than critical value (s=5%)
Value of test-statistic is: -1.2465 2.6347
| Test Statistic | 1% | 5% | 10% |
|---|---|---|---|
| τ₂ | -3.43 | -2.86 | -2.57 |
| φ₁ | 6.43 | 4.59 | 3.78 |
–> Not stationary
-1.2465 > -2.86
H0 cannot be rejected
First differences: make the time series stationary
remove trend
changes in prices with stable variance around the mean
–> provides reliable basis for future calculations
Our Objective: To generate a forecast for each stock indicating the range within which it would have traded in the absence of the pandemic
We decided for an ARIMA-model and therefore used the Website https://otexts.com/fpp3/arima-reading.html
No continuous timeplot: clean missing dates
Missing days: NA dates –> forward filling
Dark blue: 80% forecast interval
Light blue: 95% forecast interval
–> Range where values are expected
–>Limitation: small data set
–> gross overview, but limitations
Stock Data set cleaning because of unalligment in rows
Stocks have different currencies –> not as important at the moment, but could be a problem in the further process
COVID-Data not optional in original form as well –> break down of data
Generally filtering the enormous amount of data according to our “needs”
Only considering stock prices in relation to confirmed Covid cases, also due to the limited time frame
Should we use the stationary time series for the further procedure?
Rejected the idea of the first question and settled on the current topic.
Found, extracted and cleaned of both Data Sets
Initial thought: divide the stock data set into quarterly segments
–> Rejected because it would have simplified/misrepresented our results
Summarized and compared the data collection in a compatible table
Visualization of both data sets
Tried on stationary tests
Initial creation of forecasts based on the ARIMA model
Test for long-run relationships (Cointegration)
Analyze short-run dynamics (Granger causality test)
Resulting in: possible sector-specific COVID–market linkages
Robustness checks with alternative specifications (monthly forecast eventually)
Stock price data obtained from Yahoo Finance via the quantmod package
Apple (AAPL): https://finance.yahoo.com/quote/AAPL
BioNTech (BNTX): https://finance.yahoo.com/quote/BNTX
Lufthansa (LHA.DE): https://finance.yahoo.com/quote/LHA.DE
Shell (SHEL): https://finance.yahoo.com/quote/SHEL
https://github.com/CSSEGISandData/COVID-19
Path of Confirmed cases (global): csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.cs
Path of Deaths (global): csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv