HamiltonFilters.jl
is a lightweight implementation of the Hamilton
filter for time series analysis. It is not registered in the Julia
package registry, but can be installed directly from GitHub:
using Pkg
Pkg.add(url="https://github.com/enweg/HamiltonFilters.jl")
To pin the package to a specific version:
Pkg.add(PackageSpec(url="https://github.com/enweg/HamiltonFilters.jl",
rev="v2.0.0"))
Once installed, load the package with:
using HamiltonFilters
The Hamilton filter estimates trend and cyclical components by running the following regression:
You must choose two parameters:
h
: forecast horizonp
: number of most recent values at timet
(includingy_t
)
Hamilton (2018) recommends:
Frequency | h | p |
---|---|---|
Yearly | 2 | ? |
Quarterly | 8 | 4 |
Monthly | 24 | ? |
In general, he recommends that both h
and p
are integer multiples
of the number of observations in a year if the original data is
seasonal.
If the series is integrated of order d
or stationary around a
d-th-order polynomial, choose p >= d
to obtain a stationary cycle.
The cyclical component is defined as the regression residuals:
The trend is the difference between the observed data and the cycle.
Only observations starting from index p + h
can be used for filtering.
Load the package:
using HamiltonFilters
Construct a filter instance:
h = 8
p = 4
hfilter = HamiltonFilter(h, p)
Apply the filter to a time series:
trend, cycle = apply(hfilter, data)
The filter
function always returns a tuple of the form (trend, cycle)
.
-
If
data
is aVector{<:Real}
, thentrend
andcycle
are vectors of the same length asdata
. The first(p-1) + h
values are filled withNaN
. -
If
data
is aVector{Union{Missing,<:Real}}
, thentrend
andcycle
are vectors of the same length asdata
. The first(p-1) + h
values are filled withmissing
. -
If
data
is aMatrix{<:Real}
, thentrend
andcycle
are matrices of the same size asdata
. The filter is applied to each column independently, and the first(p-1) + h
rows are filled withNaN
. -
If
data
is aMatrix{Union{Missing,<:Real}}
, thentrend
andcycle
are matrices of the same size asdata
. The filter is applied to each column independently, and the first(p-1) + h
rows are filled withmissing
. -
If
data
is aDataFrame
, thentrend
andcycle
are DataFrames of the same shape and with the same column names. The filter is applied column-wise, and the first(p-1) + h
rows are either filled withNaN
or missing, depending on the type of the column.
Yes. The implementation has been compared to the hfilter
function in
Matlab using real GDP data. The test
folder contains:
logGDPC1.csv
: log real GDP from FREDmatlab_hfilter.csv
: Matlab output for comparison
The filter output matches the benchmark.
Missing values are handled internally using the following procedure:
- We create both the regressor matrix and the outcome vector for the regression that needs to be estimated.
- We then find all the rows in the regressor matrix that contain
NaN
,Inf
, ormissing
. We do the same for the outcome vector. The intersection of the two are all the observations that we cannot use for the estimation of the regression. - The regression is estimated using all those columns that do not contain
NaN
Inf
ormissing
in either the regressor matrix or the outcome vector. - The trend is obtained via the fitted values. Thus, the trend can be obtained
for all observations for which none of the regressors is
NaN
,Inf
, ormissing
. - The cycle is obtained as the difference between the observation and the trend.
Thus, computing the cycle requires that both the regressors and the observation
is not
NaN
,Inf
, ormissing
. - Depending on the missing type (
NaN
,Inf
,missing
), values that cannot be computed are replaced byNaN
,Inf
,missing
. For this step we rely on Julia's internal handling of mathematical computations usingNaN
,Inf
, andmissing
.
The precendence of missingness is missing
> NaN
> Inf
. Thus:
- If values needed for the computation of trend or cycle only contain one of
missing
,NaN
, orInf
, then missing values are filled with that value. - If values needed for the computation of trend or cycle contain more than
one of
missing
,NaN
, orInf
, then missing values are filled using the highest precedence value. E.g.missing
is used if bothmissing
andNaN
are present.