Copulas.jl
A Distributions.jl-compliant copula package.
The Copulas.jl package provides a large collection of models for dependence structures of real random vectors, known as copulas, with a wide selection of features:
random number generation
evaluation of (log)density and distribution functions
copula-based multivariate distributions via Sklar's theorem
fitting procedures, including marginal models
evaluation of dependence metrics
Since copulas are distribution functions, we fully comply with the Distributions.jl API. This compliance allows direct interoperability with other packages based on this API, such as Turing.jl.
Usually, users who work with copulas turn to the R package copula. While still well-maintained and regularly updated, the R package copula is a complicated code base in terms of readability, extensibility, reliability, and maintenance. This package aims to provide a lightweight, fast, reliable, and maintainable copula implementation in native Julia. Among other benefits, a notable feature of such a native implementation is floating point type agnosticism, i.e., compatibility with BigFloat, DoubleFloats, MultiFloats, and other numeric types.
The package revolves around two main types:
Copula, the abstract supertype of all copulas
SklarDist, the type for multivariate compound distributions via Sklar's theorem
The package is registered in Julia's General registry so you may simply install the package by running :
] add CopulasThe API contains random number generation, cdf and pdf evaluation, and the fit function from Distributions.jl. A typical use case might look like this:
using Copulas, Distributions, Random, Plots
X₁ = Gamma(2,3)
X₂ = Beta(1,4)
X₃ = Normal()
C = ClaytonCopula(3,5.2) # A 3-variate Clayton Copula with θ = 0.7
D = SklarDist(C,(X₁,X₂,X₃)) # The final distribution
simu = rand(D,1000) # Generate a dataset
D̂ = fit(SklarDist{ClaytonCopula,Tuple{Gamma,Normal,Normal}}, simu) # estimate a model
plot(D̂) # plot the result
The list of availiable copula models is very large, check it out on our documentation ! The general implementation philosophy is for the code to follow the mathematical boundaries of the implemented concepts. For example, this is the only implementation we know (in any language) that allows for all Archimedean copulas to be sampled: we use the Williamson transformation for non-standard generators, including user-provided black-box ones.
There are competing packages in Julia, such as BivariateCopulas.jl which only deals with a few models in bivariate settings but has very nice graphs, or DatagenCopulaBased.jl, which only provides sampling and does not have exactly the same models as Copulas.jl. Since recently, we cover both of these packages’ functionalities completely, while still bringing, as a key feature, compliance with the broader ecosystem. The following table provides a feature comparison between the three:
Copulas.jl | DatagenCopulaBased.jl | BivariateCopulas.jl | |
|---|---|---|---|
Distributions.jl's API | ✔️ | ❌ | ✔️ |
| Fitting | ✔️ | ❌ | ❌ |
| Plotting | ✔️ | ❌ | ✔️ |
| Conditioning | ✔️ | ❌ | ⚠️ Bivariate Only |
| Available copulas | |||
| - Classic Bivariate | ✔️ | ✔️ | ✔️ |
| - Obscure Bivariate | ✔️ | ❌ | ❌ |
| - Classic Multivariate | ✔️ | ✔️ | ❌ |
| - Archimedeans | ✔️ All of them | ⚠️ Selected ones | ⚠️Selected ones |
| - Extreme Value Copulas | ⚠️ Bivariate only | ❌ | ❌ |
| - Archimax | ⚠️ Bivariate only | ❌ | ❌ |
| - Archimedean Chains | ❌ | ✔️ | ❌ |
Since our primary target is maintainability and readability of the implementation, we did not consider the efficiency and the performance of the code yet. Proper benchmarks will come in the near future.
Here is a quick, practical tour of the public API. It shows how to construct copulas, build Sklar distributions, compute dependence metrics, subset and condition models, use Rosenblatt transforms, and fit models. For background, theory, details, and model descriptions, see the Manual.
You can construct a copula object with their respective constructors. They behave like multivariate distributions from Distributions.jl and respect their API:
using Copulas, Distributions, Random, StatsBase
# A 3-variate Clayton copula
C = ClaytonCopula(3, 2.0)
U = rand(C, 5)
Distributions.loglikelihood(C, U)6.164855199439504To build multivariate distributions, you can compose a copula with marginals via Sklar’s theorem:
X₁, X₂, X₃ = Gamma(2,3), Beta(1,5), LogNormal(0,1)
C2 = GumbelCopula(3, 1.7)
D = SklarDist(C2, (X₁, X₂, X₃))
rand(D, 3)
pdf(D, rand(3))0.027317111453056257You can get scalar dependence metrics at copula level:
(
kendall_tau = Copulas.τ(C),
spearm_rho = Copulas.ρ(C),
blomqvist_beta = Copulas.β(C),
gini_gamma = Copulas.γ(C),
entropy_iota = Copulas.ι(C),
lower_tail_dep = Copulas.λₗ(C),
upper_tail_dep = Copulas.λᵤ(C)
)(kendall_tau = 0.5, spearm_rho = 0.6822338340468779, blomqvist_beta = 0.19419223902606517, gini_gamma = 0.5310295802132414, entropy_iota = -0.9980063467115957, lower_tail_dep = 0.5773502691896257, upper_tail_dep = 0.0)Pairwise matrices of bivariate versions are available through StatsBase.corkendall(C), StatsBase.corspearman(C), Copulas.corblomqvist(C), Copulas.corgini(C), Copulas.corentropy(C), Copulas.corlowertail(C), and Copulas.coruppertail(C).
Same functions work passing a dataset instead of the copula for their empirical counterpart.
The measure function measures hypercubes under the distribution of the copula. You can access the Rosenblatt transformation of a copula (or a Sklar distribution) through the rosenblatt and inverse_rosenblatt functions:
Copulas.measure(C, (0.1,0.2,0.3), (0.9,0.8,0.7))
x = rand(D, 100)
u = rosenblatt(D, x)
x2 = inverse_rosenblatt(D, u)
maximum(abs.(x2 .- x))3.768073630894264e-10You can subset the dimensions of a model through subsetdims(), and you can condition a model on some of its marginals with condition():
S23 = subsetdims(C2, (2,3))
StatsBase.corkendall(S23)
Dj = condition(C2, 2, 0.3) # Distributions of (U₁, U₃) | U₂ = 0.3 (d=2)
Distributions.cdf(Dj, [0.95, 0.80])
Dc = condition(D, (2,3), (0.3, 0.2))
rand(Dc, 2)2-element Vector{Float64}:
4.959723957455608
2.91597319428078Fit both marginals and copula from raw data (Sklar):
X = rand(D, 500)
M = fit(CopulaModel, SklarDist{GumbelCopula, Tuple{Gamma,Beta,LogNormal}}, X; copula_method=:mle)────────────────────────────────────────────────────────────────────────────────
[ CopulaModel: SklarDist ] (Copula=Archimedean d=3, Margins=(Gamma, Beta, LogNormal))
────────────────────────────────────────────────────────────────────────────────
Copula: Archimedean d=3
Margins: (Gamma, Beta, LogNormal)
Methods: copula=mle, sklar=ifm
Number of observations: 500
────────────────────────────────────────────────────────────────────────────────
[ Fit metrics ]
────────────────────────────────────────────────────────────────────────────────
Null Loglikelihood: -1710.4098
Loglikelihood: -1363.7405
LR (vs indep.): 693.34 ~ χ²(1) ⇒ p = <1e-16
AIC: 2729.481
BIC: 2733.696
Converged: true
Iterations: 4
Elapsed: 0.744s
────────────────────────────────────────────────────────────────────────────────
[ Dependence metrics ]
────────────────────────────────────────────────────────────────────────────────
Kendall τ: 0.4427
Spearman ρ: 0.6152
Blomqvist β: 0.1390
Gini γ: 0.4601
Upper λᵤ: 0.4300
Lower λₗ: 0.0000
Entropy ι: -0.7089
────────────────────────────────────────────────────────────────────────────────
[ Copula parameters ] (vcov=hessian)
────────────────────────────────────────────────────────────────────────────────
Parameter Estimate Std.Err z-value p-val 95% Lo 95% Hi
θ 1.7942 0.0451 39.781 <1e-16 1.7058 1.8826
────────────────────────────────────────────────────────────────────────────────
[ Marginals ]
────────────────────────────────────────────────────────────────────────────────
Margin Dist Param Estimate Std.Err 95% CI
#1 Gamma α 2.0075 0.1179 [1.7763, 2.2386]
θ 3.0530 0.2036 [2.6540, 3.4521]
#2 Beta α 0.9076 0.0503 [0.8091, 1.0061]
β 4.3729 0.2971 [3.7907, 4.9552]
#3 LogNormal μ 0.0058 0.0482 [-0.0887, 0.1003]
σ 1.0786 0.0342 [1.0117, 1.1456]Directly fit a copula from pseudo-observations U:
U = pseudos(X)
Ĉ = fit(GumbelCopula, U; method=:itau)GumbelCopula{3, Float64}(θ = 1.7622073397446856,)Notes
fit chooses a reasonable default per family; pass method/copula_method to control it.
Common methods: copulas :mle, :itau, :irho, :ibeta; Sklar :ifm (parametric CDFs) and :ecdf (pseudo-observations).
If you want to contribute to the package, ask a question, found a bug or simply want to chat, do not hesitate to open an issue on the Copulas.jl repository
Do not hesitate to star this repository to show support. If you use this package in your researches, please cite it with the following bibtex code:
@article{LavernyJimenez2024,
author = {Oskar Laverny and Santiago Jimenez},
title = {Copulas.jl: A fully Distributions.jl-compliant copula package},
journal = {Journal of Open Source Software},
doi = {10.21105/joss.06189},
url = {https://doi.org/10.21105/joss.06189},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {94},
pages = {6189}
}