Empirical models
Pseudo-observations
Through the statistical process leading to the estimation of copulas, one usually observes the data and information on the marginals scale and not on the copula scale. This discrepancy between the observed information and the modeled distribution must be taken into account. A key concept is that of pseudo-observations.
If
where
In Copulas.jl, we provide a function pseudos that implement this transformation directly.
Copulas.pseudos Function
pseudos(sample)Compute the pseudo-observations of a multivariate sample. Note that the sample has to be given in wide format (d,n), where d is the dimension and n the number of observations.
Warning: the order used is ordinal ranking like https://en.wikipedia.org/wiki/Ranking#Ordinal_ranking_.28.221234.22_ranking.29, see StatsBase.ordinalrank for the ordering we use. If you want more flexibility, checkout NormalizeQuantiles.sampleranks.
Deheuvel's empirical copula
From these pseudo-observations, an empirical copula is defined and anlysed in [58] as follows:
The empirical distribution function of the normalized ranks,
is called the empirical copula function.
is an exhaustive estimator of
then converges (weakly) to
Despite its name,
In the package, this copula is implemented as the EmpiricalCopula:
Copulas.EmpiricalCopula Type
EmpiricalCopula{d, MT}Fields:
u::MT— pseudo-observation matrix of size(d, N).
Constructor
EmpiricalCopula(u; pseudo_values=true)The empirical copula in dimension
where the inequality is componentwise. If pseudo_values=false, the constructor first ranks the raw data into pseudo-observations; otherwise it assumes u already contains pseudo-observations in
Notes:
This is an empirical object based on pseudo-observations; it is not necessarily a true copula for finite
but is widely used for nonparametric inference. Supports
cdf,logpdfat observed points, random sampling, and subsetting.
References:
- [3] Nelsen, Roger B. An introduction to copulas. Springer, 2006.
Distortions: available via the generic implementation (partial-derivative ratios). For the empirical copula, derivatives are stepwise; interpret results carefully near sample jumps.
Conditional copulas: available via the generic implementation. No specialized fast path is provided.
Visual: empirical copula from pseudo-observations
using Copulas, Distributions, Plots
# generate data with known dependence, then compute pseudo-observations
X = SklarDist(ClaytonCopula(2, 1.2), (Normal(), Beta(1, 4)))
x = rand(X, 1000)
Ĉ = EmpiricalCopula(x, pseudo_values=false)
plot(plot(X.C), plot(Ĉ); layout=(1,2))
Beta copula
The empirical copula function is not a copula. An easy way to fix this problem is to smooth out the marginals with beta distribution functions. The Beta copula is thus defined and analysed in [59] as follows:
Denoting
is a genuine copula, called the Beta copula.
In the package, this copula is implemented as BetaCopula:
Copulas.BetaCopula Type
BetaCopula{d, MT}Fields:
ranks::MT- ranks matrix (d × n), each row contains integers 1..n
Constructor
BetaCopula(u)The empirical beta copula in dimension
where
Notes:
This is always a valid copula for any finite sample size
n.Supports
cdf,logpdfat observed points and random sampling.
References:
- [60] Segers, J., Sibuya, M., & Tsukahara, H. (2017). The empirical beta copula. Journal of Multivariate Analysis, 155, 35-51.
Distortions: specialized fast path returning a MixtureModel of Beta components for efficient evaluation and sampling.
Conditional copulas: available via the generic implementation (no dedicated fast path).
Performance notes
- Construction is O(d·n) after pseudo-observations are computed. Evaluation at a point uses O(d·n) basis lookups; consider subsampling for very large n.
Bernstein Copula
Bernstein copula are simply another smoothing of the empirical copula using Bernstein polynomials.
Mathematically, given a base copula
It is a multivariate Bernstein polynomial approximation of
In the package, this copula is implemented as BernsteinCopula:
Copulas.BernsteinCopula Type
BernsteinCopula{d}Fields:
m::NTuple{d,Int}- polynomial degrees (smoothing parameters)weights::Array{Float64, d}- precomputed grid of box measures
Constructor
BernsteinCopula(C; m=10)
BernsteinCopula(data; m=10)The Bernstein copula in dimension
It is a polynomial approximation of the base copula
Implementation notes:
The grid of box measures (weights) is fully precomputed and stored as an
-dimensional array at construction. This enables fast evaluation of the copula and its density, but can be memory-intensive for large or . The choice of
mcontrols the smoothness of the approximation: largermyields finer approximation but exponentially increases memory and computation cost (boxes). For high dimensions or large
, memory usage may become prohibitive; see documentation for scaling behavior. If
is an EmpiricalCopula, the constructor produces the empirical Bernstein copula, a smoothed version of the empirical copula.Supports
cdf,logpdf, and random generation via mixtures of beta distributions.
References:
[61] Sancetta, A., & Satchell, S. (2004). The Bernstein copula and its applications to modeling and approximations of multivariate distributions. Econometric Theory, 20(3), 535-562.
[60] Segers, J., Sibuya, M., & Tsukahara, H. (2017). The empirical beta copula. Journal of Multivariate Analysis, 155, 35-51.
Distortions: specialized fast path returning a MixtureModel of Beta components (weights from Bernstein grid finite differences conditioned on
). Conditional copulas: available via the generic implementation (no dedicated fast path).
Performance notes
Complexity grows with the grid size ∏_j (m_j+1) for cdf and ∏_j m_j for pdf. In higher dimensions, keep m small or prefer the 2D specialized paths provided.
Small negative finite differences from numerical noise are clipped to zero before normalization.
Checkerboard Copulas
There are other nonparametric estimators of the copula function that are true copulas. Of interest to our work is the Checkerboard construction (see [62, 63]), detailed below.
First, for any
Furthermore, for any copula
Let
is a genuine copula as soon as
If all
This copula is called Checkerboard, as it fills the unit hypercube with hyperrectangles of same shapes
It can be noted that there is no need for the hyperrectangles to be filled with a uniform distribution (
Denoting
where we intend
This allows for an easy generalization in the framework of patchwork copulas [64–66]:
Let
is a copula.
In fact, replacing
Convergence results for this kind of copulas can be found in [66], with a slightly different parametrization.
In the package, this copula is implemented as CheckerboardCopula:
Copulas.CheckerboardCopula Type
CheckerboardCopula{d, T}Fields:
m::Vector{Int}— length d; number of partitions per dimension (grid resolution).boxes::Dict{NTuple{d,Int}, T}— dictionary-like mapping from grid box indices to empirical weights. TypicallyDict{NTuple{d,Int}, Float64}built withStatsBase.proportionmap.
Constructor:
CheckerboardCopula(X; m=nothing, pseudo_values=true)Builds a piecewise-constant (histogram) copula on a regular grid. The unit cube in each dimension i is partitioned into m[i] equal bins. Each observation is assigned to a box k ∈ ∏_i {0, …, m[i]-1}; the empirical box weights w_k sum to 1. The copula density is constant inside each box, with
c(u) = w_k × ∏_i m[i] when u ∈ box k, and 0 otherwise.The CDF admits the multilinear overlap form
C(u) = ∑_k w_k × ∏_i clamp(m[i]·u_i − k_i, 0, 1),which this type evaluates directly without storing all grid corners.
Notes:
If
misnothing, we usem = fill(n, d)wheren = size(X, 2).When
pseudo_values=true(default),Xmust already be pseudo-observations in [0,1]. Otherwise pass raw data and setpseudo_values=falseto convert viapseudos(X).Each
m[i]must dividento produce a valid checkerboard on the sample grid; this is enforced by the constructor.
References
Neslehova (2007). On rank correlation measures for non-continuous random variables.
Durante, Sanchez & Sempi (2013) Multivariate patchwork copulas: a unified approach with applications to partial comonotonicity.
Segers, Sibuya & Tsukahara (2017). The empirical beta copula. J. Multivariate Analysis, 155, 35-51.
Genest, Neslehova & Rémillard (2017) Asymptotic behavior of the empirical multilinear copula process under broad conditions.
Cuberos, Masiello & Maume-Deschamps (2019) Copulas checker-type approximations: application to quantiles estimation of aggregated variables.
Fredricks & Hofert (2025). On the checkerboard copula and maximum entropy.
Distortions: specialized for conditioning on a single coordinate (p=1) via a histogram-bin distortion on the corresponding slice.
Conditional copulas: specialized projection onto remaining axes, renormalizing the mass in the fixed bins, still returns a Checkerboard.
Performance notes
Construction cost scales with sample size but stores only occupied boxes (sparse). CDF evaluation is O(#occupied boxes) at query time.
For large n choose coarser m to reduce occupied boxes; for small n a finer grid is possible but may leave many empty boxes.
Empirical Extreme-Value copula (Pickands estimator)
In addition to the empirical, beta, Bernstein and checkerboard constructions, we provide a nonparametric bivariate Extreme Value copula built from data by estimating the Pickands dependence function. The tail implementation EmpiricalEVTail supports several classical estimators (Pickands, CFG, OLS intercept), and a convenience constructor EmpiricalEVCopula builds the corresponding ExtremeValueCopula directly from pseudo-observations.
Typical workflow:
See the Extreme Value manual page for background and the bestiary entry for the full API of EmpiricalEVTail.
Copulas.EmpiricalEVTail Type
EmpiricalEVTailFields:
tgrid::Vector{Float64}— evaluation grid in (0,1)Ahat::Vector{Float64}— estimated Pickands function values ontgridslope::Vector{Float64}— per-segment slopes for linear interpolation
Constructor
EmpiricalEVTail(u; method=:ols, grid=401, eps=1e-3, pseudo_values=true) ExtremeValueCopula(2, EmpiricalEVTail(u; ...))
The empirical extreme-value (EV) copula (bivariate) is defined from pseudo-observations u = (U₁, U₂) and a nonparametric estimator of the Pickands dependence function. Supported estimators are:
:pickands— classical Pickands estimator:cfg— Capéraà–Fougères–Genest (CFG) estimator:ols— OLS-intercept estimator
For stability, the estimated function is always projected onto the class of valid Pickands functions (convex, bounded between max(t,1-t) and 1, with endpoints fixed at 1).
Its Pickands function is
Â(t), t ∈ (0,1),evaluated via piecewise linear interpolation on the grid tgrid.
References
[caperaa1997nonparametric] Capéraà, Fougères, Genest (1997) Biometrika
[gudendorf2011nonparametric] Gudendorf, Segers (2011) Journal of Multivariate Analysis
Empirical Archimedean generator (Kendall inversion)
Beyond copula estimators, we also provide a nonparametric estimator of a
Approximating the (unknown)
The radii and weights are recovered from the empirical Kendall distribution via a triangular recursion (see the example page for details), and the resulting generator is exposed as EmpiricalGenerator.
Usage:
Build from data
u::d×n(raw or pseudos):Ĝ = EmpiricalGenerator(u; pseudo_values=true)Use directly in an Archimedean copula:
Ĉ = ArchimedeanCopula(d, Ĝ)Access the fitted radial law:
R̂ = williamson_dist(Ĝ, d)
Copulas.EmpiricalGenerator Function
EmpiricalGenerator(u::AbstractMatrix)Nonparametric Archimedean generator fit via inversion of the empirical Kendall distribution.
This function returns a WilliamsonGenerator{TX} whose underlying distribution TX is a Distributions.DiscreteNonParametric, rather than a separate struct. The returned object still implements all optimized methods (ϕ, derivatives, inverses) via specialized dispatch on WilliamsonGenerator{<:DiscreteNonParametric}.
Usage
G = EmpiricalGenerator(u)where u::AbstractMatrix is a d×n matrix of observations (already on copula or pseudo scale).
Notes
The recovered discrete radial support is rescaled so its largest atom equals 1 (scale is not identifiable).
We keep the old documentation entry point for backward compatibility; existing code that relied on the
EmpiricalGeneratortype should instead treat the result as aGenerator.
References
[42]
[43]
[37] Genest, Neslehova and Ziegel (2011), Inference in Multivariate Archimedean Copula Models
Performance notes
- The Kendall sample computation is currently O(n^2) in the number of observations. For large n, future versions may switch to Fenwick-tree–based sweeps to reach ~O(n log n) in bivariate cases and ~O(n log^{d-1} n) for higher d.
See This example page for more details and example usages.
Available models
EmpiricalCopula
Copulas.EmpiricalCopula Type
EmpiricalCopula{d, MT}Fields:
u::MT— pseudo-observation matrix of size(d, N).
Constructor
EmpiricalCopula(u; pseudo_values=true)The empirical copula in dimension
where the inequality is componentwise. If pseudo_values=false, the constructor first ranks the raw data into pseudo-observations; otherwise it assumes u already contains pseudo-observations in
Notes:
This is an empirical object based on pseudo-observations; it is not necessarily a true copula for finite
but is widely used for nonparametric inference. Supports
cdf,logpdfat observed points, random sampling, and subsetting.
References:
- [3] Nelsen, Roger B. An introduction to copulas. Springer, 2006.
BernsteinCopula
Copulas.BernsteinCopula Type
BernsteinCopula{d}Fields:
m::NTuple{d,Int}- polynomial degrees (smoothing parameters)weights::Array{Float64, d}- precomputed grid of box measures
Constructor
BernsteinCopula(C; m=10)
BernsteinCopula(data; m=10)The Bernstein copula in dimension
It is a polynomial approximation of the base copula
Implementation notes:
The grid of box measures (weights) is fully precomputed and stored as an
-dimensional array at construction. This enables fast evaluation of the copula and its density, but can be memory-intensive for large or . The choice of
mcontrols the smoothness of the approximation: largermyields finer approximation but exponentially increases memory and computation cost (boxes). For high dimensions or large
, memory usage may become prohibitive; see documentation for scaling behavior. If
is an EmpiricalCopula, the constructor produces the empirical Bernstein copula, a smoothed version of the empirical copula.Supports
cdf,logpdf, and random generation via mixtures of beta distributions.
References:
[61] Sancetta, A., & Satchell, S. (2004). The Bernstein copula and its applications to modeling and approximations of multivariate distributions. Econometric Theory, 20(3), 535-562.
[60] Segers, J., Sibuya, M., & Tsukahara, H. (2017). The empirical beta copula. Journal of Multivariate Analysis, 155, 35-51.
CheckerboardCopula
Copulas.CheckerboardCopula Type
CheckerboardCopula{d, T}Fields:
m::Vector{Int}— length d; number of partitions per dimension (grid resolution).boxes::Dict{NTuple{d,Int}, T}— dictionary-like mapping from grid box indices to empirical weights. TypicallyDict{NTuple{d,Int}, Float64}built withStatsBase.proportionmap.
Constructor:
CheckerboardCopula(X; m=nothing, pseudo_values=true)Builds a piecewise-constant (histogram) copula on a regular grid. The unit cube in each dimension i is partitioned into m[i] equal bins. Each observation is assigned to a box k ∈ ∏_i {0, …, m[i]-1}; the empirical box weights w_k sum to 1. The copula density is constant inside each box, with
c(u) = w_k × ∏_i m[i] when u ∈ box k, and 0 otherwise.The CDF admits the multilinear overlap form
C(u) = ∑_k w_k × ∏_i clamp(m[i]·u_i − k_i, 0, 1),which this type evaluates directly without storing all grid corners.
Notes:
If
misnothing, we usem = fill(n, d)wheren = size(X, 2).When
pseudo_values=true(default),Xmust already be pseudo-observations in [0,1]. Otherwise pass raw data and setpseudo_values=falseto convert viapseudos(X).Each
m[i]must dividento produce a valid checkerboard on the sample grid; this is enforced by the constructor.
References
Neslehova (2007). On rank correlation measures for non-continuous random variables.
Durante, Sanchez & Sempi (2013) Multivariate patchwork copulas: a unified approach with applications to partial comonotonicity.
Segers, Sibuya & Tsukahara (2017). The empirical beta copula. J. Multivariate Analysis, 155, 35-51.
Genest, Neslehova & Rémillard (2017) Asymptotic behavior of the empirical multilinear copula process under broad conditions.
Cuberos, Masiello & Maume-Deschamps (2019) Copulas checker-type approximations: application to quantiles estimation of aggregated variables.
Fredricks & Hofert (2025). On the checkerboard copula and maximum entropy.
BetaCopula
Copulas.BetaCopula Type
BetaCopula{d, MT}Fields:
ranks::MT- ranks matrix (d × n), each row contains integers 1..n
Constructor
BetaCopula(u)The empirical beta copula in dimension
where
Notes:
This is always a valid copula for any finite sample size
n.Supports
cdf,logpdfat observed points and random sampling.
References:
- [60] Segers, J., Sibuya, M., & Tsukahara, H. (2017). The empirical beta copula. Journal of Multivariate Analysis, 155, 35-51.
EmpiricalEvTail
Copulas.EmpiricalEVTail Type
EmpiricalEVTailFields:
tgrid::Vector{Float64}— evaluation grid in (0,1)Ahat::Vector{Float64}— estimated Pickands function values ontgridslope::Vector{Float64}— per-segment slopes for linear interpolation
Constructor
EmpiricalEVTail(u; method=:ols, grid=401, eps=1e-3, pseudo_values=true) ExtremeValueCopula(2, EmpiricalEVTail(u; ...))
The empirical extreme-value (EV) copula (bivariate) is defined from pseudo-observations u = (U₁, U₂) and a nonparametric estimator of the Pickands dependence function. Supported estimators are:
:pickands— classical Pickands estimator:cfg— Capéraà–Fougères–Genest (CFG) estimator:ols— OLS-intercept estimator
For stability, the estimated function is always projected onto the class of valid Pickands functions (convex, bounded between max(t,1-t) and 1, with endpoints fixed at 1).
Its Pickands function is
Â(t), t ∈ (0,1),evaluated via piecewise linear interpolation on the grid tgrid.
References
[caperaa1997nonparametric] Capéraà, Fougères, Genest (1997) Biometrika
[gudendorf2011nonparametric] Gudendorf, Segers (2011) Journal of Multivariate Analysis
References
R. B. Nelsen. An Introduction to Copulas. 2nd ed Edition, Springer Series in Statistics (Springer, New York, 2006).
C. Genest, J. Nešlehová and J. Ziegel. Inference in Multivariate Archimedean Copula Models. TEST 20, 223–256 (2011).
A. J. McNeil and J. Nešlehová. Multivariate Archimedean copulas,
-monotone functions and -norm symmetric distributions. Annals of Statistics 37, 3059–3097 (2009). R. E. Williamson. Multiply monotone functions and their Laplace transforms. Duke Mathematical Journal 23, 189–207 (1956).
P. Deheuvels. La Fonction de Dépendance Empirique et Ses Propriétés. Académie Royale de Belgique. Bulletin de la Classe des Sciences 65, 274–292 (1979).
J. Segers, M. Sibuya and H. Tsukahara. The Empirical Beta Copula. Journal of Multivariate Analysis 155, 35–51 (2017).
J. Segers, M. Sibuya and H. Tsukahara. The empirical beta copula. Journal of Multivariate Analysis 155, 35–51 (2017).
A. Sancetta and S. Satchell. The Bernstein copula and its applications to modeling and approximations of multivariate distributions. Econometric theory 20, 535–562 (2004).
A. Cuberos, E. Masiello and V. Maume-Deschamps. Copulas Checker-Type Approximations: Application to Quantiles Estimation of Sums of Dependent Random Variables. Communications in Statistics - Theory and Methods, 1–19 (2019).
P. Mikusiński and M. D. Taylor. Some Approximations of N-Copulas. Metrika 72, 385–414 (2010).
F. Durante, E. Foscolo, J. A. Rodríguez-Lallena and M. Úbeda-Flores. A Method for Constructing Higher-Dimensional Copulas. Statistics 46, 387–404 (2012).
F. Durante, J. Fernández Sánchez and C. Sempi. Multivariate Patchwork Copulas: A Unified Approach with Applications to Partial Comonotonicity. Insurance: Mathematics and Economics 53, 897–905 (2013).
F. Durante, J. Fernández-Sánchez, J. J. Quesada-Molina and M. Úbeda-Flores. Convergence Results for Patchwork Copulas. European Journal of Operational Research 247, 525–531 (2015).
O. Laverny. Empirical and Non-Parametric Copula Models with the Cort R Package. Journal of Open Source Software 5, 2653 (2020).