Today, we will go further on the inference of copula functions. Some codes (and references) can be found on a previous post, on nonparametric estimators of copula densities (among other related things).  Consider (as before) the loss-ALAE dataset (since we've been working a lot on that dataset)

> library(MASS)
> library(evd)
> X=lossalae
> U=cbind(rank(X[,1])/(nrow(X)+1),rank(X[,2])/(nrow(X)+1))

The standard tool to plot nonparametric estimators of densities is to use multivariate kernels. We can look at the density using

> mat1=kde2d(U[,1],U[,2],n=35)
> persp(mat1$x,mat1$y,mat1$z,col="green",
+ shade=TRUE,theta=s*5,
+ xlab="",ylab="",zlab="",zlim=c(0,7))

or level curves (isodensity curves) with more detailed estimators (on grids with shorter steps)

> mat1=kde2d(U[,1],U[,2],n=101)
> image(mat1$x,mat1$y,mat1$z,col=
+ rev(heat.colors(100)),xlab="",ylab="")
> contour(mat1$x,mat1$y,mat1$z,add=
+ TRUE,levels = pretty(c(0,4), 11))

http://freakonometrics.blog.free.fr/public/perso6/3dcop-est1.gif

Kernels are nice, but we clearly observe some border bias, extremely strong in corners (the estimator is 1/4th of what it should be, see another post for more details). Instead of working on sample http://latex.codecogs.com/gif.latex?(U_i,V_i) on the unit square, consider some transformed sample http://latex.codecogs.com/gif.latex?(Q(U_i),Q(V_i)), where http://latex.codecogs.com/gif.latex?Q:(0,1)\rightarrow\mathbb{R} is a given function. E.g. a quantile function of an unbounded distribution, for instance the quantile function of the http://latex.codecogs.com/gif.latex?\mathcal{N}(0,1) distribution. Then, we can estimate the density of the transformed sample, and using the inversion technique, derive an estimator of the density of the initial sample. Since the inverse of a (general) function is not that simple to compute, the code might be a bit slow. But it does work,

> gaussian.kernel.copula.surface <- function (u,v,n) {
+   s=seq(1/(n+1), length=n, by=1/(n+1))
+   mat=matrix(NA,nrow = n, ncol = n)
+ sur=kde2d(qnorm(u),qnorm(v),n=1000,
+ lims = c(-4, 4, -4, 4))
+ su<-sur$z
+ for (i in 1:n) {
+     for (j in 1:n) {
+ 	Xi<-round((qnorm(s[i])+4)*1000/8)+1;
+ 	Yj<-round((qnorm(s[j])+4)*1000/8)+1
+ 	mat[i,j]<-su[Xi,Yj]/(dnorm(qnorm(s[i]))*
+ 	dnorm(qnorm(s[j])))
+     }
+ }
+ return(list(x=s,y=s,z=data.matrix(mat)))
+ }

Here, we get

http://freakonometrics.blog.free.fr/public/perso6/3dcop-est2.gif

Note that it is possible to consider another transformation, e.g. the quantile function of a Student-t distribution.

> student.kernel.copula.surface =
+  function (u,v,n,d=4) {
+  s <- seq(1/(n+1), length=n, by=1/(n+1))
+  mat <- matrix(NA,nrow = n, ncol = n)
+ sur<-kde2d(qt(u,df=d),qt(v,df=d),n=5000,
+ lims = c(-8, 8, -8, 8))
+ su<-sur$z
+ for (i in 1:n) {
+     for (j in 1:n) {
+ 	Xi<-round((qt(s[i],df=d)+8)*5000/16)+1;
+ 	Yj<-round((qt(s[j],df=d)+8)*5000/16)+1
+ 	mat[i,j]<-su[Xi,Yj]/(dt(qt(s[i],df=d),df=d)*
+ 	dt(qt(s[j],df=d),df=d))
+     }
+ }
+ return(list(x=s,y=s,z=data.matrix(mat)))
+ }

Another strategy is to consider kernel that have precisely the unit interval as support. The idea is here to consider the product of Beta kernels, where parameters depend on the location

> beta.kernel.copula.surface=
+  function (u,v,bx=.025,by=.025,n) {
+  s <- seq(1/(n+1), length=n, by=1/(n+1))
+  mat <- matrix(0,nrow = n, ncol = n)
+ for (i in 1:n) {
+     a <- s[i]
+     for (j in 1:n) {
+     b <- s[j]
+ 	mat[i,j] <- sum(dbeta(a,u/bx,(1-u)/bx) *
+     dbeta(b,v/by,(1-v)/by)) / length(u)
+     }
+ }
+ return(list(x=s,y=s,z=data.matrix(mat)))
+ }

http://freakonometrics.blog.free.fr/public/perso6/3dcop-est3.gif

On those two graphs, we can clearly observe strong tail dependence in the upper (right) corner, that cannot be intuited using a standard kernel estimator...