[R] Cancor

Discussion:

[R] Cancor

Irena Komprej

2004-09-11 09:18:39 UTC

Dear R's!
I am strugling with cancor procedure in R. I cannot figure out the
meaning of xcoef and of yxcoef.
Are these:
1. standardized coefficients
2. structural coefficients
3. something else?

I have tried to simulate canonical correlation analysis by checking the
eigenstructure of the expression:

Sigma_xx %*% Sigma_xy %*% Sigma_yy %*% t(Sigma_xy).

The resulting eigenvalues were the same as the squared values of
cancor$cor. I have normalized the resulting eigenvectors, the a's with

sqrt(a'%*%Sigma_xx%*%t(a)), and similarly the b's with
sqrt(b'%*%Sigma_yy%*%t(b)).

The results differed considerably from xcoef and ycoef of the cancor.

I thought then, maybe these coefficients are structural coefficients and
therefore I multiplied them, the a's with
a%*%Sigma_xx, and the b's with
b%*%Sigma_yy,

but the results are nevertheless far from those of the cancor. Now, I
really don't know any more, how to interpret the xcoef and ycoef.
I am thanking you in advance.

Irena Komprej
(an R enthusiast as of last year)

Irena Komprej

2004-09-11 03:45:07 UTC

Permalink

Gabor Grothendieck

2004-09-12 18:04:22 UTC

Permalink

Post by Irena Komprej
I am strugling with cancor procedure in R. I cannot figure out the
meaning of xcoef and of yxcoef.
1. standardized coefficients
2. structural coefficients
3. something else?

Look at the examples at the bottom of ?cancor from which its evident
xcoef is such that x %*% cxy$xcoef are the canonical variables. (More
at the end of this post.)

Post by Irena Komprej
I have tried to simulate canonical correlation analysis by checking the
Sigma_xx %*% Sigma_xy %*% Sigma_yy %*% t(Sigma_xy).
The resulting eigenvalues were the same as the squared values of
cancor$cor. I have normalized the resulting eigenvectors, the a's with
sqrt(a'%*%Sigma_xx%*%t(a)), and similarly the b's with
sqrt(b'%*%Sigma_yy%*%t(b)).
The results differed considerably from xcoef and ycoef of the cancor.

Run the example in the help page to get some data and some
output:

set.seed(1)
example(cancor)

# Also, define isqrt as the inverse square root of a postive def matrix

isqrt <- function(x) {
e <- eigen(x)
stopifnot( all(e$values > 0) )
e$vectors %*% diag(1/sqrt(e$values)) %*% solve(e$vectors)
}

# we can reconstruct the canonical correlations and xcoef
# in the way you presumably intended like this:

z <- svd(cov(x,y) %*% solve(var(y), cov(y,x)) %*% solve(var(x)))
sqrt(z$d) # canonical correlations
isqrt((nrow(x)-1)*var(x)) %*% z$u # xcoef

Another thing you can do is to type

cancor

at the R prompt to view its source and see how it works using
the QR decomposition.

Irena Komprej

2004-09-13 11:30:22 UTC

Permalink

Dear Gabor,
thank you for your answer, but I am still a little confused because the
values of the xcoef and ycoef are so small. It is true, that I receive
very similar results to the cancor, if I use the proposed formula with

z <- svd(cov(x,y) %*% solve(var(y), cov(y,x)) %*% solve(var(x)))
sqrt(z$d) # canonical correlations
isqrt((nrow(x)-1)*var(x)) %*% z$u # xcoef

But, why do I need the nrow(x)-1)* in isqrt()?
In the literature, if you use the proposed calculation, the a's are
calculated as z$u %*% isqrt(var(x)).

I have this problem, because I need structural coefficients to calculate
Redundancy measure and according to literature, they are calculated as
a's%*%var(x).
The coefficients from cancor are so small that my redundancy measure iz
almost zero, despite the high correlation coefficient.

I have, on the other hand, calculated correlation between x and their
corresponding canonical variables as:
cor(x, x%*%xcoef)
and results were good.

Can I use these results as structural correlations in Redundancy measure
calculation?

Thank you again and best regards

Irena Komprej

________________________________________________________________

Post by Irena Komprej
I am strugling with cancor procedure in R. I cannot figure out the
meaning of xcoef and of yxcoef.
1. standardized coefficients
2. structural coefficients
3. something else?

Look at the examples at the bottom of ?cancor from which its evident
xcoef is such that x %*% cxy$xcoef are the canonical variables. (More
at the end of this post.)

Post by Irena Komprej
I have tried to simulate canonical correlation analysis by checking

the

Post by Irena Komprej
Sigma_xx %*% Sigma_xy %*% Sigma_yy %*% t(Sigma_xy).
The resulting eigenvalues were the same as the squared values of
cancor$cor. I have normalized the resulting eigenvectors, the a's with
sqrt(a'%*%Sigma_xx%*%t(a)), and similarly the b's with
sqrt(b'%*%Sigma_yy%*%t(b)).
The results differed considerably from xcoef and ycoef of the cancor.

Gabor Grothendieck

2004-09-13 12:00:11 UTC

Permalink

Irena Komprej <irena.komprej <at> telemach.net> writes:

: thank you for your answer, but I am still a little confused because the
: values of the xcoef and ycoef are so small. It is true, that I receive
: very similar results to the cancor, if I use the proposed formula with
:
: z <- svd(cov(x,y) %*% solve(var(y), cov(y,x)) %*% solve(var(x)))
: sqrt(z$d) # canonical correlations
: isqrt((nrow(x)-1)*var(x)) %*% z$u # xcoef
:
: But, why do I need the nrow(x)-1)* in isqrt()?

I don't think the scaling really matters. If you want to define
the scaling so that the canonical variables have identity variance
matrix or other scaling you can rescale. I chose the above
scaling since the objective was to give the same result as
cancor.

If the values we calculate above are slightly different than cancor's
I would go with cancor since it uses the QR decomposition which is
presumably more stable numerically than what we have above.

Don't know about the rest of your questions.

:
: >
: > I am strugling with cancor procedure in R. I cannot figure out the
: > meaning of xcoef and of yxcoef.
: > Are these:
: > 1. standardized coefficients
: > 2. structural coefficients
: > 3. something else?
: >
:
: Look at the examples at the bottom of ?cancor from which its evident
: xcoef is such that x %*% cxy$xcoef are the canonical variables. (More
: at the end of this post.)
:
: > I have tried to simulate canonical correlation analysis by checking
: the
: > eigenstructure of the expression:
: >
: > Sigma_xx %*% Sigma_xy %*% Sigma_yy %*% t(Sigma_xy).
: >
: > The resulting eigenvalues were the same as the squared values of
: > cancor$cor. I have normalized the resulting eigenvectors, the a's with
: >
: > sqrt(a'%*%Sigma_xx%*%t(a)), and similarly the b's with
: > sqrt(b'%*%Sigma_yy%*%t(b)).
: >
: > The results differed considerably from xcoef and ycoef of the cancor.
:
: Run the example in the help page to get some data and some
: output:
:
: set.seed(1)
: example(cancor)
:
: # Also, define isqrt as the inverse square root of a postive def matrix
:
: isqrt <- function(x) {
: e <- eigen(x)
: stopifnot( all(e$values > 0) )
: e$vectors %*% diag(1/sqrt(e$values)) %*% solve(e$vectors)
: }
:
: # we can reconstruct the canonical correlations and xcoef
: # in the way you presumably intended like this:
:
: z <- svd(cov(x,y) %*% solve(var(y), cov(y,x)) %*% solve(var(x)))
: sqrt(z$d) # canonical correlations
: isqrt((nrow(x)-1)*var(x)) %*% z$u # xcoef
:
: Another thing you can do is to type
:
: cancor
:
: at the R prompt to view its source and see how it works using
: the QR decomposition.
:
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
:
:

Irena Komprej

2004-09-15 08:14:37 UTC

Permalink

Dear Gabor,
thank you for your answer related to the normalization of cancor
coefficients. If you want to interpret the coefficients in terms of
variables' contributions to canonical variables, loadings, redundancy
measure, etc. , you have to normalize the results so that the canonical
variables have identity variance matrix. Multiplying cancor coefficients
xcoef and ycoef by as.numeric(sqrt(nrow(x)-1)) does the job.
(I sometimes miss such information in the R help.)
Best regards,
Irena Komprej

Prof Brian Ripley

2004-09-15 08:39:31 UTC

Permalink

Post by Irena Komprej
Dear Gabor,

This is R-help, not `Gabor', although you keep sending mail addressed to
`Gabor' to this list.

Post by Irena Komprej
thank you for your answer related to the normalization of cancor
coefficients. If you want to interpret the coefficients in terms of
variables' contributions to canonical variables, loadings, redundancy
measure, etc. , you have to normalize the results so that the canonical
variables have identity variance matrix. Multiplying cancor coefficients
xcoef and ycoef by as.numeric(sqrt(nrow(x)-1)) does the job.
(I sometimes miss such information in the R help.)

I think you miss it in the references given on the R help pages. Please
do read them. (Any good book will tell you that the scaling of canonical
variates is arbitrary.)

Please also ask your local IT advisors to set your computer to a
sensible time: the world is 6.7 years ahead of you.

--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595