Godmar Back
2009-07-07 14:25:39 UTC
Hi,
I am trying to use R for some survey analysis, and need to compute the
significance of some correlations. I read the man pages for cor and
cor.test, but I am confused about
- whether these functions are intended to work the same way
- about how these functions handle NA values
- whether cor.test supports 'use = complete.obs'.
Some example output may explain why I am confused:
-----------------------------------------------
overallimpression 0.7440637
-----------------------------------------------
-----------------------------------------------
(I assume that's because of R's generous type coercions.... does R
have a "typeof" operator to learn what the type of q[[9]] is?)
-----------------------------------------------
data: q[[9]][, 1] and q[[10]][, 1]
t = 12.9877, df = 136, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.6588821 0.8104055
sample estimates:
cor
0.7440637
-----------------------------------------------
overallimpression NA
-----------------------------------------------
WORKS, and uses complete observations only
overallimpression 0.2859895
-----------------------------------------------
WORKS, apparently, but does not require 'use="complete.obs"' (!?)
data: q[[9]][, 1] and q[[51]][, 1]
t = 3.1016, df = 108, p-value = 0.002456
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.1043351 0.4491779
sample estimates:
cor
0.2859895
-----------------------------------------------
The help page for cor.test states that 'getOption('na.action')'
-----------------------------------------------
No action is taken, yet cor.test appears to only use complete observations (!?)
Others believe that cor.test accepts 'use=complete.obs':
http://markmail.org/message/nuzqeouqhbb7f6ok
--------------
Needless to say, this makes writing robust code very hard.
I'm wondering what the rationale for the inconsistencies between cor
and cor.test is.
Thanks!
- Godmar
I am trying to use R for some survey analysis, and need to compute the
significance of some correlations. I read the man pages for cor and
cor.test, but I am confused about
- whether these functions are intended to work the same way
- about how these functions handle NA values
- whether cor.test supports 'use = complete.obs'.
Some example output may explain why I am confused:
-----------------------------------------------
cor(q[[9]], q[[10]])
perceivedlearningcurveoverallimpression 0.7440637
-----------------------------------------------
cor.test(q[[9]], q[[10]])
Error in `[.data.frame`(x, OK) : undefined columns selected-----------------------------------------------
(I assume that's because of R's generous type coercions.... does R
have a "typeof" operator to learn what the type of q[[9]] is?)
-----------------------------------------------
cor.test(q[[9]][,1], q[[10]][,1])
Pearson's product-moment correlationdata: q[[9]][, 1] and q[[10]][, 1]
t = 12.9877, df = 136, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.6588821 0.8104055
sample estimates:
cor
0.7440637
-----------------------------------------------
cor(q[[9]], q[[51]])
usefulnessautodetectionbox_ordoverallimpression NA
-----------------------------------------------
WORKS, and uses complete observations only
cor(q[[9]], q[[51]], use="complete.obs")
usefulnessautodetectionbox_ordoverallimpression 0.2859895
-----------------------------------------------
WORKS, apparently, but does not require 'use="complete.obs"' (!?)
cor.test(q[[9]][,1], q[[51]][,1])
Pearson's product-moment correlationdata: q[[9]][, 1] and q[[51]][, 1]
t = 3.1016, df = 108, p-value = 0.002456
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.1043351 0.4491779
sample estimates:
cor
0.2859895
-----------------------------------------------
The help page for cor.test states that 'getOption('na.action')'
getOption("na.option")
NULL-----------------------------------------------
No action is taken, yet cor.test appears to only use complete observations (!?)
Others believe that cor.test accepts 'use=complete.obs':
http://markmail.org/message/nuzqeouqhbb7f6ok
--------------
Needless to say, this makes writing robust code very hard.
I'm wondering what the rationale for the inconsistencies between cor
and cor.test is.
Thanks!
- Godmar