Discussion:
[R] Proportion data GAM
Rodrigo Tardin
2014-10-15 15:48:19 UTC
Permalink
Hi all,

I am not sure if this is the right place for this question or if there is
one more specific.
Anyway, I hope somebody can help me.

I am trying to run a GAM with beta distribution from mgcv package.
My dependent variable is a proportion continuously ranging from 0 to 1
(whales density) and I have three co-variates Depth, Distance to Coast and
Seabed Slope.
From what I read, beta distribution is the most appropriate for my response
variable and not binomial.
According to mgcv manual, it is possible to specify beta distribution on a
GAM with the "betar" function, but I get the following error:
could not find function "betar"

My code is:
library(mgcv)
a2=gam(Density~s(DEPTH,k=4)+s(DISTCOAST_1,k=4)+s(SLOPE,k=4),
family=betar(link="logit"),data=misti,gamma=1.4)

The beta family is specified exactly as it is shown in the manual:
bm <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=betar(link="logit"),data=dat)

Does anyone know what it seems to be the problem?
Thanks in advance,
Rodrigo


Rodrigo Tardin

Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA

[[alternative HTML version deleted]]
Simon Wood
2014-10-15 19:22:50 UTC
Permalink
Can you give the result of typing
sessionInfo()
in the session where this happens, please?
Post by Rodrigo Tardin
Hi all,
I am not sure if this is the right place for this question or if there is
one more specific.
Anyway, I hope somebody can help me.
I am trying to run a GAM with beta distribution from mgcv package.
My dependent variable is a proportion continuously ranging from 0 to 1
(whales density) and I have three co-variates Depth, Distance to Coast and
Seabed Slope.
From what I read, beta distribution is the most appropriate for my response
variable and not binomial.
According to mgcv manual, it is possible to specify beta distribution on a
could not find function "betar"
library(mgcv)
a2=gam(Density~s(DEPTH,k=4)+s(DISTCOAST_1,k=4)+s(SLOPE,k=4),
family=betar(link="logit"),data=misti,gamma=1.4)
bm <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=betar(link="logit"),data=dat)
Does anyone know what it seems to be the problem?
Thanks in advance,
Rodrigo
Rodrigo Tardin
Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
Simon Wood
2014-10-15 19:42:34 UTC
Permalink
Rodrigo,

OK, it looks as if your mgcv help files/manual are somehow out of sync
with the package version you have loaded. 'betar' is only available from
mgcv 1.8. If you update to the current mgcv from CRAN then this problem
should be solved.

best,
Simon

ps. beta regression is only available with REML (or ML) smoothness
selection in mgcv, so the 'gamma' parameter will be ignored.

pps. Do you really want to limit all your smooths to a maximum of 3
degrees of freedom by setting k=?. I'd be inclined to allow the
smoothing parameter selection do its thing with a higher k, and only get
really restrictive on k if the resulting models somehow don't make sense.
Hi Simon,
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252
[1] splines stats graphics grDevices utils datasets methods
base
[1] gamlss_4.3-1 gamlss.dist_4.3-1 MASS_7.3-31
gamlss.data_4.2-7 MuMIn_1.10.0
[6] ggplot2_1.0.0 mgcv_1.7-29 nlme_3.1-117 betareg_3.0-5
[1] colorspace_1.2-4 digest_0.6.4 flexmix_2.3-12 Formula_1.1-2
[5] grid_3.1.0 gtable_0.1.2 lattice_0.20-29 lmtest_0.9-33
[9] Matrix_1.1-3 modeltools_0.2-21 munsell_0.4.2 nnet_7.3-8
[13] plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2 reshape2_1.4
[17] sandwich_2.3-1 scales_0.2.4 stats4_3.1.0 stringr_0.6.2
[21] survival_2.37-7 tools_3.1.0 zoo_1.7-11
Is this that you were asking or the sessionInfo() of the code (it would
be sessionInfo(a2))?
Error in if (pkg == "base") file.path(.Library, "base") else if (pkg
missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the
first 50)
Thanks
Rodrigo Tardin
Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
2014-10-15 15:22 GMT-04:00 Simon Wood <s.wood at bath.ac.uk
Can you give the result of typing
sessionInfo()
in the session where this happens, please?
Hi all,
I am not sure if this is the right place for this question or if there is
one more specific.
Anyway, I hope somebody can help me.
I am trying to run a GAM with beta distribution from mgcv package.
My dependent variable is a proportion continuously ranging from 0 to 1
(whales density) and I have three co-variates Depth, Distance to Coast and
Seabed Slope.
From what I read, beta distribution is the most appropriate
for my response
variable and not binomial.
According to mgcv manual, it is possible to specify beta distribution on a
could not find function "betar"
library(mgcv)
a2=gam(Density~s(DEPTH,k=4)+s(__DISTCOAST_1,k=4)+s(SLOPE,k=4),
family=betar(link="logit"),__data=misti,gamma=1.4)
bm <-
gam(y~s(x0)+s(x1)+s(x2)+s(x3),__family=betar(link="logit"),__data=dat)
Does anyone know what it seems to be the problem?
Thanks in advance,
Rodrigo
Rodrigo Tardin
Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
[[alternative HTML version deleted]]
________________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 <tel:%2B44%20%280%291225%20386603>
http://people.bath.ac.uk/sw283
________________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
Rodrigo Tardin
2014-10-15 20:20:36 UTC
Permalink
Thanks Simon!

That worked!
I did not constrain my k as you suggested, but when I saw my results, my
degrees of freedom are not larger than 1, the REML is negative and all
covariates are not significant (what it does not make sense). Is there
something wrong?

Here's the results of summary (a2)

Family: Beta regression(76.885)
Link function: logit

Formula:
MDENS1 ~ s(DEPTH2) + s(DISTCOAST_1) + s(DIST_DIVE) + s(TOUR)

Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.7613 0.0252 -268.3 <2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(DEPTH2) 1.005 1.009 0.004 0.948
s(DISTCOAST_1) 1.004 1.008 0.077 0.785
s(DIST_DIVE) 1.004 1.008 0.065 0.801
s(TOUR) 1.004 1.008 0.010 0.922

R-sq.(adj) = -0.00255 Deviance explained = 11.7%
-REML = -19521 Scale est. = 1 n = 1560

Thanks a lot one more time!

Rodrigo Tardin

Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
Post by Simon Wood
Rodrigo,
OK, it looks as if your mgcv help files/manual are somehow out of sync
with the package version you have loaded. 'betar' is only available from
mgcv 1.8. If you update to the current mgcv from CRAN then this problem
should be solved.
best,
Simon
ps. beta regression is only available with REML (or ML) smoothness
selection in mgcv, so the 'gamma' parameter will be ignored.
pps. Do you really want to limit all your smooths to a maximum of 3
degrees of freedom by setting k=?. I'd be inclined to allow the smoothing
parameter selection do its thing with a higher k, and only get really
restrictive on k if the resulting models somehow don't make sense.
Hi Simon,
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252
[1] splines stats graphics grDevices utils datasets methods
base
[1] gamlss_4.3-1 gamlss.dist_4.3-1 MASS_7.3-31
gamlss.data_4.2-7 MuMIn_1.10.0
[6] ggplot2_1.0.0 mgcv_1.7-29 nlme_3.1-117 betareg_3.0-5
[1] colorspace_1.2-4 digest_0.6.4 flexmix_2.3-12 Formula_1.1-2
[5] grid_3.1.0 gtable_0.1.2 lattice_0.20-29 lmtest_0.9-33
[9] Matrix_1.1-3 modeltools_0.2-21 munsell_0.4.2 nnet_7.3-8
[13] plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2 reshape2_1.4
[17] sandwich_2.3-1 scales_0.2.4 stats4_3.1.0 stringr_0.6.2
[21] survival_2.37-7 tools_3.1.0 zoo_1.7-11
Is this that you were asking or the sessionInfo() of the code (it would
be sessionInfo(a2))?
Error in if (pkg == "base") file.path(.Library, "base") else if (pkg
missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the
first 50)
Thanks
Rodrigo Tardin
Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
2014-10-15 15:22 GMT-04:00 Simon Wood <s.wood at bath.ac.uk
Can you give the result of typing
sessionInfo()
in the session where this happens, please?
Hi all,
I am not sure if this is the right place for this question or if there is
one more specific.
Anyway, I hope somebody can help me.
I am trying to run a GAM with beta distribution from mgcv package.
My dependent variable is a proportion continuously ranging from 0 to 1
(whales density) and I have three co-variates Depth, Distance to
Coast and
Seabed Slope.
From what I read, beta distribution is the most appropriate
for my response
variable and not binomial.
According to mgcv manual, it is possible to specify beta
distribution on a
could not find function "betar"
library(mgcv)
a2=gam(Density~s(DEPTH,k=4)+s(__DISTCOAST_1,k=4)+s(SLOPE,k=4),
family=betar(link="logit"),__data=misti,gamma=1.4)
bm <-
gam(y~s(x0)+s(x1)+s(x2)+s(x3),__family=betar(link="logit"),_
_data=dat)
Does anyone know what it seems to be the problem?
Thanks in advance,
Rodrigo
Rodrigo Tardin
Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
[[alternative HTML version deleted]]
________________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 <tel:%2B44%20%280%291225%20386603>
http://people.bath.ac.uk/sw283
________________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
[[alternative HTML version deleted]]

Rodrigo Tardin
2014-10-15 19:34:07 UTC
Permalink
Hi Simon,

The result of sessionInfo() is:
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252

attached base packages:
[1] splines stats graphics grDevices utils datasets methods
base

other attached packages:
[1] gamlss_4.3-1 gamlss.dist_4.3-1 MASS_7.3-31 gamlss.data_4.2-7
MuMIn_1.10.0
[6] ggplot2_1.0.0 mgcv_1.7-29 nlme_3.1-117 betareg_3.0-5

loaded via a namespace (and not attached):
[1] colorspace_1.2-4 digest_0.6.4 flexmix_2.3-12 Formula_1.1-2
[5] grid_3.1.0 gtable_0.1.2 lattice_0.20-29 lmtest_0.9-33
[9] Matrix_1.1-3 modeltools_0.2-21 munsell_0.4.2 nnet_7.3-8
[13] plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2 reshape2_1.4
[17] sandwich_2.3-1 scales_0.2.4 stats4_3.1.0 stringr_0.6.2
[21] survival_2.37-7 tools_3.1.0 zoo_1.7-11

Is this that you were asking or the sessionInfo() of the code (it would be
sessionInfo(a2))?

If it is the sessionInfo(a2) the result is:
Error in if (pkg == "base") file.path(.Library, "base") else if (pkg %in%
:
missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the
first 50)

Thanks

Rodrigo Tardin

Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
Post by Simon Wood
Can you give the result of typing
sessionInfo()
in the session where this happens, please?
Post by Rodrigo Tardin
Hi all,
I am not sure if this is the right place for this question or if there is
one more specific.
Anyway, I hope somebody can help me.
I am trying to run a GAM with beta distribution from mgcv package.
My dependent variable is a proportion continuously ranging from 0 to 1
(whales density) and I have three co-variates Depth, Distance to Coast and
Seabed Slope.
From what I read, beta distribution is the most appropriate for my
response
variable and not binomial.
According to mgcv manual, it is possible to specify beta distribution on a
could not find function "betar"
library(mgcv)
a2=gam(Density~s(DEPTH,k=4)+s(DISTCOAST_1,k=4)+s(SLOPE,k=4),
family=betar(link="logit"),data=misti,gamma=1.4)
bm <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=betar(link="logit"),data=dat)
Does anyone know what it seems to be the problem?
Thanks in advance,
Rodrigo
Rodrigo Tardin
Shor Term Scholar - Duke Marine Lab. - Duke University
Doutorando em Ecologia e Evolu??o - IBRAG - UERJ
M.Sc em Biologia Animal - PPGBA - UFRRJ
Especialista em Doc?ncia do Ensino Superior - IAVM
Laborat?rio de Bioac?stica e Ecologia de Cet?ceos - UFRRJ/ IF/ DCA
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Loading...