Hello,
I am new to the list, but I have searched for an answer to this question in the list archives and couldn't find anything. Question: Does anybody have experience with fitting nested variogram models in R? I am not having success. Details: I am trying fit a nested variogram to an empirical variogram. I have tried both gstat and geoR which appear able to do this, but I have come across these problems: geoR: I can only run nested variogram models using cov.spatial and this is not suitable to fit a model to an empirical variogram. In the help manual it says that I should inherit cov.spatial in the variofit function, however, it does not seem to work. I get: Error in match.arg(cov.model, choices = c("matern", "exponential", "gaussian", : there is more than one match in match.arg my code: F2.1Y.wls.exp = variofit(F2.1Y, ini.cov.pars=starting.values, fix.nugget=F, nugget=starting.values.vec, cov.model=c("exp","spherical")) gstat: I can run nested models using F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, add.to = vgm(psill=33,"Lin",6, nugget=20) ) , print.SSE=T, fit.sills=T, fit.ranges=T ) but I never seem to get a good fit, i.e., it always fails to fit the empirical variogram. If anyone has a solution, I also need to calculate the Residual Sums of Squares to test for the relative fit. Thank you, Eliot -- Eliot McIntire NSERC Post Doctoral Fellow Department of Ecosystems and Conservation Science College of Forestry and Conservation University of Montana, Missoula, MT 59812 406-243-5239 fax: 406-243-4557 emcintire at forestry.umt.edu |
Dear Elliot
I can only comment on the geoR related question, since I am unfamiliar with gstat. Reading the help-file for variofit, then I think it is rather clear that this functions assumes one variogram model [i.e. the format of the estimated parameters is a vector]. So fitting nested model using that function seems not possible. I agree with you that the help-file for cov.spatial gives the impression that nested variables are allowed for that function. And it also is bit misleading that the help-file for variofit refers to cov.spatial, hinting that the full functionality of cov.spatial can be used in variofit. I hope the author of the package can provide more details on this [therefore copy to him]. I have no experience fitting nested variogram models myself, but my general opinion is that nested variograms aren't really useful, since what matters the most is to make a good fit of the empirical variogram near the origin. And if one really wants to make a very careful fit of a variogram-model to the data, then the likelihood function should be used rather than fitting to the empirical variogram. Ole Eliot McIntire wrote: > Hello, > > I am new to the list, but I have searched for an answer to this > question in the list archives and couldn't find anything. > > Question: > Does anybody have experience with fitting nested variogram models in > R? I am not having success. > > Details: > I am trying fit a nested variogram to an empirical variogram. I have > tried both gstat and geoR which appear able to do this, but I have > come across these problems: > > geoR: I can only run nested variogram models using cov.spatial and > this is not suitable to fit a model to an empirical variogram. In > the help manual it says that I should inherit cov.spatial in the > variofit function, however, it does not seem to work. I get: > Error in match.arg(cov.model, choices = c("matern", "exponential", > "gaussian", : > there is more than one match in match.arg > my code: F2.1Y.wls.exp = variofit(F2.1Y, > ini.cov.pars=starting.values, fix.nugget=F, > nugget=starting.values.vec, cov.model=c("exp","spherical")) > > gstat: I can run nested models using > F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, > add.to = vgm(psill=33,"Lin",6, nugget=20) ) > , print.SSE=T, fit.sills=T, fit.ranges=T ) > > > but I never seem to get a good fit, i.e., it always fails to fit the > empirical variogram. > > If anyone has a solution, I also need to calculate the Residual Sums > of Squares to test for the relative fit. > > Thank you, > Eliot > -- Ole F. Christensen BiRC - Bioinformatics Research Center University of Aarhus |
Dear Elliot
I entirely agree with Ole's comments and would just add the following: - with geoR the only thing you can do is to overimpose to an epirical variogram a theoretical nested variogram using lines.variomodel() which internally calls cov.spatial() - cov.spatial() is able to compute values of nested variogram, so you could use it as ingredient for other computation bearing in mind whether this is really useful and whether these things are really estimable best P.J. On Thu, 14 Oct 2004, Ole F. Christensen wrote: > Dear Elliot > > I can only comment on the geoR related question, since I am unfamiliar > with gstat. > > Reading the help-file for variofit, then I think it is rather clear that > this functions assumes one variogram model [i.e. the format of the > estimated parameters is a vector]. > So fitting nested model using that function seems not possible. > I agree with you that the help-file for cov.spatial gives the impression > that nested variables are allowed for that function. > And it also is bit misleading that the help-file for variofit refers > to cov.spatial, hinting that the full functionality of cov.spatial can > be used in variofit. > I hope the author of the package can provide more details on this > [therefore copy to him]. > > I have no experience fitting nested variogram models myself, but my > general opinion is that nested variograms aren't really useful, since > what matters the most is > to make a good fit of the empirical variogram near the origin. And if > one really wants to make a very careful fit of a variogram-model to the > data, then the likelihood function should be used rather than fitting to > the empirical variogram. > > Ole > > > Eliot McIntire wrote: > > > Hello, > > > > I am new to the list, but I have searched for an answer to this > > question in the list archives and couldn't find anything. > > > > Question: > > Does anybody have experience with fitting nested variogram models in > > R? I am not having success. > > > > Details: > > I am trying fit a nested variogram to an empirical variogram. I have > > tried both gstat and geoR which appear able to do this, but I have > > come across these problems: > > > > geoR: I can only run nested variogram models using cov.spatial and > > this is not suitable to fit a model to an empirical variogram. In > > the help manual it says that I should inherit cov.spatial in the > > variofit function, however, it does not seem to work. I get: > > Error in match.arg(cov.model, choices = c("matern", "exponential", > > "gaussian", : > > there is more than one match in match.arg > > my code: F2.1Y.wls.exp = variofit(F2.1Y, > > ini.cov.pars=starting.values, fix.nugget=F, > > nugget=starting.values.vec, cov.model=c("exp","spherical")) > > > > gstat: I can run nested models using > > F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, > > add.to = vgm(psill=33,"Lin",6, nugget=20) ) > > , print.SSE=T, fit.sills=T, fit.ranges=T ) > > > > > > but I never seem to get a good fit, i.e., it always fails to fit the > > empirical variogram. > > > > If anyone has a solution, I also need to calculate the Residual Sums > > of Squares to test for the relative fit. > > > > Thank you, > > Eliot > > > > -- > Ole F. Christensen > BiRC - Bioinformatics Research Center > University of Aarhus > > > > Paulo Justiniano Ribeiro Jr Departamento de Estat?stica Universidade Federal do Paran? Caixa Postal 19.081 CEP 81.531-990 Curitiba, PR - Brasil Tel: (+55) 41 361 3573 Fax: (+55) 41 361 3141 e-mail: paulojus at est.ufpr.br http://www.est.ufpr.br/~paulojus /"\ \ / Campanha da fita ASCII - contra mail html X ASCII ribbon campaign - against html mail / \ |
In reply to this post by Eliot McIntire-3
Eliot McIntire wrote: > > gstat: I can run nested models using > F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, > add.to = vgm(psill=33,"Lin",6, nugget=20) ) > , print.SSE=T, fit.sills=T, fit.ranges=T ) > > > but I never seem to get a good fit, i.e., it always fails to fit the > empirical variogram. > > If anyone has a solution, I also need to calculate the Residual Sums > of Squares to test for the relative fit. The problem is your starting values for the ranges. Use the values that you fit `by eye' from the sample variogram; they should be sufficiently distinct (and present in the sample variogram) for gstat to fit them with success. Also, be aware that the linear model with range is not a valid variogram model for data in more than one dimension. Best regards, -- Edzer |
In reply to this post by Ole F. Christensen
Ole F. Christensen wrote: > > I have no experience fitting nested variogram models myself, but my > general opinion is that nested variograms aren't really useful, since > what matters the most is > to make a good fit of the empirical variogram near the origin. And if > one really wants to make a very careful fit of a variogram-model to > the data, then the likelihood function should be used rather than > fitting to the empirical variogram. This reasoning has been put forward in the 1999 book by Michael Stein which contains besides this one a few very provocative statements, such as "forget about sample variograms, only look at likelyhood profiles". Although I like the book, the problem I have with it is that it contains hardly any analysis of real data. The argument therefore is based on theory; mathematicians do that, and they may prove right. However, nested variograms have been very useful in the past, especially for describing spatial variability in larger data sets. There are theoretical arguments for using them, think e.g. of the nugget effect: it consists of measurement error (a "true" nugget effect) and spatially correlated microvariation: a nested variogram model with a range so small that it's usually not detected by the data; see Cressie (1993) for more on this. Given it's not in the data, ML or REML will never pick it up, it's only something you can (and should) impose when you know for instance the true measurement error from other sources than the observed data. I would like to see papers where both approaches (ML without nested vs. nested models, traditionally fit) were compared with large data sets; I find it hard to embrace theoretical ideas without having them seen work in practice. Geostatistics is about modelling what's out there. -- Edzer |
Dear Edzer
Thanks for your comment. I don't think we strongly disagree on these matters. I hope this response clarifies my current view point. * I certainly don't want to debunk the empirical variogram. I find it very useful as an exploratory tool. For example, the emperical variogram might reveal pseudo-periodicity in the data and it might reveal directional effects. For some projects there is also the questions whether there is actually any spatial structure in the data, which a variogram plot of residuals [or standardised residuals if you having a GLM model] would reveal. Also plotting the empirical variogram might reveal if something has gone wrong when fitting by m.l.e. My recommandation : "Always plot the empirical variogram [of standardised residuals]" * I agree that the micro-scale variation component may be an important component. Since the data does not contain any information about whether a non-spatial component is part of the signal of interest or just random noise then the user has to specify this himself. This is an issue no matter what inference-machinery you are using [m.l.e. or fitting to variograms]. I can't see we disagree about anything here [and if you see my paper in the september 2004 issue of Journal of computational and Graphical Statistics, then there is a discussion about micro-scale issues for likelihood inference in a spatial Poisson model]. * Nested variogram models. My objection to them is based on what I have sometimes seen : a very elaborate fitting to empirical variograms, where a lot of effort is going into fitting the variogram away from the origin, and where the number of variogram models used in the nested structure seems to decided by this fitting to the empirical variogram in mind. A nested model for the variogram really says that the phenomenon we are modelling is Y(x) = Y_1(x) + Y_2(x) + Y_3(x) + Y_4(x) etc. , where the different components have different spatial structure. Rather than letting the empirical variogram decide the number of components, then shouldn't we start thinking about at the data generating mechanisms instead ? When having more than one spatial component Y_i(x), shouldn't we attempt interpreting the different components ? How about the implicit additivity assumption of the components when using a nested model ? [The data generating mechanism may suggest otherwise ... ]. A blind use of nested variogram models seems silly to me. * Fitting a nested variogram model. In case you want to use such a model, then you may fit the parameters by maximum likelihood, which was one point I tried to make in my previous mail. I see now that I may have stressed that point a bit too hard. I expect that a procedure for finding the maximum of the likelihood, for some data sets might have convergence problems due to identifiability problems of parameters. So probably good starting values are needed, but from your previous e-mail I see that there seems to be a similar issue for fitting to variograms. As you wrote in your previous e-mail, good starting values can be found by fitting a nested model by eye. I also have to admit, that currently there seems to be no procedure available in packages in R for fitting nested variogram models using maximum likelihood [so we are lacking behind in that respect]. * Using the likelihood function : A certain type of books and papers about geostatistics may have emphasised the likelihood function too strongly. Being brought up as a statistician, then using the likelihood function for inference is the natural thing to me. But I have also been taught to be be careful about the model. A model should catch the important structure of the data [here you need input from subject matter people]. Considering and investigating the structure of a model in many aspect is where we should spend our time. I give my applaud to the final sentence in your e-mail ``Geostatistics is about modelling what's out there." * Last comment : Your suggested comparison (ML without nested vs. nested models, traditionally fit) is missing the point entirely, since such a comparison would be a comparison of two different models, rather than two procedures for inference. Best regards Ole Edzer J. Pebesma wrote: > > > Ole F. Christensen wrote: > >> >> I have no experience fitting nested variogram models myself, but my >> general opinion is that nested variograms aren't really useful, since >> what matters the most is >> to make a good fit of the empirical variogram near the origin. And if >> one really wants to make a very careful fit of a variogram-model to >> the data, then the likelihood function should be used rather than >> fitting to the empirical variogram. > > > This reasoning has been put forward in the 1999 book by Michael Stein > which > contains besides this one a few very provocative statements, such as > "forget about > sample variograms, only look at likelyhood profiles". Although I like > the book, > the problem I have with it is that it contains hardly any analysis of > real data. The > argument therefore is based on theory; mathematicians do that, and > they may prove > right. > > However, nested variograms have been very useful in the past, > especially for > describing spatial variability in larger data sets. There are > theoretical arguments > for using them, think e.g. of the nugget effect: it consists of > measurement error > (a "true" nugget effect) and spatially correlated microvariation: a > nested variogram > model with a range so small that it's usually not detected by the > data; see > Cressie (1993) for more on this. Given it's not in the data, ML or > REML will never pick > it up, it's only something you can (and should) impose when you know for > instance the true measurement error from other sources than the > observed data. > > I would like to see papers where both approaches (ML without nested vs. > nested models, traditionally fit) were compared with large data sets; > I find > it hard to embrace theoretical ideas without having them seen work in > practice. > > Geostatistics is about modelling what's out there. > -- > Edzer > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/r-sig-geo > > -- Ole F. Christensen BiRC - Bioinformatics Research Center University of Aarhus |
In reply to this post by Edzer J. Pebesma
Thank you for your comments Edzer, Ole and Paulo,
Edzer: > Also, be aware that the linear model with range is not a valid variogram > model for data in more than one dimension. Yes, for some reason, I copied the code for the Linear models. I believe I had been trying nesting two exponentials or spherical and exponential, which was resulting in the error. I tried to simplify the model by putting in the linear models and had the same problem. Ole and Edzer: > * Nested variogram models. My objection to them is based on what I have > sometimes seen : a very elaborate fitting to empirical variograms, where > a lot of effort is going into fitting the variogram away from the > origin, and where the number of variogram models used in the nested > structure seems to decided by this fitting to the empirical variogram in > mind. > A nested model for the variogram really says that the phenomenon we are > modelling is Y(x) = Y_1(x) + Y_2(x) + Y_3(x) + Y_4(x) etc. , where the > different components have different spatial structure. > Rather than letting the empirical variogram decide the number of > components, then shouldn't we start thinking about at the data > generating mechanisms instead ? > When having more than one spatial component Y_i(x), shouldn't we attempt > interpreting the different components ? > How about the implicit additivity assumption of the components when > using a nested model ? [The data generating mechanism may suggest > otherwise ... ]. > A blind use of nested variogram models seems silly to me. I am an ecologist, and it is actually the mechanisms I am interested in, which is arguably more than "modeling what's out there". The data that I am fitting using WLS fitted to empirical variograms or ML fit to the data are only just numbers. They do, however, represent the outcome of potentially several underlying ecological mechanisms. The context for this study is identifying competition among trees in plantations in Chile. The expected variogram shape reflecting underlying environmental variation would be a traditional increasing function, like an exponential: closer points are more alike. The expected variogram shape reflecting competition among trees (which is expected to be assymetrical, i.e., the larger tree negatively affects the smaller tree, the not the inverse) is more like a wave function: neighbouring trees are expected to have MORE variation than the average variation across the forest stand. To test this mechanistic hypothesis, I have tree ring growth data from 7 different forests, so I have growth increment each year for every tree. To test for competition, I would like to fit a wave function; to test for environmental variation, I would like to fit an exponential function; to test for the two, I would like to test a nested function. I will be comparing the relative fits of these three models using AIC (since the nested model now has more parameters, an information criterion like AIC becomes appropriate). I the end, I have decided to model my own variograms and fit them using weighted non-linear least squares, unrelated to the "built in" functions in geoR or gstat because I couldn't get them to work. I am having success fitting the models in this way (i.e., coding my own functions). Ole: I am glad you do not bunk fitting empirical variograms, as you point out, there seems to be useful places for either approach. And it seems to me that there are fewer distributional assumptions in fitting an empirical variogram using WLS, and that it does "a pretty good job" most of the time. I still have not had a conclusive statement that I can model nested variograms in any of these R functions. I will write the authors. Thank you for your comments, Eliot On Sat, 16 Oct 2004 10:49:17 +0200, Edzer J. Pebesma <e.pebesma at geog.uu.nl> wrote: > > > Eliot McIntire wrote: > >> >> gstat: I can run nested models using >> F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, >> add.to = vgm(psill=33,"Lin",6, nugget=20) ) >> , print.SSE=T, fit.sills=T, fit.ranges=T ) >> >> >> but I never seem to get a good fit, i.e., it always fails to fit the >> empirical variogram. >> >> If anyone has a solution, I also need to calculate the Residual Sums >> of Squares to test for the relative fit. > > The problem is your starting values for the ranges. Use the values that > you > fit `by eye' from the sample variogram; they should be sufficiently > distinct > (and present in the sample variogram) for gstat to fit them with success. > > > Best regards, > -- > Edzer -- Eliot McIntire NSERC Post Doctoral Fellow Department of Ecosystems and Conservation Science College of Forestry and Conservation University of Montana, Missoula, MT 59812 406-243-5239 fax: 406-243-4557 emcintire at forestry.umt.edu |
Eliot
Despite of the very interesting debate and exchange of ideias on validity/merit etc here it is example on how to overimpose variograms variogram models using nested models in geoR. It's is a try and error exercise with lines.variomodel: > data(s100) > v <- variog(s100, max.d=1) > plot(v) # a single structure variogram > lines.variomodel(seq(0,1,l=100), cov.model="exp", cov.pars=c(1, .25), nug=0) # a nested variogram model > lines.variomodel(seq(0,1,l=100), cov.model="exp", cov.pars=rbind(c(.5, .1), c(.5,.3)), nug=0, col=2) Besides there is an tcl/tk based function with a menu and sliding bars. The function is eyefit() and is based on lines.variomodel. However it does not cope with nested models, but can be modifyed to do so Best P.J. On Mon, 18 Oct 2004, Eliot McIntire wrote: > Thank you for your comments Edzer, Ole and Paulo, > > Edzer: > > Also, be aware that the linear model with range is not a valid variogram > > model for data in more than one dimension. > > Yes, for some reason, I copied the code for the Linear models. I believe > I had been trying nesting two exponentials or spherical and exponential, > which was resulting in the error. I tried to simplify the model by > putting in the linear models and had the same problem. > > Ole and Edzer: > > > * Nested variogram models. My objection to them is based on what I have > > sometimes seen : a very elaborate fitting to empirical variograms, where > > a lot of effort is going into fitting the variogram away from the > > origin, and where the number of variogram models used in the nested > > structure seems to decided by this fitting to the empirical variogram in > > mind. > > A nested model for the variogram really says that the phenomenon we are > > modelling is Y(x) = Y_1(x) + Y_2(x) + Y_3(x) + Y_4(x) etc. , where the > > different components have different spatial structure. > > Rather than letting the empirical variogram decide the number of > > components, then shouldn't we start thinking about at the data > > generating mechanisms instead ? > > When having more than one spatial component Y_i(x), shouldn't we attempt > > interpreting the different components ? > > How about the implicit additivity assumption of the components when > > using a nested model ? [The data generating mechanism may suggest > > otherwise ... ]. > > A blind use of nested variogram models seems silly to me. > > I am an ecologist, and it is actually the mechanisms I am interested in, > which is arguably more than "modeling what's out there". The data that I > am fitting using WLS fitted to empirical variograms or ML fit to the data > are only just numbers. They do, however, represent the outcome of > potentially several underlying ecological mechanisms. The context for > this study is identifying competition among trees in plantations in > Chile. The expected variogram shape reflecting underlying environmental > variation would be a traditional increasing function, like an exponential: > closer points are more alike. The expected variogram shape reflecting > competition among trees (which is expected to be assymetrical, i.e., the > larger tree negatively affects the smaller tree, the not the inverse) is > more like a wave function: neighbouring trees are expected to have MORE > variation than the average variation across the forest stand. To test > this mechanistic hypothesis, I have tree ring growth data from 7 different > forests, so I have growth increment each year for every tree. To test for > competition, I would like to fit a wave function; to test for > environmental variation, I would like to fit an exponential function; to > test for the two, I would like to test a nested function. I will be > comparing the relative fits of these three models using AIC (since the > nested model now has more parameters, an information criterion like AIC > becomes appropriate). > > I the end, I have decided to model my own variograms and fit them using > weighted non-linear least squares, unrelated to the "built in" functions > in geoR or gstat because I couldn't get them to work. I am having success > fitting the models in this way (i.e., coding my own functions). > > Ole: > I am glad you do not bunk fitting empirical variograms, as you point out, > there seems to be useful places for either approach. And it seems to me > that there are fewer distributional assumptions in fitting an empirical > variogram using WLS, and that it does "a pretty good job" most of the time. > > I still have not had a conclusive statement that I can model nested > variograms in any of these R functions. I will write the authors. > > Thank you for your comments, > Eliot > > > On Sat, 16 Oct 2004 10:49:17 +0200, Edzer J. Pebesma > <e.pebesma at geog.uu.nl> wrote: > > > > > > > Eliot McIntire wrote: > > > >> > >> gstat: I can run nested models using > >> F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, > >> add.to = vgm(psill=33,"Lin",6, nugget=20) ) > >> , print.SSE=T, fit.sills=T, fit.ranges=T ) > >> > >> > >> but I never seem to get a good fit, i.e., it always fails to fit the > >> empirical variogram. > >> > >> If anyone has a solution, I also need to calculate the Residual Sums > >> of Squares to test for the relative fit. > > > > The problem is your starting values for the ranges. Use the values that > > you > > fit `by eye' from the sample variogram; they should be sufficiently > > distinct > > (and present in the sample variogram) for gstat to fit them with success. > > > > > > Best regards, > > -- > > Edzer > > > > -- > Eliot McIntire > NSERC Post Doctoral Fellow > Department of Ecosystems and Conservation Science > College of Forestry and Conservation > University of Montana, Missoula, MT 59812 > 406-243-5239 > fax: 406-243-4557 > emcintire at forestry.umt.edu > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/r-sig-geo > > Paulo Justiniano Ribeiro Jr Departamento de Estat?stica Universidade Federal do Paran? Caixa Postal 19.081 CEP 81.531-990 Curitiba, PR - Brasil Tel: (+55) 41 361 3573 Fax: (+55) 41 361 3141 e-mail: paulojus at est.ufpr.br http://www.est.ufpr.br/~paulojus /"\ \ / Campanha da fita ASCII - contra mail html X ASCII ribbon campaign - against html mail / \ |
In reply to this post by Eliot McIntire-3
Dear Elliot
I am glad that there is an interpretation behind the the nested modellig you are doing. Given that the likelihood approach to the nested variogram models is not implemented I would also do what do, fit by eye a variogram-model to the empirical variogram. [I just noticed that Paulo replied about how to do such things using his software] Just a thought : Could there be some information in the locations of the trees, i.e. in the neighbourhood of a large three there would probably be no other trees at all ? Marked point processes may (at least in principle) provide a framework for a type of modelling (but don't ask me, I don't know much about such models). > Ole: > I am glad you do not bunk fitting empirical variograms, as you point > out, there seems to be useful places for either approach. And it > seems to me that there are fewer distributional assumptions in > fitting an empirical variogram using WLS, and that it does "a pretty > good job" most of the time. Well, I actually wrote that I do not bunk the "empirical variogram". I don't really like the "fitting to empirical variograms". Note the distinction here : I very am happy with the empirical variograms in themselfves but less happy with the fitting to them. When you use the likelihood function, you are fitting a model to the data, instead of fitting to a summary of the data [which emprirical variograms are] I disagree with "there are fewer distributional assumptions in fitting an empirical variogram". When the distributional assumptions are not true, say Gaussian, you may still use that specific likelihood function estimate your parameters. Neither the likelihood approach nor the fitting to empirical variograms have a theoretical justification when the model is wrong. The difference is that likelihood approach has a justification when the model is true. In practice, where one should be worried, is when the distribution of the data is skewed. Here you should really transform the data [and it does not matter which approach you are using to estimate parameters]. The fitting to empirical variograms may do "a pretty good job" in practice, but actually you need to fit by likelihood before you can confirm this :-) I have probably already written too much about likelihood ...... We seem to agree on modelling which is the important part. Ole |
In reply to this post by Paulo Justiniano Ribeiro Jr
Dear Eliot
I've been checking the geoR code and have now done a change for lines.variomodel() be able to plot nested variograms models You will need to update geoR from the version on its we-site just loaded (1 min ago) Below there is an example code on how to do that and also on how to compute a "goodness-of- fit" measure (bearing in mind all the previous debate on this list) best P.J. require(geoR) data(s100) v <- variog(s100, max.dist=1) plot(v) ## overimposing a variogram with a single structure lines.variomodel(seq(0,1,l=101), cov.model="exp", cov.pars=c(1, .25), nug=0) ## and accessing a "quality of fit" by sum of squares teo1 <- cov.spatial(v$u, cov.model="exp", cov.pars=c(1, .25)) ss1 <- sum((v$v - teo1)^2) ss1 ## overimposing a variogram with two exponential structures lines.variomodel(seq(0,1,l=101), cov.model="exp", cov.pars=rbind(c(0.6, .15), c(0.4, .65)), nug=0, col=2) teo2 <- cov.spatial(v$u, cov.model="exp", cov.pars=rbind(c(0.6, .15), c(0.4, .65))) ss2 <- sum((v$v - teo2)^2) ss2 ## overimposing a variogram with exponential and spherical structures lines.variomodel(seq(0,1,l=101), cov.model=c("exp", "sph"), cov.pars=rbind(c(0.6, .15), c(0.4, .65)), nug=0, col=3) teo3 <- cov.spatial(v$u, cov.model=c("exp", "sph"), cov.pars=rbind(c(0.6, .15), c(0.4, .65))) ss3 <- sum((v$v - teo3)^2) ss3 On Mon, 18 Oct 2004, Eliot McIntire wrote: > Paulo, > > Sorry for one more follow-up... > > Your example here has a nested exponential within an exponential. Is it > possible to do two different models? > > Also, I can't see how this will help me arrive at a "quantitative" fit > measurement, like residual sums of squares, that I can use to compare the > fit of different models. Am I corrent? > > Thank you, > Eliot > > On Mon, 18 Oct 2004 14:38:17 -0200 (BRST), Paulo Justiniano Ribeiro Jr > <paulojus at est.ufpr.br> wrote: > > > Eliot > > > > Despite of the very interesting debate and exchange of ideias on > > validity/merit etc > > here it is example on how to overimpose variograms variogram models > > using nested models in geoR. > > It's is a try and error exercise with lines.variomodel: > >> data(s100) > >> v <- variog(s100, max.d=1) > >> plot(v) > > # a single structure variogram > >> lines.variomodel(seq(0,1,l=100), cov.model="exp", cov.pars=c(1, .25), > > nug=0) > > # a nested variogram model > >> lines.variomodel(seq(0,1,l=100), cov.model="exp", cov.pars=rbind(c(.5, > > .1), c(.5,.3)), nug=0, col=2) > > > > Besides there is an tcl/tk based function with a menu and sliding bars. > > The function is eyefit() and is based on lines.variomodel. > > However it does not cope with nested models, but can be modifyed to do so > > > > > > Best > > P.J. > > > > > > > > > > On Mon, 18 Oct 2004, Eliot McIntire wrote: > > > >> Thank you for your comments Edzer, Ole and Paulo, > >> > >> Edzer: > >> > Also, be aware that the linear model with range is not a valid > >> variogram > >> > model for data in more than one dimension. > >> > >> Yes, for some reason, I copied the code for the Linear models. I > >> believe > >> I had been trying nesting two exponentials or spherical and exponential, > >> which was resulting in the error. I tried to simplify the model by > >> putting in the linear models and had the same problem. > >> > >> Ole and Edzer: > >> > >> > * Nested variogram models. My objection to them is based on what I > >> have > >> > sometimes seen : a very elaborate fitting to empirical variograms, > >> where > >> > a lot of effort is going into fitting the variogram away from the > >> > origin, and where the number of variogram models used in the nested > >> > structure seems to decided by this fitting to the empirical variogram > >> in > >> > mind. > >> > A nested model for the variogram really says that the phenomenon we > >> are > >> > modelling is Y(x) = Y_1(x) + Y_2(x) + Y_3(x) + Y_4(x) etc. , where the > >> > different components have different spatial structure. > >> > Rather than letting the empirical variogram decide the number of > >> > components, then shouldn't we start thinking about at the data > >> > generating mechanisms instead ? > >> > When having more than one spatial component Y_i(x), shouldn't we > >> attempt > >> > interpreting the different components ? > >> > How about the implicit additivity assumption of the components when > >> > using a nested model ? [The data generating mechanism may suggest > >> > otherwise ... ]. > >> > A blind use of nested variogram models seems silly to me. > >> > >> I am an ecologist, and it is actually the mechanisms I am interested in, > >> which is arguably more than "modeling what's out there". The data that > >> I > >> am fitting using WLS fitted to empirical variograms or ML fit to the > >> data > >> are only just numbers. They do, however, represent the outcome of > >> potentially several underlying ecological mechanisms. The context for > >> this study is identifying competition among trees in plantations in > >> Chile. The expected variogram shape reflecting underlying environmental > >> variation would be a traditional increasing function, like an > >> exponential: > >> closer points are more alike. The expected variogram shape reflecting > >> competition among trees (which is expected to be assymetrical, i.e., the > >> larger tree negatively affects the smaller tree, the not the inverse) is > >> more like a wave function: neighbouring trees are expected to have MORE > >> variation than the average variation across the forest stand. To test > >> this mechanistic hypothesis, I have tree ring growth data from 7 > >> different > >> forests, so I have growth increment each year for every tree. To test > >> for > >> competition, I would like to fit a wave function; to test for > >> environmental variation, I would like to fit an exponential function; to > >> test for the two, I would like to test a nested function. I will be > >> comparing the relative fits of these three models using AIC (since the > >> nested model now has more parameters, an information criterion like AIC > >> becomes appropriate). > >> > >> I the end, I have decided to model my own variograms and fit them using > >> weighted non-linear least squares, unrelated to the "built in" functions > >> in geoR or gstat because I couldn't get them to work. I am having > >> success > >> fitting the models in this way (i.e., coding my own functions). > >> > >> Ole: > >> I am glad you do not bunk fitting empirical variograms, as you point > >> out, > >> there seems to be useful places for either approach. And it seems to me > >> that there are fewer distributional assumptions in fitting an empirical > >> variogram using WLS, and that it does "a pretty good job" most of the > >> time. > >> > >> I still have not had a conclusive statement that I can model nested > >> variograms in any of these R functions. I will write the authors. > >> > >> Thank you for your comments, > >> Eliot > >> > >> > >> On Sat, 16 Oct 2004 10:49:17 +0200, Edzer J. Pebesma > >> <e.pebesma at geog.uu.nl> wrote: > >> > >> > > >> > > >> > Eliot McIntire wrote: > >> > > >> >> > >> >> gstat: I can run nested models using > >> >> F2.BA1.fit = fit.variogram (F2.BA1.variogram, vgm(psill=33,"Lin",6, > >> >> add.to = vgm(psill=33,"Lin",6, nugget=20) ) > >> >> , print.SSE=T, fit.sills=T, fit.ranges=T ) > >> >> > >> >> > >> >> but I never seem to get a good fit, i.e., it always fails to fit the > >> >> empirical variogram. > >> >> > >> >> If anyone has a solution, I also need to calculate the Residual Sums > >> >> of Squares to test for the relative fit. > >> > > >> > The problem is your starting values for the ranges. Use the values > >> that > >> > you > >> > fit `by eye' from the sample variogram; they should be sufficiently > >> > distinct > >> > (and present in the sample variogram) for gstat to fit them with > >> success. > >> > > >> > > >> > Best regards, > >> > -- > >> > Edzer > >> > >> > >> > >> -- > >> Eliot McIntire > >> NSERC Post Doctoral Fellow > >> Department of Ecosystems and Conservation Science > >> College of Forestry and Conservation > >> University of Montana, Missoula, MT 59812 > >> 406-243-5239 > >> fax: 406-243-4557 > >> emcintire at forestry.umt.edu > >> > >> _______________________________________________ > >> R-sig-Geo mailing list > >> R-sig-Geo at stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo > >> > >> > > > > Paulo Justiniano Ribeiro Jr > > Departamento de Estat?stica > > Universidade Federal do Paran? > > Caixa Postal 19.081 > > CEP 81.531-990 > > Curitiba, PR - Brasil > > Tel: (+55) 41 361 3573 > > Fax: (+55) 41 361 3141 > > e-mail: paulojus at est.ufpr.br > > http://www.est.ufpr.br/~paulojus > > > > /"\ > > \ / Campanha da fita ASCII - contra mail html > > X ASCII ribbon campaign - against html mail > > / \ > > > > > > -- > Eliot McIntire > NSERC Post Doctoral Fellow > Department of Ecosystems and Conservation Science > College of Forestry and Conservation > University of Montana, Missoula, MT 59812 > 406-243-5239 > fax: 406-243-4557 > emcintire at forestry.umt.edu > > Paulo Justiniano Ribeiro Jr Departamento de Estat?stica Universidade Federal do Paran? Caixa Postal 19.081 CEP 81.531-990 Curitiba, PR - Brasil Tel: (+55) 41 361 3573 Fax: (+55) 41 361 3141 e-mail: paulojus at est.ufpr.br http://www.est.ufpr.br/~paulojus /"\ \ / Campanha da fita ASCII - contra mail html X ASCII ribbon campaign - against html mail / \ |
Free forum by Nabble | Edit this page |