\documentclass[nojss]{jss} \usepackage{amssymb} \usepackage{latexsym} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\jm}{J_{\!\mbox{\tiny{$M$}}}} %% just as usual \author{Robin K. S. Hankin\\Auckland University of Technology} \Plainauthor{Robin K. S. Hankin} \title{Introducing \pkg{untb}, an \proglang{R} Package For Simulating Ecological Drift Under the Unified Neutral Theory of Biodiversity} \Plaintitle{Introducing untb, an R Package For Simulating Ecological Drift Under the Unified Neutral Theory of Biodiversity} \Shorttitle{Ecological Drift With \proglang{R}} %\VignetteIndexEntry{A vignette for the untb package} \Abstract{ The distribution of abundance amongst species with similar ways of life is a classical problem in ecology. The {\em unified neutral theory of biodiversity}, due to Hubbell, states that observed population dynamics may be explained on the assumption of per capita equivalence amongst individuals. One can thus dispense with differences between species, and differences between abundant and rare species: all individuals behave alike in respect of their probabilities of reproducing and death. It is a striking fact that such a parsimonious theory results in a non-trivial dominance-diversity curve (that is, the simultaneous existence of both abundant and rare species) and even more striking that the theory predicts abundance curves that match observations across a wide range of ecologies. Here I introduce the \pkg{untb} package of \proglang{R} routines, for numerical simulation of ecological drift under the unified neutral theory. A range of visualization, analytical, and simulation tools are provided in the package and these are presented with examples in the paper. This vignette is based on~\cite{hankin2007a}. For reasons of performance, some of the more computationally expensive results are pre-loaded. To calculate them from scratch, change ``\code{calc\_from\_scratch <- TRUE}'' to ``\code{calc\_from\_scratch <- FALSE}'' in chunk \code{time\_saver}. } \Keywords{unified neutral theory, neutral theory, biodiversity, island biogeography} \Plainkeywords{unified neutral theory, neutral theory, biodiversity, island biogeography} %% publication information %% NOTE: Typically, this can be left commented and will be filled out by the technical editor %% \Volume{13} %% \Issue{9} %% \Month{September} %% \Year{2004} %% \Submitdate{2004-09-29} %% \Acceptdate{2004-09-29} %% The address of (at least) one author should be given %% in the following format: \Address{ Robin K. S. Hankin\\ Auckland University of Technology\\ Wakefield Street\\ Auckland, New Zealand\\ E-mail: \email{hankin.robin@gmail.com}\\ } %% It is also possible to add a telephone and fax number %% before the e-mail in the following format: %% Telephone: +43/1/31336-5053 %% Fax: +43/1/31336-734 %% for those who use Sweave please include the following line (with % symbols): %% need no \usepackage{Sweave.sty} %% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \SweaveOpts{echo=FALSE} \begin{document} \hfill\includegraphics[width=1in]{\Sexpr{system.file("help/figures/untb.png",package="untb")}} <>= <>= require("untb",quietly=TRUE) @ <>= calc_from_scratch <- FALSE @ <>= if(calc_from_scratch){ set.seed(0) rn <- rand.neutral(5e6, theta=50) } else { load("rn.Rdata") } @ \section{Island biogeography and biodiversity} In the field of ecological dynamics, one simple system of interest is that of similar species, at the same trophic level, that compete for the same or similar resources. One might choose forest ecology as the canonical example: trees are long-lived, stationary, and easily identified. Determining the statistical properties of such systems is a classical problem in ecology~\citep{bell2000}; the observed range of abundances between apparently similar species does not have a ready explanation. Many studies involve the tabulation of individuals, collected according to some formal sampling method, and it is a universal finding that the species abundances exhibit a large range: the majority of individuals are accounted for by a relatively small number of species, while the rarest species typically are sufficiently rare to have been sampled only once (`singletons'). \subsection{The unified neutral theory of biodiversity} One natural null hypothesis which arises in this field is that differences between species may be ascribed to random, species-independent, processes. The origin of such ideas may be traced back to at least the 1960s, in which \citet{macarthur1963} discussed the observation that island faunas tend to become progressively impoverished with distance from the nearest landmass. Their insight was to treat the number of species on an island as the {\em independent} variable. \cite{hubbell2001} considers that \citeauthor{macarthur1963}'s theory raised the possibility that ``chance {\em alone} could play a \ldots role in structuring ecological communities'' (my italics), thus paving the way for later, more explicitly stochastic, theories. In the context of theoretical studies, and in particular the \pkg{untb} package, it is standard practice to consider a metacommunity of fixed size~$\jm=\sum_{i=1}^S n_i$ individuals where the abundance of the $i^\mathrm{th}$ species, $1\leqslant i\leqslant S$, is~$n_i$. The evolution of the community is then modelled as a Markov chain with one state change corresponding to the death of a single individual, and the simultaneous birth of another individual of the same or a different species. Because the community size is fixed, it is common to refer to an individual being associated with a ``site'', thus notionally carving the ecosystem resource into discrete units. The transition matrix may thus be described, following~\citet{macarthur1963,macarthur1967}, in terms of the behaviour of the~$i^\mathrm{th}$ species: \begin{equation}\label{neutral.markoff.chain} n_i(t+1)= \left\{ \begin{array}{ll} n_i(t) -1 &\mbox{with probability~$p_{-1}$}\\ n_i(t) &\mbox{with probability~$p_{0}$}\\ n_i(t) +1 &\mbox{with probability~$p_{+1}$.}\\ \end{array}\right. \end{equation} The three probabilities~$\left(p_{-1},p_0,p_{+1}\right)$ may be functions of the species~$i$, and the abundances~$n_i$ of each species present. A theory is {\em species neutral} if it assumes that the probabilities are independent of~$i$. Thus, in a species neutral theory, all species are equivalent. Note that this model does not preclude abundant species being at a competitive disadvantage\footnote{One mechanism might be increased incidence of parasitism among an abundant species; or a rare species might enjoy the competitive advantage of dispensing with territoriality.} or advantage\footnote{For example, an abundant species might gain a competitive advantage from cooperative behaviour such as swarming; or, conversely, a rare species might suffer a competitive disadvantage due to the difficulty of finding a mate.}. A theory is {\em neutral} (sometimes {\em unified neutral}) if it assumes that all individuals in a community are ``strictly equivalent regarding their prospects of reproduction and death''~\citep{chave2005}; a neutral theory is thus species neutral. One consequence of this assumption is that both~$p_B(i)$ and~$p_D(i)$, the probabilities of a birth and death respectively being an individual of species~$i$, are both proportional to~$n_i$, the number of individuals of that species present in the ecosystem. Then we have~$p_D(i)=p_B(i)$ for all~$i$, and hence~$p_{-1}=p_{+1}$. Thus the number of individuals of species~$i$---and hence that of all species---is a Martingale. \cite{hubbell2001,hubbell1979} discusses, motivates, and assesses the unified neutral theory of biodiversity (henceforth UNTB) and shows that it makes a wide variety of quantitative predictions, all of which appear to be statistically consistent with predictions---or to be incorrect by only a small margin, as discussed by~\cite{volkov2003}, for example. The neutral theory also provides a useful null model against which to compare temporal data: \cite{leigh1993}, for example, use neutral dynamics in this way and found evidence that tree diversity on newly isolated tropical islands declines faster than expected by neutral models, suggesting that the discrepancy may be explained by non-neutral animal-plant interactions. More recent work adduces subtle analyses that cast doubt on the verisimilitude of the UNTB. For example, \citet{nee2005} shows that the UNTB implies that the expected age of common lineages (of trees, in his example) is ``impossibly old''; \citet{mcgill2006} present a hierarchy of increasingly demanding statistical tests of the UNTB and finds it lacking at the highest levels. Nevertheless, the UNTB furnishes a broadly realistic mechanism for reproduction and speciation~\citep{maurer2004}; it makes predictions that are closer to field observations than one might expect for so parsimonious a theory. \cite{mcgill2006}, after rejecting the UNTB in favour of a non-mechanistic alternative\footnote{The lognormal distribution~\citep{volkov2003}}, state that the UNTB remains ``an extraordinary and perhaps uniquely elegant ecological theory. Such elegance is indicative that the neutral theory will have some important role in ecology''. Similarly, \cite{nee2005} states that the UNTB can still provide ``useful null models for the interpretation of data''. It is in this spirit that I present \pkg{untb}, an \proglang{R}~\citep{rcore2007} package that uses the UNTB to simulate and analyze such datasets. The theory is then applied to a number of oceanographic datasets. In the context of computer simulations, it is possible to simulate neutral ecological drift by the following algorithm. Consider an system of~$\jm$ individuals, and a timestep sufficiently short for only a single individual to die. Then: \label{intro} \begin{enumerate} \item Choose a site at random; the individual at this site dies. \item With probability~$1-\nu$, an individual of species chosen at random from the remaining population is born \item With probability~$\nu$, an individual of a new species is born. \end{enumerate} Subsequent timesteps may be enacted using the new community as a pool in step~2. The meaning of ``new'' species is subject to differing interpretations: the new species might be the result of a point mutation, or in the case of island biogeography, the successful immigration of an individual from a metacommunity (typically the mainland). This system is not the only way to simulate neutral ecological drift: it ignores, for example, changes in competitive advantage accruing to older individuals. Also, alternative models of speciation may be adopted, such as the random fission model~\citep{ricklefs2003,hubbell2003}. \subsection[Computational population ecology and the untb package]{Computational population ecology and the \pkg{untb} package} The \proglang{R} package \pkg{untb} associated with this paper may be used to analyze ecosystem data in the context of Hubbell's neutral theory. The package contains routines that summarize census data, estimate biodiversity parameters, and generate synthetic datasets under conditions of exact neutrality. In an \proglang{R} session, typing {\tt library(help = "untb")} at the command line shows a list of topics that are documented in the package. It is envisaged that the untb package will be useful to ecologists in both research and teaching. Many of the routines are intended to be useful to teachers covering the neutral theory: a number of visualization tools are included that display census datasets in a consistent and standardized format; function \code{display.untb()} displays a ``movie'' of a system undergoing neutral drift in a visually striking, and hopefully memorable, manner. Researchers will find useful functionality in the package, which can perform a range of relevant analyses on real data. It is hoped that the package will avoid other workers having to `reinvent the wheel' by virtue of having all of these features together in a unified, clearly described framework implemented in an open-source software environment. \section[Package untb in use]{Package \pkg{untb} in use} \subsection{Analysis of species abundance data} Consider the \code{saunders} dataset, which lists species abundances of various types of marine animals living in kelp holdfasts~\citep{saunders2007,anderson2005}. The dataset may be examined using the functionality of \pkg{untb}: <>= data("saunders") summary(saunders.tot) @ Thus the \code{saunders.tot} dataset comprises~\Sexpr{no.of.ind(saunders.tot)} individuals of~\Sexpr{no.of.spp(saunders.tot)} species, of which~\Sexpr{no.of.singletons(saunders.tot)} are ``singletons''---that is, species sufficiently rare that they have only one representative in the entire sample. Further details of the data are given by function \code{preston()}, which carries out the analysis of~\cite{preston1948}: <>= preston(saunders.tot,n=9) @ <>= jj.preston <- preston(saunders.tot) jj.summary <- summary(saunders.tot) @ Thus there are~\Sexpr{jj.preston[1]} species with abundance 1 (that is, singletons); \Sexpr{jj.preston[2]} species with abundance~2; \Sexpr{jj.preston[3]} species with abundance 3-4; \Sexpr{jj.preston[4]} species with abundance~5-8, and so on. A more graphical representation of the same dataset is given in figure~\ref{ranked.abundance.curve}, which shows the ranked abundance of each species in red on the logarithmic vertical axis. The highest red dot thus corresponds to \Sexpr{jj.summary[[5]]} with \Sexpr{jj.summary[[4]]} individuals. On this figure, grey lines show synthetic datasets generated using the maximum likelihood estimate for~$\theta$ (section~\ref{parameter.estimation}). <>= n <- 10 J <- no.of.ind(saunders.tot) unc <- list() theta_hat <- optimal.theta(saunders.tot) for(i in 1:n){ unc[[i]] <- rand.neutral(J=J, theta=theta_hat) } @ \begin{figure}[htbp] \begin{center} <>= plot(saunders.tot,uncertainty=FALSE) for(i in 1:n){ points(seq_along(unc[[i]]),unc[[i]],type="l",col="grey") } @ \caption{The\label{ranked.abundance.curve} ranked abundance curve of the Saunders dataset showing real data (red) and~10 simulated ranked abundance curves, generated randomly using \code{rand.neutral()} using the maximum likelihood estimate for $\theta$ (grey)} \end{center} \end{figure} \subsection{Estimation of parameters} \label{parameter.estimation} One distinguishing feature of Hubbell's unified neutral theory is that it provides a mechanism for speciation: the algorithm shown in section~\ref{intro} specifies that a birth of an individual will be a new species with probability~$\nu$. Although~$\nu$ is small and $\jm$ large, \citeauthor{hubbell2001} (\citeyear[page 116]{hubbell2001}) observes that their product is a `moderately sized number'. The Fundamental Biodiversity parameter~$\theta$, defined as $\theta=2\jm\nu$, arises naturally when investigating neutral ecological drift and in the present context may be regarded as a parameter susceptible to statistical inference. If~$\phi_a$ is the number of species with abundance~$a$, the ``Ewens sampling formula'' \citep{ewens1972} gives the probability of observing a particular species abundance distribution (SAD) as \begin{equation}\label{hubbellpage122} \mathrm{Pr}\left\{S, n_1,n_2,\ldots,n_S|\theta\right\}= \frac{\jm!\theta^S}{1^{\phi_1}2^{\phi_2}\cdots\jm^{\phi_{\jm}}\, \phi_1!\phi_2!\cdots\phi_{\jm}!\, \prod_{k=1}^{\jm}\left(\theta+k-1\right)} \end{equation} [\code{theta.prob()} in the package] from which it is clear that the number of species~$S$ occurring in a fixed sample size of~$\jm$ individuals is a sufficient statistic for~$\theta$, and optimizing the likelihood~$\mathcal{L}$ given as \begin{equation} \mathcal{L}=\frac{\theta^S}{\prod_{k=1}^{\jm}\theta+k-1} \end{equation} [\code{theta.likelihood()} in the package] over positive values thus furnishes a maximum likelihood estimate for~$\theta$. Function \code{optimal.theta()} carries out this procedure numerically: <>= optimal.theta(saunders.tot) @ The accuracy of this estimate may be assessed by examining the likelihood curve~\citep{alonso2004}; figure~\ref{theta.likelihood} shows the support ($\log\mathcal{L}$) for a range of $\theta$. Using an exchange rate of two units of support per degree of freedom \citep{edwards1992} suggests that the true value of~$\theta$ lies in the range 26-37. \begin{figure}[htbp] \begin{center} <>= S <- no.of.spp(saunders.tot) J <- no.of.ind(saunders.tot) theta <- seq(from=25,to=39,len=55) jj <- theta.likelihood(theta=theta,S=S,J=J,give.log=TRUE) support <- jj-max(jj) plot(theta,support,xlab=paste("Biodiversity parameter",expression(theta)),ylab="support") abline(h= -2) @ \caption{The\label{theta.likelihood} support curve for the Saunders dataset as a function of the Fundamental Biodiversity Parameter $\theta$} \end{center} \end{figure} \subsubsection{Estimating the probability of immigration: The Etienne sampling formula} Neutral theories may allow for the possibility of restricted immigration to, for example, islands. Consider a partially isolated ecosystem of~$J$ individuals, and a neutral metacommunity of~$\jm$ individuals and biodiversity parameter~$\theta$. In the algorithm presented in section~\ref{intro}, reinterpret the birth of a new species (an event with probability~$\nu$) as the arrival of an organism from the metacommunity. This probability is conventionally labelled~$m$; the probability of mutation on the island is negligible. It is often of great interest to estimate~$m$; \citet{etienne2005} shows that the probability of a given set of abundances~$D$ is just equation~\ref{hubbellpage122}, modified by a factor that is a complicated function of~$m$, the species abundances, and~$\theta$. \citeauthor{etienne2005}'s sampling probability is given by function \code{etienne()} in the package, and is maximized by function \code{optimal.params()}. From a numerical perspective, the Etienne sampling formula is challenging: the intermediate steps involved in the calculation of Etienne's $K(D,A)$ [\code{logkda()} in the package] involve very large numbers. Standard IEEE floating-point arithmetic, being unable to represent numbers larger than about~$1.7\times 10^{308}$, restricts one to~$J\lesssim 100$. The \pkg{untb} package circumvents this restriction in two ways: one can use either the logarithmic representation employed by the \pkg{Brobdingnag} package~\citep{hankin2007}; or the \proglang{PARI}/\proglang{GP} system~\citep{batut2006}. These options are controlled by setting a flag in function \code{logkda()}; full documentation is given in the online help page. \subsubsection{The Barro Colorado dataset} The BCI dataset~\citep{condit2005}---a standard resource for ecologists~\citep{hubbell2001,etienne2005}--- contains location and species identity for all trees of~$\geqslant 10\,{\rm cm}$ diameter at breast height on Barro Colorado Island for a number of years. This dataset was used in~\citet{hankin2007a} but is not included in the \pkg{untb} package (and therefore not used in this vignette), because its licence appears to be inconsistent with the GPL. Further details may be found in the package's online documentation file {\tt bci.Rd}. \subsubsection{Estimation of parameters from field data} Some of the difficulties of estimating~$\theta$ and~$m$ from field data are shown in Figure~\ref{mle.theta}, which shows three ensembles of maximum likelihood estimates for local communities of size $J=100,1000,10000$, partially isolated ($m=0.01$) from a metacommunity of size~$\jm=5\times 10^{6}$ with~$\theta=50$. The local communities were allowed to evolve for sufficiently many timesteps for the results to be unaffected by further simulation~\citep{chisholm2004,hubbell2004}. Note the ensemble behaviour of the maximum likelihood estimator: the estimates are severely biased, especially for the smaller sample sizes. In general, $\theta$ is underestimated, while~$m$ is overestimated. Even with the largest local community, $\E(\hat{\theta})=40.9$ and~$\E(\hat{m})=0.047$, showing under- and over-estimation respectively. <>= f <- function(local_size,gens){ jj <- isolate(rn,size=local_size) a <- untb(start=jj, prob=0.01, D=local_size, gens=gens, meta=rn) optimal.params(a) } if(calc_from_scratch){ x100 <- t(replicate(100,f(100 ,999))) x1000 <- t(replicate(100,f(1000 ,999))) x10000 <- t(replicate(100,f(10000 ,999))) } else { load("dat.Rdata") x100 <- dat[,,1] x1000 <- dat[,,2] x10000 <- dat[,,3] } @ \begin{figure}[htbp] \begin{center} <>= plot(x100,log="xy",xlim=c(1,80),ylim=c(0.001,1),col="black",pch=1, main=paste("Maximum likelihood estimates of ", expression(m), " and ",expression(theta)) ) points(x1000,col="red",pch=2) points(x10000,col="blue",pch=3) points(50,0.01,pch=4,lwd=3,cex=2) legend( "bottomleft", c("100","1000","10000"),col=c("black","red","blue"), pch=1:3, title="Local community size") @ \caption{Ensemble of~100 maximum\label{mle.theta} likelihood estimates of~$m$ and~$\theta$ for three different sizes of local community. Here, the metacommunity is of size~$\jm=5\times 10^6$, the biodiversity parameter~$\theta$ is~50, and the immigration probability~$m$ is~0.01. The large diagonal cross corresponds to the true values $(50,0.01)$. Note the systematic bias present for each~$J$, being most severe for the smallest ($J=100$; black circles)} \end{center} \end{figure} In particular, note that many of the ensembles give rise to estimates for~$m$ very close to~$1$. This represents an incorrect assessment of (almost) no dispersal limitation, as in fact~$m=0.01$. The probability of~$\hat{m}$ being greater than, say, 0.95 decreases with increasing local community size~$J$ but is definitely appreciable even when~$J=10000$. In cases where~$\hat{m}\simeq 1$, inspection of the likelihood surface in the~$\left(\theta,m\right)$ plane shows that there is no extremum (that is, $\partial{\mathcal L}/\partial\theta=\partial{\mathcal L}/\partial m=0$) anywhere in the half-plane~$m\leqslant 1$. Such cases are challenging for the method of maximum likelihood because the numerically determined maximum will fall on the line~$m=1$. In practice, this means that~$\hat{m}$ being close to 1 is not inconsistent with a much smaller value of~$m$ unless~$J\gg 10000$, at least for~$\theta\sim 50$. Such considerations may be relevant to any work in which~$\hat{m}\simeq 1$; rather than using~$\left(\hat{\theta},\hat{m}\right)$ as a basis for making inferences, one might adopt the Bayesian approach and use the posterior joint distribution of~$\left(\theta,m\right)$. \subsection[Exact combinatorial analysis for small $J_M$]{Exact combinatorial analysis for small $\boldmath{J}_{\!\boldmath{M}}$} The neutral theory may be used to produce various analytical results, but many are tractable only for small~$\jm$; here I determine exact expected abundances for~$\jm=20$. \citeauthor{hubbell2001} (\citeyear[page 122]{hubbell2001}) shows that the expected abundance of the $i^\mathrm{th}$ most abundant species in a sample is \begin{equation} \sum_{k=1}^C r_i(k)\mathrm{Pr}\left\{ \underbrace{S,r_1,\ldots,r_S,0,\ldots ,0}_{\jm} \right\} \end{equation} where~$r_i(k)$ is the abundance of the $i^\mathrm{th}$ species under configuration~$k$, and summation extends over partitions of~$\jm$ in the sense of~\cite{hankin2006} (\citeauthor{hubbell2001}'s ``configurations'' are our ``partitions''). Function \code{expected.abundance()} in the package calculates this, as shown in Figure~\ref{exp.ab} which plots expected abundance as a function of species rank~$R$. The very small expected abundances for, say, $R>10$, are consistent with the overwhelming majority of $\jm=20$, $\theta=2$ ecosystems comprising fewer than 11 species (thus giving any species with $R>10$ a zero abundance). As an extreme case, the value of about~$10^{-11}$ for~$R=20$ is the probability of the ecosystem comprising only singletons. <>= n <- 20 x <- expected.abundance(J=n, theta=3) e.low <- expected.abundance(J=n,theta=4) e.high <- expected.abundance(J=n,theta=2) @ \begin{figure}[htbp] \begin{center} <>= plot(x) segments(x0=1:n,x1=1:n,y0=e.low,y1=e.high) @ \caption{The\label{exp.ab} expected abundances for the $i^{\mathrm{th}}$ ranked species with $\jm=20$ and~$\theta=3$ for $1\leqslant i\leqslant 20$. Red dots show exact expected abundances and the vertical black lines show the range if~$2\leqslant\theta\leqslant 4$.} \end{center} \end{figure} The abundance of each ranked species will exhibit variance; it is possible to build up a histogram of the abundance of, say, the third ranked species by repeatedly generating a neutral ecosystem and recording the abundance of the third most abundant species amongst the~$\jm$ individuals present. Figure~\ref{hist.abund} shows such a histogram generated, again using~$\jm=20$ and~$\theta=2$. Note the absence of zero abundance: in none of the~1000 replicates did the ecosystem comprise only two species (thus rendering the third ranked species nonexistent). Also note the impossibility of it being~7 or greater: if the third ranked species had abundance~7, then the first and second ranked species would have abundances~$\geqslant 7$, so the total number of individuals would exceed~20. <>= rank3 <- table(replicate(1000,rand.neutral(J=20,theta=2)[3])) @ \begin{figure}[htbp] \begin{center} <>= plot(rank3,xlab="abundance of third ranked species",ylab="frequency") @ \caption{Abundance of third\label{hist.abund} ranked species of an ecosystem of $\jm=20$ individuals and a biodiversity parameter $\theta=2$: 1000 replicates} \end{center} \end{figure} \subsection{Creation and analysis of synthetic datasets} Package \pkg{untb} includes a number of functions that generate synthetic datasets in the context of visualization or verification of predictions of the neutral theory. Function \code{untb()} generates random neutral ecosystems using the system specified in equations~\ref{neutral.markoff.chain}. A sample output is shown graphically in Figure~\ref{punctuated.equilibrium}, which exhibits sudden overturning events during which the dominant species suffers a drastic reduction in abundance, to be replaced by a different species. This is a direct numerical verification of the neutral theory's prediction of punctuated equilibrium~\citep[page~233]{hubbell2001}. <>= if(calc_from_scratch){ set.seed(0); synthetic.spp <- species.table(untb(start=rep(1,60),prob=0.002, gens=40000, keep=TRUE)) } else { load("synthetic_spp.Rdata") } @ \begin{figure}[htb] \begin{center} <>= plot(1:10,xlim=c(0,40000),ylim=c(0,60),type="n",xlab="time (generation)",ylab="abundance") "showabundance" <- function(x, ...){ jj <- rle(x) x <- cumsum(jj$lengths) y <- jj$values segments(x0=c(1,x),y0=y,x1=c(x,x[length(x)]),y1=y, ...) #horizontal ones segments(x0=x[-length(x)],y0=y[-1],x1=x[-length(x)],y1=y[-length(y)], ...) # vertical ones } showabundance(synthetic.spp[,1],col="black") showabundance(synthetic.spp[,2],col="red") showabundance(synthetic.spp[,3],col="green") showabundance(synthetic.spp[,4],col="blue") showabundance(synthetic.spp[,5],col="cyan") showabundance(synthetic.spp[,6],col="magenta") if(FALSE){matplot(synthetic.spp,type="l",lty=1,xlab="time (generation)",ylab="abundance")} @ \caption{Synthetic dataset generated using neutral dynamics.\label{punctuated.equilibrium} Lines show the abundance of each species in time; different colours correspond to different (equivalent) species. A sudden displacement of the initially monodominant species (black line) with another species (red line) occurs at $t\simeq 17000$; a similar overturning event occurs at around time 35000} \end{center} \end{figure} Function~\code{display.untb()} displays an animation of successive iterations of function \code{untb()} using different coloured dots to represent the different species; Figure~\ref{display.untb} shows stills from this function at various points in time under differing values of~$\theta$. Take the top row, which uses~\mbox{$\theta=0$}. Here, species count decreases monotonically\footnote{Hubbell shows that the probability of having just one species tends to 1 as time~$t\longrightarrow\infty$ for~$\theta=0$. Running the \code{display.untb()} function illustrates just how long it takes to get to a monoculture.}, with only three species remaining at~$t=100$. The other rows show an equilibrium being approached, the species richness of which depends on magnitude of $\theta$. Also note the first column, showing the the ecosystem after~$t=20$: the four plots are visually indistinguishable, illustrating the difficulty of estimating~$\theta$ in ecosystems far from equilibrium (after a disturbance, for example). \begin{figure}[htbp] \begin{center} <>= set.seed(0) f <- function(gens,p){ display.untb(start=sample(as.census(untb(start=1:100,gens=gens,D=100,prob=p))),gens=0,main="",cex=1.7, asp=1) } g <- function(u="title", ...){ par(srt=0) par(mai=c(0,0,0,0)) plot.new() text(0.5,0.5,u,...) } h <- function(u="title", ...){ par(mai=c(0,0,0,0)) par(srt=90) plot.new() text(0.5,0.5,u, ...) } nf <- layout(matrix( c(00,01,00,02,00,03, 04,05,00,06,00,07, 00,00,00,00,00,00, 08,09,00,10,00,11, 00,00,00,00,00,00, 12,13,00,14,00,15, 00,00,00,00,00,00, 16,17,00,18,00,19),8,6,byrow=TRUE), c(1,4, 1,4, 1,4), c(1,4, 1,4, 1,4, 1,4), TRUE) g(expression(t==10)) g(expression(t==50)) g(expression(t==100)) h(expression(theta==0)) f(10,0) f(50,0) f(100,0) h(expression(theta==0.1)) f(10,0.001) f(50,0.001) f(100,0.001) h(expression(theta==1)) f(10,0.01) f(50,0.01) f(100,0.01) h(expression(theta==10)) f(10,0.1) f(50,0.1) f(100,0.1) @ \caption{Simulated ecosystems of size~$\jm=100$ for varying parameters, \label{display.untb} with each organism represented by a dot, the different colours representing different species; spatial locations are functionally identical. Initial conditions are a system of maximal diversity (i.e. 100 singletons)} \end{center} \end{figure} \clearpage \section{Conclusions} This paper presents Hubbell's Unified Neutral Theory of Biodiversity %' from a computational ecology perspective, and introduces the \pkg{untb} package of \proglang{R} routines for numerical simulation of neutral ecosystem dynamics, analysis of field data, and visualization of real and synthetic datasets. Examples taken from oceanographic datasets are presented and discussed using the package. A number of synthetic datasets are generated and analyzed using the package's simulation functions and %' analyzed using the visualization suite. \section*{Acknowledgements} I would like to acknowledge the many stimulating and helpful comments made on the \proglang{R}-help list while preparing this paper; and the helpful reviews of two anonymous referees which considerably improved the clarity of an earlier draft. The manuscript was prepared with the help of the \pkg{weaver} package~\citep{falcon2007}. \bibliography{untb} \end{document}