# Bayesian versus Neural Network Analysis of Algae Data Population - A New Method to Predict and Analyse Cause and Effect

### Jen J. Lee, Jorge A. Achcar, Emílio A. C. Barros, Carlos D. Maciel

#### Abstract

In biology, advanced modelling techniques are needed since there is a mixture of qualitative, linguistics and numerical data on the environmental and biological relationships. Also, experiments and data collecting are expensive and time consuming, so determine which variables are relevant and using inference models less data demanding are highly desirable. In this work, from a set of 200 multivariate data samples of algae population and environmental variables, we propose a Bayesian method to predict compositional population distribution. This is a good application example, since measuring environmental variables are easier to automate, faster and less expensive than population counting that usually involves the need of a large amount of specialized human interaction. An additive log-ratio transformation and a regression model were applied to the data and 255.000 Gibbs samples were simulated using the OPENBUGS software. Also an Artificial Neural Network (ANN) was designed on Matlab to predict the distribution for benchmarking purposes. Both models showed similar prediction performance, but on the Bayesian model an analysis of credible interval of the variables corresponding to the each regression parameters is possible, showing that most of the variables on this study are relevant, which is consistent to the expected results in this case.

#### References

- Aitchison, J., 1982. The statistical analysis of compositional data. Journal of the Royal Statistical Society, pp. p. 139-177.
- Aitchison, J., 1986. The statistical analysis of compositional data. Chapman and Hall.
- Aitchison, J. & Shen, S. M., 1980. Logistic-normal distributions: Some properties and uses. Biometrika, Issue 67, pp. 261-272.
- da Silva, I. N., Spatti, D. H. & Flauzino, R. A., 2010. Redes Neurais Artificiais para engenharia e ciências aplicadas. I ed. São Paulo: ArtLiber.
- Gelfand, A. E., Carlin, J. B., Stern, H. S. & Rubin, D. B., 1995. Bayesian Data Analysis. Issue 85, pp. 398-409.
- Gelfand, A. E. & Smith, A. F. M., 1990. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association, Issue 85, pp. 398-409.
- Iyengar, M. & Dey, D. K., 1996. Bayesian Analysis of Compositional Data. Department of Statistics, University of Connecticut, Storrs, CT.
- Iyengar, M. & Dey, D. K., 1998. Box-Cox transformations in Bayesian analysis of compositional data. Environmetrics, Issue 9, pp. 657-671.
- López-Flores, R., Romaní, A. M. & Quintana, X. D., 2011. Phytoplankton composition in shallow water ecosystems: influence of environmental gradients and nutrient availability, In 4th international Workshop on Compositional Data Analysis.
- Lynch, S. M., 2007. Introduction to Applied Bayesian Statistics and Estimation for Social Scientists. s.l.:Springer .
- OpenBUGS, 2009. OpenBUGS, accessed 10 Feb 2013, <http://www.openbugs.info/w.cgi/FrontPage>.
- Rayens, W. S. & Srinivasan, C., 1991a. Box-Cox transformations in the analysis of compositional data. Journal of Chemometrics, Issue 5, pp. 227-239.
- Rayens, W. S. & Srinivasan, C., 1991b. Estimation in compositional data. Journal of Chemometrics, Issue 5, pp. 361-374.
- Roberts, G. O. & Smith, A. F. M., 1993. Bayesian methods via the Gibbs sampler and related Markov Chain Monte Carlo methods. Journal of the Royal Statistical Society, 55(1), pp. 3-23.
- The MathWorks, Inc., 1994-2013. Neural Network Toolbox, accessed 20 Jan 2013, <http:// www.mathworks.com/products/neural-network/descri ption2.html>.
- Tjelmeland, H. & Lund, K. V., 2003. Bayesian modelling of spatial compositional data, preprint n.1. Journal of Applied Statistics, Issue 30, pp. 87-100.
- Univesity of California - Irvine, 1999. COIL 1999 Competition Data, accessed 10 Jul 2013, <http://kdd.ics.uci.edu/databases/coil/coil.data.html>.

#### Paper Citation

#### in Harvard Style

J. Lee J., A. Achcar J., A. C. Barros E. and D. Maciel C. (2013). **Bayesian versus Neural Network Analysis of Algae Data Population - A New Method to Predict and Analyse Cause and Effect** . In *Proceedings of the 5th International Joint Conference on Computational Intelligence - Volume 1: NCTA, (IJCCI 2013)* ISBN 978-989-8565-77-8, pages 482-488. DOI: 10.5220/0004552304820488

#### in Bibtex Style

@conference{ncta13,

author={Jen J. Lee and Jorge A. Achcar and Emílio A. C. Barros and Carlos D. Maciel},

title={Bayesian versus Neural Network Analysis of Algae Data Population - A New Method to Predict and Analyse Cause and Effect},

booktitle={Proceedings of the 5th International Joint Conference on Computational Intelligence - Volume 1: NCTA, (IJCCI 2013)},

year={2013},

pages={482-488},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0004552304820488},

isbn={978-989-8565-77-8},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 5th International Joint Conference on Computational Intelligence - Volume 1: NCTA, (IJCCI 2013)

TI - Bayesian versus Neural Network Analysis of Algae Data Population - A New Method to Predict and Analyse Cause and Effect

SN - 978-989-8565-77-8

AU - J. Lee J.

AU - A. Achcar J.

AU - A. C. Barros E.

AU - D. Maciel C.

PY - 2013

SP - 482

EP - 488

DO - 10.5220/0004552304820488