Quantity Distribution Search using Sparse Representation Generated with Kernel-based Regression

Akinori Asahara, Hideki Hayashi

Abstract

The number of records representing a quantity distribution (e.g. temperature and rainfall) requires an extreme amount of overhead to manage the data. We propose a method using a subset of records against the problem. The proposed method involves an approximation derived with kernel ridge regression in advance to determine the minimal dataset to be input into database systems. As an advantage of the proposed method, processes to reconstruct the original dataset can be completely implemented with Structured Query Language, which is used for relational database systems. Thus users can analyze easily the quantity distribution. From the results of experiments using digitized elevation map data, we confirmed that the proposed method can reduce the number of data to less than 1/10 of the original number if the acceptable error was set to 125 m.

References

  1. Asahara, A., Hayashi, H., and Kai, T. (2015). Moving point density estimation algorithm based on a generated bayesian prior. ISPRS International Journal of Geo-Information, 4(2):515-534.
  2. Bates, P. D. and De Roo, A. (2000). A simple raster-based model for flood inundation simulation. Journal of hydrology, 236(1):54-77.
  3. Baumann, P., Dehmel, A., Furtado, P., Ritsch, R., and Widmann, N. (1999). Spatio-temporal retrieval with rasdaman. In VLDB, pages 746-749.
  4. Christensen, J. H., Kjellstr öm, E., Giorgi, F., Lenderink, G., Rummukainen, M., et al. (2010). Weight assignment in regional climate models. Climate research (Open Access for articles 4 years old and older), 44(2):179.
  5. David, H. D. and Thomas, K. P. (1973). Algorithms for the reduction of the number of points required to represent a hne or its caricature. The Canadian Cartographer, 10(2):112-122.
  6. Hayashi, H., Asahara, A., Sugaya, N., Ogawa, Y., and Tomita, H. (2015). Spatio-temporal similarity search method for disaster estimation. In 2015 IEEE International Conference on Big Data (Big Data), pages 2462-2469. IEEE.
  7. Haynes, D., Ray, S., Manson, S. M., and Soni, A. (2015). High performance analysis of big spatial data. In 2015 IEEE International Conference on Big Data (Big Data), pages 1953-1957. IEEE.
  8. Hershberger, J. and Snoeyink, J. (1994). An o(nlogn) implementation of the douglas-peucker algorithm for line simplification. In Proceedings of the Tenth Annual Symposium on Computational Geometry, SCG 7894, pages 383-384, New York, NY, USA. ACM.
  9. Hofstra, N., Haylock, M., New, M., and Jones, P. D. (2009). Testing e-obs european high-resolution gridded data set of daily precipitation and surface temperature. Journal of Geophysical Research: Atmospheres, 114(D21).
  10. International Standard Organization. ISO IEC CD 9075-15 Information technology - Database languages - SQL - Part 15: Multi dimensional arrays. http://www.iso.org/iso/home/store/ catalogue tc/catalogue detail.htm?csnumber=67382.
  11. John Shawe-Taylor, N. C. (2004). Kernel Methods for Pattern Analysis. Cambridge University Press.
  12. Kbiob, D. (1951). A statistical approach to some basic mine valuation problems on the witwatersrand. Journal of Chemical, Metallurgical, and Mining Society of South Africa.
  13. Koubarakis, M., Sellis, T., Frank, A. U., Grumbach, S., G üting, R. H., Jensen, C. S., Lorentzos, N., Manolopoulos, Y., Nardelli, E., Pernici, B., et al. (2003). Spatio-temporal databases: The CHOROCHRONOS approach, volume 2520. Springer.
  14. Ministry of Land, Infrastructure, Transport and Turism (2014). National Land Numerical Information Data. http://nlftp.mlit.go.jp/ksj-e/index.html.
  15. Open Geospatial Consortium (2010). OGC Network Common Data Form (NetCDF) Core Encoding Standard version 1.0 (10-090r3). http:// www.opengeospatial.org/standards/netcdf.
  16. Oracle (2014). Oracle Spatial and Graph GeoRaster, ORACLE WHITE PAPER SEPTEMBER 2014. http://download.oracle.com/otndocs/products/spatial/ pdf/12c/oraspatialfeatures 12c wp georaster wp.pdf.
  17. Pajarola, R. and Widmayer, P. (1996). Spatial indexing into compressed raster images: how to answer range queries without decompression. In Multimedia Database Management Systems, 1996., Proceedings of International Workshop on, pages 94-100. IEEE.
  18. Park, S., Bringi, V., Chandrasekar, V., Maki, M., and Iwanami, K. (2005). Correction of radar reflectivity and differential reflectivity for rain attenuation at x band. part i: Theoretical and empirical basis. Journal of Atmospheric and Oceanic Technology, 22(11):1621-1632.
  19. Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (2007). Numerical Recipes 3rd Edition: The Art of Scientific Computing . Cambridge University Press, New York, NY, USA, 3 edition.
  20. Seaman, D. E. and Powell, R. A. (1996). An evaluation of the accuracy of kernel density estimators for home range analysis. Ecology, 77(7):2075-2085.
  21. Shyu, C.-R., Klaric, M., Scott, G. J., Barb, A. S., Davis, C. H., and Palaniappan, K. (2007). Geoiris: Geospatial information retrieval and indexing systemContent mining, semantics modeling, and complex queries. IEEE Transactions on geoscience and remote sensing, 45(4):839-852.
  22. Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall/CRC.
  23. Stonebraker, M., Duggan, J., Battle, L., and Papaemmanouil, O. (2013). Scidb DBMS research at M.I.T. IEEE Data Eng. Bull., 36(4):21-30.
  24. the European Climate Assessment & Dataset project team (2016). European Climate Assessment & Dataset, Daily Data. http://www.ecad.eu/dailydata/index.php.
  25. Theodoridis, Y., Sellis, T., Papadopoulos, A. N., and Manolopoulos, Y. (1998). Specifications for efficient indexing in spatiotemporal databases. In Scientific and Statistical Database Management, 1998. Proceedings. Tenth International Conference on, pages 123-132. IEEE.
  26. Vapnik, V., Golowich, S. E., Smola, A., et al. (1997). Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, pages 281- 287.
  27. Zhang, J., You, S., and Gruenwald, L. (2010). Indexing large-scale raster geospatial data using massively parallel gpgpu computing. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 450-453, New York, NY, USA. ACM.
Download


Paper Citation


in Harvard Style

Asahara A. and Hayashi H. (2017). Quantity Distribution Search using Sparse Representation Generated with Kernel-based Regression . In Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management - Volume 1: GISTAM, ISBN 978-989-758-252-3, pages 209-216. DOI: 10.5220/0006316402090216


in Bibtex Style

@conference{gistam17,
author={Akinori Asahara and Hideki Hayashi},
title={Quantity Distribution Search using Sparse Representation Generated with Kernel-based Regression},
booktitle={Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management - Volume 1: GISTAM,},
year={2017},
pages={209-216},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006316402090216},
isbn={978-989-758-252-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management - Volume 1: GISTAM,
TI - Quantity Distribution Search using Sparse Representation Generated with Kernel-based Regression
SN - 978-989-758-252-3
AU - Asahara A.
AU - Hayashi H.
PY - 2017
SP - 209
EP - 216
DO - 10.5220/0006316402090216