Authors:
Fatemeh Tavakoli
1
;
Kshirasagar Naik
1
;
Marzia Zaman
2
;
Richard Purcell
3
;
Srinivas Sampalli
3
;
Abdul Mutakabbir
4
;
Chung-Horng Lung
4
and
Thambirajah Ravichandran
5
Affiliations:
1
Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada
;
2
Research and Development, Cistel Technology, Ottawa, ON, Canada
;
3
Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
;
4
Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada
;
5
Research and Development, Hegyi Geomatics International Inc., Ottawa, ON, Canada
Keyword(s):
Forest Fire, Classification, Machine Learning, Supervised Learning, Dataset, Big Data, Random Forest, XGBoost, LightGBM, SMOTE, NearMiss, SMOTE-ENN.
Abstract:
Forest fires have been escalating in frequency and intensity across Canada in recent times. This study employs machine learning techniques and builds a dataset framework utilizing Copernicus climate reanalysis data combined with historical fire data to develop a fire classification framework. Three algorithms, Random Forest, XGBoost, and LightGBM, were evaluated. Given the pronounced class imbalance of 154:1 between “non-fire” and “fire” events, we rigorously employed two re-sampling strategies: Spatiotemporal, focusing on spatial and seasonal considerations, and Technique-Driven, leveraging advanced algorithmic approaches. Ultimately, XGBoost combined with NearMiss Version 3 in a 0.09 sampling ratio between “non-fire” and “fire” events yielded the best results: 98.08% precision, 86.06% sensitivity, and 93.03% specificity.