Home / Regular Issue / JST Vol. 30 (1) Jan. 2022 / JST-2029-2020

 

A Comparative Effectiveness of Hierarchical and Non-hierarchical Regionalisation Algorithms in Regionalising the Homogeneous Rainfall Regions

Zun Liang Chuan, Wan Nur Syahidah Wan Yusoff, Azlyna Senawi, Mohd Romlay Mohd Akramin, Soo-Fen Fam, Wendy Ling Shinyie and Tan Lit Ken

Pertanika Journal of Science & Technology, Volume 30, Issue 1, January 2022

DOI: https://doi.org/10.47836/pjst.30.1.18

Keywords: Anderson Darling statistical test, bootstrap, hierarchical, non-hierarchical, regionalisation algorithm, unbiased statistical test

Published on: 10 January 2022

Descriptive data mining has been widely applied in hydrology as the regionalisation algorithms to identify the statistically homogeneous rainfall regions. However, previous studies employed regionalisation algorithms, namely agglomerative hierarchical and non-hierarchical regionalisation algorithms requiring post-processing techniques to validate and interpret the analysis results. The main objective of this study is to investigate the effectiveness of the automated agglomerative hierarchical and non-hierarchical regionalisation algorithms in identifying the homogeneous rainfall regions based on a new statistically significant difference regionalised feature set. To pursue this objective, this study collected 20 historical monthly rainfall time-series data from the rain gauge stations located in the Kuantan district. In practice, these 20 rain gauge stations can be categorised into two statistically homogeneous rainfall regions, namely distinct spatial and temporal variability in the rainfall amounts. The results of the analysis show that Forgy K-means non-hierarchical (FKNH), Hartigan- Wong K-means non-hierarchical (HKNH), and Lloyd K-means non-hierarchical (LKNH) regionalisation algorithms are superior to other automated agglomerative hierarchical and non-hierarchical regionalisation algorithms. Furthermore, FKNH, HKNH, and LKNH yielded the highest regionalisation accuracy compared to other automated agglomerative hierarchical and non-hierarchical regionalisation algorithms. Based on the regionalisation results yielded in this study, the reliability and accuracy that assessed the risk of extreme hydro-meteorological events for the Kuantan district can be improved. In particular, the regional quantile estimates can provide a more accurate estimation compared to at-site quantile estimates using an appropriate statistical distribution.

  • Ahmad, N. H., Othman, I. R., & Deni, S. M. (2013). Hierarchical cluster approach for regionalisation of Peninsular Malaysia based on the precipitation amount. Journal of Physics: Conference Series, 423, 1-10. https://doi.org/10.1088/1742-6596/423/1/012018

  • Awan, J. A., Bae, D. H., & Kim, K. J. (2014). Identification and trend analysis of homogeneous rainfall zones over the East Asia monsoon region. International Journal of Climatology, 35(7), 1422-1433. https://doi.org/10.1002/joc.4066

  • Burn, D. H., Zrinji, Z., & Kowalchuk, M. (1997). Regionalization of catchments for regional flood frequency analysis. Journal of Hydrologic Engineering, 2(2), 76-82. https://doi.org/10.1061/(ASCE)1084-0699(1997)2:2(76)

  • Chuan, Z. L., Deni, S. M., Fam, S. F., & Ismail, N. (2020). The effectiveness of a probabilistic principal component analysis model and expectation maximisation algorithm in treating missing daily rainfall data. Asia-Pacific Journal of Atmospheric Sciences, 56, 119-129. https://doi.org/10.1007/s13143-019-00135-8

  • Chuan, Z. L., Ismail, N., Shinyie, W. L., Ken, T. L., Fam, S. F., Senawi, A., & Yusoff, W. N. S. W. (2018a). The efficiency of average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling in identifying homogeneous precipitation catchments. IOP Conference Series: Materials Science and Engineering, 342, 1-10. https://doi.org/10.1088/1757-899X/342/1/012070

  • Chuan, Z. L., Ismail, N., Yusoff, W. N. S. W., Fam, S. F., & Romlay, M. A. M. (2018b). Identifying homogeneous rainfall catchments for non-stationary time series using TOPSIS algorithm and bootstrap k-sample Anderson darling test. International Journal of Engineering & Technology, 7(4), 3228-3237.

  • Chuan, Z. L., Senawi, A., Yusoff, W. N. S. W., Ismail, N., Ken, T. L., & Chuan, M. W. (2018c). Identifying the ideal number Q-components of the Bayesian principal component analysis model for missing daily precipitation data treatment. International Journal of Engineering & Technology, 7(4.30), 5-10. https://doi.org/10.14419/ijet.v7i4.30.21992

  • Dash, M., & Liu, H. (2003). Feature selection for clustering. In T. Terano, H. Liu & A. L. P. Chen (Eds.), Knowledge discovery and data mining current issues and new applications (pp. 110-121). Springer. https://doi.org/10.1007/3-540-45571-X_13

  • Forgy, E. (1965). Cluster analysis of multivariate data: Efficiency versus interpretability of classification. Biometrics, 21(3), 768-769.

  • Guttman, N. B. (1993). The use of L-moments in the determination of regional precipitation climates. Journal of Climate, 6(12), 2309-2325. https://doi.org/10.1175/1520-0442(1993)006<2309:TUOLMI>2.0.CO;2

  • Hamdan, M. F., Suhaila, J., & Jemain, A. A. (2015). Clustering rainfall pattern in Malaysia using functional data analysis. AIP Conference Proceedings, 1643, 349-355. https://doi.org/10.1063/1.4907466

  • Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1), 100-108. https://doi.org/10.2307/2346830

  • Lloyd, S. P. (1982). Least square quantization in PCM. IEEE Transactions on Information Theory, IT-28(2), 129-137. https://doi.org/10.1109/TIT.1982.1056489

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Cam & J. Neyman (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (pp. 281-297). University of California Press.

  • Ngongondo, C. S., Xu, C. Y., Tallaksen, L. M., Alemaw, B., & Chirwa, T. (2011). Regional frequency analysis of rainfall extremes in Southern Malawi using the index rainfall and L-moments approaches. Stochastic Environmental Research and Risk Assessment, 25(7), 939-955. https://doi.org/10.1007/s00477-011-0480-x

  • Nnaji, C. C., Mama, C. N., & Ukpabi, O. (2014). Hierarchical analysis of rainfall variability across Nigeria. Theoretical and Applied Climatology, 123(1-2), 171-184. https://doi.org/10.1007/s00704-014-1348-z

  • Saeed, G. A. A., Chuan, Z. L., Zakaria, R., Yusoff, W. N. S. W., & Salleh, M. Z. (2016). Determine of the best single imputation algorithm for missing rainfall data treatment. Journal of Quality Measurement and Analysis, 12(1-2), 79-87.

  • Sahrin, S., Ismail, N., & Alias, N. E. (2018). Regional frequency analysis of Peninsular Malaysia using L-moments. Far East Journal of Mathematical Sciences, 103(8), 1379-1398. https://dx.doi.org/10.17654/MS103081379

  • Scholz, F. W., & Stephens, M. A. (1986). K-sample Anderson-Darling tests. Journal of the American Statistical Association, 82(399), 918-924. https://doi.org/10.1080/01621459.1987.10478517

  • Shimodaira, H. (2002). An approximately unbiased test of phylogenetic tree selection. Systematic Biology, 51(3), 492-508. https://doi.org/10.1080/10635150290069913

  • Tan, P. N., Steinbach, M., & Kumar, V. (2006). Introduction to data mining. Pearson Addison Wesley.

  • Terassi, P. M. D. B., & Galvani, E. (2017). Identification of homogeneous rainfall regions in the Eastern watersheds of the State of Paraná, Brazil. Climate, 5(3), 1-13. https://doi.org/10.3390/cli5030053

ISSN 0128-7680

e-ISSN 2231-8526

Article ID

JST-2029-2020

Download Full Article PDF

Share this article

Recent Articles