e-ISSN 2231-8526
ISSN 0128-7680

Home / Regular Issue / JST Vol. 29 (4) Oct. 2021 / JST-2403-2021


Classification of Existing Health Model of India at the End of the Twelfth Plan using Enhanced Decision Tree Algorithm

Ashok Kumar, Arun Lal Srivastav, Ishwar Dutt and Karan Bajaj

Pertanika Journal of Science & Technology, Volume 29, Issue 4, October 2021


Keywords: C4.5 Algorithm, classification algorithms, decision tree, health model, Shannon entropy

Published on: 29 October 2021

The high rate of urbanisation has increased the need for state-of-art health models that can meet the growing needs of society during any pandemic. Information-theoretic algorithms based on decision tree can mine the data to establish standards for the final decision by classifying the related data. Classification is an effective tool to analyse the existing health system in India’s states and union territories. For this purpose, the data is categorised and then treated with the enhanced Shannon Entropy-based C4.5 decision tree algorithm to set some rules. These rules are capable of finding the major gaps in the health care systems after the analysis. Supposedly, these gaps are taken care of properly in the affected regions. In that case, the health care models will accomplish the endeavouring Sustainable Development Goals.

  • Afulani, P. A., Phillips, B., Aborigo, R. A., & Moyer, C. A. (2019). Person-centred maternity care in low-income and middle-income countries: Analysis of data from Kenya, Ghana, and India. The Lancet Global Health, 7(1), e96-e109.

  • Alkema, L., Chou, D., Hogan, D., Zhang, S., Moller, A. B., Gemmill, A., Fat, D. M., Boerma, T., Temmerman, M., Mathers, C., & Say, L. (2016). Global, regional, and national levels and trends in maternal mortality between 1990 and 2015, with scenario-based projections to 2030: A systematic analysis by the UN Maternal Mortality Estimation Inter-Agency Group. The Lancet, 387(10017), 462-474. https:// 10.1016/S0140-6736(15)00838-7

  • Antonella, P., & Mariangela, S. (2017). Weighted distance-based trees for ranking data. Advances in Data Analysis and Classification, 13(2), 427-444.

  • Assembly, U. G. (2000, September 6-8). United Nations millennium declaration. In Millenium Summit of the United Nations. New York.

  • Chen, M. S., Han, J., & Yu, P. S. (1996). Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6), 866-883. https:// 10.1109/69.553155

  • Gondek, D., & Hofmann, T. (2007). Non-redundant data clustering. Knowledge and Information Systems, 12(1), 1-24.

  • Jamaludin, M. H., Wah, Y. B., Nawawi, H. M., Yung-An, C., Rosli, M. M., & Annamalai, M. (2020). Classification of familial hypercholesterolaemia using ordinal logistic regression. Pertanika Journal of Science & Technology, 28(4), 1163-1177.

  • Jonsson, Å., Orwelius, L., Dahlstrom, U., & Kristenson, M. (2020). Evaluation of the usefulness of EQ-5D as a patient-reported outcome measure using the Paretian classification of health change among patients with chronic heart failure. Journal of Patient-Reported Outcomes, 4(1), 1-11.

  • Karim, A., & Frank, P. F. (2017). Local generalized quadratic distance metrics:Application to the k-nearest neighbors. Advances in Data Analysis and Classification, 12(2), 341-363. https://10.1007/s11634-017-0286-x.

  • Kruk, M. E., Nigenda, G., & Knaul, F. M. (2015). Redesigning primary care to tackle the global epidemic of noncommunicable disease. American Journal of Public Health, 105(3), 431-437. https://10.2105/AJPH.2014.302392

  • Kruk, M. E., Porignon, D., Rockers, P. C., & Van Lerberghe, W. (2010). The contribution of primary care to health and health systems in low-and middle-income countries: A critical review of major primary care initiatives. Social Science & Medicine, 70(6), 904-911. https://10.1016/j.socscimed.2009.11.025

  • Kumar, A., Taneja, H. C., & Chitkara A.. (2016, January 18-19). Analysis of health conditions using generalized information measure based ID3 algorithm. In 4th Annual International Conference on Operations Research and Statistics (ORS-2016) (pp. 33-37). Singapore. https://10.5176/2251-1938_ORS16.11

  • Macarayan, E. K., Gage, A. D., Doubova, S. V., Guanais, F., Lemango, E. T., Ndiaye, Y., Waiswa, P., & Kruk, M. E. (2018). Assessment of quality of primary care with facility surveys: A descriptive analysis in ten low-income and middle-income countries. The Lancet Global Health, 6(11), e1176-e1185.

  • Mackintosh, M., Channon, A., Karan, A., Selvaraj, S., Cavagnero, E., & Zhao, H. (2016). What is the private sector? Understanding private provision in the health systems of low-income and middle-income countries. The Lancet, 388(10044), 596-605.

  • Maria, T. G., & Gunter, R. (2016). Probabilistic clustering via Pareto solutions and significance tests. Advance Data Analysis and Classification, 12(2), 179-202. https://10.1007/s11634-016-0278-2.

  • OGD. (2015). Open government data (OGD) platform India. Retrieved June 6, 2015, from

  • Okada, M., Tanaka, T., Oseto, M., Takeda, N., & Shinozaki, K. (2006). Genetic analysis of noroviruses associated with fatalities in healthcare facilities. Archives of Virology, 151(8), 1635-1641.

  • Panagiotis, T., & Christos, T. (2016). T3C: Improving a decision tree classification algorithm’s interval splits on continuous attributes. Advances in Data Analysis and Classification, 11(2), 353-370.

  • Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1) 81-106.

  • Rokach, L., & Maimon, O. (2014). Data mining with decision trees: Theory and applications. World Scientific.

  • Salzberg, S. L. (1994). C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Machine Learning, 16, 235-240.

  • Sarka, B., Maia, Z., Peter, F., Thomas, O., & Christian, B. (2018). Clustering of imbalanced high-dimensional media data. Advances in Data Analysis and Classification, 12(2), 261-284.

  • Shannon, C. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. https://10.1002/j.1538-7305.1948.tb01338.x

  • Sharma, H., & Kumar, S. (2016). A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research, 4(4) 2094-2097.

  • Shi, L. (2012). The impact of primary care: A focused review. Scientifica, 2012, Article 432892. https://10.6064/2012/432892

  • Tzirakis, P., & Tjortjis, C. (2017). T3C: Improving a decision tree classification algorithm’s interval splits on continuous attributes. Advances in Data Analysis and Classification, 11(2), 353-370.

  • Varma, R. S. (1966). Generalizations of Renyi’s entropy of order α. Journal of Mathematical Sciences, 1(7), 34-48.

  • Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Philip, S. Y., & Zhou, Z. H. (2008). Top 10 algorithms in data mining. Knowledge and information systems, 14(1), 1-37.

  • Zeng, J., Shi, L., Zou, X., Chen, W., & Ling, L. (2015). Rural-to-urban migrants’ experiences with primary care under different types of medical institutions in Guangzhou, China. PloS One, 10(10), Article e0140922.

  • Zhang, J., Kang, D. K., Silvescu, A., & Honavar, V. (2006). Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data. Knowledge and Information Systems, 9(2), 157-179.

  • Zhu, P., & Wen, Q. (2010). Some improved results on communication between information systems. Information Sciences, 180(18), 3521-3531.

ISSN 0128-7680

e-ISSN 2231-8526

Article ID


Download Full Article PDF

Share this article

Recent Articles