e-ISSN 2231-8526
ISSN 0128-7680
Janice Allison Anak Sabang, Stephanie Chua, and Puteri Nor Ellyza Nohuddin
Pertanika Journal of Science & Technology, Volume 34, Issue 2, April 2026
DOI: https://doi.org/10.47836/pjst.34.2.20
Keywords: Deep learning, feature set, machine learning, Myers-Briggs type indicator, prediction model
Published on: 2026-04-30
The Myers-Briggs Type Indicator (MBTI) is used to categorise individuals into one of the 16 types, using the acronym across the four binary personality trait divisions: Extraversion against Introversion (E/I), Intuition against Sensing (N/S), Feeling against Thinking (F/T), and Judging against Perceiving (J/P). While MBTI personality types are typically determined through questionnaire answering, the task of categorising an individual’s MBTI personality type based on their written texts can be presented as a classification task that utilises machine learning and deep learning techniques. The objective of this paper is to compare and determine the best feature set for MBTI personality type classification. The methods involved in this study were text mining, feature generation, machine learning, and model evaluation. The feature generation approaches tested out were the statistical analysis approach and the semantic approach involving grammar class tagging and synonym generation. Different document-term matrix representations involving both standard and synset column representations for the semantic approach’s synonyms generation are also experimented. This study found that the best MBTI personality type classification performance was obtained using the Logistic Regression model through the utilisation of the TF-IDF Top 10,000 nouns feature set from the statistical analysis with the semantic approach’s grammar class tagging.
ISSN 0128-7680
e-ISSN 2231-8526