Improving Credit Decision Through Machine Learning and Alternative Data: Evidence from NBFIS

Main Article Content

Jinnajate Achalapong
Manit Satitsamitpong

Abstract

Non-banking financial institutions (NBFIs) often struggle to make accurate credit decisions, especially for customers with insufficient traditional credit histories. Conventional models, such as logistic regression, primarily depend on credit bureau data and fail to capture the full credit potential of underserved populations—thereby hindering business expansion and financial inclusion. This study investigates how NBFIs can enhance credit decision-making by applying advanced machine learning techniques—namely XGBoost and neural networks—alongside alternative data sources, including mobile phone usage patterns, utility bill payments, and social media activity. Utilizing a real-world dataset of over 300,000 individuals, the findings demonstrate that machine learning models significantly outperform traditional approaches, particularly when alternative data is incorporated. These improvements lead to more precise risk classification, enabling institutions to reduce default rates, expand lending to previously overlooked borrowers, and improve portfolio profitability. In addition, the study addresses critical ethical and privacy considerations surrounding alternative data use. The results provide actionable insights for NBFIs aiming to adopt data-driven credit strategies that balance predictive power with responsible data governance—ultimately enhancing credit operations and promoting inclusive growth.

Article Details

Section
Articles

References

Abdou, H. A., & Pointon, J. (2011). Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intelligent Systems in Accounting, Finance and Management, 18(2-3), 59-88. https://doi.org/10.1002/isaf.325

Aithal, V., & Jathanna, R. D. (2019). Credit risk assessment using machine learning techniques. International Journal of Innovative Technology and Exploring Engineering, 9(1), 3570-3575.

Albanesi, S., & Vamossy, D. F. (2019). Predicting consumer default: A deep learning approach (NBER Working Paper No. 26165). National Bureau of Economic Research. https://doi.org/10.3386/w26165

Alliance for Financial Inclusion. (2025). Alternative data for credit scoring. AFI. https://www.afi-global.org/publication/2025-alternative-data-for-credit-scoring/

Avery, R. B., Calem, P. S., & Canner, G. B. (2004). Consumer credit scoring: Do situational circumstances matter? Journal of Banking & Finance, 28(4), 835-856. https://doi.org/10.1016/j.jbankfin.2003.10.009

Anderson, R. (2007). The credit scoring toolkit: Theory and practice for retail credit risk management and decision automation. Oxford University Press.

Baesens, B., Roesch, D., & Scheule, H. (2016). Credit risk analytics: Measurement techniques, applications, and examples in SAS. John Wiley & Sons.

Bartlett, R., Morse, A., Stanton, R., & Wallace, N. (2019). Consumer-lending discrimination in the FinTech era (NBER Working Paper No. 25943). National Bureau of Economic Research. https://doi.org/10.3386/w25943

Bazarbash, M. (2019). FinTech in financial inclusion: Machine learning applications in assessing credit risk (IMF Working Paper No. 19/109). International Monetary Fund. https://doi.org/10.5089/9781498314428.001

Berg, T., Burg, V., Gombović, A., & Puri, M. (2020). On the rise of intechs: Credit scoring using digital footprints. The Review of Financial Studies, 33(7), 2845-2897. https://doi.org/10.1093/rfs/hhz099

Blanco, A., Pino-Mejías, R., Lara, J., & Rayo, S. (2020). Assessing the fairness of credit scoring models. Journal of Financial Services Research, 57(2), 153-177. https://doi.org/10.1007/s10693-019-00322-8

Bradford, T. (2023). Give me some credit!: Using alternative data to expand credit access (Payments System Research Briefing). Federal Reserve Bank of Kansas City. Retrieved June 27, 2025, from https://www.kansascityfed.org/research/payments-system-research-briefings/give-me-some-credit-using-alternative-data-to-expand-credit-access/

Bravo, C., Maldonado, S., & Weber, R. (2015). Methodologies for the construction of models for credit granting based on utility functions. Annals of Operations Research, 233(1), 237-264. https://doi.org/10.1007/s10479-013-1492-3

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794). ACM. https://doi.org/10.1145/2939672.2939785

De Cnudde, S., Moeyersoms, J., Stankova, M., Tobback, E., Javaly, V., & Martens, D. (2019). What does your Facebook profile reveal about your creditworthiness? Using alternative data for microfinance. Journal of the Operational Research Society, 70(3), 353-363. https://doi.org/10.1080/01605682.2018.1434402

Fourcade, M., & Healy, K. (2013). Classification situations: Life-chances in the neoliberal era. Accounting, Organizations and Society, 38(8), 559-572. https://doi.org/10.1016/j.aos.2013.11.002

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139. https://doi.org/10.1006/jcss.1997.1504

Fuster, A., Goldsmith-Pinkham, P., Ramadorai, T., & Walther, A. (2022). Predictably unequal? The effects of machine learning on credit markets. The Journal of Finance, 77(1), 5-47. https://doi.org/10.1111/jofi.13090

Gambacorta, L., Huang, Y., Qiu, H., & Wang, J. (2019). How do machine learning and non-traditional data affect credit scoring? New evidence from a Chinese fintech firm (BIS Working Paper No. 834). Bank for International Settlements. https://www.bis.org/publ/work834.pdf

Garg, N., & Agarwal, P. (2014). Financial inclusion in India–a review of initiatives and achievements. IOSR Journal of Business and Management, 16(6), 52-61.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Haykin, S. S. (2009). Neural networks and learning machines (3rd ed.). Pearson.

He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263-1284. https://doi.org/10.1109/TKDE.2008.239

HFS Research & Cognizant. (2025). Reinventing non-bank mortgage lending journey in the age of AI. HFS Research. https://www.hfsresearch.com/research/cognizant-non-bank-mortgage/

International Finance Corporation. (2020). Artificial intelligence innovation in financial services (EM Compass Note 85). World Bank Group. https://www.ifc.org/content/dam/ifc/doc/mgrt/emcompass-note-85-ai-innovation-in-financial-services.pdf

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539

Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124-136. https://doi.org/10.1016/j.ejor.2015.05.030

Leyshon, A., & Thrift, N. (1995). Geographies of financial exclusion: Financial abandonment in Britain and the United States. Transactions of the Institute of British Geographers, 20(3), 312-341. https://doi.org/10.2307/622654

Losing, V., Hammer, B., & Wersing, H. (2018). Incremental on-line learning: A review and comparison of state of the art algorithms. Neurocomputing, 275, 1261-1274. https://doi.org/10.1016/j.neucom.2017.06.084

Omarini, A. E. (2018). Banks and fintechs: How to develop a digital open banking approach for the bank’s future. International Business Research, 11(9), 23-36. https://doi.org/10.5539/ibr.v11n9p23

Óskarsdóttir, M., Bravo, C., Sarraute, C., Vanthienen, J., & Baesens, B. (2019). The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics. Applied Soft Computing, 74, 26-39. https://doi.org/10.1016/j.asoc.2018.10.004

Panjwani, M., Joshi, S., & Sinha, S. (2019). Application of machine learning techniques for stress testing. Journal of Risk Management in Financial Institutions, 12(4), 365-380.

Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206-215. https://doi.org/10.1038/s42256-019-0048-x

Samek, W., Wiegand, T., & Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv. https://arxiv.org/abs/1708.08296

Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 210-229. https://doi.org/10.1147/rd.33.0210

San Pedro, J., Proserpio, D., & Oliver, N. (2015). MobiScore: Towards universal credit scoring from mobile phone data. In International Conference on User Modeling, Adaptation, and Personalization (pp. 195-207). Springer. https://doi.org/10.1007/978-3-319-20267-9_16

Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13-22.

Shrestha, Y. R., Krishna, V., & von Krogh, G. (2021). Augmenting organizational decision-making with deep learning algorithms: Principles, promises, and challenges. Journal of Business Research, 123, 588-603. https://doi.org/10.1016/j.jbusres.2020.09.068

Siddiqi, N. (2017). Intelligent credit scoring: Building and implementing better credit risk scorecards (2nd ed.). John Wiley & Sons.

Tangsawasdirat, B., Tanpoonkiat, S., & Tangsatchanan, B. (2021). Credit risk database: Credit-scoring models for Thai SMEs (PIER Discussion Paper No. 168). Puey Ungphakorn Institute for Economic Research. https://www.pier.or.th/en/dp/168/

Thomas, L. C., Crook, J., & Edelman, D. (2017). Credit scoring and its applications (2nd ed.). Society for Industrial and Applied Mathematics.

Verma, S., & Rubin, J. (2018). Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness (FairWare ,18) (pp. 1-7). ACM. https://doi.org/10.1145/3194770.3194776

Wei, Y., Yildirim, P., Van den Bulte, C., & Dellarocas, C. (2016). Credit scoring with social network data. Marketing Science, 35(2), 234-258. https://doi.org/10.1287/mksc.2015.0949

World Bank. (2022). The Global Findex Database 2021: Financial inclusion, digital payments, and resilience in the age of COVID-19. World Bank. https://doi.org/10.1596/978-1-4648-1897-4

Zhang, W., He, H., & Zhang, S. (2019). A novel multi-stage hybrid model with enhanced multi-population niche genetic algorithm: An application in credit scoring. Expert Systems with Applications, 121, 221-23

Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. CRC Press.