Improving Credit Decision Through Machine Learning and Alternative Data: Evidence from NBFIS
Main Article Content
Abstract
Non-banking financial institutions (NBFIs) often struggle to make accurate credit decisions, especially for customers with insufficient traditional credit histories. Conventional models, such as logistic regression, primarily depend on credit bureau data and fail to capture the full credit potential of underserved populations—thereby hindering business expansion and financial inclusion. This study investigates how NBFIs can enhance credit decision-making by applying advanced machine learning techniques—namely XGBoost and neural networks—alongside alternative data sources, including mobile phone usage patterns, utility bill payments, and social media activity. Utilizing a real-world dataset of over 300,000 individuals, the findings demonstrate that machine learning models significantly outperform traditional approaches, particularly when alternative data is incorporated. These improvements lead to more precise risk classification, enabling institutions to reduce default rates, expand lending to previously overlooked borrowers, and improve portfolio profitability. In addition, the study addresses critical ethical and privacy considerations surrounding alternative data use. The results provide actionable insights for NBFIs aiming to adopt data-driven credit strategies that balance predictive power with responsible data governance—ultimately enhancing credit operations and promoting inclusive growth.
Article Details
References
Abdou, H. A., & Pointon, J. (2011). Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intelligent Systems in Accounting, Finance and Management, 18(2-3), 59-88. https://doi.org/10.1002/isaf.325
Aithal, V., & Jathanna, R. D. (2019). Credit risk assessment using machine learning techniques. International Journal of Innovative Technology and Exploring Engineering, 9(1), 3570-3575.
Albanesi, S., & Vamossy, D. F. (2019). Predicting consumer default: A deep learning approach (NBER Working Paper No. 26165). National Bureau of Economic Research. https://doi.org/10.3386/w26165
Alliance for Financial Inclusion. (2025). Alternative data for credit scoring. AFI. https://www.afi-global.org/publication/2025-alternative-data-for-credit-scoring/
Avery, R. B., Calem, P. S., & Canner, G. B. (2004). Consumer credit scoring: Do situational circumstances matter? Journal of Banking & Finance, 28(4), 835-856. https://doi.org/10.1016/j.jbankfin.2003.10.009
Anderson, R. (2007). The credit scoring toolkit: Theory and practice for retail credit risk management and decision automation. Oxford University Press.
Baesens, B., Roesch, D., & Scheule, H. (2016). Credit risk analytics: Measurement techniques, applications, and examples in SAS. John Wiley & Sons.
Bartlett, R., Morse, A., Stanton, R., & Wallace, N. (2019). Consumer-lending discrimination in the FinTech era (NBER Working Paper No. 25943). National Bureau of Economic Research. https://doi.org/10.3386/w25943
Bazarbash, M. (2019). FinTech in financial inclusion: Machine learning applications in assessing credit risk (IMF Working Paper No. 19/109). International Monetary Fund. https://doi.org/10.5089/9781498314428.001
Berg, T., Burg, V., Gombović, A., & Puri, M. (2020). On the rise of intechs: Credit scoring using digital footprints. The Review of Financial Studies, 33(7), 2845-2897. https://doi.org/10.1093/rfs/hhz099
Blanco, A., Pino-Mejías, R., Lara, J., & Rayo, S. (2020). Assessing the fairness of credit scoring models. Journal of Financial Services Research, 57(2), 153-177. https://doi.org/10.1007/s10693-019-00322-8
Bradford, T. (2023). Give me some credit!: Using alternative data to expand credit access (Payments System Research Briefing). Federal Reserve Bank of Kansas City. Retrieved June 27, 2025, from https://www.kansascityfed.org/research/payments-system-research-briefings/give-me-some-credit-using-alternative-data-to-expand-credit-access/
Bravo, C., Maldonado, S., & Weber, R. (2015). Methodologies for the construction of models for credit granting based on utility functions. Annals of Operations Research, 233(1), 237-264. https://doi.org/10.1007/s10479-013-1492-3
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794). ACM. https://doi.org/10.1145/2939672.2939785
De Cnudde, S., Moeyersoms, J., Stankova, M., Tobback, E., Javaly, V., & Martens, D. (2019). What does your Facebook profile reveal about your creditworthiness? Using alternative data for microfinance. Journal of the Operational Research Society, 70(3), 353-363. https://doi.org/10.1080/01605682.2018.1434402
Fourcade, M., & Healy, K. (2013). Classification situations: Life-chances in the neoliberal era. Accounting, Organizations and Society, 38(8), 559-572. https://doi.org/10.1016/j.aos.2013.11.002
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139. https://doi.org/10.1006/jcss.1997.1504
Fuster, A., Goldsmith-Pinkham, P., Ramadorai, T., & Walther, A. (2022). Predictably unequal? The effects of machine learning on credit markets. The Journal of Finance, 77(1), 5-47. https://doi.org/10.1111/jofi.13090
Gambacorta, L., Huang, Y., Qiu, H., & Wang, J. (2019). How do machine learning and non-traditional data affect credit scoring? New evidence from a Chinese fintech firm (BIS Working Paper No. 834). Bank for International Settlements. https://www.bis.org/publ/work834.pdf
Garg, N., & Agarwal, P. (2014). Financial inclusion in India–a review of initiatives and achievements. IOSR Journal of Business and Management, 16(6), 52-61.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
Haykin, S. S. (2009). Neural networks and learning machines (3rd ed.). Pearson.
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263-1284. https://doi.org/10.1109/TKDE.2008.239
HFS Research & Cognizant. (2025). Reinventing non-bank mortgage lending journey in the age of AI. HFS Research. https://www.hfsresearch.com/research/cognizant-non-bank-mortgage/
International Finance Corporation. (2020). Artificial intelligence innovation in financial services (EM Compass Note 85). World Bank Group. https://www.ifc.org/content/dam/ifc/doc/mgrt/emcompass-note-85-ai-innovation-in-financial-services.pdf
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539
Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124-136. https://doi.org/10.1016/j.ejor.2015.05.030
Leyshon, A., & Thrift, N. (1995). Geographies of financial exclusion: Financial abandonment in Britain and the United States. Transactions of the Institute of British Geographers, 20(3), 312-341. https://doi.org/10.2307/622654
Losing, V., Hammer, B., & Wersing, H. (2018). Incremental on-line learning: A review and comparison of state of the art algorithms. Neurocomputing, 275, 1261-1274. https://doi.org/10.1016/j.neucom.2017.06.084
Omarini, A. E. (2018). Banks and fintechs: How to develop a digital open banking approach for the bank’s future. International Business Research, 11(9), 23-36. https://doi.org/10.5539/ibr.v11n9p23
Óskarsdóttir, M., Bravo, C., Sarraute, C., Vanthienen, J., & Baesens, B. (2019). The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics. Applied Soft Computing, 74, 26-39. https://doi.org/10.1016/j.asoc.2018.10.004
Panjwani, M., Joshi, S., & Sinha, S. (2019). Application of machine learning techniques for stress testing. Journal of Risk Management in Financial Institutions, 12(4), 365-380.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206-215. https://doi.org/10.1038/s42256-019-0048-x
Samek, W., Wiegand, T., & Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv. https://arxiv.org/abs/1708.08296
Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 210-229. https://doi.org/10.1147/rd.33.0210
San Pedro, J., Proserpio, D., & Oliver, N. (2015). MobiScore: Towards universal credit scoring from mobile phone data. In International Conference on User Modeling, Adaptation, and Personalization (pp. 195-207). Springer. https://doi.org/10.1007/978-3-319-20267-9_16
Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13-22.
Shrestha, Y. R., Krishna, V., & von Krogh, G. (2021). Augmenting organizational decision-making with deep learning algorithms: Principles, promises, and challenges. Journal of Business Research, 123, 588-603. https://doi.org/10.1016/j.jbusres.2020.09.068
Siddiqi, N. (2017). Intelligent credit scoring: Building and implementing better credit risk scorecards (2nd ed.). John Wiley & Sons.
Tangsawasdirat, B., Tanpoonkiat, S., & Tangsatchanan, B. (2021). Credit risk database: Credit-scoring models for Thai SMEs (PIER Discussion Paper No. 168). Puey Ungphakorn Institute for Economic Research. https://www.pier.or.th/en/dp/168/
Thomas, L. C., Crook, J., & Edelman, D. (2017). Credit scoring and its applications (2nd ed.). Society for Industrial and Applied Mathematics.
Verma, S., & Rubin, J. (2018). Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness (FairWare ,18) (pp. 1-7). ACM. https://doi.org/10.1145/3194770.3194776
Wei, Y., Yildirim, P., Van den Bulte, C., & Dellarocas, C. (2016). Credit scoring with social network data. Marketing Science, 35(2), 234-258. https://doi.org/10.1287/mksc.2015.0949
World Bank. (2022). The Global Findex Database 2021: Financial inclusion, digital payments, and resilience in the age of COVID-19. World Bank. https://doi.org/10.1596/978-1-4648-1897-4
Zhang, W., He, H., & Zhang, S. (2019). A novel multi-stage hybrid model with enhanced multi-population niche genetic algorithm: An application in credit scoring. Expert Systems with Applications, 121, 221-23
Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. CRC Press.