Machine Learning and the Social Safety Net: Improved beneficiary targeting and malnutrition detection in India

By Ronak Upadhyaya

Given the bureaucratic difficulties associated with sophisticated methodologies for identifying target individuals and households for beneficiary targeting and poverty assessment in India, the idea of using proxy means tests (PMT) is administratively desirable. Currently used poverty assessment methods exhibit an inherent inaccuracy at household level; many social safety net programs implemented in India have very low coverage rates, which in turn imply that exclusion errors are large. Moreover, there is considerable geographic variation in coverage and non-poor households are also benefiting from these programs, implying heterogeneity in the effectiveness of welfare services.


India’s sheer population size and income heterogeneity have had policy makers preoccupied with the agenda of poverty targeting since independence. India’s effort to providing welfare services to its vulnerable populations has traditionally involved three major components. The first principle attempts to resolve low income in average Indian households and raise household earnings through growth, assuming higher incomes provide the ability to be self-sufficient. The second, built on the foundation of providing equal opportunities to all citizens, involves providing public goods and other subsidized services. The third principle is deeply embedded in the idea in the selective targeting; special safety nets and welfare programs are devised for communities and individuals that are more susceptible to poverty than others. While India’s social safety net programs have performed modestly well on the first component, their record on public goods, subsidized services and selective targeting has been woeful. The Integrated Child Development Services (ICDS) programme is one such welfare service; established as an Indian government welfare programme, which provides food, preschool education, and primary healthcare to children less than 6 years of age and their mothers, the ICDS programme is built on the second and third components of the Indian social safety net. While several positive benefits of the programme have been well reported and documented, the World Bank has also highlighted certain key shortcomings including inability to “target the girl child improvements, participation of wealthier children more than the poorer children and lowest level of funding for the poorest and the most undernourished states of India.” ICDS programme placement is clearly regressive across states in the country. The states with the “greatest need for the programme and populations with greatest vulnerabilities”— the poor Northern states with high levels of child malnutrition and nearly half of India’s population — have the lowest programme coverage. High rates of leakage, low coverage and low incidence of children below the age of 3 corroborate the World Bank’s account of the programme. This paper presents a methodology using supervised machine learning techniques with the aim of reducing under coverage and leakage rates of the Integrated Child Development Services (ICDS).


Poverty assessment tests in India usually emphasize on using the ordinary least squares (OLS) regression method to reduce errors in-sample, resulting in an over-fitted statistical model that describes random error instead of the underlying relationship. Dimensionality reduction of the feature space using Principal Component Analysis (PCA) can be performed to determine the most important variables, which create an environment conducive to under nourishment in women and children. The PCA technique finds the underlying variables that best differentiate the datasets; in essence, it is a method of extracting important variables (in form of components) from a large set of variables available in a data set. It extracts low dimensional set of features from a high dimensional data set with a motive to capture as much information as possible. The Dimensionality Reduction method is particularly useful in evaluating datasets for poverty assessment; the technique will allow us identify the most crucial variables and economic conditions leading to malnourishment in India. Using these identified variables, classification algorithms can be built to achieve a significant gain in detecting malnutrition. These classification algorithms can include ID3 (Iterative Dichotomiser 3) Decision Tree Model, Random Forest Tree Algorithm and the Artificial Neural Network. The classification techniques can provide appropriate and flexible methods to process large amount of data for specifying accurate malnutrition detection, prevention over the survey dataset and reducing cases of undercoverage. Subsequently, this method can be implemented in developing a tool for accurate poverty targeting in core Indian social protection programs.

In this paper, I have presented methods for the improvement of a particular type of poverty assessment and malnutrition identification tool. Although this dataset analyzed here, the application of dimensionality reduction methods and classification algorithms to the problems of reducing undercoverage and leakage in poverty targeting should produce a significant gain in accuracy in India. The data mining methods presented here demonstrate the power of computational methods in the subject area of policy making. However, a profound knowledge of the context, setting and the realities on the ground are vital in the research methodology; uncritical use of neural networks and decision tree algorithms only reduces the functionality of the process. Although I have attempted to identify machine learning algorithms to identify vulnerable cases of malnutrition among preschool children, and among expectant and nursing mothers, the methods applied in this paper should be considered for malnourishment identification technique development more broadly.


  • McBride, Linden, and Austin Nichols. “Improved Poverty Targeting through Machine Learning: An Application to the USAID Poverty Assessment Tools.” Improved Poverty Targeting through Machine Learning: An Application to the USAID Poverty Assessment Tools (2015): n. pag. 12 Jan. 2015. Web. 20 July 2016.
  • Thangamani, D., and P. Sudha. “Identification Of Malnutrition With Use Of Supervised Datamining Techniques –Decision Trees And Artificial Neural Networks.” International Journal Of Engineering And Computer Science 3.9 (2014): 8236-241. IJECS. 9 Sept. 2014. Web. 20 July 2016.
  • “Integrated Child Development Services(ICDS).” MINISTRY OF WOMEN & CHILD DEVELOPMENT. National Informatics Centre( NIC ), 15 June 2016. Web. 20 July 2016.
  • “Using Social Safety Nets to Combat Child Malnutrition.” The World Bank. The World Bank, 23 Sept. 2015. Web. 20 July 2016.
  • Lantz, Brett. Machine Learning with R. Birmingham: Packt, 2013. Print.
  • Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. N.p.: Springer, n.d. Print.
  • Grosh, Margaret E., and Judy L. Baker. “Proxy Means Tests for Targeting Social Programs: Simulations and Speculation.” World Bank Group. World Bank Group, July 1995. Web. 20 July 2016.
  • “Safety Nets and Transfers.” The World Bank Group. The World Bank Group, n.d. Web. 20 July 2016.
  • Ajwad, Mohamed Ihsan. “Performance of Social Safety Net Programs in Uttar Pradesh.” SSRN Electronic Journal SSRN Journal (2007): n. pag. World Bank Group. World Bank Group, Oct. 2007. Web. 20 July 2016.
  • “Social Safety Nets.” Social Safety Nets in India (1998): 197-206. IHDS India Human Development Survey. IHDS India Human Development Survey. Web. 20 July 2016..p.: Springer, n.d. Print.

About the Author

Ronak Upadhyaya

Ronak Upadhyaya, 18, is a freshman at the University of Southern California, where he is majoring in Computer Science/ Business Administration and is working towards a minor in Mathematical Finance. “Right now, I’m thinking about becoming a strategy-consultant upon graduation. I wish to help businesses solve organizational performance problems” he said. “But whatever I do, it will have something to do with tackling complex, multidisciplinary problems. He graduated from Oberoi International School, Mumbai, India where he pursued the International Baccalaureate Diploma Programme. His interests in entrepreneurship, mathematics and public policy are a result of a curious mind with a passion for problem solving. Over the past three years, he has been actively involved in his local community. He cofounded a pro-bono consulting group to offer information about traditional financial and welfare services like savings accounts, identification cards and money transfers to lower socio-economic groups in Mumbai. He has also worked as a counselor at Teach for India, teaching young students programming languages. Currently, Ronak is an active member of several of clubs and organizations on campus. His love for finance inspired him to join the Trojan Hedge Fund Group and the Trojan Investing Group at USC.He also joined the Los Angeles Technical Consulting Group, which offers technical consulting to organizations in the Greater Los Angeles Area, as he continues pursuing hobbies that increase his appreciation for uncovering creative solutions to difficult problems.