Conditional average treatment effect (CATE) estimates have been increasingly used in policy decision-making as they can profile and prioritize individuals who receive the most benefits from a treatment. This paper studies the specific case of an imbalanced covariate in the data set. We posit that standard parametric and non-parametric methods lead to disparate performance for the minority and majority groups, creating bias in CATE estimates. In this paper, we first provide theoretical derivations for reweighting methods in a parametric setting, which will provide deeper intuitions about the problem. Then, we propose a repository of tools that address the issues of imbalanced covariates, including reweighting in causal forests and data augmentation through generative modeling. We demonstrate the effectiveness of these methods through extensive simulation studies. Finally, we apply these novel methods to a real-world data set in the case of job training programs.
Learning Generative 3D Scene Layouts from a Single Image
Linan Zhao, Zeqing Yuan, Yunzhi Zhang, and 2 more authors
What is a scene, conceptually? It can be decomposed into multiple objects, their spatial arrangement, and the background. While recent works have pushed the boundary on modeling 3D objects, the scene layout indicating how objects are arranged in 3D space remains under-explored. In this work, we build a generative model that learns the 3D scene layout distribution from a single 2D image, such as a photo of a parking lot containing several cars. We first retrieve the object geometry from segmented instances. Next, we build a permutation-equivariant model to generate layout parameters, which, combined with geometry, render scene images. We then leverage a patch-based discriminator on 2D images along with auxiliary losses to guide layout learning. Experiments demonstrate that our model successfully learns a wide range of layout distributions, each from a single Internet image. Our method achieves superior results on multiple downstream tasks, including extrapolating on number of instances and transferring learned layout to other objects.
2023
COMBOU: Leveraging Unlabeled Data in Conservative Offline Model-Based RL
Linan Zhao, Haozhuo Li, Rafael Rafailov, and 1 more author
Long-term unemployment has significant societal impact and is of particular concerns for policymakers with regard to economic growth and public finances. This paper constructs advanced ensemble machine learning models to predict citizens’ risks of becoming long-term unemployed using data collected from European public authorities for employment service. The proposed model achieves 81.2% accuracy on identifying citizens with high risks of long-term unemployment. This paper also examines how to dissect black-box machine learning models by offering explanations at both a local and global level using SHAP, a state-of-the-art model-agnostic approach to explain factors that contribute to long-term unemployment. Lastly, this paper addresses an under-explored question when applying machine learning in the public domain, that is, the inherent bias in model predictions. The results show that popular models such as gradient boosted trees may produce unfair predictions against senior age groups and immigrants. Overall, this paper sheds light on the recent increasing shift for governments to adopt machine learning models to profile and prioritize employment resources to reduce the detrimental effects of long-term unemployment and improve public welfare.