Spatial data are being collected at unprecedented scales and variety: the volume of Earth observation data at NASA alone is projected to reach 250-PB by 2025, and the number of GPS receivers has surpassed 6-billion in 2021. Such datasets are at the core of decision-making across many critical sectors including agriculture, transportation and public health, and are broadly used to tackle some of the most pressing challenges including climate change and the COVID-19 pandemic. While machine learning is important for automating the analysis of such gigantic datasets, direct applications of these methods often fall short due to the unique challenges posed by spatial data. This talk will focus on the common heterogeneity problem, where not only data distributions are non-stationary in space, but also the footprints of the distributions are unknown. We will discuss two model-agnostic frameworks to address the challenge from two different perspectives: performance (e.g., F1 scores) and fairness. The talk will conclude with a brief discussion on other challenges and emerging opportunities.
Yiqun Xie is an Assistant Professor in Geospatial Information Science at the University of Maryland, College Park. He received his PhD in Computer Science at the University of Minnesota, and his research addresses challenges facing machine learning and data mining for spatial data. His current work focuses on: (1) heterogeneity-aware learning in space, (2) knowledge/physics-guided learning for data-sparse applications, and (3) fairness-aware learning for data with locations. His research is supported by NSF, NASA, Google and Amazon, and has received several recognitions including the Best Paper Awards from IEEE ICDM 2021 and SSTD 2019, and the Great Innovative Ideas by CRA’s Computing Community Consortium. More information is available at: https://terpconnect.umd.edu/~