The field of unsupervised learning in machine learning concerns itself with organizing and understanding patterns in dataset without the use of an oracle capable of giving us the ground-truth labels. A primary objective of unsupervised learning is not merely to perform density estimation or generate realistic samples, but to uncover and characterize the latent structure inherent in observational data. Central to this objective is the challenge of modeling high-dimensional dependencies in a manner that admits interpretable or functionally meaningful representations. This task of factoring data into meaningful representations is known as \textit{disentanglement}. In this context, \textit{unsupervised disentanglement} has emerged as a critical problem: the task of isolating distinct generative factors of variation, corresponding to semantic concepts, directly from data, and encoding them into statistically and structurally orthogonal latent subspaces, all without recourse to supervision. Despite significant progress, it has been established that, in the absence of additional constraints, generative models may replicate the observed distribution yet fail to yield disentangled representations, a limitation fundamentally tied to identifiability issues, analogous to those in nonlinear independent component analysis. We first discuss the theoretical foundations and into the role of inductive biases, which we investigate in this thesis as a necessary mechanism for achieving disentanglement in unsupervised settings.
After a theoretical background, we then discuss specific methodologies for embedding these inductive biases within generative neural network architectures, with particular emphasis on regularization strategies that systematically influence the structural formation of latent representations. In our first work, we introduce a novel approach to impose the inductive bias of local isometry by leveraging self-supervised metric learning, thereby encouraging the preservation of local geometric structures within the learned representations. Building upon foundational concepts from sparse coding and discretization frameworks, in our second work, we further discuss a principled probabilistic approach employing structured nonparametric priors to strengthen the inductive biases, thereby enhancing the interpretability.
Further, we propose strategies that systematically relax restrictive assumptions commonly imposed in prior frameworks, particularly those constraining representational flexibility and scalability, thereby rendering disentangled representation learning a more tractable and broadly applicable paradigm. Specifically, we leverage disentangled representations to achieve compositional generalization, enabling systematic extrapolation to novel combinations of learned latent factors and enhancing sample efficiency alongside robust generalization to out-of-distribution scenarios. We propose the adoption of tailored nonparametric prior structures that facilitate the continual incorporation of novel semantic factors within the representational framework, thereby substantially enhancing the scalability of the overall learning paradigm.
Vaishnavi Patil is a PhD student at the Department of Computer Science, studying under Professor Joseph JaJa. Her investigations lead into fundamentally understanding representational learning, and probability theory, into the subfield of disentanglement. Disentanglement concerns us with looking into guiding powerful artificial neural networks to learn meaningful representations, representations that can make sense to human observers. This problem has thus far been theoretically difficult even with powerful generative neural networks, and requires new methods and methods beyond just relying on known neural network methods. She is specifically interested in solving the problem of unsupervised disentanglement would allow us to, for the first time, scour through large amounts of data without the use of supervised labels and make sense of the underlying patterns.

