Principled Models for Machine Learning
- Tan Nguyen (UCLA)
Designing deep learning models for practical applications, including those in machine learning, is an art that often involves an expensive search over candidate architectures. We develop novel frameworks to facilitate the process of designing deep learning models via three principled approaches: optimization, numerical analysis, and statistical modeling. First, using optimization as a tool, we develop new families of momentum-based deep learning models that take advantage of momentum methods to improve the convergence speed and ability to capture long-range dependencies in models, including deep neural networks and neural ordinary differential equations. Second, we employ numerical methods such as diffusion with a source term to develop new graph neural networks with better accuracy and efficiency. Finally, we develop new generative models that help explain the attention mechanism in transformers and suggest new directions to improve the performance of these models.