The puzzle of dimensionality and feature learning in modern Deep Learning and LLM

Mon statut pour la session

Quoi:

Talk

Partie de:

Jour 1

Quand:

1:30 PM, Lundi 3 Juin 2024 EDT (1 heure 30 minutes)

Où:

Université du Québec à Montréal

Thème:

Large Language Models & Understanding

Remarkable progress in AI has far surpassed expectations of just a few years ago is rapidly changing science and society. Never before had a technology been deployed so widely and so quickly with so little understanding of its fundamentals. Yet our understanding of the fundamental principles of AI is lacking. I will argue that developing a mathematical theory of deep learning is necessary for a successful AI transition and, furthermore, that such a theory may well be within reach. I will discuss what such a theory might look like and some of its ingredients that we already have available. At their core, modern models, such as transformers, implement traditional statistical models -- high order Markov chains. Nevertheless, it is not generally possible to estimate Markov models of that order given any possible amount of data. Therefore, these methods must implicitly exploit low-dimensional structures present in data. Furthermore, these structures must be reflected in high-dimensional internal parameter spaces of the models. Thus, to build fundamental understanding of modern AI, it is necessary to identify and analyze these latent low-dimensional structures. In this talk, I will discuss how deep neural networks of various architectures learn low-dimensional features and how the lessons of deep learning can be incorporated in non-backpropagation-based algorithms that we call Recursive Feature Machines.

References

Adityanarayanan Radhakrishnan, Daniel Beaglehole, Parthe Pandit, Mikhail Belkin, Mechanism for feature learning in neural networks and backpropagation-free machine learning models, in Science (Vol 383, Issue 6690).

Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal, Reconciling modern machine-learning practice and the classical bias–variance trade-off, PNAS 116 (32) 15849-15854

Mikhail Belkin

Conférencier.ère

Mon statut pour la session

Permettre aux participants d'évaluer les sessions avec un "pouces vers le haut/bas" (thumbs up/thumbs down).

Permettre aux participants d'envoyer un feedback à l'organisateur.

Pour chaque session, permet aux participants d'écrire un court texte de feedback qui sera envoyé à l'organisateur. Ce texte n'est pas envoyé aux présentateurs.

Afficher la liste des personnes dans l'auditoire de chaque session du programme.

Afin de respecter les règles de gestion des données privées, cette option affiche uniquement les profils des personnes qui ont accepté de partager leur profil publiquement.

Permettre aux participants de participer à des discussions en ligne sur les sessions.

Les changements ici affecteront toutes les pages de détails des sessions

The puzzle of dimensionality and feature learning in modern Deep Learning and LLM

Mon statut pour la session

References

Mon statut pour la session

Detail de session