Passer au contenu de la page principale

Aishwarya Agrawal

Mila
Participe à 2 sessions

Aishwarya Agrawal Aishwarya Agrawal is an Assistant Professor in the Department of Computer Science and Operations Research at University of Montreal. She is also a Canada CIFAR AI Chair and a core academic member of Mila -- Quebec AI Institute. She also spends one day a week at Google DeepMind as a Research Scientist. Aishwarya completed her PhD in Aug 2019 from Georgia Tech, working with Dhruv Batra and Devi Parikh. Aishwarya's research interests lie at the intersection of computer vision, deep learning and natural language processing, with the goal of developing artificial intelligence (AI) systems that can “see” (i.e. understand the contents of an image: who, what, where, doing what?) and “talk” (i.e. communicate the understanding to humans in free-form natural language). Aishwarya is a recipient of the Canada CIFAR AI Chair Award, Georgia Tech 2020 Sigma Xi Best Ph.D. Thesis Award, Georgia Tech 2020 College of Computing Dissertation Award, 2019 Google Fellowship (declined), Facebook Fellowship 2019-2020 (declined) and NVIDIA Graduate Fellowship 2018-2019. Aishwarya was one of the two runner-ups of the 2019 AAAI / ACM SIGAI Dissertation Award. Aishwarya was also selected for the Rising Stars in EECS 2018.
 

Talk
Multimodal Vision-Language Learning | June 14

Sessions auxquelles Aishwarya Agrawal participe

Vendredi 14 Juin, 2024

Fuseau horaire: (GMT-05:00) Eastern Time (US & Canada)
1:30 PM
1:30 PM EDT - 3:00 PM EDT | 1 heure 30 minutes
Large Language Models & Multimodal Grounding

Over the last decade, multimodal vision-language (VL) research has seen impressive progress. We can now automatically caption images in natural language, answer natural language questions about images, retrieve images using complex natural language queries and even generate images given natural language descriptions.Despite such tremendous progress, current VL research faces several challenges that limit the applicability of state-of-art VL systems. Even large VL systems based on multimodal l...