Passer au contenu de la page principale

Lessons from Computer Vision: We are (still!) not giving Data enough credit

Mon statut pour la session

Quoi:
Talk
Partie de:
Quand:
1:30 PM, Mercredi 12 Juin 2024 EDT (1 heure 30 minutes)
Thème:
Large Language Models & Multimodal Grounding
For most of Computer Vision’s existence, the focus has been solidly on algorithms and models, with data treated largely as an afterthought. Only recently did our discipline finally begin to appreciate the singularly crucial role played by data. In this talk, I will begin with some historical examples illustrating the importance of large visual data in both computer vision as well as human visual perception. I will then share some of our recent work demonstrating the power of very simple algorithms when used with the right data. Recent results in visual in-context learning, large vision models, and visual data attribution will be presented.

 

References

Bai, Y., Geng, X., Mangalam, K., Bar, A., Yuille, A., Darrell, T., Malik, J. and Efros, A.A. (2023). Sequential modeling enables scalable learning for large vision models. arXiv preprint arXiv:2312.00785.

Bar, A., Gandelsman, Y., Darrell, T., Globerson, A., & Efros, A. (2022). Visual prompting via image inpainting. Advances in Neural Information Processing Systems, 35, 25005-25017.

Pathak, D., Agrawal, P., Efros, A. A., & Darrell, T. (2017). Curiosity-driven exploration by self-supervised prediction. In International conference on machine learning (pp. 2778-2787). PMLR.

Alexei Efros

Conférencier.ère

Mon statut pour la session

Detail de session
Pour chaque session, permet aux participants d'écrire un court texte de feedback qui sera envoyé à l'organisateur. Ce texte n'est pas envoyé aux présentateurs.
Afin de respecter les règles de gestion des données privées, cette option affiche uniquement les profils des personnes qui ont accepté de partager leur profil publiquement.

Les changements ici affecteront toutes les pages de détails des sessions