Cognitive Science Tools for Understanding the Behavior of Large Language Models

My Session Status

What:

Talk

Part of:

Day 10 | Large Language Models & Multimodal Grounding

When:

3:30 PM, Friday 14 Jun 2024 EDT (1 hour 30 minutes)

Theme:

Large Language Models & Understanding

Large language models have been found to have surprising capabilities, even what have been called “sparks of artificial general intelligence.” However, understanding these models involves some significant challenges: their internal structure is extremely complicated, their training data is often opaque, and getting access to the underlying mechanisms is becoming increasingly difficult. As a consequence, researchers often have to resort to studying these systems based on their behavior. This situation is, of course, one that cognitive scientists are very familiar with — human brains are complicated systems trained on opaque data and typically difficult to study mechanistically. In this talk I will summarize some of the tools of cognitive science that are useful for understanding the behavior of large language models. Specifically, I will talk about how thinking about different levels of analysis (and Bayesian inference) can help us understand some behaviors that don’t seem particularly intelligent, how tasks like similarity judgment can be used to probe internal representations, how axiom violations can reveal interesting mechanisms, and how associations can reveal biases in systems that have bee trained to be unbiased.

References

Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2024). Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36.

Hardy, M., Sucholutsky, I., Thompson, B., & Griffiths, T. (2023). Large language models meet cognitive science: Llms as tools, models, and participants. In Proceedings of the annual meeting of the cognitive science society (Vol. 45, No. 45).

Tom Griffiths

Speaker

My Session Status

Allow attendees to rate sessions with a "thumbs up" or "thumbs down".

Allow attendees to send feedback about sessions

Allows attendees to send short textual feedback to the organizer for a session. This is only sent to the organizer and not the speakers.

Enable list of attendees for sessions

When enabled, you can choose to display attendee lists for individual sessions. Only attendees who have chosen to share their profile will be listed.

Display the list of attendees for this session

Enable to display the attendee list on this session's detail page. This change applies only to this session.

Allow attendees to participate in a discussion thread for sessions

Changes here will affect all session detail pages unless otherwise noted

Cognitive Science Tools for Understanding the Behavior of Large Language Models

My Session Status

References

My Session Status

Session detail