Speaker Details

[more info about the speakers will be announced here]

Petar Velickovic

Petar Veličković

Google DeepMind

Bio:
Dr. Petar Veličković is a Staff Research Scientist at Google DeepMind, Affiliated Lecturer at the University of Cambridge, and an Associate of Clare Hall, Cambridge. He holds a PhD in Computer Science from the University of Cambridge (Trinity College), obtained under the supervision of Pietro Liò. His research concerns geometric deep learning—devising neural network architectures that respect the invariances and symmetries in data (a topic he has co-written a proto-book about). For his contributions, he is recognised as an ELLIS Scholar in the Geometric Deep Learning Program. Particularly, he focuses on graph representation learning and its applications in algorithmic reasoning (featured in VentureBeat). He is the first author of Graph Attention Networks—a popular convolutional layer for graphs—and Deep Graph Infomax—a popular self-supervised learning pipeline for graphs (featured in ZDNet). His research has been used in substantially improving travel-time predictions in Google Maps (featured in the CNBC, Endgadget, VentureBeat, CNET, the Verge and ZDNet), and guiding intuition of mathematicians towards new top-tier theorems and conjectures (featured in Nature, Science, Quanta Magazine, New Scientist, The Independent, Sky News, The Sunday Times, la Repubblica and The Conversation).

Keynote Title:
Embracing multimodality in Neural Algorithmic Reasoning.

Tom Griffiths

Tom Griffiths

Princeton University

Bio:
Dr. Tom Griffiths is the Henry R. Luce Professor of Information Technology, Consciousness and Culture in the Departments of Psychology and Computer Science at Princeton University. His research explores connections between human and machine learning, using ideas from statistics and artificial intelligence to understand how people solve the challenging computational problems they encounter in everyday life. Tom completed his PhD in Psychology at Stanford University in 2005, and taught at Brown University and the University of California, Berkeley before moving to Princeton. He has received awards for his research from organizations ranging from the American Psychological Association to the National Academy of Sciences and is a co-author of the book Algorithms to Live By, introducing ideas from computer science and cognitive science to a general audience.

Keynote Title:
Abstraction in Humans and Machines.

Keynote Abstract:
Machine learning has made great strides in creating systems that demonstrate high performance on tasks that were previously only performed by humans. However, are the solutions that they find comparable to those that humans use? In this talk I will summarize recent work analyzing the behavior of deep neural networks performing multimodal tasks that shows that these models failed to capture important abstractions that guide human performance in those tasks. I will also present some ideas on how we can better guide systems towards developing those abstractions.

Pushmeet_Kohli

Pushmeet Kohli

Google DeepMind

Bio:
Dr. Pushmeet Kohli, Vice President of Research (AI for Science, Reliable and Responsible AI) leads the science program at Google DeepMind, which uses AI to help accelerate scientific progress in areas ranging from genomics to quantum chemistry.

Pushmeet's team is responsible for AlphaFold, an AI system for predicting the 3D structure of proteins. The AlphaFold paper is one of most cited AI biology papers ever, with over 15,000 citations. His team is also working on AI systems for materials discovery and nuclear fusion.

Before working at Google DeepMind, Pushmeet spent 10 years in research for Microsoft, rising to the director of research at Microsoft’s Cognition group. Pushmeet has won a number of awards including the British Machine Vision Association’s “Sullivan Doctoral Thesis Award”, and is a member of the Association for Computing Machinery's (ACM) Distinguished Speaker Program.

He also leads research to ensure AI systems are safe, and was the UK government’s nominee for the Responsible AI working group as part of the Global partnership on AI. On Google DeepMind's Reliable and Responsible AI team, Pushmeet led on SynthID, a tool for watermarking and identifying AI-generated images.

Keynote Title:
TBD.

Keynote Abstract:
TBD.

Lijuan Wang

Lijuan Wang

Microsoft GenAI

Bio:
Dr. Lijuan Wang serves as a Principal Researcher and Research Manager, leading a multimodal generative AI research group within Microsoft GenAI. After earning her PhD from Tsinghua University, China, she began her tenure at Microsoft Research Asia in 2006 and later joined Microsoft Research in Redmond in 2016. Her research primarily focuses on multimodal understanding and generation, encompassing a wide range of areas from 3D talking heads to vision-language pre-training, vision foundation models and image/video generation. As a pivotal contributor in vision-language pretraining, image captioning, and object detection, her work has been integral to the development of various Microsoft products, including Cognitive Services and Office 365. Her recent explorations into GPT-4V's advanced capabilities, contributions to the development of DALL-E 3, and work on multimodal agents have garnered significant attention.

Keynote Title:
Recent Advances in Multimodal Foundation Models.

Keynote Abstract:
Humans interact with the world through multiple modalities, naturally synchronizing and integrating diverse information. A key goal in artificial intelligence is to develop algorithms capable of understanding and generating multimodal content. Research encompasses a broad range of tasks, from visual understanding (including image classification, image-text retrieval, image captioning, visual question answering, object detection, and various segmentation tasks) to visual generation (such as text-to-image and text-to-video generation). Recent advancements have shown significant improvements in model capabilities and versatility, novel benchmarks for emergent capabilities, and a trend toward integrating understanding and generation. The computer vision community is now emphasizing the development of general-purpose vision foundation models, influenced by the success of large-scale pre-training and large language models. These efforts are moving from specialized models to versatile general-purpose assistants. This talk will explore cutting-edge learning and application strategies for multimodal foundation models. Topics include learning models for multimodal understanding and generation, benchmarking these models to evaluate emergent abilities in understanding and generation tasks, and developing advanced systems and agents based on vision foundation models.

Chelsea Finn

Chelsea Finn

Stanford University

Bio:
Dr. Chelsea Finn is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University, and the William George and Ida Mary Hoover Faculty Fellow. Her research interests lie in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. To this end, her work has pioneered end-to-end deep learning methods for vision-based robotic manipulation, meta-learning algorithms for few-shot learning, and approaches for scaling robot learning to broad datasets. Her research has been recognized by awards such as the Sloan Fellowship, the IEEE RAS Early Academic Career Award, and the ACM doctoral dissertation award, and has been covered by various media outlets including the New York Times, Wired, and Bloomberg. Prior to Stanford, she received her Bachelor's degree in Electrical Engineering and Computer Science at MIT and her PhD in Computer Science at UC Berkeley.

Keynote Title:
TBD.

Keynote Abstract:
TBD.