Multimodal Algorithmic Reasoning

(MAR)

In Conjunction with the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

Music City Center, Nashville, TN, USA

June 11 or June 12, 2025 (exact time & date TBD)

About MAR 2025

About MAR 2025

In this workshop, we plan to gather researchers working in neural algorithmic learning, multimodal reasoning, and cognitive models of intelligence to showcase their cutting-edge research, discuss the latest challenges, as well as bring to the forefront problems in perception and language modeling that are often overlooked but are pivotal in achieving true artificial general intelligence. An emphasis of this workshop is on the emerging topic of multimodal algorithmic reasoning, where a reasoning agent is required to automatically deduce new algorithms/procedures for solving real-world tasks, e.g., algorithms that use multimodal foundational models for analysis, synthesis, and planning, new approaches towards solving challenging vision-and-language mathematical (Olympiad type) reasoning problems, deriving winning strategies in multimodal games, procedures for using tools in robotic manipulation, etc. We hope to deep dive into this exciting topic at the intersection of multimodal learning and cognitive science to understand what we have achieved thus far in machine intelligence and what we are lacking in relation to the human way of thinking -- through talks from outstanding researchers and faculty that could inspire the audience to search for the missing rungs on the ladder to true intelligence.

Where

Music City Center, Nashville, TN, USA

When

June 11 or June 12, 2025 (exact time & date TBD)

Keynote Speakers

[More info about keynote speakers will be updated here]

Heng Ji

Heng Ji

UIUC

Rishabh Agarwal

Rishabh Agarwal

Google DeepMind

MAR 2025 Schedule

[in Nashville local time (CST)]

[More info about the schedule will be updated here]

Call for Contributions

Deep learning–powered AI systems have rapidly advanced in their data modeling capabilities, yielding compelling applications that often seem to rival human intelligence. Despite these impressive achievements, questions remain about whether these systems possess the foundational elements of general intelligence, or whether they simply excel at task-specific computations without human-like understanding. Addressing these questions calls for new methods of both developing and assessing such models.

In this workshop, we aim to bring together researchers working in neural algorithmic learning, multimodal reasoning, and cognitive models of intelligence to showcase cutting-edge research, tackle current challenges, and highlight critical yet underexplored problems in perception and language modeling—issues at the core of achieving true artificial general intelligence. A key focus is on the emerging field of multimodal algorithmic reasoning, which explores neural representations of algorithms to devise novel solutions for real-world tasks. These span a wide range of areas, including multimodal alignment, algorithms over foundational models for solving problems related to analysis, synthesis, or planning, mathematical problem-solving, procedural learning in robotic manipulation, and more.

Our goal is to delve deeply into this exciting intersection of multimodal algorithmic learning and cognitive science, reflecting on the current progress in machine intelligence while examining the gaps that distinguish it from human cognition. Through talks by leading researchers and faculty, we aim to inspire participants to explore the "missing rungs" on the ladder to true intelligence.

We invite you to submit high-quality papers to the workshop that propose innovative approaches, theoretical insights, or practical applications towards advancing this exciting field, as well as foster meaningful discussions and collaborations.



Important Dates

Paper submission deadline: March 12, 2025 (11:59pm PDT) 
Rebuttal (optional): March 25-26 2025. 
Notification to authors: April 3, 2025.
Camera-ready deadline:  April 7, 2025 (11:59pm PDT).

Topics

We invite submissions of high-quality research papers in the topics related to multimodal algorithmic reasoning. The topics for MAR 2025 include, but are not limited to:
  • Multimodal machine reasoning.
  • Algorithmic reasoning in vision, including program synthesis, planning, and procedural learning.
  • Neural architectures and approaches for mathematical reasoning.
  • Architectures for aligning/integrating multimodal foundation models, including vision, language, audio, and 3D content.
  • Architectures for solving abstract multimodal reasoning/language-based IQ puzzles, e.g., using sketches, diagrams, audio-visual clips
  • New tasks, datasets, benchmarks, and models for multimodal reasoning including algorithmic reasoning, neuro-symbolic reasoning, abstract reasoning, and mathematical reasoning.
  • Extreme generalization to new tasks and few-shot concept induction.
  • Synthetic data and automatic verification for reasoning.
  • Multimodal agents, including programmable agent, tool-use agent, etc., for reasoning tasks.
  • Position papers on novel perspectives to understand AI and human problem solving.
  • Studies comparing AI and human problem solving skills, including but not limited to:
    • Perspectives from psychology, neuroscience, and educational science.
    • Children's cognitive development.
    • Limitations of large vision-and-language models.

Submission Instructions

We have two tracks for paper submissions:
  1. Papers with IEEE/CVF workshop proceedings (≤ 8 pages)
  2. Papers without workshop proceedings (≤ 8 pages)
For track 1, we are inviting only original, previously unpublished papers, and dual submissions are not allowed. The page limits described above are excluding the references. Papers accepted to track 2 will not be included in the proceedings, however will be publicly shared on the workshop website. The submissions to track 2 can be novel/ongoing work (limited to 4 pages) or accepted/previously published papers (limited to 8 pages), both excluding references.
  • All submissions are handled via the workshop’s CMT website.
  • Submissions should be made in PDF format and should follow the official CVPR 2025 template and guidelines.
  • Papers accepted in track 1 will be part of the CVPR 2025 workshop proceedings.
  • Authors may upload an optional supplementary materials, containing additional details, videos, images, etc. in a separate zip file (with a max of 50MB in size); the deadline for submitting these supplementary materials is the same as that for the main paper.
  • All submissions should maintain author anonymity and should abide by the CVPR 2025 conference guidelines for double-blind review.
  • Accepted papers will be presented as either an oral, spotlight, or poster presentation. At least one author of each accepted submission must present the paper at the workshop in-person.
  • Presentation of accepted papers at our workshop will follow the same policy as that for accepted papers at the CVPR 2025 main conference.
  • Accepted papers will be made publicly accessible on the workshop website shortly after the camera-ready deadline. CVPR 2025 will provide the official proceedings of the accepted papers.

Contact

Email: smart101@googlegroups.com

MAR 2025 Venue

Music City Center, Nashville, TN, USA

MAR 2025 will be held at Music City Center, Nashville, TN, USA on June 11 or June 12, 2025 (exact time & date TBD).

Sponsor

Organizers

[Contact Email: smart101@googlegroups.com]

Anoop Cherian

Anoop Cherian

Mitsubishi Electric Research Laboratories (MERL)

Kuan-Chuan Peng

Kuan-Chuan Peng

Mitsubishi Electric Research Laboratories (MERL)

Suhas Lohit

Suhas Lohit

Mitsubishi Electric Research Laboratories (MERL)

Honglu Zhou

Honglu Zhou

Salesforce AI Research

Tim Marks

Tim Marks

Mitsubishi Electric Research Laboratories (MERL)