EgoLink overlook

Intro

EgoLink (Egocentric Language-Vision Interactive Network Knowledge Challenge) is designed to advance embodied AI within complex, real-world egocentric scenarios. Moving beyond traditional navigation or static object perception, the challenge holistically evaluates a model's capacity for social reasoning and interactive task execution. It challenges intelligent AI to not only perceive emotional cues, understand causal relationships, and predict behavioral intents in human interactions but also to actively solve daily life tasks through multimodal dialogue, dynamic tool use, and autonomous planning. EgoLink aims to foster integrated intelligence where perception, reasoning, and decision-making are tightly coupled in unstructured social environments.

News

Challenge Tasks

Track 1: Social Reasoning in Egocentric Video. This track focuses on social reasoning in egocentric video. Instead of only testing navigation or object perception, the benchmark evaluates emotional perception, causal understanding, behavioral intent prediction, and semantic summarization in real-world human interaction scenes.

The challenge is built on E3 (Exploring Embodied Emotion) and introduces a unified MCQ-based protocol designed to be objective, reproducible, and accessible to both multimodal learning and embodied AI communities.

Track 2: Interactive Agent Challenge: Multimodal Interaction Task Execution in Social Life Scenarios. This track evaluates how well an intelligent agent can solve real-world tasks in complex and dynamic social-life environments through tool use.

Unlike traditional static QA or single-modality recognition tasks, this track requires the model to act as an interactive agent. The agent receives first-person visual streams (e.g., shopping, ordering food, and other high-frequency daily scenarios), combines them with user natural-language instructions and available external tools, and completes user goals through multi-turn dialogue, accurate tool invocation, autonomous planning, and closed-loop execution.

Beyond visual perception accuracy, this track primarily assesses the model's integrated intelligence in unstructured environments, where perception, reasoning, and action must be tightly coupled.

Dataset

Track 1

The dataset builds upon the E3 (Exploring Embodied Emotion) dataset, a pioneering large-scale egocentric video benchmark for embodied emotion understanding. While E3 provides foundational egocentric video data with emotion annotations, EgoLink transforms this resource into a comprehensive social reasoning benchmark through systematic re-annotation and task reformulation. The MCQ construction scheme references the methodology discussed in Measuring Audio's Impact on Correctness: Audio-Contribution-Aware Post-Training of Large Audio Language Models .

We will provide the constructed MCQ training set, the MCQ validation set, and the original E3 annotation labels to participants for model training.

Track 2

This track provides benchmark data only and does not release a training set.

Evaluation

Track 1

Track 2

Competition Rules

Important Dates

Track 1

Track 2

Presentation policy

ACM Multimedia 2026 is an on-site event only. This means that all papers and contributions must be presented by a physical person on-site; remote presentations will not be hosted or allowed. Papers and contributions not presented on-site will be considered a no-show and removed from the proceedings of the conference. More details will be provided to handle unfortunate situations in which none of the authors would be able to attend the conference physically.

Organisers

Organising committee.

Jian Liu

Jian Liu

Ant Group

Weiqiang Wang

Weiqiang Wang

Ant Group

Chang Yao

Chang Yao

Zhejiang University

Jingyuan Chen

Jingyuan Chen

Zhejiang University

Challenge Chairs

Challenge chairs and core team.

Yueying Feng

Yueying Feng

Zhejiang University

Bohan Yu

Bohan Yu

Ant Group

Renhe Sun

Renhe Sun

Ant Group

Zitong Wang

Zitong Wang

Ant Group

Tong Niu

Tong Niu

Ant Group

Yunqi Liu

Yunqi Liu

Ant Group

Haolin He

Haolin He

The Chinese University of Hong Kong (CUHK)

Chang Han

Chang Han

Ant Group

Contact

License: CC BY-NC-SA 4.0 for non-commercial research and education usage.