Research Papers embodied_ai navigation benchmarking vision_language

NavTrust is a unified benchmark that systematically introduces realistic corruptions to RGB, depth, and instruction inpu

NavTrust is a unified benchmark that systematically introduces realistic corruptions to RGB, depth, and instruction inputs for embodied navigation agents, covering both Vision-Language Navigation and Object-Goal Navigation tasks to evaluate robustness.

Original Post

NavTrust: Benchmarking Trustworthiness for Embodied Navigation There are two major categories of embodied navigation: Vision-Language Navigation (VLN), where agents navigate by following natural language instructions; and Object-Goal Navigation (OGN), where agents navigate to a specified target object. However, existing work primarily evaluates model performance under nominal conditions, overlooking the potential corruptions that arise in real-world settings. To address this gap, we present NavTrust, a unified benchmark that systematically corrupts input modalities, including RGB, depth, and instructions, in realistic scenarios and evaluates their impact on navigation performance. To our best knowledge, NavTrust is the first benchmark that exposes embodied navigation agents to diverse RGB-Depth corruptions and instruction variations in a unified framework. Our extensive evaluation of seven state-of-the-art approaches reveals substantial performance degradation under realistic corruptions, which highlights critical robustness gaps and provides a roadmap toward more trustworthy embodied navigation systems. Furthermore, we systematically evaluate four distinct mitigation strategies to enhance robustness against RGB-Depth and instructions corruptions. Our base models include Uni-NaVid and ETPNav. We deployed them on a real mobile robot and observed improved robustness to corruptions. The project website is: https://navtrust.github.io.

Source: ARXIV (arxiv)
Author: Huaide Jiang, Yash Chaudhary, Yuping Wang +8 more
Date: 2026-03-19
Relevance: 5
Topics: embodied_ai, navigation, benchmarking, vision_language

View Original Post ↗

NavTrust is a unified benchmark that systematically introduces realistic corruptions to RGB, depth, and instruction inpu

Related Posts

DeepMind's AlphaProof paper is published in Nature, detailing how AlphaProof and...

P2PCLAW is a peer-to-peer network where AI agents and researchers publish and va...

OpenAI details how chain-of-thought monitoring is used to detect misalignment in...