Datasets:
You need to agree to share your contact information to access this dataset
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
This dataset contains 350 (scene, floor) episodes × 32 RGB frames each =
11,200 JPG renderings of Matterport HM3D scenes, produced by
Habitat-Sim from the HM-EQA initial-pose CSV (Ren et al., 2024).
These renders are derivative works of HM3D and are subject to the
Matterport Academic EULA. Each frame depicts the interior of a real
scanned residential / commercial space; by downloading you accept the
full upstream license chain:
- HM3D (Matterport Academic EULA) — primary license governing
the underlying mesh data. Accept at
https://matterport.com/legal/matterport-end-user-license-agreement-academic-use
(one-time signup via https://aihabitat.org/datasets/hm3d/). - HM-EQA (Ren et al., 2024) — provided the
(scene, init_pose)
metadata that defines each episode. Use must respect
Stanford-ILIAD/explore-eqa's terms. - QEMR (code repository) — produced by
scripts/render_hmeqa_frames.py
in https://github.com/QEMR-2026/QEMR, MIT-licensed.
By requesting access you affirm:
- Use will be academic / non-commercial research only.
- You will not redistribute these renders or any derivative.
- You will cite HM3D, HM-EQA, and QEMR in any publication using
these frames.
Log in or Sign Up to review the conditions and access this dataset content.
QEMR — HM-EQA Pre-rendered Frames
Pre-rendered RGB frames for the 350 (scene, floor) episodes of HM-EQA, produced from HM3D via Habitat-Sim. Lets you skip the rendering step entirely when re-running QEMR's VLM-inference pipeline — no Habitat-Sim install, no OpenGL system libs, no GPU for this step.
Companion datasets:
- QEMR-2026/qemr-traces — generation traces (analysis-only reproduction)
- QEMR-2026/QEMR — code repository
What's in here (350 episodes · 11 200 jpgs · 647 MB)
QEMR-2026/qemr-frames/
├── README.md, LICENSE
├── 00004-VqCaAuuoeWk_1/ ← <scene_id>_<floor> per episode
│ ├── manifest.json ← rendering metadata: init_pose, seed, FOV, ...
│ ├── frame_000.jpg ← agent at HM-EQA's specified init pose
│ ├── frame_001.jpg ← random navigable point on the same floor
│ ├── ...
│ └── frame_031.jpg
└── ... (350 episodes total)
Each frame_NNN.jpg: 640×360 RGB, 90° hFoV, camera tilt −30°, agent
height 1.5m. Same intrinsics as in the QEMR paper's main result table.
manifest.json per episode (verbatim from scripts/render_hmeqa_frames.py):
{
"scene": "00004-VqCaAuuoeWk",
"floor": 1,
"init_position": [1.55, 0.14, -3.00],
"init_angle_rad": 0.0954,
"n_frames_requested": 32,
"n_frames_saved": 32,
"img_width": 640, "img_height": 360,
"hfov_deg": 90.0, "sensor_height": 1.5, "camera_tilt_deg": -30.0,
"frames": [{"frame_index": 0, "file": "frame_000.jpg", "position": [...], "yaw_rad": ..., "camera_tilt_deg": -30.0}, ...]
}
How to use
# 1. Get access via "Request access" above.
# 2. After approval, download into the layout QEMR's pipeline expects:
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="QEMR-2026/qemr-frames",
repo_type="dataset",
local_dir="data/processed/hmeqa_frames",
allow_patterns=["*/manifest.json", "*/frame_*.jpg"], # skip README/LICENSE
)
# 3. Now any QEMR script that reads `data/processed/hmeqa_frames/` works
# without ever installing habitat-sim.
Downstream scripts that read these frames (from the code repo):
scripts/run_hmeqa_vlm_baseline.py— VLM inference (Qwen-VL / InternVL3)scripts/export_semantic_crop_manifest_hmeqa.py— OWL-ViT semantic cropsscripts/compute_uvlm_answer_verification_hmeqa.py— VLM answer verification
You still need GPU + VLM weights to run inference; the frames just remove the Habitat-Sim dependency.
Provenance
Rendered May 21 2026 on a single NVIDIA RTX 5090 (habitat-sim 0.3.3 +
withbullet on conda-forge / aihabitat channels). Per-episode rendering
parameters in each manifest.json. Full pipeline is reproducible with
scripts/render_hmeqa_frames.py from the code repo if you accept the
HM3D EULA and have habitat-sim set up.
License
See LICENSE. Headline:
- The rendering decisions / code (which init pose, camera params,
manifest.jsonschema) are QEMR's MIT-licensed contribution. - The pixel content of each
.jpgderives from HM3D scenes and retains the Matterport Academic EULA.
Citation
@inproceedings{qemr,
title = {Query-Conditioned Evidence-Mode Routing for Calibrated Embodied RAG},
author = {Anonymous},
year = {2026},
note = {Under review at CoRL 2026; replace with full author block on acceptance}
}
- Downloads last month
- 19