You need to agree to share your contact information to access this dataset

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This dataset contains 350 (scene, floor) episodes × 32 RGB frames each =
11,200 JPG renderings
of Matterport HM3D scenes, produced by
Habitat-Sim from the HM-EQA initial-pose CSV (Ren et al., 2024).

These renders are derivative works of HM3D and are subject to the
Matterport Academic EULA.
Each frame depicts the interior of a real
scanned residential / commercial space; by downloading you accept the
full upstream license chain:

  1. HM3D (Matterport Academic EULA) — primary license governing
    the underlying mesh data. Accept at
    https://matterport.com/legal/matterport-end-user-license-agreement-academic-use
    (one-time signup via https://aihabitat.org/datasets/hm3d/).
  2. HM-EQA (Ren et al., 2024) — provided the (scene, init_pose)
    metadata that defines each episode. Use must respect
    Stanford-ILIAD/explore-eqa's terms.
  3. QEMR (code repository) — produced by scripts/render_hmeqa_frames.py
    in https://github.com/QEMR-2026/QEMR, MIT-licensed.

By requesting access you affirm:

  • Use will be academic / non-commercial research only.
  • You will not redistribute these renders or any derivative.
  • You will cite HM3D, HM-EQA, and QEMR in any publication using
    these frames.

Log in or Sign Up to review the conditions and access this dataset content.

QEMR — HM-EQA Pre-rendered Frames

Pre-rendered RGB frames for the 350 (scene, floor) episodes of HM-EQA, produced from HM3D via Habitat-Sim. Lets you skip the rendering step entirely when re-running QEMR's VLM-inference pipeline — no Habitat-Sim install, no OpenGL system libs, no GPU for this step.

Companion datasets:


What's in here (350 episodes · 11 200 jpgs · 647 MB)

QEMR-2026/qemr-frames/
├── README.md, LICENSE
├── 00004-VqCaAuuoeWk_1/        ← <scene_id>_<floor> per episode
│   ├── manifest.json           ← rendering metadata: init_pose, seed, FOV, ...
│   ├── frame_000.jpg           ← agent at HM-EQA's specified init pose
│   ├── frame_001.jpg           ← random navigable point on the same floor
│   ├── ...
│   └── frame_031.jpg
└── ... (350 episodes total)

Each frame_NNN.jpg: 640×360 RGB, 90° hFoV, camera tilt −30°, agent height 1.5m. Same intrinsics as in the QEMR paper's main result table.

manifest.json per episode (verbatim from scripts/render_hmeqa_frames.py):

{
  "scene": "00004-VqCaAuuoeWk",
  "floor": 1,
  "init_position": [1.55, 0.14, -3.00],
  "init_angle_rad": 0.0954,
  "n_frames_requested": 32,
  "n_frames_saved": 32,
  "img_width": 640, "img_height": 360,
  "hfov_deg": 90.0, "sensor_height": 1.5, "camera_tilt_deg": -30.0,
  "frames": [{"frame_index": 0, "file": "frame_000.jpg", "position": [...], "yaw_rad": ..., "camera_tilt_deg": -30.0}, ...]
}

How to use

# 1. Get access via "Request access" above.
# 2. After approval, download into the layout QEMR's pipeline expects:
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="QEMR-2026/qemr-frames",
    repo_type="dataset",
    local_dir="data/processed/hmeqa_frames",
    allow_patterns=["*/manifest.json", "*/frame_*.jpg"],   # skip README/LICENSE
)
# 3. Now any QEMR script that reads `data/processed/hmeqa_frames/` works
#    without ever installing habitat-sim.

Downstream scripts that read these frames (from the code repo):

  • scripts/run_hmeqa_vlm_baseline.py — VLM inference (Qwen-VL / InternVL3)
  • scripts/export_semantic_crop_manifest_hmeqa.py — OWL-ViT semantic crops
  • scripts/compute_uvlm_answer_verification_hmeqa.py — VLM answer verification

You still need GPU + VLM weights to run inference; the frames just remove the Habitat-Sim dependency.


Provenance

Rendered May 21 2026 on a single NVIDIA RTX 5090 (habitat-sim 0.3.3 + withbullet on conda-forge / aihabitat channels). Per-episode rendering parameters in each manifest.json. Full pipeline is reproducible with scripts/render_hmeqa_frames.py from the code repo if you accept the HM3D EULA and have habitat-sim set up.


License

See LICENSE. Headline:

  • The rendering decisions / code (which init pose, camera params, manifest.json schema) are QEMR's MIT-licensed contribution.
  • The pixel content of each .jpg derives from HM3D scenes and retains the Matterport Academic EULA.

Citation

@inproceedings{qemr,
  title  = {Query-Conditioned Evidence-Mode Routing for Calibrated Embodied RAG},
  author = {Anonymous},
  year   = {2026},
  note   = {Under review at CoRL 2026; replace with full author block on acceptance}
}

Plus HM3D (Ramakrishnan et al.) and HM-EQA (Ren et al.).

Downloads last month
19

Paper for QEMR-2026/qemr-frames