We present SAPA‑Bench, a large‑scale benchmark of 7,138 scenarios to measure privacy awareness in MLLM‑powered smartphone agents across recognition, localization, category, sensitivity level, and risk response. We define five metrics (PRR, PLR, PLAR, PCAR, RA) and evaluate seven representative agents.
Figure A — Introduction
Figure B — Benchmark
Figure C — Data Construction
Figure D — Risk Awareness by Model
@article{lin2025sapa,
title = {Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents},
author = {Lin, Zhixin and Li, Jungang and Pan, Shidong and Shi, Yibo and Yao, Yue and Xu, Dongliang},
journal = {arXiv preprint arXiv:2508.19493},
year = {2025}
}