3D world captured by humans and AI｜Exhibition Program｜NTT Communication Science Laboratories OPEN HOUSE 2026

Exhibition Program

Media Information Science

11	3D world captured by humans and AI Comparing depth estimation bias using large-scale human data

Abstract

Humans can naturally estimate 3D structures from 2D images, and recent advances in artificial intelligence (AI) have enabled physical devices to develop similar abilities. Our research investigates whether these systems rely on the same visual cues as humans in depth estimation. To this end, we collected large-scale human-annotated data for indoor and outdoor images and compared them with predictions from various AI models. Our results show that many AI models exhibited estimation biases similar to humans (e.g., perceiving distant objects as closer than they physically are). Additionally, we identify an accuracy-similarity trade-off: highly accurate AI models often behave less like humans. By precisely modeling human-like error patterns, our work contributes to the development of AI models that better align with human perception. This may support safer and more intuitive applications, such as remote robot operation, where visual misunderstandings can lead to accidents.

References

[1] Y. Kubota, T. Fukiage, “Human-like monocular depth biases in deep neural networks,” PLOS Computational Biology, Vol. 21, No. 8, e1013020, 2025.

[2] Y. Kubota, T. Fukiage, “Accuracy does not guarantee human-likeness in monocular depth estimators ,” arXiv, 2512.08163, 2025.

[3] Y. Kubota, T. Fukiage, “Benchmarking human and DNN biases in monocular depth estimation,” under review, 2026.

Poster

Please click the icon to open the full-size PDF file.

Contact

Yuki Kubota, Sensory Representation Research Group, Human Information Science Laboratory

Click here for other research exhibits

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22

3D world captured by humans and AI

Comparing depth estimation bias using large-scale human data

Contact

Download