HOME / Exhibition Program / 3D world captured by humans and AI
Exhibition Program
Media Information Science
11

3D world captured by humans and AI

Comparing depth estimation bias using large-scale human data

3D world captured by humans and AI
Abstract

Humans can naturally estimate 3D structures from 2D images, and recent advances in artificial intelligence (AI) have enabled physical devices to develop similar abilities. Our research investigates whether these systems rely on the same visual cues as humans in depth estimation. To this end, we collected large-scale human-annotated data for indoor and outdoor images and compared them with predictions from various AI models. Our results show that many AI models exhibited estimation biases similar to humans (e.g., perceiving distant objects as closer than they physically are). Additionally, we identify an accuracy-similarity trade-off: highly accurate AI models often behave less like humans. By precisely modeling human-like error patterns, our work contributes to the development of AI models that better align with human perception. This may support safer and more intuitive applications, such as remote robot operation, where visual misunderstandings can lead to accidents.

3D world captured by humans and AI
References

[1] Y. Kubota, T. Fukiage, “Human-like monocular depth biases in deep neural networks,” PLOS Computational Biology, Vol. 21, No. 8, e1013020, 2025.

[2] Y. Kubota, T. Fukiage, “Accuracy does not guarantee human-likeness in monocular depth estimators ,” arXiv, 2512.08163, 2025.

[3] Y. Kubota, T. Fukiage, “Benchmarking human and DNN biases in monocular depth estimation,” under review, 2026.

Poster
Contact

Yuki Kubota, Sensory Representation Research Group, Human Information Science Laboratory

Click here for other research exhibits