AutoPhoto: Aesthetic Photo Capture using Reinforcement Learning
Hadi AlZayer
Hubert Lin
Kavita Bala
Left: photo of AutoPhoto deployed on a Jackal UGV. Right: photos captured by AutoPhoto.


The process of capturing a well-composed photo is difficult and it takes years of experience to master. We propose a novel pipeline for an autonomous agent to automatically capture an aesthetic photograph by navigating within a local region in a scene. Instead of classical optimization over heuristics such as the rule-of-thirds, we adopt a data-driven aesthetics estimator to assess photo quality. A reinforcement learning framework is used to optimize the model with respect to the learned aesthetics metric. We train our model in simulation with indoor scenes, and we demonstrate that our system can capture aesthetic photos in both simulation and real world environments on a ground robot. To our knowledge, this is the first system that can automatically explore an environment to capture an aesthetic photo with respect to a learned aesthetic estimator.




AutoPhoto is composed of an aesthetics model that extracts features from the current view, a common MLP+LSTM backbone that processes these features, and two separate layers that parameterize the actor and critic. The actor selects an action to take and the critic estimates the current state value. We iteratively run multiple episodes to sample action and state value pairs to optimize the model parameters.


Supplementary Video


H. AlZayer, H. Lin, K. Bala
AutoPhoto: Aesthetic Photo Capture using Reinforcement Learning
IROS, 2021.
(hosted on ArXiv)



This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.