Detailed Human Shape and Pose from Images
Automating the process of measuring human shape characteristics and estimating body postures from images is central to many practical applications. While the problem is difficult in general, it can be made tractable by employing simplifying assumptions and relying on domain specific knowledge, or by engineering the environment appropriately.
In this thesis we demonstrate that using a data-driven model of the human body supports the recovery of both human shape and articulated pose from images, and has many benefits over previous body models. Specifically, we represent the body using SCAPE, a low-dimensional, but detailed, parametric model of body shape and pose deformations. We show that the parameters of the SCAPE model can be estimated directly from image data in a variety of imaging conditions and present a series of techniques enabled by this model.
We first consider the case of multiple calibrated and synchronized camera views and assume the subject wears tight-fitting clothing. We define a cost function between image silhouettes and a hypothesized mesh and formulate the problem as an optimization over the body shape and pose parameters. Second, we relax the tight-fitting clothing assumption and develop a robust method that accounts for the fact that observed silhouettes of clothed people provide only weak constraints on the true shape. Our approach is to accumulate many weak silhouette constraints while observing the subject in various poses and combine them with strong constraints from regions detected as skin and with a prior expectation of typical shapes to infer the most likely shape under clothing. Third, we consider scenes with strong lighting and show that a point light source and the shadow of the body cast on the ground provide an additional view equivalent to a silhouette from an actual camera. This approach effectively reduces the number of cameras needed for successful recovery of the body model by taking advantage of the lighting information in the scene. Results on a novel database of thousands of images of clothed and "naked" subjects, as well as sequences from the HumanEva dataset, suggest these methods may be accurate enough for biometric shape analysis in video.