Pedestrian Pose, Gait and Trajectory Prediction

In applications such as autonomous driving and human-robot interaction, it is important to understand, infer, and anticipate future behavior of pedestrians.

We have been investigating and developing novel computer vision and machine learning methods for pedestrian pose, gait, and trajectory prediction. Notably:

Bio-LSTM

This work focused on the prediction/forecasting of 3D mesh representation of pedestrian walking pose and gait. We developed a novel biomechanically inspired recurrent neural network (Bio-LSTM) that can predict the location and 3D articulated body pose of pedestrians in a global coordinate frame, given 3D poses and locations estimated in prior frames with inaccuracy.  The pedestrian 3D body pose was represented using the Skinned Multi-Person Linear (SMPL) model, with 3D meshes as the prediction output. We present prediction results on the PedX dataset, a large-scale, in-the-wild data collection at complex urban intersections in Ann Arbor, MI, USA. Results show that the proposed network can successfully learn the characteristics of pedestrian gait and produce accurate and consistent 3D pose predictions.

This work was presented at the Michigan AI Symposium 2018 and published in IEEE RA-L. This work was featured in the University of Michigan CSE 2018/12 newsletter, The University Record 2019/02 news story, The Michigan Engineer News Center news story, and TechCrunch news story.

Stochastic Sampling for Trajectory Simulation

This work proposes a stochastic sampling method for generating vast amounts of automatically-annotated, realistic, and naturalistic synthetic pedestrian trajectories that can be used to train deep learning approaches such as Social GAN. Presented at IROS 2019 conference.

PPLP Network for Oriented Pedestrian Detection

This work proposes a Pedestrian Planar LiDAR Pose (PPLP) Network for oriented pedestrian detection based on planar LiDAR and monocular images. Instead of previously commonly used three-dimensional (3D) Light Detection and Ranging (LiDAR) sensors, which can be expensive to deploy, our PPLP relies on the combination of two-dimensional (2D) LiDAR data and monocular camera, which offers a far more affordable solution to the oriented pedestrian detection problem. Our PPLP network takes 2D LiDAR point clouds as input and outputs 3D bounding box locations and orientation values for all pedestrians in the scene. Published in IEEE RA-L 2020.

Unsupervised Pedestrian Pose Prediction

This work leverages unsupervised/self-supervised video prediction methods, such as the PredNet, to address the pedestrian pose prediction problem. This work develops a pipeline for video generation and pose extraction for unsupervised pedestrian pose prediction and offers a solution that no longer requires labeled pedestrian data for training, which has been a major bottleneck in the practical appli­cation of such methods. Published in IEEE RA Magazine: Special Issue on Deep Learning and Machine Learning in Robotics.

BiTraP and BiPOCO: Bi-directional goal-conditioned trajectory prediction

This work presents BiTraP, a goal-conditioned bi-directional multi-modal trajectory prediction method based on the Conditional Variational Autoencoder (CVAE). BiTraP estimates multi-modal goals of pedestrian trajectories and introduces a novel bi-directional decoder to improve longer-term trajectory prediction accuracy. We presented results on both first-person view and bird’s eye view datasets and show improved trajectory prediction results by around 50% over the state-of-the-art. We also show that different choices of nonparametric versus parametric target models in the CVAE directly influence the predicted multi-modal trajectory distributions and can be used to compute collision rate and guide navigation tasks. Published in IEEE RA-L 2021.

Additionally, we have developed an extended model, BiPOCO (a Bi-directional trajectory predictor with POse COnstraints). BiPOCO incorporates compositional pose-based losses on top of BiTraP for detecting anomalous activities of pedestrians in videos. Accepted in ICML 2022 Safe Learning for Autonomous Driving (SL4AD) Workshop.