OAK

Modeling Crowd Behaviors: From Representation to Understanding and Generation

Metadata Downloads
Abstract
Modeling crowd behaviors is useful for emerging technologies like autonomous driving and virtual/augmented reality, and essential for understanding social phenomena such as stress and evacuation analysis. However, it poses significant challenges due to the inherent diversity and indeterminacy of human behaviors. For example, people may choose to turn left or right to avoid obstacles, making this problem infeasible. This dissertation addresses these challenges by presenting a holistic pipeline for crowd behavior modeling. Specifically, we introduce a data-driven approach to accurately capture and simulate crowd dynamics across three key aspects: Representation, Understanding, and Generation.

The first part of this dissertation introduces how to construct the feature representations of the relationships between agents in scenes. The agents consider their surrounding environments, objects, and nearby agents when determining their routes toward destinations. For this, there have been data-driven manners, but the complexity of capturing their relations grows exponentially while managing an increasing number of elements in dynamic, real-world scenarios. To mitigate this, we propose novel techniques to reduce the complexity of the representations in both spatial and temporal dimensions. To be specific, we propose group-based methods, following hierarchical steps that perceive scenes at collective levels and propagate this feature to the individual levels. Next, the agents' motion pattern-based methods allow the recognition of long-term pedestrian behaviors without any need to encode every individual footstep. These proposed methods enable more efficient and tractable context recognition.

The second part of this dissertation focuses on understanding realistic crowd dynamics based on the perceived information. As usual, humans potentially sample the hypotheses for their destination and then select one reasonable route for them. Because of this manner, a future trajectory for each agent should be modeled with probabilistic natures. In computer vision field, it is well-known that generative models work well in handling stochastic tasks. Due to their inherent randomness, conventional generative models struggle to cover the feasible trajectories. One of the proposed methods makes use of human thinking prior, embedded in the pre-trained language model. Additionally, another method, the conditional generative model, infers trajectories step-by-step in a cascading fashion, allowing for realistic predictions even in challenging conditions. These approaches facilitate more realistic crowd dynamics modeling and accurate path planning by understanding human dynamics in scenes well.

The third part of this dissertation explores generating crowd behaviors in simulation spaces. In real world, people have certain walking patterns, going to their destination from starting point in consideration of their spatial layouts. We observe that people sometimes form walking groups as time goes by.
To imitate this behavior, we present a learnable crowd emitter, which enables continuous generation of human dynamics. The proposed method accounts for key initialization factors, including agent attributes, starting and destination coordinates, and pace. In addition, locomotion trajectories are planned to guide them toward their destinations as well. Here, we propose a learnable sampling technique that replaces the random sampling process with a purposive manner to ensure diversity in the trajectories. This approach is good at simulating diverse crowd scenarios.

Lastly, the dissertation summarizes the contributions and outlines future directions for further improving data-driven approaches to crowd behavior modeling.
Author(s)
배인환
Issued Date
2025
Type
Thesis
URI
https://scholar.gist.ac.kr/handle/local/19495
Alternative Author(s)
Inhwan Bae
Department
대학원 AI대학원
Advisor
Jeon, Hae-Gon
Table Of Contents
Abstract
List of Contents
List of Tables
List of Figures
1 Introduction
1.1 Problem Definition
1.2 Scope of the Research
1.2.1 Representation of Crowd Behavior
1.2.2 Understanding Crowd Behavior
1.2.3 Generation of Crowd Behavior
1.3 Outline of Dissertation
I Representation of Crowd Behavior
2 Group-Aware Crowd Dynamics Modeling in Crowded Environment
2.1 Introduction
2.2 Related Works
2.2.1 Trajectory Prediction
2.2.2 Group-aware Representation
2.2.3 Graph Node Pooling
2.3 Proposed Method
2.3.1 Problem Definition
2.3.2 Learning the Trajectory Grouping Network
2.3.3 Pedestrian Group Hierarchy Architecture
2.3.4 Implementation Details
2.4 Experiments
2.4.1 Experimental Setup
2.4.2 Quantitative Results
2.4.3 Qualitative Results
2.4.4 Analysis
2.4.5 Ablation Study
2.5 Summary
3 Stochastic Pedestrian Intention Prediction with Control Points
3.1 Introduction
3.2 Related work
3.2.1 Context-aware Trajectory Prediction
3.2.2 Endpoint Conditioned Approach
3.2.3 Trajectory Refinement
3.3 Control Point Conditioned Prediction
3.3.1 Preliminaries
3.3.2 Control Point Conditioned Endpoint Prediction
3.3.3 Multi-Relational Pedestrian Graph
3.3.4 Trajectory Refinement
3.3.5 Implementation Details
3.4 Experiments
3.4.1 Experimental Setup
3.4.2 Comparison with state-of-the-art
3.4.3 In-Depth Analysis
3.4.4 Experiments in Various Settings
3.4.5 Ablation Study
3.5 Summary
4 Understanding Crowd Movement Patterns with Low-Rank Descriptor
4.1 Introduction
4.2 Related Works
4.2.1 Pedestrian Trajectory Prediction
4.2.2 Parametric Trajectory Descriptor
4.3 Methodology
4.3.1 Problem Definition
4.3.2 EigenTrajectory (ET) Descriptor
4.3.3 Forecasting in the ET Space
4.3.4 Loss Functions
4.3.5 Implementation Details
4.4 Experiments
4.4.1 Experimental Setup
4.4.2 Evaluation Results
4.4.3 Ablation Studies
4.5 Summary
II Understanding Crowd Behavior
5 Multi-Modal and Most-Likely Generation with Large Language Models
5.1 Introduction
5.2 Related Works
5.2.1 Pedestrian Trajectory Prediction
5.2.2 Large Language Models with Multimodal Data
5.2.3 Language Models for Reasoning and Prediction
5.3 Methodology
5.3.1 Problem Definition
5.3.2 Data Space Conversion to Prompt
5.3.3 Domain Shift to Sentence Generation
5.3.4 Empowering Model with Various Input Modalities
5.3.5 Forecasting With the Language Model
5.3.6 Implementation Details
5.4 Experiments
5.4.1 Experimental Setup
5.4.2 Evaluation Results
5.4.3 Case Studies
5.4.4 Ablation Studies
5.5 Conclusion
6 Multi-Agent Trajectory Planner with Unified Motion Space and Diffusions
6.1 Introduction
6.2 Related Works
6.2.1 Pedestrian Trajectory Prediction
6.2.2 Various Trajectory Prediction Tasks
6.3 Methodology
6.3.1 Problem Definition
6.3.2 Preliminaries
6.3.3 Unifying the Motion Space
6.3.4 Adaptive Anchor
6.3.5 Diffusion-Based SingularTrajectory Model
6.4 Experiments
6.4.1 Experimental Setup
6.4.2 Evaluation Results
6.4.3 Extensive Evaluations
6.4.4 Ablation Studies
6.5 Summary
III Generation of Crowd Behavior
7 Diversifying Output Samples with Non-Probability Sampling Process
7.1 Introduction
7.2 Related Works
7.2.1 Stochastic trajectory prediction
7.2.2 Learning latent variables
7.2.3 Graph-based approaches
7.2.4 Monte Carlo Sampling Method
7.3 Generated Trajectories Are Biased
7.3.1 Problem Definition
7.3.2 Preliminaries
7.3.3 Stochastic Trajectory Prediction is Biased
7.3.4 Quasi-Monte Carlo for Trajectory Prediction
7.4 Non-Probability Sampling Network
7.4.1 Non-Probability Sampling on Multimodal Trajectory Prediction
7.4.2 NPSN Architecture
7.4.3 Implementation Details
7.5 Experiments
7.5.1 Experimental Setup
7.5.2 Results from QMC and NPSN Method
7.5.3 Analysis
7.5.4 Ablation Studies
7.5.5 Analysis
7.6 Summary
8 Continuous Crowd Behavior Generation with Crowd Emitter and Simulator
8.1 Introduction
8.2 Related Works
8.2.1 Multi-Agent Trajectory Prediction
8.2.2 Crowd Locomotion Simulation
8.2.3 Traffic Scene Generation
8.2.4 Switching Dynamical Systems
8.3 Methodology
8.3.1 Problem Definition
8.3.2 Crowd Emitter Model
8.3.3 Crowd Simulator Model
8.3.4 Continuous Crowd Behavior Generation
8.4 Experiments
8.4.1 Benchmark Method
8.4.2 Evaluation Results
8.4.3 Flexibility and Controllability
8.4.4 Ablation Studies
8.5 Summary
9 Concluding Remark
References
Degree
Doctor
Appears in Collections:
Department of AI Convergence > 4. Theses(Ph.D)
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.