Human Activity Understanding
Human Activity Understanding (HAU) research is concerned with modeling and interpreting human behavior and interactions within physical environments. We focus on leveraging multimodal information such as visual, tactile, and motion using Computer Vision, sensor fusion, and Artificial Intelligence to develop robust models that capture the complexity of human activities and their contextual dependencies. By integrating data from multiple sensing modalities, such as RGB and depth cameras, wearable devices, and ambient environmental sensors, these approaches aim to provide a comprehensive representation of human actions and interactions.
A central motivation for this research is to enable intelligent systems that can perceive, understand, and respond to human behavior in a meaningful way. Such capabilities are essential for the development of next-generation applications in areas including smart living environments, healthcare monitoring and assisted living, human-computer interaction, human-robot collaboration, and industrial automation. Through the extraction of high-level semantic information from low-level sensory inputs, this research contributes to improving human well-being, safety, comfort, and overall quality of life.
Key Topics
- Human-Object Interaction and 4D Reconstruction
- HAU based on visual-tactile multimodal information
- Human Activity Recognition
- Multimodal Action Segmentation
Key Publications
- Wu, Yuankai; Li, Zhinan; Patsch, Constantin; Zakour, Marsil; Salihu, Driton; Steinbach, Eckehard: UMI-HOI: Unifying Multimodal Information with Semantic Multi-Head Attention for Human–Object Interaction Detection. Findings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2026), 2026
- He, Xinguo; Shen, Yixin; Chaudhari, Rahul: Discrete Token Representation for Efficient Hand Mesh Reconstruction. The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 (CVPR 2026), 2026
- Patsch, Constantin; Wu, Yuankai; Zakour, Marsil; Salihu, Driton; Steinbach, Eckehard: MistSense: Versatile Online Detection of Procedural and Execution Mistakes. 2025 International Conference on Computer Vision (ICCV 2025), 2025
- Patsch, Constantin; Goter, Jaden; Greer, Joseph; Ma, Lingni; Sodhi, Raj: WACU: Multi-Modal Wristband Assistant for Contextual Understanding. HANDS Workshop at International Conference on Computer Vision (ICCV 2025), 2025
- Patsch, Constantin; Wu, Yuankai; Salihu, Driton; Zakour, Marsil; Steinbach, Eckehard: TSCL: Timestamp Supervised Contrastive Learning for Action Segmentation. IEEE Robotics and Automation Letters 9 (9), 2024, 7485-7492
- Wu, Yuankai; Wang, Chi; Salihu, Driton; Patsch, Constantin; Zakour, Marsil; Steinbach, Eckehard: Rethinking 3D Geometric Object Features for Enhancing Skeleton-based Action Recognition. 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
Contact
If you are interested in Human Activity Understanding or would like to learn more about our work, feel free to reach out to Emre Faik Gökce, M.Sc. You are also welcome to get in touch directly with the researchers working in this field:

