My research interests are Egocentric Vision, Action Recognition, Contextual AI, Hand-object Interaction, Video Understanding, AR/VR, Multi-modal Learning, Visual-language Models and Self-supervised Learning.
Previously, I did my PhD under the supervision of
Prof. Marc Pollefeys at ETH Zurich
and I earned my Master's degree from UCLA. I
received my Bachelor's in
Electrical Engineering from Yonsei University, Seoul, Korea
If you are interested in semester projects (ETHZ), master's theses (ETHZ, Oxford), or personal projects related to action recognition,
egocentric vision, video understanding, and hand-object interaction that could lead to publications, feel free to email
me. I have some projects listed here but we can also discuss more other potential exciting projects.
HoloAssist is a large-scale egocentric human interaction dataset,
where two people collaboratively complete physical manipulation tasks. By augmenting the data with action and conversational annotations and observing the rich behaviors of various participants, we present key insights into how human assistants correct mistakes, intervene in the task completion procedure, and ground their instructions to the environment.
Contact-aware Skeletal Action Recognition (CaSAR) uses novel representations
of hand-object interaction that encompass spatial information:
1) contact points where the hand joints meet the objects,
2) distant points where the hand joints are far away
from the object and nearly not involved in the current action.
Our framework is able to learn how the hands touch or stay away from the objects
for each frame of the action sequence, and use this information to predict the action class.
EgoBody is a large-scale dataset of accurate 3D human body shape,
pose and motion of humans interacting in 3D scenes, with multi-modal streams from
third-person and egocentric views, captured by Azure Kinects and a HoloLens2.
Given two interacting subjects, we leverage a lightweight multi-camera rig to
reconstruct their 3D shape and pose over time.
We propose a skeletal self-supervised learning approach that uses alignment as a pretext task.
Our approach to alignment relies on a context-aware attention model that incorporates spatial
and temporal context within and across sequences and a contrastive learning formulation that
relies on 4D skeletal augmentations. Pose data provides a valuable cue for alignment and
downstream tasks, such as phase classification and phase progression, as it is robust
to different camera angles and changes in the background, while being efficient for real-time
processing.
In this paper, we propose a method to collect a dataset of two hands manipulating objects for
first person interaction recognition. We provide a rich set of annotations including action labels,
object classes, 3D left & right hand poses, 6D object poses, camera poses and scene point clouds.
We further propose the first method to jointly recognize the 3D poses of two hands manipulating
objects and a novel topology-aware graph convolutional network for recognizing hand-object interactions.
We propose a sensor-equipped food container, Smart Refrigerator, which recognizes foods and
monitors their status. We demonstrated the performance in detection of food and suggested that
automatic monitoring of food intake can provide intuitive feedback to users.
Grant, Swiss National Science Foundation, “Beyond Frozen Worlds: Capturing functional 3D Digital
Twins from the Real World” Role: Project Conceptualization PI. Prof. Marc Pollefeys (2M USD) 2023
Scholarship, Recipient of Korean Government Scholarship from NIIED (150K USD) 2018
Scholarship, Yonsei International Foundation 2016
IBM Innovation Prize, Startup Weekend, Technology Competition 2015
Best Technology Prize, Internet of Things (IoT) Hackathon by the government of Korea 2014
Best Laboratory Intern, Yonsei Institute of Information and Communication Technology 2014
Scholarship, Yonsei University Foundation,Korean Telecom Group Foundation 2014, 2011, 2010
Creative Prize, Startup Competition, Yonsei University 2014
Talks
2023/06: Toward Interactive AI in Mixed Reality @ Microsoft Mixed Reality & AI Lab, Zurich
2023/06: Toward Interactive AI in Mixed Reality @ AIoT Lab, Seoul National University
2023/05: Toward Interactive AI in Mixed Reality @ KASE Open Seminar, ETH Zurich
2022/03: Context-Aware Sequence Alignment using 4D Skeletal Augmentation. Applied Machine
Learning Days (AMLD) @EPFL
& Swiss JRC [Link]
2021/10: H2O: Two Hands Manipulating Objects for First Person Interaction Recognition.
ICCV 2021 Workshop on
Egocentric Perception, Interaction and Computing (EPIC) [Link|Video]
2021/04: H2O: Two Hands Manipulating Objects for First Person Interaction Recognition.
Swiss Joint Research Center
(JRC) Workshop 2021 [Link|Video]