
- Read more about DETECTS: Deep Clustering of Temporal Skeletons for Graph-based Segmentation
- Log in to post comments
Unsupervised Temporal Action Localization (UTAL) aims to segment untrimmed videos into semantically coherent actions without using temporal annotations. Existing UTAL methods rely on contrastive pretext tasks or shallow clustering pipelines that decouple representation learning from segmentation, limiting their ability to capture fine-grained temporal transitions. In this work, we propose a unified deep clustering framework for skeleton-based UTAL that formulates motion segmentation as a spatio-temporal graph separation problem in the embedding space.
- Categories:

- Read more about UTAL-GNN: Unsupervised Temporal Action Localization using Graph Neural Networks
- Log in to post comments
Fine-grained action localization in untrimmed sports videos presents a significant challenge due to rapid and subtle motion transitions over short durations. Existing supervised and weakly supervised solutions often rely on extensive annotated datasets and high-capacity models, making them computationally intensive and less adaptable to real-world scenarios. In this work, we introduce a lightweight and unsupervised skeleton-based action localization pipeline that leverages spatio-temporal graph neural representations.
- Categories: