Documents
Poster
EXTRACTING KEY FRAMES FROM FIRST-PERSON VIDEOS IN THE COMMON SPACE OF MULTIPLE SENSORS
- Citation Author(s):
- Submitted by:
- Yujie Li
- Last updated:
- 1 September 2017 - 2:09am
- Document Type:
- Poster
- Document Year:
- 2017
- Event:
- Presenters:
- Yujie Li
- Paper Code:
- 3304
- Categories:
- Log in to post comments
Selecting authentic scenes about activities of daily living (ADL) is useful to support our memory of everyday life. Key-frame extraction for first-person vision (FPV) videos is a core technology to realize such memory assistant. However, most existing key-frame extraction methods have mainly focused on stable scenes not related to ADL and only used visual signals of the image sequence even though the activities usually associate with our visual experience. To deal with dynamically changing scenes of FPV about daily activities, integrating motion and visual signals are essential. In this paper, we present a novel key-frame extraction method for ADL, which integrates multi-modal sensor signals to temper noise and detect salient activities. Our proposed method projects motion and visual features to a shared space by a probabilistic canonical correlation analysis and selects key frames there. The experimental results using ADL datasets collected in a house suggest that our key-frame extraction technique running in the shared space improves the precision of extracted key frames and the coverage of the entire video.