Documents
Poster
A 203 FPS VLSI ARCHITECTURE OF IMPROVED DENSE TRAJECTORIES FOR REAL-TIME HUMAN ACTION RECOGNITION
- Citation Author(s):
- Submitted by:
- CHIA LIN CHEN
- Last updated:
- 22 April 2018 - 10:53am
- Document Type:
- Poster
- Document Year:
- 2018
- Event:
- Presenters:
- Jia-Lin Chen
- Paper Code:
- 1045
- Categories:
- Log in to post comments
This paper introduces architecture with high throughput, low on-chip memory, and efficient data access for Improved Dense Trajectories (iDT) as video representations for real-time action recognition. The iDT feature can capture long-term motion cues better than any existing deep feature, which makes it crucial in state-of-the-art action recognition systems. There are three major features in our architecture design, including a low bandwidth frame-wise feature extraction, low on-chip memory architecture for point tracking, and two-stage trajectory pruning architecture for low bandwidth.
Using TSMC 40nm technology, our chip area is 3.1 mm2, and the size of on-chip memory is 40.8 kB. The chip can support videos in resolution of 320×240 with a throughput of 203 fps under 215 MHz, which is a 81.2 times speedup compared with CPU. Under the same operating frequency, it can also provide feature extraction for six windows of size 320×240 in higher resolution videos with a throughput of 34 fps.