A 203 FPS VLSI ARCHITECTURE OF IMPROVED DENSE TRAJECTORIES FOR REAL-TIME HUMAN ACTION RECOGNITION

This paper introduces architecture with high throughput, low on-chip memory, and efficient data access for Improved Dense Trajectories (iDT) as video representations for real-time action recognition. The iDT feature can capture long-term motion cues better than any existing deep feature, which makes it crucial in state-of-the-art action recognition systems. There are three major features in our architecture design, including a low bandwidth frame-wise feature extraction, low on-chip memory architecture for point tracking, and two-stage trajectory pruning architecture for low bandwidth.
Using TSMC 40nm technology, our chip area is 3.1 mm2, and the size of on-chip memory is 40.8 kB. The chip can support videos in resolution of 320×240 with a throughput of 203 fps under 215 MHz, which is a 81.2 times speedup compared with CPU. Under the same operating frequency, it can also provide feature extraction for six windows of size 320×240 in higher resolution videos with a throughput of 34 fps.

ICASSP 2018.pdf

ICASSP 2018.pdf (369)

Thumbs Up

CITE

Documents

Poster

A 203 FPS VLSI ARCHITECTURE OF IMPROVED DENSE TRAJECTORIES FOR REAL-TIME HUMAN ACTION RECOGNITION

ICASSP 2018.pdf

QUESTIONS?