Close

Presentation

PixelSieve: Towards Efficient Activity Analysis From Compressed Video Streams
Time
Location
Event Type
Research Manuscript
Virtual Programs
Hosted in Virtual Platform
Keywords
Approximate Computing for AI/ML
Topics
Design
DescriptionPixel-level data redundancy in video induces additional memory and computing overhead when neural networks are employed to mine spatiotemporal patterns, e.g. activity and event recognition on video streams. This work proposes PixelSieve, to enable highly-efficient CNN-based activity analysis directly from videos in compressed formats. Instead of recovering original RGB frames from compressed video, PixelSieve utilizes the built-in metadata in compressed video streams to distill only the critical pixels that render relevant spatiotemporal features, and then conducts efficient CNN inference with the condensed inputs. PixelSieve removes the overhead of video decoding, significantly improves the CNN-based video analysis by 4.5x on average.