Energy efficient LLM on RISC-V Edge Devices via Dynamic Voltage and Frequency Scaling
Speaker
Yue Wu (Andy), Visiting Student
Imperial College London MSc (Prospective PhD Student to CALAS)
Time
September 22 2025 (Mon) at 14:00 HKT
Venue
P1402 + Zoom: https://cityu.zoom.us/j/96742093029
Abstract
Fundamentally, LLM inference consists mainly of addition and multiplication operations, and its power demands can drop significantly when processing sparse data. This observation motivates the hypothesis that embedding sparsity-awareness into models could enable more effective DVFS control. Our tentative research will explore this hypothesis from three complementary directions: Software layer: Develop and evaluate optimization techniques to embed sparsity into LLM inference and identify sparsity patterns. System layer: Design a DVFS governor that leverages sparsity patterns for real-time energy and thermal management. Hardware layer: Build an FPGA-based prototype to simulate, validate, and iteratively refine the proposed approach. We anticipate that this work will contribute new cross-layer methodologies to significantly improve the energy efficiency and thermal sustainability of edge-deployed LLMs.