Amazon
State Space Model Optimization for Edge AI
State Space Models like Mamba 2 attention compute scales linearly instead of quadratic, which is a benefit. Another benefit is that that KV cache is not scaling linearly but remains constant. Amazon Devices and Services recognizes this benefit as required RAM and compute reduces for such models. However, in addition to the benefits, there is scientific risk that when these models are optimized or quantized they don't maintain accuracy. Thus, this student team will work to invent scientific techniques to quantize and optimize SSM and create multiple optimal models on pareto optimal curve with accuracy and performance. The second part of the project this student team will work to achieve is to run these models accuracy once optimized on edge hardware like Orange Pi or Raspberry Pi using open source frameworks like Ollama, Llama.cpp. Research paper: https://arxiv.org/pdf/2405.21060 Accuracy: Standard benchmarks as defined in LM Harness like Lambada, ARC easy/challenge. Performance Criterion: Accuracy degradation after model optimization and quantization <3% degradation on each task Demonstration: Optimized model running on Edge HW like Orange Pi or Raspberry Pi The outcomes this student team will work to achieve are: Quantized Mamba 2 or equivalent LLM architecture with <3% accuracy degradation on LM Harness Quantized model deployed on Orange Pi using open source frameworks like Ollama, Llama.cpp The bonus deliverables this student team will work to achieve are: Quantized + Sparse (50%) on Mamba 2 or equivalent LLM architecture with <3% accuracy degradation on LM Harness Quantized model deployed on Orange Pi with access of NPU from Rochip on Orange Pi
Faculty Adviser
Radha Poovendran,
Professor,
Related News

Fri, 09/20/2024 | UW Civil & Environmental Engineering
Smarter irrigation for a greener UW
A new project combines satellite data with ground sensors to conserve water and create a more sustainable campus environment.

Mon, 09/09/2024 | UW Mechanical Engineering
Testing an in-home mobility system
Through innovative capstone projects, engineering students worked with community members on an adaptable mobility system.

Mon, 08/19/2024 | UW Mechanical Engineering
Students strive to ensure accurate AED shock dosage
ShockSafe, developed by students with the help of mentors from Philips and Engineering Innovation in Health (EIH), can distinguish between children and adults during cardiac arrest emergencies.

Wed, 08/07/2024 | Snohomish County News
Snohomish County, University of Washington partnership boosts efficiency in enterprise scanning center
UW Industrial and Systems Engineering Capstone Project set to save Snohomish County over $40,000 annually.