INTELLIGENCE LIBRARY

243 items indexed · AI tools, prompts, hooks & techniques

▸ FILTERS & SEARCH

1–2 of 2 items

On-Device LLM Inference Optimization Techniques for 2026

This guide details advanced methods for running large language models directly on mobile and edge devices. By using techniques like model compression (quantization) and efficient processing (speculative decoding), developers can create faster, more private, and lower-cost AI applications that work without a constant internet connection.

Awesome Edge AI Agents: A Curated List for On-Device Multimodal AI

This is a curated list of research papers and software frameworks for running advanced AI, like language and image models, directly on mobile phones and other small devices. This technique enables faster, more private AI applications that work without a constant internet connection.

General

GitHub