243 items indexed · AI tools, prompts, hooks & techniques
This guide details advanced methods for running large language models directly on mobile and edge devices. By using techniques like model compression (quantization) and efficient processing (speculative decoding), developers can create faster, more private, and lower-cost AI applications that work without a constant internet connection.
This is a curated list of research papers and software frameworks for running advanced AI, like language and image models, directly on mobile phones and other small devices. This technique enables faster, more private AI applications that work without a constant internet connection.