ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device
About
Local execution of AI on edge devices is important for low latency and offline operation. However, deploying models on diverse hardware remains fragmented, often requiring model conversion or complete reimplementation outside the PyTorch ecosystem where the model was originally authored. We introduce ExecuTorch, a unified PyTorch-native deployment framework for edge AI. ExecuTorch enables seamless deployment of machine learning models across heterogeneous compute environments. It scales from embedded microcontrollers to complex system-on-chips (SoCs) with dedicated accelerators, powering devices ranging from wearables and smartphones to large compute clusters. ExecuTorch preserves PyTorch semantics while allowing customization, support for optimizations like quantization, and pluggable execution "backends". These features together enable fast experimentation, allowing researchers to validate deployment behavior entirely within PyTorch, bridging the gap between research and production.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Processing Inference | ViT (Vision Transformer) | Average Latency (ms)3.81 | 16 | |
| Image Processing Inference | ResNet50 | Average Latency (ms)0.55 | 15 | |
| Image Processing Inference | MobileNet V3 | Average Latency (ms)0.24 | 14 | |
| LLM Inference | Llama 3.2 Samsung Galaxy S25 Ultra 1B (test) | Prefill Min Throughput (tokens/sec)2.81e+3 | 13 | |
| LLM Inference | Qwen3 Samsung Galaxy S25 Ultra 0.6B (test) | Prefill Throughput (min)1.54e+3 | 12 | |
| LLM Inference | Phi4 Mini Samsung Galaxy S25 Ultra 3.8B (test) | Prefill Throughput (min, tokens/sec)1.16e+3 | 10 | |
| LLM Inference | Qwen3 Google Pixel 9 Pro XL 0.6B (test) | Prefill Throughput (min, tokens/sec)591 | 10 | |
| LLM Inference | Llama 3.2 Google Pixel 9 Pro XL 1B (test) | Prefill Throughput (min) (tokens/sec)530 | 10 | |
| Image Processing Inference | Swin T | Average Latency (ms)3.38 | 8 | |
| LLM Inference | Phi4 Mini Google Pixel 9 Pro XL 3.8B (test) | Prefill Min Throughput (tokens/sec)119.6 | 8 |