Skip to main content

On the Edge: Agentic AI for Neural Processors

A practical guide to building intelligent agents optimized for NPU hardware. Learn how to design, implement, and deploy agentic systems that leverage neural processors for edge computing, with real-world patterns, performance optimization techniques, and production-ready strategies.

Preface

This book is about a narrow, awkward, increasingly important corner of applied AI: building agent...

Foundations of NPU-Optimized Agents

NPU architecture and computational constraints. Model quantization and optimization for NPU deplo...

Agent State & Decision-Making on Constrained Hardware

Managing agent context and memory within NPU limits. Efficient reasoning loops for low-latency in...

Tool Use & Integration Patterns

Designing lightweight tools for NPU-based agents. Async I/O and non-blocking integrations. Local ...

Production Deployment & Observability

Model serving architectures (ONNX, TensorRT, TVM). Monitoring latency, throughput, and reliabilit...

Real-World Case Studies & Best Practices

Building customer-facing NPU agents (chatbots, assistants). Batch vs. streaming inference strateg...

Appendices

Glossary of terms and consolidated source references for the book.