MT04
The MT04 AI accelerator delivers exceptional compute acceleration with up to 32x sparsity support, combining ultra-high performance, low power consumption, and industry-leading energy efficiency. Optimized for computer vision, natural language processing, multimodal inference, and other data center AI workloads, it serves large-scale inference applications across internet, telecommunications, smart cities, life sciences, and autonomous driving. Its proprietary dual-sparsity algorithm and hardware-software co-design achieve orders of magnitude performance gains while significantly reducing total cost of ownership (TCO).
Detailed information
The MT04 AI accelerator delivers exceptional compute acceleration with up to 32x sparsity support, combining ultra-high performance, low power consumption, and industry-leading energy efficiency. Optimized for computer vision, natural language processing, multimodal inference, and other data center AI workloads, it serves large-scale inference applications across internet, telecommunications, smart cities, life sciences, and autonomous driving. Its proprietary dual-sparsity algorithm and hardware-software co-design achieve orders of magnitude performance gains while significantly reducing total cost of ownership (TCO).
1、Deeply Integrated Architecture
Leverages hardware-software co-design to tightly couple compute cores with high-bandwidth on-chip memory, 4th-generation tensor cores optimized for sparse computations, and a unified memory architecture with zero-copy data access.
2、Intelligent Sparsity Engine
Purpose-built for AI inference with high-rate sparse tensor cores supporting up to 32x model sparsity, adaptive precision encoding for lossless model compression, and a dynamic pruning engine achieving near-ideal hardware utilization.
3、Flexible Scalable Design
Integrates specialized accelerators for diverse workloads including custom sparse processing units, an 8K60 video codec engine supporting AV1/HEVC/VP9, 200GB/s JPEG decoding throughput with zero CPU overhead, and a vector search engine optimized for 1B+ embeddings.
4、Professional Multimedia Platform
Dedicated hardware acceleration for intelligent video analytics supporting simultaneous processing of 32x 4K30 video streams per card, real-time object detection with sub-2ms latency, and CPU offload for JPEG/PNG decoding up to 1000fps.