OUTLINE:

NVIDIA H100 Tensor-Core GPU

02 Jun,2023
icon263

tensor core gpu

NVIDIA H100 Tensor Core GPU Overview

The complexity of artificial intelligence (AI), high-performance computing (HPC), and data analytics is increasing exponentially, requiring scientists and engineers to use the most advanced computing platforms. NVIDIA Hopper GPU architecture securely delivers the highest performance computing with low latency, and integrates a full stack of capabilities for computing at data center scale.

The NVIDIA® H100 Tensor Core GPU powered by the NVIDIA Hopper GPU architecture delivers the next massive leap in accelerated computing performance for NVIDIA’s data center platforms. H100 securely accelerates diverse workloads from small enterprise workloads, to exascale HPC, to trillion parameter AI models.

Implemented using TSMC’s 4N process customized for NVIDIA with 80 billion transistors, and including numerous architectural advances, H100 is the world’s most advanced chip ever built.

AI Technology

The H100, computing architecture with its built-in Transformer Engine, is optimized for developing, training and deploying generative AI, large language models (LLMs) and recommender systems. This technology makes use of the H100’s FP8 precision and offers 9x faster AI training and up to 30x faster AI inference on LLMs versus the prior-generation A100.

 

Deploy H100 With the NVIDIA AI Platform

Deploy H100 With the NVIDIA AI platform NVIDIA AI is the end-to-end open platform for production AI built on NVIDIA H100 GPUs. It includes NVIDIA accelerated computing infrastructure, a software stack for infrastructure optimization and AI development and deployment, and application workflows to speed time to market.

 

Up to 4X Higher AI Training on GPT-3

NVIDIA H100 Tensor-Core GPU

 

Up to 30X higher AI Inference Performance on Largest Models

NVIDIA H100 Tensor-Core GPU

NVIDIA H100 GPU Key Features

New Streaming Multiprocessor (SM) has many performance and efficiencyimprovements.

New fourth-generation Tensor Cores are up to 6x faster chip-to-chip compared to A100, including per-SM speedup, additional SM count, and higher clocks of H100.

New DPX Instructions accelerate Dynamic Programming algorithms by up to 7x over the A100 GPU.

3x faster IEEE FP64 and FP32 process rates chip-to-chip compared to A100.

New Thread Block Cluster feature allows programmatic control of locality at a granularity larger than a single Thread Block on a single SM.

New Asynchronous Execution features include a new Tensor Memory Accelerator (TMA) unit that can transfer large blocks of data very efficiently between global memory and shared memory.

 

Diverse Workloads in Modern Cloud Computing

 

Graphics

Scientific Computing

Data Analytics

AI deep learning training

Edge AI video analytics

Cloud gaming

Genomics

Classical machine learning

AI deep learning inference

5G private networks

 

New Technologies in Hopper H100

NVIDIA H100 Tensor-Core GPU

AI Pioneers Adopt H100

Several pioneers in generative AI are adopting H100 to accelerate their work:

OpenAI used H100’s predecessor — OpenAI will be using H100 on its Azure supercomputer to power its continuing AI research.

Stability AI plans to use H100 to accelerate its upcoming video, 3D and multimodal models.

Twelve Labs, a platform that gives businesses and developers access to multimodal video understanding, plans to use H100 instances on an OCI Supercluster to make video instantly, intelligently and easily searchable.

Anlatan, the creator of the NovelAI app for AI-assisted story writing and text-to-image synthesis, is using H100 instances on CoreWeave’s cloud platform for model creation and inference.

 

Manufacturer

Nvidia is a technology company that designs graphics processing units (GPUs) for the gaming, professional, and data center markets. Nvidia's GPUs are used in a variety of industries, including artificial intelligence, self-driving cars, and scientific research. The company also produces a line of computer chips called Tegra, which are used in mobile devices and embedded systems.

logo

Disclaimer: The views and opinions expressed by individual authors or forum participants on this website do not represent the views and opinions of Chipsmall, nor do they represent Chipsmall's official policy.

Get the week’s best marketing content
Subscribe

share this blog to:

  • twitter
  • facebook
  • linkedin