hpc seversHPC is a cloud-based technology for high-speed compute, and it is presumably in the process of replacing legacy on-premises supercomputers. The promise of HPC is in its quick, easy, and affordable scalability. With easy access to high performance computing, big data algorithms could be easily improved. Read on to learn about the key differences between HPC on Azure and HPC on AWS.

What Is High-Performance Computing (HPC)

HPC is an umbrella term that refers to resources that enable high-speed processing and performance. HPC resources include processors, memory, disk, and operating system, which are bundled into a computer. 

Each computer in the HPC architecture is referred to as a node, and a bundle of computers is called a cluster. The nodes in the HPC cluster work together to solve a given problem. You can run Linux or Windows on your HPC cluster. 

The speed of HPC is measured in floating-point operations per second (FLOPS). The amount of FLOPS you can reach is limited to the capabilities of your HPC components, the type of architecture, the underlying processing technology, the software you run, and your budget. 

What Are the Use Cases for HPC?

HPC technology gives you the power to solve problems faster. This capability can be applied to any use case and industry. Amongst the most notable applications of HPC is in the field of big data. 

You can use HPC to train your Machine Learning (ML) algorithms faster. This is especially useful for Deep Learning (DL) models like Convolutional Neural Network (CNN), which are much more complex and demanding.

The improved power of HPC can help you get big data results faster. Whether you’re doing medical research, looking for track stock trends, creating meteorological predictions, or streaming live media—stronger and faster resources can enable you to achieve peak performance.

What Is Cloud-Based HPC?

The term cloud-based HPC refers to HPC resources that are offered by cloud vendors. That means you don’t have to set up your own on-premise operation. Rather, you use the resources provided by the cloud service. A cloud-based HPC model is typically offered on-demand, thus eliminating the expenditure associated with setting up on-premise infrastructure.

HPC on Azure vs HPC on AWS: A Comparison

Each cloud vendor offers different HPC services, based on different technologies, and at a different price range. The comparison below reviews the main aspects of HPC on Azure and HPC on AWS. 

HPC Technology: Cray vs Intel

To provide cloud-based HPC technology, Microsoft Azure has partnered with Cray, which is a  supercomputer manufacturer. Azure offers a managed Cray supercomputer service, including Cray® XC™ or Cray® CS™ supercomputers. These are attached to Cray® ClusterStor™, which are hosted at an Azure datacenter. 

AWS has partnered with Intel, which provides a wide range of processors. Intel® Xeon® processors are offered as a base for CPU, GPU, and Field Programmable Gate Array (FPGA) instances. These three core types of processors are delivered as EC2 instances, which are further divided into speed-based categories. The AWS Auto Scaling interface monitors the operation.

It’s worth mentioning that Cray and Intel are partners, and are in the process of creating the first exascale supercomputer in the United States.

HPC Services: Hybrid-Friendly vs Cloud-Focused

The table below reviews the key HPC services offered by Azure and AWS.

 

HPC Service Azure AWS
Compute Azure compute resources are offered as CPU-based or CPU-based VMs. GPUs feature the NVIDIA-based N-series. AWS compute resources are offered as EC2 instances for CPU, GPU, and Field Programmable Gate Array (FPGA).
Networking Azure ExpressRoute enables hybrid HPC, by creating secure high-performance tunnels. Elastic Fabric Adapter (EFA) provides controls for scaling inter-instance communications. 
Storage The HPC Cache service enables direct, fast access to on-prem NAS devices. Amazon FSx for Lustre is a fully manages performance file storage system for processing Amazon S3 or on-prem data.
Workflow Azure Batch provides controls for managing large numbers of compute nodes.
Azure CycleCloud creates Azure HPC clusters and orchestrates tasks for workflows.
The Azure HPC platform integrates with the Azure Kubernetes Service (AKS).
AWS Batch enables dynamic provisioning based on the type and amount of compute resources.
AWS ParallelCluster is an open source cluster management tool dedicated to HPC administration.
The NICE EnginFrame web portal provides browser access to HPC-enabled infrastructure.
Analytics Azure Data Lake Analytics enables computations and analyses on HPC data. NICE DCV enables the delivery of remote desktops and application streaming, for remote visualization.

The Verdict: HPC on Azure or AWS? 

As with any technology, choose the solution that fits your needs best. There are no right or wrong answers. There aren’t necessarily better or worse solutions. There are different requirements for each project, that can be met by one vendor or another.

Azure is known for its focus on enterprise-level solutions, and especially hybrid environments. If your HPC project is meant for a hybrid environment, you might have an easier time using Azure HPC. Azure provides a wide range of managed services, which require less technical skills

AWS is more cloud-focused, and offers HPC integration with AWS cloud services. If you’re already an AWS user, and you want to integrate your HPC project to existing AWS services, you won’t have any problem setting this up. However, the more complex your project, the more skills you’ll need to set up and maintain your HPC project on AWS.

Conclusion

Hopefully, this article has helped you better understand the key differences between the HPC services offered on Azure and AWS. Assess your situation, and then determine where to set up your HPC operation. Be honest in your assessment, and experiment with the different solutions before committing to any new technology.

gilad maayanAuthor bio: Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Samsung NEXT, NetApp and Imperva, producing technical and thought leadership content that elucidates technical solutions for developers and IT leadership. You can follow him on Linkedin