Using the GPU on an Azure NVIDIA enabled virtual machine

GPU-accelerated computing is the employment of a graphics processing unit (GPU), along with a computer processing unit (CPU), to facilitate processing-intensive operations such as analytics, machine learning, and engineering applications.  It is becoming much more popular due to the larger number of applications where it can be used: Artificial Intelligence (AI) being one example.

Azure virtual machines can now be GPU Accelerated with NVIDIA cards. There are several virtual machine sizes with several NVIDIA GPU Accelerators to choose from:  https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-gpu

Even when the proper drivers are installed, your application still may not utilize the GPU.  This blog provides some steps to assist with that issue.

Install the driver

You can install the NVIDIA GPU Driver Extension when you provision the new virtual machine. Or drivers may be selected from here: https://docs.microsoft.com/en-us/azure/virtual-machines/windows/n-series-driver-setup. Once your machine is provisioned and you have the drivers installed, verify the driver by checking Device Manager.

Verify your application utilizes the GPU resources

Some applications, including those that are browser-based, may not use the GPU accelerator. Run the following command to check the driver and check the value GPU-Util.
     C:\ProgramFiles\NVIDIA Corporation\NVSMI>nvidia-smi

If your GPU-Util is “0” when running the application, you may need to force driver mode for the NVIDIA GPU accelerator.
   C:\ProgramFiles\NVIDIA Corporation\NVSMI>nvidia-smi-fdm 0
   Set driver model to WDDM for GPU 00000001:00:00.0.
   All done.

Now you can sample applications like FishGL, http://www.fishgl.com/, Google Earth, https://earth.google.com/web/, and the Nvidia Island Demo, https://www.geforce.com/games-applications/pc-applications/fermi-water-demo/downloads, to check to make sure that they all work fine and correctly use the GPU. When using Google Earth and Fish GL we can see that GPU is being utilized.

CPU utilization is lower, and now we can see GPU utilization. 
     C:\ProgramFiles\NVIDIA Corporation\NVSMI>nvidia-smi     

+—————————————————————————+

| NVIDIA-SMI398.75                Driver Version:398.75                   |

|——————————-+———————-+———————-+

| GPU Name            TCC/WDDM| Bus-Id        Disp.A | Volatile Uncorr.ECC |
| Fan  Temp  Perf Pwr:Usage/Cap|         Memory-Usage | GPU-Util Compute M.|
|===============================+======================+======================|

|   0  TeslaK80          WDDM  |00000001:00:00.0 Off|                   0 |
| N/A   58C    P0   66W / 149W |    553MiB / 11520MiB |     20%     Default |
+——————————-+———————-+———————-+

Get the most out of your GPU Accelerated Azure virtual machines. If you would like to learn more about how Azure can help provide your organization with a competitive advantage, email info@peters.com.  We are happy to help!

By |2019-02-19T11:19:17-05:00February 19th, 2019|Infrastructure Services|Comments Off on Using the GPU on an Azure NVIDIA enabled virtual machine

About the Author:

As a Solutions Architect at Peters & Associates, Terry Felesena is responsible for high level architecture, design, and review of complex virtualization solutions, as well as mentoring and troubleshooting guidance. Terry has been with Peters & Associates for over two decades. Application Virtualization: Terry has a vast knowledge base regarding XenApp, XenDesktop, and Terminal Services. He has had numerous projects involving the design, implementation, and support of using industry best practice methodology. Terry has recently completed projects with large numbers of servers and thousands of concurrent users. Designs and implementations include high availability and redundant access points via Internet, WAN and local connectivity. Server Virtualization: Through assessments, Terry has been integral in providing optimal designs and sizing to support virtualizing mission critical applications. Implementations are based on zero impact to production and maintaining server uptime.