Category: PowerEdge
-

Introduction to NVIDIA Inference Microservice, aka NIM
At NVIDIA GTC 2024, the major release of the NVIDIA Inference Microservices, aka NIM was announced. NIM is part of the portfolio making up the Nvidia AI Enterprise stack. Why the focus on inferencing? Because when we look at use cases for Generative AI, the vast majority of them are…
-

Dell is making it easy to stand up Digital Assistants !
By choosing Dell’s generative AI digital assistant solutions, businesses can bypass the complexities and high costs associated with building AI systems from scratch. Instead, they can leverage Dell Technologies comprehensive, integrated, and scalable solutions to quickly realize the benefits of AI, drive innovation, and stay competitive in the digital age.
-

Announcing the Dell PowerEdge XE9680L – Purpose-built for GPU density with extreme AI performance
The XE9680L sets a new standard for performance and energy efficiency in a more compact, denser 4U form factor, complete with an HGX 8-way B200 GPU configuration.
-

PowerEdge Soundbytes Ep20: Dell RAG Demo Update
In this previous episode, I sat down with David O’Dell from the Dell Technical Marketing Engineering team to talk about RAG. David also walked me through his RAG demo. Since then, David has been super busy, making some improvements to the demo. Those improvements are so awesome that I had…
-

Converting HuggingFace LLM’s to TensorRT-LLM for use in the Triton Inference Server
Introduction Before getting into this blog proper, I want to take a minute to thank Fabricio Bronzati for his technical help on this topic. Over the last couple of years, HuggingFace has become the de-facto standard platform to store anything to do with generative AI. From models to datasets to…
-

Llama 2 on XE9680
Co-Authored by Damian Erangey and Fabricio Bronzati NVIDIA’s LLM Playground, part of the NeMo framework is an innovative platform for experimenting with and deploying large language models (LLMs) for various enterprise applications. It’s currently in a private, early access stage and offers the following features: Fabricio Bronzati from Dell’s integrated…
-

GenAI Use Cases: The Top 5 GenAI Inferencing Use Cases with Dell
GenAI Use Cases: The Top 5 GenAI Inferencing Use Cases with Dell Thanks to computing and other innovations in computing solutions like PowerEdge servers, use cases for Generative AI (GenAI) continue to expand and has the potential to revolutionize the way businesses approach problem-solving and automation. But how do we…
-

Why no ARM-based PowerEdge server?
ARM processors are everywhere these days from cellphones to laptops to datacenters. What used to be known as the brain of a very inexpensive single board computer, the Raspberry Pi, targeted at the hobbyist maker market is now fairly ubiquitous: most cellphones, be it iPhones or Androids, use ARM processors,…
-

Dell’s Validated Design for Generative AI Inferencing: An Exploration In Sizing
The world of artificial intelligence (AI) is undergoing rapid transformation, with Large Language Models (LLMs) at the forefront of this evolution. Ensuring the efficient deployment and operation of these models is paramount. In July (’23) Dell released what will be the first in a series of Validated Design Guides for…
-

Installing OpenManage Enterprise on VMware vCenter
OpenManage Enterprise, aka OME, is a critical component of the PowerEdge Software Management stack. It sits between the PowerEdge embedded out-of-band controller, called iDRAC, and our AIOps cloud-based offering, CloudIQ. OME is a key component to managing PowerEdge servers at scale. In this short video, I demo how to install…
