• FryAI
  • Posts
  • How LLM costs could be lowered "significantly"

How LLM costs could be lowered "significantly"

This week has been extra crispy when it comes to the AI space! Today, it’s Nvidia that is making the moves. 🔥

Today’s Menu

Appetizer: Nvidia’s new chip to lower LLM costs “significantly” 💾

Entrée: Nvidia introduces “AI Workbench” 🦾

Dessert: AI-generated books steal author’s name and content 📗

🔨 AI TOOLS OF THE DAY

📧 Rizemail: Get to the core of your unread newsletters, long email-threads, and cluttered commercial communications. This tool receives your emails, summarizes them using AI, and returns you what you want to know … All within your inbox! → check it out

🦙 LLaMa Chat: A virtual assistant capable of engaging users in real-time conversations, answering queries, and providing assistance across various domains. → check it out

💬 Vrew: Easily add customizable captions to long videos with minimal editing.→ check it out

NVIDIA’S NEW CHIP TO LOWER LLM COSTS “SIGNIFICANTLY” 💾

There is a lot of buzz in the AI space about the cost of large language models (LLMs), and Nvidia’s new chip might be a good start. 💸

What’s up? Nvidia, which occupies over 80% of the AI chip market, looks to maintain its dominance with the introduction of its latest innovation, the GH200 chip, aimed at enhancing AI capabilities and lowering the cost of running LLMs.

What is the chip? Unlike its predecessors, the GH200 combines a robust GPU, akin to the H100 model, with an impressive 141GB of advanced memory and a 72-core ARM central processor. This setup is tailored to enhance AI model performance in large data centers. The GH200's design optimizes AI inference tasks, boasting ample memory capacity—which is essential for deploying large AI models on a single system. Unlike AI model training that demands intensive resources, inference continuously predicts outcomes or generates real-time content. With enhanced memory and processing power, the GH200 efficiently handles extensive language models and resource-intensive AI tasks.

What does this mean? Nvidia's announcement of the GH200 chip signifies a significant advancement in AI hardware technology. By bolstering inference capabilities, this chip addresses the computational demands of real-time AI applications, ranging from natural language processing to image generation. The expanded memory capacity of the GH200 enables the deployment of larger AI models on a single GPU, streamlining processes and potentially reducing infrastructure costs. As Nvidia's competitors like AMD and tech giants Google and Amazon also venture into custom AI chip development, the market is witnessing fierce competition and rapid innovation. Nvidia's GH200 presents an exciting solution that can potentially reshape how businesses and industries harness the power of AI, ultimately paving the way for more sophisticated and impactful AI-driven applications. 🤛

NVIDIA INTRODUCES “AI WORKBENCH” 🦾

Just when we thought Nvidia was about to be caught by other AI chip manufacturers, the giant takes another big step. 👣

What’s new? Nvidia has introduced another innovation, the AI Workbench, aimed at revolutionizing the creation of generative AI models. This move promises to simplify the complex process of developing and deploying these models across various Nvidia AI platforms, ranging from personal computers to powerful workstations.

How will this work? Nvidia acknowledges the abundance of pre-trained models available, but customization remains time-consuming and demanding. This is where the AI Workbench steps in, offering a user-friendly approach to tailor and execute generative AI models. Developers can harness enterprise-grade models with minimal effort, utilizing the diverse range of frameworks, libraries, and software development kits (SDKs) provided by Nvidia's AI platform, as well as open-source repositories like GitHub and Hugging Face.

What are the implications? With the AI Workbench, developers can efficiently share customized models across platforms. From local systems equipped with Nvidia RTX graphics cards to expansive data centers and cloud resources, the scalability is seamless for individuals and organizations. As Manuvir Das, Nvidia's Vice President of Enterprise Computing, says, “Nvidia AI Workbench provides a simplified path for cross-organizational teams to create the AI-based applications that are increasingly becoming essential in modern business.”

AI-GENERATED BOOKS STEAL AUTHOR’S NAME AND CONTENT 📗

Jane Friedman’s blog for authors and publishers can be found at JaneFriedman.com

AI generated books have gotten out of hand, and there is currently nothing stopping this train from barreling down the tracks. 🚂

What’s up? Author Jane Friedman has recently had five books taken off Amazon that were reportedly written by her, but were actually AI-generated.

How did this happen? Friedman said she has been blogging since 2009, so it is no surprise that AI models could pull from that vast amount of available content. The books, she said, were looking to profit off her name, but said, “It feels like a violation, because it’s really low quality material with my name on it … It looks terrible. It makes me look like I’m trying to take advantage of people with really crappy books.”

What can be done? Friedman said the process of getting these books removed from Amazon was a horribly tedious. She called for Amazon and other bookstores to “ create a way to verify authorship.” She said, “They have no procedure for reporting this sort of activity where someone’s trying to profit off someone’s name … Unless Amazon puts some sort of policy in place to prevent anyone from just uploading whatever book they want and applying whatever name they want, this will continue, it’s not going to end with me.” In response to this issue, Amazon released a fluffy apology which offered no real solution to this issue. 😐

THOUGHTFUL THURSDAY 🤔

How often do you engage with an AI chatbot?

Login or Subscribe to participate in polls.

HAS AI REACHED SINGULARITY? CHECK OUT THE FRY METER BELOW

God has spoken, via The Pope, and He doesn’t like AI. The Singularity Meter drops 0.4%.

What do ya think of this latest newsletter?

Login or Subscribe to participate in polls.