• FryAI
  • Posts
  • An Unachievable Standard: Why Are We So Hard On AI Agents?

An Unachievable Standard: Why Are We So Hard On AI Agents?

Welcome to this week’s Deep-Fried Dive with Fry Guy! In these long-form articles, Fry Guy conducts in-depth analyses of cutting-edge artificial intelligence (AI) developments and developers. Today, Fry Guy dives into a way to rethink the standard by which we judge AI agents. We hope you enjoy!

*Notice: We do not receive any monetary compensation from the people and projects we feature in the Sunday Deep-Fried Dives with Fry Guy. We explore these projects and developers solely to showcase interesting and cutting-edge AI developments and uses.*


🤯 MYSTERY LINK đź¤Ż

(The mystery link can lead to ANYTHING AI-related. Tools, memes, and more…)

Imagine a judge who is trained on slews of subjective and biased data. This judge has been exposed to all sorts of internet propaganda, has been inundated by skewed facts, and as a result of these influences has somewhat sexist and racist tendencies. Do you really want that sort of judge making important decisions? Screw AI, right?

… Well, what if what was just described is not an AI system, but a human being?

It comes as no surprise that there is massive pushback towards AI’s implementation into our society—especially in important roles—but when we step back for just a second to compare AI’s performance in these roles to that of humans, we may find that we have set unrealistic expectations for AI, and as a result, are stifling opportunities for a better society. Let’s explore!

THE RISE OF AI AGENTS

We have seen emerging visions for AI doctors, AI therapists, AI educators, AI-driven cars, and more. Around every corner is another application for AI—it is being employed in all sorts of roles, from fry cooks to judges.

The purpose of implementing these AI agents is to make life a little bit easier and a bit more affordable too, for both individuals and companies. Take the rise of AI therapists, for instance. These therapists, though not able to offer the same level of personal touch as a human, are often free and available 24/7. This alternative, for some people, is better than paying $125/hour and booking a needed session on a weekday afternoon 3 weeks from now. For another example, look at self-driving cars. They have been all the rage for the past few years. Elon Musk has recently showcased the Tesla Cybercab, powered by AI features, which aims to make driving safer and more convenient. Not to mention, the sleek design looks pretty cool.

Beyond consumer use, AI agents are also being employed in business functions of all kinds. Figure’s AI-powered humanoid robot has recently gotten a job at BMW, working on cars and aiding in other factory functions. Emerging tech giant Salesforce has been working with a wide variety of companies to implement always-available autonomous AI service agents that can understand and interpret customers’ questions using natural language. The hope is that implementations like these will make business operations more affordable and consistent, freeing up humans to do more meaningful and creative tasks.

AI agents are making a wave in almost every industry, and oftentimes they are replacing roles currently (or previously) done by humans. In a previous article, we discussed the impact of these AI applications on human jobs, which is one obvious concern for many people. However, in this article, we would like to consider something else—namely, what performance standard we should hold these agents to. By considering some of the pushback on AI agents, we may be able to shed light on this issue and consider a change of perspective.

 

PUSHING BACK ON PUSHBACK

Some cases of AI implementation seem incredibly exciting. Take Flippy, for example. Designed by Miso Robotics in California, Flippy is an AI-powered robotic fry station that makes French fries faster and more consistently than a human. Staying on the topic of food, look at fast food chains like McDonald’s and Wendy’s, who have implemented AI-powered ordering systems in drive-thrus. The Wendy’s FreshAI system has been able to improve wait times for customers by an average of 22 seconds, a significant amount of time when you add up all the customers throughout the entire day. Not to mention, it has reduced the number of errors, satisfied gaps in labor, and has cut operating costs. So what’s the problem? Why are some people so opposed to such a system?

Well, in the fast food case, the problem is that AI has messed up some orders. Many of these mishaps have gone viral, painting such systems as a laughing stock. But let’s take a step back. Why are these small mistakes by AI systems viewed as so catastrophic? Why is ordering 55 cheeseburgers at an AI window so funny to people? Orders at drive-thrus get messed up all the time! In fact, traditional drive-thrus mess up orders for almost two out of every ten guests. But rarely do these mistakes go viral, unless someone finds human fingers in their food (yes, this actually happened—more than once). So this raises the question: what standard are we holding AI to? If AI does better than humans do, why does AI get such intense criticism?

This phenomenon is seen beyond just drive-thrus. For example, there have been talks about implementing AI judges into courts to rule on various cases. We could extensively train an AI model on law, feed the model all the details about the case—provide all the evidence on both sides—and the AI could make a ruling in seconds. This could simplify court hearings and speed up the judicial process tremendously. However, most people become immediately scared of such an idea, pointing out that the AI judge might be unfair or biased towards certain groups of people based on skewed training data. But wait just a second: aren’t human judges biased in fundamentally worse ways? Studies have found that judges are more likely to make favorable rulings at the beginning of the day and after a meal break, rather than before a meal. Moreover, humans are riddled with biases based on their upbringing, political stances, and personal experiences. Humans are also subject to all sorts of emotional influences that are difficult to measure and control. Even the most “unbiased,” well-respected judges cannot escape their own human nature at times. AI, on the other hand, is not subject to these influences—it will follow the data. So although the training data can be skewed at times, if the AI judge is less biased than a human judge and can outperform the human judge at understanding and weighing the evidential factors to render a judgment, should that not be viewed as acceptable? It’s not perfect, but it’s better than the alternative. Should the standard not be that AI performs better than humans? If the standard is perfection, not only should we do away with AI systems, but we shouldn’t let humans do much of anything!

Now, the goal of this article is not to minimize meaningful concerns and green-light all AI innovations. Some implementations of AI give cause for careful consideration and merited concern. Take, for instance, controversies surrounding Waymo. This company has been putting AI-driven robotaxis on the road for years. Most of the taxis operate incredibly smoothly, getting passengers to their destinations without a hiccup. One recent incident, however, went viral for all the wrong reasons. Near Phoenix, Arizona, a police officer pulled a Waymo car over after it was found driving on the opposite side of the road in a construction area. Thankfully, in this case, nobody was harmed. But trusting AI to operate cars that can do this sort of thing raises eyebrows and is a genuine cause for concern.

But let’s explore this case a bit more carefully. After Waymo reviewed the incident, what they found was that the car responded to “inconsistent construction signage.” Some may argue that this incident was due to a human error and not an error with the AI system. Nonetheless, AI systems need to learn how to adjust to these sorts of things. Waymo admitted the unacceptability of such driving patterns, and incidents like this one allow AI systems to be tweaked and improved. Surely, many AI systems are not perfect, and they will inevitably mess up from time to time—especially when they are first implemented. But without implementing these AIs into real-world situations, we will never be able to learn from practical experiences to make these systems better. Furthermore, the main point remains: even if AI systems make costly mistakes like this one with Waymo, why should we compare AI to a standard of perfection that is unachievable by even the most competent humans currently in that role? The standard should certainly be high (and maybe higher in some roles than others), but we ought to hold AI to the same high standard we hold humans to—asking anything beyond that is unrealistic.

RETHINKING THE STANDARD

AI agents are making their way into society, whether we like it or not. The train is on the tracks, and it’s barreling full steam ahead. In the midst of all this, it seems like jumping on developers and the mistakes of AI as a reason to cancel them is a misguided approach. Rather, we should view such mistakes as learning opportunities for these AI agents and not as reasons to do away with them altogether. So next time you find yourself mocking the AI drive-thru system for giving you a medium drink instead of the large you ordered, remember: having a little less Diet Coke to drink might be better than chewing on human fingers!

Did you enjoy today's article?

Login or Subscribe to participate in polls.