• FryAI
  • Posts
  • AI Agents As Moral Agents (Part 2/2)

AI Agents As Moral Agents (Part 2/2)

Welcome to this week’s Deep-Fried Dive with Fry Guy! In these long-form articles, Fry Guy conducts in-depth analyses of cutting-edge artificial intelligence (AI) developments and developers. Today, Fry Guy dives into the ethical considerations surrounding AI agents. We hope you enjoy!

*Notice: We do not receive any monetary compensation from the people and projects we feature in the Sunday Deep-Fried Dives with Fry Guy. We explore these projects and developers solely to showcase interesting and cutting-edge AI developments and uses.*


🤯 MYSTERY LINK 🤯

(The mystery link can lead to ANYTHING AI-related. Tools, memes, and more…)

Made with Grok

Emerging visions for AI doctors, lawyers, educators, autonomous cars, and more are coming to fruition, raising a huge concern for the general public: How should these AI agents handle the moral dilemmas they will inevitably face?

In Part 1 last week, we discussed current approaches for programming AI agents as moral agents. We looked at three approaches:

  1. Prohibit moral stances: When faced with a moral dilemma, the AI agent will make no decision at all—a decision in itself—which may lead to an atrocity.

  2. Allow the training data to determine moral stances: When faced with a moral dilemma, the AI agent may go “rogue,” based on specific nuances and inevitable inconsistencies within the training data, possibly leading to atrocities.

  3. Implement personal/company moral stances: When faced with a moral dilemma, the AI agent will decide based on subjectively chosen training data and safeguards, leading to a moral bias issue.

Reflection on these approaches revealed many of the challenges associated with morally safeguarding these agents and underscored the severity of the conversation. We concluded that the most promising approach may be to safeguard these AI agents with general, widely agreed upon moral safeguards. In this essay, we will look at what such safeguards might look like and how implementing AI agents might not be as big of a problem for morality as we might initially think.

A GENERAL APPROACH

In March of 2024, the United Nations signed the first “Global AI Resolution.” This effort was led by the United States and China, along with 120 other countries. The resolution agreed that the two most important parameters for safeguarding AI models are to “…protect human rights and monitor for potential risks.” This pattern has been echoed throughout statements of all kinds from global leaders as well as by major tech players. At the end of the day, the two goals of nearly all regulatory measures and agreed upon safeguards seem to be to 1) protect human rights and 2) mitigate potential risks. As a result of this overarching agreement, these principles seem like a good place to start.

According to many major governing councils, AI agents should be programmed to protect human rights and mitigate potential risks. At face value, this seems agreeable, right? But what do these fancy creeds actually look like in practical settings? What does mitigating risk and protecting human rights look like for the autonomous car which is choosing whether to protect the driver or the pedestrian, or for the AI HR agent who has to choose between disclosing private information from the company and dissolving a violent threat from a customer? These are the questions that most people like to raise when it comes to ethics, especially ethics of AI agents. However, it seems these types of cases represent the wrong starting point.

Most acts we classify as “moral atrocities” are not these types of complex, no-win situations. These situations are incredibly difficult, and oftentimes we have no clue what to do in such situations. If this is the case, then, we should not use AI’s struggles to handle such situations as a reason not to use the technology. Rather, we should view it as a reflection of our own inability to navigate such complex moral dilemmas.

What seems relevant to problems of moral bias and atrocities, then, are not the difficult cases but the easy cases, such as whether to hurt an innocent person or treat them respectfully. To help AI agents handle such cases, it may be worthwhile to program subsidiary principles to prevent obvious atrocities (e.g., do not murder/torture innocent people, don’t insult people, etc.). We might, in this way, view respecting humanity as a cluster value and tell AI, all else equal, respecting humanity involves willingness to help them develop rationality, promote happiness, protect life, and things like that. If we find a way like this to implement the general moral principles into foundational models, it would help handle a bunch of easy moral cases for AI agents, such as guiding an AI HR agent how to engage in dialogue with an angry customer or guiding an AI-driven car to stop on the road when a person is crossing, even if they are not in a crosswalk.

It seems the measure for whether an AI agent should be implemented into society should not involve a standard of perfection (as we sometimes seem to promote) but rather to compare AI agents to the alternative: humans. When it comes to morality, why should we compare AI to a standard of perfection that is unachievable by even the most competent humans currently in that role? The standard should certainly be high (and maybe higher in some roles than others), but we ought to hold AI to the same high standard we hold humans to—asking anything beyond that is unrealistic. What seems to matter for AI agent morality, then, is not whether AI agents handle difficult moral cases well, because nobody does. Rather, what matters is whether AI agents can handle easy moral cases more consistently than the humans in similar roles. And it seems that if safeguarded with general moral principles, AI would be able to do just that. Unlike humans, AI is not prone to temptations, biases, negative emotions, and selfish influences like many humans.

So maybe you have gotten onboard with the idea that if AI agents can handle easy moral cases based on general principles, there may be promise for their handling of moral situations. But there seems to be something more to say here about difficult moral cases. At the end of the day, we don’t just want AI agents guessing when it comes to complex moral dilemmas. Even though humans may make controversial decisions in such situations, at least we tend have a level of trust in the human intuition or rational capabilities. After all, we can trust that humans at least understand the weight of complex moral situations and tend to have feelings of empathy and compassion—emotions not felt by AI bots.

In response to this concern, it may be worth reflecting on how we tend to make moral decisions in difficult cases. When we don’t know what to do in a difficult moral situation, we often try to weigh our options and deliberate about the best course of action. Oftentimes, we don’t have enough time or brainpower to consider all the relevant facts and end up making a choice regardless. That can lead to regret, where we think, “I wish I would have thought about that relevant factor or another.” But AI agents are able to take in much more data than we are in any given moment, and at a much more rapid pace. What would take us many minutes to contemplate, AI can assess in less than a second. By giving AI general moral safeguards, the model might actually perform better than humans in difficult, pressurized, and oftentimes time-constrained dilemmas where AI can take into account a plethora of considerations we never could.

So there might be moral promise in hard cases for AI that surpasses our own ability to deliberate about moral choices. In such cases, it might not make a perfect choice, but again, do we even know what that is? If the foundational models are safeguarded by general moral principles to mitigate risks and protect human life, then when faced with complex moral dilemmas, these agents will make the most rational choice that does just that … and what else could we ask for?

WHERE DOES THAT LEAVE US?

AI agents are infiltrating society, and their decisions are being guided by safeguarded training data. As these AI agents are plugged into decision-making roles in our society, they will inevitably face dilemmas with moral weight.

In this two-part series, we haven’t tried to answer all the moral questions surrounding AI agents or even give a perfect solution to the problem. Rather, our goal has been to get you thinking about the severity of this issue and introduce you to some of the current approaches to guiding these AI agents in moral decisions. This is a problem many people would like to push away or avoid altogether, but it is one we believe is here to stay. As a result, the best thing we can do is try to responsibly leverage the positive aspects of AI in the moral life while mitigating harmful biases and moral atrocities.

Next time you interact with an AI agent, pause to consider that it operates based on underlying moral programming. On one hand, it may be a saint, but on the other hand, it may not be as moral as it appears.

Did you enjoy today's article?

Login or Subscribe to participate in polls.