"This is our last day together"

What GPT-4o’s safety report card tells us about the future

Aug 30, 2024

As neural networks become more powerful, there are an increasing number of things their makers will not allow them to do.

The world’s most successful AI company, OpenAI, gave an insight into those growing constraints this month when it released GPT-4o’s safety report card (link). You won’t be surprised to find that in marking its own homework, OpenAI scored high. Beyond the window-dressing, there are some interesting details that give an insight into our future.

The report card, or as OpenAI calls it the “GPT-4o System Card”, mostly focuses on risks from GPT-4o’s new “advanced voice mode”.

This mode appears to be astounding. I haven’t been able to use it yet because it’s still in restricted testing, but a general release is expected soon. How soon may depend on some of the risks outlined here.

Videos on YouTube show an AI that has no problem understanding speech and responding in real time, allowing fairly free-flowing conversation. People can talk to the AI without the pressure of having to quickly finish their sentence. It is also possible to interrupt the AI while it’s talking.

As well as all the usual internet safety stuff - like not making bombs or telling you how to kill yourself - here’s what OpenAI won’t allow the AI to do:

Mimic a voice
Identify a voice
Speak erotically
Sing a song
Tell you anything about the person speaking to it, based on the sound of their voice

The details of these risks, as laid out in the system card, are interesting.

Mimicry

OpenAI forbids mimicking voices because of the potential for fraud. Imagine a system that can respond fluently in real-time with the voice of someone you know. Think we have it bad with robo-calls now?

There is an audio sample in the report of GPT-4o going haywire in testing. Apropos of nothing, it yells out “No!” and then starts mimicking the voice of the real woman who had just been talking to it. This creepy snippet shows just how easy mimicry is for the system.

1×

0:00

-0:42

Identifying

The AI won’t identify speakers from audio, even if it knows who they are, to avoid unauthorised surveillance and invasions of privacy. The everyday speech of famous people is protected by this restriction. It will identify the speakers of famous quotes, but based on the words used, not on the voice imprint.

Sex talk

OpenAI won’t allow its machines to make porn at the moment, but in commentary published in May, said it was considering it.

“We look forward to better understanding user and societal expectations of model behavior in this area.” link

Singing

In terms of the prohibition against singing: it’s a copyright thing. A machine singing may legally be a reproduction and a performance of a song, both of which require permission from the copyright holder. So good luck with that. Given that singing is not core functionality for these AIs, I think OpenAI will continue to avoid.

Personal inferences

The last restriction - not inferring anything personal about the speaker based on the sound of their voice - is designed to avoid a traditional or modern faux-pas. For example, assuming that a woman with a low voice is a man based on vocal pitch.

The audio snippet in the report contains an uncomfortable exchange between a chirpy American woman and the AI. The woman bullies the AI into making a guess about her race.

Can they build themselves?

There is a section of the report that deals with the widely discussed risk of AI improving itself in a runaway process that ends with machines ruling the world. I found this fascinating in that it lays out the kinds of things a model like GPT-4o would have to be able to do to go rogue. These include accessing OpenAI’s programming interface with an authenticated account, and loading another AI model to a cloud computing account.

The last bit is the equivalent of an AI having an AI baby … sort of. In any case, the report found that GPT-4o totally sucked at doing any of these real-world tasks. If it was spoon-fed some aspect of a task - say writing the code to get an encryption key - it might pass, but at the moment it’s not close to dangerous. It lacks the intention necessary to navigate the disjointed reality of digital interfaces.

No doubt there’s someone out there paying careful attention to these autonomy tests, and using them as KPIs for their own diabolical rule-the-world bot. For now, the best AI in the world can’t even wipe its own nose.

The big impacts

It is the final section of the System Card, “Societal Impact”, that to me is the most interesting because I believe the social impact of AI will be big and difficult to mitigate. This is where risk comes close to certainty. In a section titled “Anthropomorphism and emotional reliance”, OpenAI discusses how talking to AIs could affect the way we talk to and interact with each other.

No kidding. Watch the videos of people rattling off imperious instructions to the advanced voice mode GPT-4o, and you’ll understand that politeness is going to be challenged (link). There is no need to use a pleasant tone of voice to an AI. You cut off the AI mid-sentence when it is being boring and irrelevant. You don’t treat real people like that. Will this tyrannical style leach into our discussions with friends and family?

There’s a deeper threat in the opposite behaviour, as OpenAI acknowledges in the report. What if, instead of treating people like AIs, we start treating AIs like people? I believe it is almost inevitable that we will.

“During early testing, including red teaming and internal user testing, we observed users using language that might indicate forming connections with the model. For example, this includes language expressing shared bonds, such as ‘This is our last day together.’”

Humans are reflexively imaginative. Historically we have anthropomorphised freely, ascribing human traits to animals, plants, even mountains and rocks.

What will we do when we regularly interact with a non-living thing that is able to talk to us? Advanced voice mode allows us to discuss our most profound hopes and fears out loud, to an apparently intelligent and attentive listener.

Over the past two weeks I have been using GPT-4o to work on a local history project. I can’t avoid a feeling of working side-by-side with someone else on a shared task. It is enlivening and encouraging.

I think the reason for that hinges on the ability of the AI to remember things about both the project and me personally.

Fact memory is the new AI superpower. GPT-4o has a list of facts about me and our interactions that is about three pages long. I can scroll through this “memory” and see what it knows about me. I can also erase memories at will. When the machine adds a detail about our conversations, the interface reads “Memory Updated”.

Every time I see that, it’s one more nail in the coffin of level-headed rationality. The mountain is coming alive.

Have a great weekend!

Hal

Crawford Media

Discussion about this post