sashank_1509
I have played with it for 20 minutes and here’s my review:

1. The low latency responses do make a difference. It feels miles better than any other voice chat out there.

2. Its pronunciation is excellent and very human like but it is not quite there. Somehow I can tell instantly that it’s a chatbot, it feels firmly in the uncanny valley.

3. On the same note if I was on call and there was a chatbot on the other side of the call I can instantly tell. It’s a mix of the voice with the way it responds, it just does not sounds like a human talking to you. I tried a bit to make it sound more human like, asking it to stop trying so hard in conversation being briefer etc but I wouldn’t say it made things better

And so my final review is, it is a big achievement over anything out there, nothing else comes close but it is like video game console graphics. You can instantly tell it’s not the real thing and because of that I find it harder to use than just typing to it.

tkgally
I got access to the Advanced Voice mode a couple of hours ago and have started testing it. (I had to delete and reinstall the ChatGPT app on my iPhone and iPad to get it to work. I am a ChatGPT Plus subscriber.)

In my tests so far it has worked as promised. It can distinguish and produce different accents and tones of voice. I am able to speak with it in both Japanese and English, going back and forth between the languages, without any problem. When I interrupt it, it stops talking and correctly hears what I said. I played it a recording of a one-minute news report in Japanese and asked it to summarize it in English, and it did so perfectly. When I asked it to summarize a continuous live audio stream, though, it refused.

I played the role of a learner of either English or Japanese and asked it for conversation practice, to explain the meanings of words and sentences, etc. It seemed to work quite well for that, too, though the results might be different for genuine language learners. (I am already fluent in both languages.) Because of tokenization issues, it might have difficulty explaining granular details of language—spellings, conjugations, written characters, etc.—and confuse learners as a result.

Among the many other things I want to know is how well it can be used for interpreting conversations between people who don’t share a common language. Previous interpreting apps I tested failed pretty quickly in real-life situations. This seems to have the potential, at least, to be much more useful.

(reposted from earlier item that sank quickly)

throwaway13337
I'm in europe and was able to accesss the feature with a VPN.

Surprising that there isn't a 'hey siri' for chatgpt yet. Obviously, that would make this sort of feature infinitely more useful. This is what monopoly gatekeeping looks like.

The limitations in this feature show the problems with both EU proactive regulation and US underregulation.

Bad regulation has become the biggest issue standing in the way of useful software for humans.

modeless
I tried asking it to practice Chinese with me. It claimed to be able to identify tones. I tested it by using the wrong tones on purpose and it said my pronunciation was "really great". Seems like it just praises you no matter what you do.
kanwisher
Its pretty amazing, if you are curious just try asking it to do a live translation with a friend that speaks another language, its realtime and very seamless
m3kw9
Some review bullet points:

1. It's a bit too agreeable, example: "thats an excellent point" etc every single time.

2. It understands surprisingly well. example: from experience, when I explain something vaguely, my expectation is that it would not understand, but it does most of the time. It removes the frustration of needing to spell out in much more detail.

3. It feels like talking to a real person, but the way the AI talks in a sort of monotonic ways. Example: it would respond with similar tones/excitement every time.

4. Very useful if you need to chat but doesn't want to chat with humans about some subjects like ideas, and explainations.

mnicky
Interesting that the release comes a day after Google's new models [1]. Seems a bit like strategical timing :) Maybe they waited until some of the competitors release something so that they can upstage his release with theirs?

____

[1] Which, btw, I think deserve better sentiment. On benchmarks, the new Gemini Pro seems to be better than GPT-4o. It's just not so hyped...

martypitt
> Advanced Voice is not yet available in the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein.

That's disappointing. I wonder if it's related to legal issues, technical issues, or just doing a phased rollout?

riwsky
If "speaks faster, doesn't mind being interrupted, and still will happily spout bullshit in multiple languages" is what defines advanced voice mode, then that must mean humanity's advanced voice mode is "being from New York".
ionwake
[flagged]
delilahnoah
[flagged]
starfezzy
I was hoping to find a voice like Microsoft's "Guy" (a name, not referring to the gender) or Google Assistant's "Pink". An unambiguously white, masculine, American "radio voice" or "audiobook narrator" voice.

ChatGPT describes this as "A rich, deep, and smooth tone that is pleasant to listen to for extended periods. This often comes from good control over pitch and timbre, creating a voice that resonates well."

If you watch youtube, voices in this theme are the Pirate Software guy, and the voice of The Infographics show.

There are similar voices for every gender, race, and nationality. As an American, Morgan Freeman comes to mind as a comfy black, masculine narrator voice.

All this is to lead up to my point that companies engage in a meticulous science when deciding who should voice roles, and especially when the product itself is literally just a synthetic voice and they near limitless capacity to shape it.

With that in mind, here are the voices that OpenAI wants us to hear:

Breeze: ambiguous gender, white, feminine

Juniper: female, black Maple: female, white Spruce: male, black, masculine Arbor: male, Australian, masculine Sol: female, white Ember: male, black, less masculine Cove: male, Sal Khan, less masculine Vale: female, British

The only one that could be considered a narrator/radio voice is unambiguously black (great if that's your preference). It just seems weird that they would intentionally exclude a masculine white male, and that sucks because those are always my preferred voices when I'm looking for audiobooks or choosing a computer voice. It sucks in particular because OpenAI is not staffed by dumb people—this exclusion was intentional, and that's obnoxious.

My last note on the Advanced Voice feature is that it makes my phone HOT within a few seconds, which will limit it's usefulness on sunny days when I need hands-free use the most while the phone is mounted to my dash. This is when the device is already liable to overheat (display forced to dim, lagging due to shutting down CPU cores, and in the worst case the phone shutting off and refusing to work until it gets cooler).