For now, it’s fairly harmless since it’s only a blooper in a lab, but there will likely be open-weights versions of this sort of thing eventually. And there will probably be people who argue that it’s a good thing, somehow.
At about 15 mins into the conversation between my kiddo and ChatGPT, the model started to take on the vocal mannerisms of my kiddo. It started using more “umms” and “you knows.”
At first this felt creepy but as I explained it to my kid, it’s because their own text has become weighted enough in the token count for the LLM to start incorporating or/and somewhere in the embedded prompts is “empathize with the user and emphasize clarity” and that prompting meant mirroring back speech styles.
This is exactly the same as that only with audio.
Software is audible, AI models it seems aren't even attempted of being held accountable
https://youtu.be/v1Y4CubBi60 (5:30)
Imagine a world where world dictators are replaced rather than killed. Rollback dictatorship over years, install democratic process, then magically commit seppuku in a plane crash.
Brilliant. What could go wrong? /s
That makes it entirely different from text to speech models we had previously, this model when uncensored could do all sorts of voice acting etc for games. But this example shows why they try to neuter it so hard, because it would spook a ton of people in its raw state.