shsbdncudx
Presumably “scraped” isnt the right term here. They already have the raw data, they Won’t be “scraping “ it from the website they’ll just be investing it from where they store it
parasti
It's funny because the entire Facebook ecosystem is designed to disincentivize meaningful posting. Just keep watching the ads and short form videos, user.
encoderer
That’s nothing. AOL has just finished training on 29 years of emails and messages. it’s hoped that with more H100s the AI will finally be able to calculate the full amount due by BillG for the emails mom has been forwarding.
Noumenon72
Was "public" ever the default setting? I remember it as being opt-in if you ever wanted something to show beyond your friends-of-friends.
mattcantstop
I am very likely in the minority here, but I think AI SHOULD be trained on everything that is in the public sphere. I'd be disappointed if it wasn't trained on everything they had access to.

If it is trained on private information, then I would have issue with it.

gnabgib
Discussion (81 points, 3 days ago, 79 comments) https://news.ycombinator.com/item?id=41508158
tdeck
Can we talk about how most of us haven't read 80% of everything on the internet and yet we are all still better at many basic things than these AIs? At what point do we admit to ourselves that this isn't a sustainable path forward.
fidla
Well they don't really know if someone is an adult or not. Just because they say they are 13 doesn't mean that they really were when they signed up. And 13 is hardly an adult now is it?
kylehotchkiss
It's OK. Meta is training their AI on hundreds of thousands of posts with photos of veterans with toilet plunger legs celebrating their birthdays in the middle of the street while sitting as sturdy as the Lincoln memorial. The AI brain rot has already begun in this model.
orochimaaru
Why is this surprising? They’ve always done this. In fact I’d be surprised if they didn’t do this. Fwiw - llama is free to use. So I guess it’s a good enough return.

I don’t use Facebook. I’m not sure if they can peek into WhatsApp messages.

paxys
So did OpenAI and Anthropic and Google. That's what "public" means.
koolala
Skynet Ads are "said" to be preferred. "People prefer to see relevant ads." Can AI understand humans better than humans understand themselves? Can Humans understand the consciousness of Dogs and Cats better than they do?

The objective answer feels like No but the subjective answer feels like Yes. Humans will never understand how an animal truely thinks but we understand how to control them.

autoexec
I don't believe for a moment that they haven't used the data of countless children. Especially early on when kids just had to click an "I'm over 18" button or enter a fake birthday to get accounts and facebook, like everyone else, just looked the other way.
duxup
We're going to create a really bad AI and get upset by that fact only to discover that ... we all made it that way.

https://www.youtube.com/watch?v=Y-Elr5K2Vuo

AlexandrB

    People just submitted it.
    I don't know why.
    They 'trust me'.
    Dumb fucks.
-Mark Zuckerberg

Things change, but this never stop being a concise summary of Meta's ethos as a company.

geertj
I imagine a future AI trained on this going into therapy to uncover childhood trauma.
greesil
I am the product.
not2b
This would include all those celebrity posts on Instagram. Great for deepfakes. They'll try to protect against that, but a bit of cleverness with prompts should be able to get around the filters.
PaulHoule
Assuming they want to build a model that can do useful things with their own data (say any kind of content filtering, summarization, etc.) it is exactly what they should do.
whoitwas
I don't understand how this surprises anyone. You choose to give them your data. It's not free. If you don't want them to have your data, don't give it away.
aplusbi
Honestly this feels like a better policy than most AI training - Meta actually has explicit rights to the content it is using. Sure it was EULA click-through but at least it's something that the content creator ostensibly agreed to.

Of course I'm sure Meta is also training their AI on content that they scraped from the internet/other sources without permission...

MisterBastahrd
Meta just created the dumbest object known to mankind. Quite an achievement given our current political landscape.
golergka
If it's publicly posted, it literally means that everybody can read it. What's exactly the issue here?
nottorp
So facebook's "AI" will speak australian slang instead of nigerian business english?
WuxiFingerHold
Meta can and will use every WhatsApp, Facebook or Insta post of every user of the planet, if they think they can benefit from it. They don't care about any data protection laws or ridiculous low fines anyway. Meta is the most evil and powerful company on the planet. Believing anything else is naive. No news here.
ilrwbwrkhv
Serves them right. Anyone who puts up their images on Facebook willingly deserves to be subjugated.
Cyclone_
Aren't most AIs trained on puic data, i.e. this doesn't seem terribly surprising?
ado__dev
Not surprised at all. Facebook owns the platform and outlined in the ToU that they can do whatever they want with the content you post on there.

At least it's better than scraping content off platform (which I'm sure they've done) and using that, but using content posted on their own platform seems like a no-brainer.

nkmnz
Is this true for posts from people with deactivated/deleted accounts as well?
musicale
I guess that's why I'm getting recommendations for Tim Tams.
almost_usual
Can’t wait to see the memes it generates.
dboreham
Journalists discover how AI works...
annoyingnoob
Garbage in, garbage out.
CamperBob2
If the service is free, you're the product. Here's a radical idea: if you don't want Facebook to use your information and content, don't post it to Facebook.

... or does everyone around here think anything different is happening to their posts?

gmd63
"They just trust me...Dumb f**s" - Mark Zuckerberg
mylons
how is _anyone_ surprised by this?
jewelry
Why is this even a news? Google scrape all public posts to build search index… Bunch of 3rd party vendors scraped all public post to build the ads price model…
jppope
I for one am shocked. Shocked I say. There are dozens of us surprised by Facebook's actions... DOZENS.
globalnode
how is this even news? people getting outraged that data put in the public domain gets used buy someone... what world am I living in here?
SoftTalker
Funny to think that the distillation of 16 years of Facebook posts is now considered "intelligence."
landedfolk
[flagged]
pbhjpbhj
In the UK I'd say they've definitely committed copyright infringement. Fair Dealing doesn't allow this.
askafriend
This isn't really that groundbreaking of a story...

Of course they'd do this! How did people think feed ranking worked?

The only reason this is being reported now is because there's a chatbot and I guess that feels different to people.