so a few days back, I discovered that with the new Flux v1.1 pro image gen model, if you use a prompt like "IMG_XXXX.HEIC", you get some extremely realistic images.
I spent a few hours playing around with it, trying out a ton of combinations. Ended up with a very large library of images and decided to make a quick website to share it.
Some notes and observations:
1. Using a prompt like "IMG_XXXX.HEIC" tends to yield the most realistic images, but most of these tend to be rather mundane images of landscapes, flowers, poorly shot cityscapes
2. Adding "IMG_XXXX.HEIC posted on Snapchat in [year]" yields more realistic, casual images of people. However, a lot of these look like screenshots, complete with the Snapchat UI. Most people also tend to be attractive.
3. Adding a [year] in the prompt yields some interesting images. Like [2017] will yield blurrier images than [2023]. Adding [2021] in the prompt got images with face masks and face shields
4. The prompt "[firstName] [lastName] selfie" gets real-looking selfies of real-looking people. You can use Indian, Hispanic, Chinese, European, American, etc. names and get realistic images of people with these ethnicities. Example: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/pc5JnhNRUEq...
5. There is a decently high failure rate. The ~750 images on the site are hand picked. I had to delete around 220 images for not meeting the criteria (not real enough) or being just bizarre
6. If this model is any indication, its soon going to be impossible to tell what's real online
Wow that's amazing, you really can't tell. What is it that gives most AI images that not-quite-real look? Is it due to airbrushed images in the training data, including cartoons, 3D renders and illustrations or all of that?
I was going to say that. Pretty crazy! The seatbelt looks more like a robotic arm. Yet at the back there is a curve and a back window. It's really a hybrid car-building. Wondering what the black bar at the top is about...
the model is too small to memorise that much, and papers usually check things like "closest sample from training set" so we know it's not memorising (unless it's an image that's hundreds of times in the training set like the Mona Lisa)
Why would it memorize some but not others if each contributes the same amount of information? E.g. the LAION dataset is about 240TB, while FLUX is about 10GB, so it could only memorize 0.00833333% of each image, not enough to reproduce it.
The vibe I'm getting, they try to make it look real by making it dull.
The problem with AI images is that they don't make sense and by making the image dull, it reduces the urge to make sense of it I guess.
AI images have very low fidelity. You know the "A picture is worth a thousand words" phrase? I think AI images fail on that because they are not an instance of a very complex system but a very concentrated subject if you know what i mean.
When someone captures a picture of a dog, that's actually a picture of a story about the past; That is, the surrounding environment is arranged in a way that you can tell what just happened moments ago or years ago. AI Pictures lack that story and that's why I think the dull images are easier to pass as real because they don't induce you to think about the moments ago.
Congrats, they do look good at first glance, without the usual overly shiny look, and the pure Nature images do look real to me. I'd really like to visit some of these places.
I guess the question is: Who cares? What is this for, except illustrating blogspam?
It seems that more resources are being poured into verisimilitude across generative models, but what is the business model or even human use case for it?
A picture of a glorious landscape seems worthless to me without any grounding to be able to ask a question like, "where is that?", "when do those flowers bloom?", "what is on the other side of that mountain?" and receive any kind of interesting answer.
1. Scams. Human deepfakes/fake-fakes to make you believe you’re communicating with a real person.
2. Self-image editing. You want a picture of yourself doing something you are unable to do. Could be benign, but very likely being used as a scam on social media in some way.
3. Marketing. Putting your product in some setting without having to do a photoshoot. People will argue this isn’t a form of scam, but it seems suspect to me.
Can't wait for an influencer that has thousands of travel pics, thousands of likes and comments, but they're all computer-generated.
And then for the influencer to try to get freebies from companies in return for a favorable review in her feed.
Also, asking AI for a generic image is a lot cheaper than paying for stock photos (do stock photo services still charge buyers a lot and pay photographers peanuts?).
HN's guidelines say "don't shit on other people's work". That guideline is heavily enforced sometimes, but goes out of the window at other times. It seems AI related stuff tends to be more on the "go ahead and take a dump" side of things.
I tried to use it to generate images when the alternative was costly or had a prohibitive licence. However it did not reproduce the mundane subject (German mailboxes, empty apartments without light fixtures, Berliner Altbaus) well at all.
I did use it to make friends laugh in our little group chat, but in this case AI nonsense was not a bug but a feature. Artefacts added to the humour more often than not.
I did plan to use it for deliberate art practice. It can generate unlimited subjects and let me see hundreds of variations of the same detail. It also lets me see the same concept in different styles. This could be useful.
By and large though, image generation seems to be a tool for spam and porn.
Even as a big supporter of image gen ai, I don't like realistic "photos". I remember hitting photorealism and proudly showing them to my wife and she goes "awesome, where is this?" Something about the fact that these were completely fake and not real absolutely sunk the value of them for me. It's a fraud - there's no history in the image, no true cultural value. It's not a capture of a place or emotion, it's a mimic. The architecture not real, the people never existed, the animals incorrectly visualized, the light doesn't behave as real light would etc etc.
I think the sweet spot currently is in obviously fake stylized images or illustration/3dish type stuff. I know artists reject that, but I think it will eventually become a paintbrush of sorts for them, not replacing them anymore than code gen ai replaces software engineers.
Another angle on it - I think there's something really wrong consuming fake images of real life but thinking they're real. It's like the visual equivalent of artificial cancer causing sweetener. I would want to know if the blog image I'm looking at is ai generated or not.
I wonder how far into generative AI will have people developing "intractable FOMO" where they see a beautiful place but then they find out they can't visit it because it doesn't exist.
Are we seeing it already from people who are getting irrationally angry about generative AI?
That is fair. I am not smart enough to shape it out but I wonder if there is a difference between going into a reality expecting it to be fake and being unknowingly presented a fake reality.
Maybe I am just weird but I have definitely had at least one situation where a book got its hooks into me and my brain would be thinking about events in the book like I would real events. I forget exactly what parts caught my brain but I had a few times where I had to remind myself the book is fiction when it would pop into my head.
I block a lot of this stuff on social media. I don't want to develop feelings about places or people that don't actually exist, I think it's psychologically unhealthy.
A bunch of the people images are very clearly AI though. I’d wager about 30-50% of them could be recognised as generated by people with a bit of understanding of how these models work.
nobody knows how these models work. Not even experts. These are black box algorithms where people understand these things in terms the analogy of a best fit curve in a series of data points. Outside of this analogy... nobody understands how generative AI works.
What made a model for a specific situation choose to generate a hand with 6 fingers instead of 5? Or 5 instead of 6? Nobody knows.
Even before generative AI was an issue, in the US you couldn't just tell a court "this is a photo, therefore it is evidence". There had to be witnesses would could testify about how that photo was produced, and the other side could cross-examine.
But yes, you're already seeing politicians caught doing or saying embarrassing things claiming it's all a deepfake, but in most prominent cases of this there were lots of witnesses who can confirm that yes, they did say that.
Until AI generated imagery has been tested by the legal system it may be a bit too optimistic to call this "Royalty-free, copyright-free gallery of images".
How different from a source image do these AI generated images need to be to be considered "copyright free"?
If I grab a series of photos from shutterstock, run them through a generative AI photo enhance process to improve the white balance, contrast and levels is that adequate enough to be considered "copyright free"?
> Until AI generated imagery has been tested by the legal system it may be a bit too optimistic to call this "Royalty-free, copyright-free gallery of images".
Although it's conceivable there's a surprise legal finding, companies like OpenAI and Anthropic are confident enough in how it will go that they are willing to insure you for any lawsuits, which would be ruinous for them if they consistently lost.
(One can certainly argue that AI means the law should change, but that's a separate question.)
> If I grab a series of photos from shutterstock, run them through a generative AI photo enhance process to improve the white balance, contrast and levels is that adequate enough to be considered "copyright free"?
No, just like it's not enough for me to grab a photo and change white balance and contrast. AI doesn't change anything here. Copyright infringement is generally tested by comparing the two works directly.
> How different from a source image do these AI generated images need to be to be considered "copyright free"?
The same way it's always tested: "substantial similarity"
The trick here is that OpenAI might win and won’t be liable for plagiarizing the NYT (or my own GitHub repo, as I observed with GPT-3.5). But if YOU ask ChatGPT for something about American politics for your blog, and it gives you a paragraph from the NYT, then YOU are responsible for the copyright infringement. Even though you of course had no clue it was plagiarized.
I do not think this is at all likely to happen with newer LLMs, but when GPT-3.5 spat out hundreds of lines of my own F#, verbatim, it certainly convinced me that this tech is too skeezy and unethical for me to use. I really don’t like the idea of playing Plagiarism Roulette.
> and it gives you a paragraph from the NYT, then YOU are responsible for the copyright infringement. Even though you of course had no clue it was plagiarized.
I had forgotten that, I think they had changed it a few months after ChatGPT launched in response. But you'll still be the one getting the angry letter. And this is a pretty shady exception IMO:
> This section governs your use of services or features that OpenAI offers on an alpha, preview, early access, or beta basis (“Beta Services”). Beta Services are offered “as-is” to allow testing and evaluation and are excluded from any indemnification obligations OpenAI may have to you.
> Although it's conceivable there's a surprise legal finding, companies like OpenAI and Anthropic are confident enough in how it will go that they are willing to insure you for any lawsuits, which would be ruinous for them if they consistently lost.
I don't think this is the right interpretation, at all.
They can act confidently about this because corporations can't go below zero; the downside of the bet is so far in the red, in that zone of bankruptcy, that it actually makes the bet work even if they internally believe that they're likely to lose in court.
>If I grab a series of photos from shutterstock, run them through a generative AI photo enhance process to improve the white balance, contrast and levels is that adequate enough to be considered "copyright free"?
Hard to say. If I had the generative AI copy the photo but change the time of day and angle then would it be copying?
What if I went to the same location and changed the angle and the time of day? Would that be copying?
AI is essentially "drawing" the same photo from a different time and angle. What if I did the same thing photorealistic-ally by hand in photoshop? Would I be copying if I painted the picture in the same way the AI did it??
I want you to consider what I'm doing here with my reply. I am admitting to a crime right now. What I have done with this reply is literally rip off different vocabulary words and certain short phrases from books all over the world and mixed up those words and phrases to produce the reply here. I am ADMITTING to copying those books.
You going to accuse me of a crime even if I admitted to it? No. But if I did the same thing with AI.... you going to accuse me then?
I spend an inordinate amount of time looking at plants and pictures of plants and the first image I was shown prominently displays a species very familiar to me. Only, it's subtly wrong, with weirdness along the edges. It jumped straight of the page and distracted me from the rest of the image. Oh well, real life sucks anyway
Still pretty easy to spot on some of these at least. The dog in a park, the collar is obviously off. The cyclist on a dirt trail, the bike is completely whackadoo. Anything with small, specific details.
Of course, in a few years (months?), models will get better at this, and even those tells will fade away.
Interesting note: my wife works with dogs, and I just took that photo over and asked her what breed she thought it was. Her first take was "it really just looks like an old mutt, I can't tell the marking colors. Maybe some Rottweiler in there?"
When I pointed out the collar she took a closer look, and then pointed out "oh, and look at its legs! They're way too short for its body!" Which, in retrospect, is the most hilarious error in the picture, hah.
There's also this LoRa for Flux that generates "1999 Digital Camera Style" images. Those are also very realistic in my opinion. It's probably due to the imperfections and how mundane the photo's are, that make them believable. https://civitai.com/models/724495?modelVersionId=810420
Wow. We have really opened Pandora’s Box. Imagine what political opponents and their supporters will be pumping out with this stuff. This makes me sad for my children’s futures.
I spent a few hours playing around with it, trying out a ton of combinations. Ended up with a very large library of images and decided to make a quick website to share it.
Some notes and observations:
1. Using a prompt like "IMG_XXXX.HEIC" tends to yield the most realistic images, but most of these tend to be rather mundane images of landscapes, flowers, poorly shot cityscapes
2. Adding "IMG_XXXX.HEIC posted on Snapchat in [year]" yields more realistic, casual images of people. However, a lot of these look like screenshots, complete with the Snapchat UI. Most people also tend to be attractive.
3. Adding a [year] in the prompt yields some interesting images. Like [2017] will yield blurrier images than [2023]. Adding [2021] in the prompt got images with face masks and face shields
4. The prompt "[firstName] [lastName] selfie" gets real-looking selfies of real-looking people. You can use Indian, Hispanic, Chinese, European, American, etc. names and get realistic images of people with these ethnicities. Example: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/pc5JnhNRUEq...
5. There is a decently high failure rate. The ~750 images on the site are hand picked. I had to delete around 220 images for not meeting the criteria (not real enough) or being just bizarre
6. If this model is any indication, its soon going to be impossible to tell what's real online
Mountain Sheep in Snow [1] looks fake to me - the sheep are more like dogs fading back to rocks, and have different scales.
Underground Time Display [2] looks obviously fake as the clock's colon is in the wrong place and the sign has fake writing.
[0] https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/ozjUKlysLql..., https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/XcNnWM6L54Q...
[1] https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/It9DhumKX7w...
[2] https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/GygiLirdHvq...
The problem with AI images is that they don't make sense and by making the image dull, it reduces the urge to make sense of it I guess.
AI images have very low fidelity. You know the "A picture is worth a thousand words" phrase? I think AI images fail on that because they are not an instance of a very complex system but a very concentrated subject if you know what i mean.
When someone captures a picture of a dog, that's actually a picture of a story about the past; That is, the surrounding environment is arranged in a way that you can tell what just happened moments ago or years ago. AI Pictures lack that story and that's why I think the dull images are easier to pass as real because they don't induce you to think about the moments ago.
However, details are still off, e.g.
* the guy you linked to apparently sits in a car, but the ceiling looks like a house (at least I've never seen a vehicle like that). Reversed issue with this guy: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/EeJBCnNsZG1...
* the bicycle guy sits in the air, and the bike is mutated in several places: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/WrfWYZlthe2...
* The face in Yoga in the field is distorted: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/BUcURAtyzjb...
* Hands are ok-ish but not yet solved: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/SeO8u2HZ-V2...
* Any text is obviously fake, which also affects urban environments. Agree with this caption: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/oc6eI5w2kQQ...
Bonus points for this portrait where the tower seems to have a face as well: https://d1l4k1vcf8ijbs.cloudfront.net/fakeimages/6ho0FIV-i2t...
It seems that more resources are being poured into verisimilitude across generative models, but what is the business model or even human use case for it?
A picture of a glorious landscape seems worthless to me without any grounding to be able to ask a question like, "where is that?", "when do those flowers bloom?", "what is on the other side of that mountain?" and receive any kind of interesting answer.
2. Self-image editing. You want a picture of yourself doing something you are unable to do. Could be benign, but very likely being used as a scam on social media in some way.
3. Marketing. Putting your product in some setting without having to do a photoshoot. People will argue this isn’t a form of scam, but it seems suspect to me.
And then for the influencer to try to get freebies from companies in return for a favorable review in her feed.
Also, asking AI for a generic image is a lot cheaper than paying for stock photos (do stock photo services still charge buyers a lot and pay photographers peanuts?).
I tried to use it to generate images when the alternative was costly or had a prohibitive licence. However it did not reproduce the mundane subject (German mailboxes, empty apartments without light fixtures, Berliner Altbaus) well at all.
I did use it to make friends laugh in our little group chat, but in this case AI nonsense was not a bug but a feature. Artefacts added to the humour more often than not.
I did plan to use it for deliberate art practice. It can generate unlimited subjects and let me see hundreds of variations of the same detail. It also lets me see the same concept in different styles. This could be useful.
By and large though, image generation seems to be a tool for spam and porn.
https://x.com/fofrAI/status/1841854401717403944
I think the sweet spot currently is in obviously fake stylized images or illustration/3dish type stuff. I know artists reject that, but I think it will eventually become a paintbrush of sorts for them, not replacing them anymore than code gen ai replaces software engineers.
Another angle on it - I think there's something really wrong consuming fake images of real life but thinking they're real. It's like the visual equivalent of artificial cancer causing sweetener. I would want to know if the blog image I'm looking at is ai generated or not.
Are we seeing it already from people who are getting irrationally angry about generative AI?
Maybe I am just weird but I have definitely had at least one situation where a book got its hooks into me and my brain would be thinking about events in the book like I would real events. I forget exactly what parts caught my brain but I had a few times where I had to remind myself the book is fiction when it would pop into my head.
The Apple VR googles salespeople's ears have just perked up...
But most of the images here are just mundane enough that they could have been taken by your average smartphone user
What made a model for a specific situation choose to generate a hand with 6 fingers instead of 5? Or 5 instead of 6? Nobody knows.
But yes, you're already seeing politicians caught doing or saying embarrassing things claiming it's all a deepfake, but in most prominent cases of this there were lots of witnesses who can confirm that yes, they did say that.
GeoGuessr-like sites are going to get trolled hard with AI photos of nonexistent locations.
Follow the trend line. You'll be seeing stories, movies and works of art better than humans in the not too far future.
How different from a source image do these AI generated images need to be to be considered "copyright free"?
If I grab a series of photos from shutterstock, run them through a generative AI photo enhance process to improve the white balance, contrast and levels is that adequate enough to be considered "copyright free"?
Although it's conceivable there's a surprise legal finding, companies like OpenAI and Anthropic are confident enough in how it will go that they are willing to insure you for any lawsuits, which would be ruinous for them if they consistently lost.
(One can certainly argue that AI means the law should change, but that's a separate question.)
> If I grab a series of photos from shutterstock, run them through a generative AI photo enhance process to improve the white balance, contrast and levels is that adequate enough to be considered "copyright free"?
No, just like it's not enough for me to grab a photo and change white balance and contrast. AI doesn't change anything here. Copyright infringement is generally tested by comparing the two works directly.
> How different from a source image do these AI generated images need to be to be considered "copyright free"?
The same way it's always tested: "substantial similarity"
https://en.wikipedia.org/wiki/Substantial_similarity
I do not think this is at all likely to happen with newer LLMs, but when GPT-3.5 spat out hundreds of lines of my own F#, verbatim, it certainly convinced me that this tech is too skeezy and unethical for me to use. I really don’t like the idea of playing Plagiarism Roulette.
No. They indemnify you for from copyright lawsuits that arise for your use of ChatGPT output. https://openai.com/policies/service-terms/
> This section governs your use of services or features that OpenAI offers on an alpha, preview, early access, or beta basis (“Beta Services”). Beta Services are offered “as-is” to allow testing and evaluation and are excluded from any indemnification obligations OpenAI may have to you.
I don't think this is the right interpretation, at all.
They can act confidently about this because corporations can't go below zero; the downside of the bet is so far in the red, in that zone of bankruptcy, that it actually makes the bet work even if they internally believe that they're likely to lose in court.
Hard to say. If I had the generative AI copy the photo but change the time of day and angle then would it be copying?
What if I went to the same location and changed the angle and the time of day? Would that be copying?
AI is essentially "drawing" the same photo from a different time and angle. What if I did the same thing photorealistic-ally by hand in photoshop? Would I be copying if I painted the picture in the same way the AI did it??
I want you to consider what I'm doing here with my reply. I am admitting to a crime right now. What I have done with this reply is literally rip off different vocabulary words and certain short phrases from books all over the world and mixed up those words and phrases to produce the reply here. I am ADMITTING to copying those books.
You going to accuse me of a crime even if I admitted to it? No. But if I did the same thing with AI.... you going to accuse me then?
Of course, in a few years (months?), models will get better at this, and even those tells will fade away.
Interesting note: my wife works with dogs, and I just took that photo over and asked her what breed she thought it was. Her first take was "it really just looks like an old mutt, I can't tell the marking colors. Maybe some Rottweiler in there?"
When I pointed out the collar she took a closer look, and then pointed out "oh, and look at its legs! They're way too short for its body!" Which, in retrospect, is the most hilarious error in the picture, hah.
Why actually travel when you can easily generate photos and videos of yourself doing all sorts of things in all sorts of places?
I can't even tell what's real and what's AI. I'm actually more likely to assume an authentic image is AI!
This is such a depressing but accurate take on travel.
At least if it's true I can finally enjoy everything without the crowds of selfie sticks.
- Enhance search capabilities
- Add a paid API
- Add a whole lot more images :)
Other royalty free image APIs out there like Unsplash[1] have difficult licensing terms. AI is poised to disrupt this space.
[0]: https://www.lummi.ai [1]: https://unsplash.com/developers