garote: (Default)
[personal profile] garote
After I took that brief dip into Stable Diffusion, my friend Mr. Breakpoint began exploring a similar tool called Midjourney.

He was looking for a creative outlet after a bad breakup, and found a worthy distraction in learning how to wield the keyword prompts and other widgets of the Midjourney interface. There is apparently always a learning curve to these things, because there is some skew between what the person at the controls has in mind, how the software interprets the prompting, and whatever assumptions got baked into the data it was trained on. I'm going to be exploring that in a set of posts, because he made some really interesting images.

His first project was a series of national space program critters. This is "Soviet Space Hippopotamus":



You'll notice immediately that the quality is way better compared to something generated for free on the Stable Diffusion demo site, because Midjourney is a commercial product and has been "trained" a lot more, using a lot more images.

By default, Midjourney presents you with four attempts at rendering a result, based on different random number "seeds." You can then pick one and ask it to refine the image various ways. That's how Breakpoint got the relatively polished image above. Then just by adding the keyword "general", he got this:



Once you've got a bunch of settings dialed in, it's fun to let it ride, with minor tweaks. Here's "Chinese Dragon Taikonaut":



This image was picked from several and refined, but aside from that, just swapping "soviet hippo" for "chinese dragon" in the keywords resulted in an image that was both deceptively novel, and deceptively similar in composition and style. Just as easily as the idea comes to you to make a series of themed images, the images can be manufactured.

Also, while there are plenty of details in the image that are screwy - like the way the dragon's head doesn't fit in the helmet, or the position of its tail - it still makes a handy template for an experienced artist to retouch. And so we stumble again, directly into one of the imminent consequences of generative art that I pointed out in the post about Stable Diffusion: The idea problem.

An artistic process can be described as a rough combination of two things: Ideas and execution. If you hired an artist to generate a series of space-agency-themed animals, you would be contributing the initial spark of an idea, but the artist would then go on to personally iterate on that idea, sometimes with execution, by making a few rough sketches for example, or with ideas, by examining photos of animals or perhaps reading about the various space agencies. Whatever the approach, the artist would be aware of the external sources they were drawing from, and would generally produce all the iterations from scratch, acting on a learned instinct based on all the other art they've seen or produced. Unless they were unscrupulous - or just very unlucky - the result of their instincts and their research would be a piece of art made to your specifications that is also unique.

But now we introduce generative art, and let's assume that the artist who uses it does not have any idea how the generative software was trained.

You give the artist the initial idea, like before. The artist then turns that idea into a few carefully worded sentences, and dumps them into the generative AI. It cranks out a dozen or so polished-looking pieces in a range of styles. Now the artist picks a few, maybe tweaks them, and presents them to you as drafts. You pick the one you like, ask for a few changes which the artist gladly applies, and the work is done. You get your art, and the artist probably put in one tenth of the work creating it for you.

Before, the execution of the art could be laid squarely at the feet of the artist you hired. They did the sketches, they did the iterating, they did the tweaks. With the generative art, the artist has taken a step back from the canvas, and most of the execution was done by the software. That by itself could reasonably be welcomed as progress, by artists and commissioners of art alike. But here's where it all goes wrong: In those iterations, who came up with the ideas?

Who researched the animals and turned them into stylized forms? Who researched the space agencies and came up with appropriate clothing, props, and color schemes? Who came up with the idea to compose them in profile, facing right, with those facial expressions?

Okay, suppose you have to tell the machine about the composition. "Make the characters all face to the side, visible from the waist up, with their helmets open, looking stoic." These are all things you would have to realistically supply to Midjourney, to get this series of animals. Here's another example, "French Space Kestrel":



Whose idea was it to put the Eiffel Tower in the background? Sure it's the cheapest way to make someone think of France, so it feels obvious, but nevertheless, whose idea was that?

The only sensible answer you could give, when you're talking about something created by processing thousands of pieces of human-curated art and photography into tiny cross-referenced pieces, is, "everyone who contributed anything, and possibly some people more than others."

And so, how in the hell do you fairly compensate those original artists for their work?

The true origin of an idea has now been obscured in a process so interconnected and complex that it's comparable to a weather system. Can we credit that butterfly flapping its wings in Detroit with causing that hailstorm in Chicago? Well, it probably contributed, but ... how? How much?

When ordinary people began putting together videos of each other dancing to songs, and posting them to YouTube, the developers quickly realized that they had a copyright problem. They were redistributing copyrighted music without compensating the rights holders, and those rights holders - big powerful record labels - were annoyed. So they created software that would crawl through every newly posted video and detect the fingerprint of any commercially published music. The video containing the music would be tagged, identifying the artist, and the website would play an advertisement before or after it. The idea was that the revenue gained from the advertiser would be split and a fragment would be shared with the record company. So the record company turned a loss into a gain.

You could see a similar approach eventually working for any professionally produced generative art tool: If the tool was trained using art from a particular artist, that artist would be due a certain amount of compensation from the makers of the tool. That doesn't address the "weather system" problem - which may be impossible to address directly - it sidesteps it, by assuming that each artistic work used to program the tool has a certain value, supposedly worked out between the artist and the developer, and every time the tool is used to make artwork, everyone gets a cut of the profit according to the value they've negotiated, no matter what artwork was produced or what the prompts or settings were.

You need to account for the negative training effect I mentioned in the Stable Diffusion post. So, for example, it wouldn't be fair to compensate just the people who drew foxes when you generate your latest animal iteration, "Canadian Space Fox":



That seems like one possible solution that could be implemented after a lot of legal dust has settled. It also has flaws.

For example, here's one related to training: If you were a company producing training data, you could get permission from a pile of artists to submit their work according to some compensation terms you negotiate. Then, you could tell the people assigning descriptions to images to label them not just according to the original artist, but according to artists that have produced similar work, but didn't submit any material. The outcome would be a generative art tool that you could order to create, for example, "a sketch in the style of Bernie Wrightson", even if the training images didn't use a single sketch by Bernie Wrightson.

Would that be fair to Bernie Wrightson? Well, according to current legal precedent, yes. You can't patent a style, which is what you would effectively be doing if you prevented other people from creating works that resembled yours from fresh material. (I mean, I'm sure that Disney would absolutely do that if they could. They would patent the very idea of singing cartoon animals.)

But what would happen to Bernie Wrightson's career? Would he go from being established enough that people want to name-drop him in their training data, to suddenly never working again?

Hard to say. Often times in the art world, imitation means interest, which can be redirected to become opportunity, and no one knows yet what kind of tools the next generation of artists will use to stay relevant.

In the meantime, software projects like Stable Diffusion and companies like Midjourney operate in a Wild West environment. They built their data sets by dumping in truckloads of images scraped from sources that did not get the permission of the original artists to have their work processed in this way, and have so far been willing to do all kinds of talking and negotiating, short of actually deleting those data sets and starting over on more equitable ground.

But even if they did, the tools and the theory they've developed are already being distributed among people who don't care about compensation or copyright. People who are basically analogous to software pirates and hackers.

Profile

garote: (Default)
garote

June 2025

S M T W T F S
1234567
891011 121314
15161718192021
2223 2425262728
2930     

Most Popular Tags

Page generated Jun. 27th, 2025 12:59 pm