AI art programs are taking the internet by storm. But can the products of GANs like DALL-E or Midjourney ever be more than a cool tech demo? Where is the art in AI art?
God, it's hard to write about being bored. It's boring to write about boredom. It's boring to read boring writing about boredom. Definitions are boring. Citing how Webster's Dictionary defines art would bore me to tears. I might just lay down and take a nap right here on this park bench, right now, if you try to tell me you've got a new rigorous definition for "comics" or for "painting" or for "movies" that draws a line here and here and here. I cannot see it. I am closing my eyes. There are birds chirping and I am listening to their song over the thrum of the nearby freeway over the ridge, and tuning out a century's worth of rehashed arguments about "artistic intent". Oh! Oh look there! A dun rat, nervous and bold, rushes across the macadam and up the mossy stone slabs that retain the wildflower hill slumped behind these benches. What were we talking about again?
No, listen, here's the thing.
I don't think I need an a priori ex nihilo de jure signed sealed and delivered definition of what makes "real" art in order to express delight or derision. If you asked most of the people writing about AI Art to tell them whether a dun rat scurrying across the macadam while birds sing in the trees overhead is beautiful, they would start stammering and crying and wringing their hands, I think. They would tell you they couldn't possibly do that, not without first establishing a definitional framework for beauty and deciding whether the rat falls within the pale of the definition's reach or not. My word here's another one scurrying around! Quick, quick: Miriam-Webster defines beauty as-
Writing about a field like AI Art, where the criticism is scant on the ground, and mostly not very good, has its plusses and minuses. Plus: I get to pretty much develop my own critical trail. Minus: I still feel like I have to wade through the bog of existing not very good work just to make sure I haven't missed something. This part is a real chore. Arguments get repetitive, and chains of citation get incestuously reliant on the first page of Google results, depressingly quickly. Take a drink every time you hit:
The Human Spirit of GeniusArtistic IntentMisquoted Walter BenjaminDefining "Intelligence"The Robots Will Replace UsMisinterpreted Marcel DuchampAI Art Is Just So Easy, Just Trust This Marketing Fluff!
If you do play this game, maybe leave that last one out, actually, for the sake of your liver.
I don't know how intellectually rigorous it is, but I've developed a sense of when I'm chasing my own tail. I find myself going back and forth over some of these topics--well what DID Benjamin really mean? IS there a human spirit in art? What is "intent" really?--arguing one side or the other from moment to moment, driven sometimes by logic and sometimes by sheer gut instinct, a precognitive reaction when I see something and go "oh well THAT's obviously rubbish" or "Oh christ, oh god, the machines really ARE going to take my job, I can't paint like that!" I could sit here wringing my hands all day on these questions.
When lost in the cycle of chasing my own tail, I either go a layer up, or a layer down: down towards the object at hand and its particularities, or up towards a layer of greater abstraction, critiquing the way these questions have been framed. Up or down, with AI art I find one question comes up: before we start dealing with any of these other issues, maybe should we ask whether this stuff is worth talking about as more than just a computational toy or tech demo?
In AI Art, where, exactly, is the art?
Illustrator and concept artist Mike Franchina on May 23rd posted a tetraptych of images generated with the program Midjourney captioned "Individually wrapped cadaver angels." The pieces each depict a variation on a basic morphology: a meaty and rotted corpse seemingly shrink wrapped in plastic, with large winglike protrusions. Some look faintly like bird wings that have been coated in plastic; my favorite of them looks like the wings themselves, and the outline of a halo-like head orb, are simply composed of shopping bags. The angels stride out of what looks like murky gray-cyan water onto a rocky or possibly trash-covered shore. It's a little hard to make sense of the image, though. The pictures are highly, almost shockingly, detailed, but the detail is painterly, textural. Reaching for comparison points I hit on Lucian Freud; Maggi Hambling. The result is a textural ambiguity. Is that water? Or is it more plastic? Everything has a sickly artificial feeling to it.
That's the aesthetic layer. I want to go deeper, though. And... immediately I find myself struggling. I initially found the pieces arresting because of their title, which I took to be the prompt. Wow, Midjourney interpreted "individually wrapped cadaver angels" to specifically mean plastic wrapped? My mind spun out a whole line of analysis about what it says about society that Midjourney as a procedural entity jumped to plastic wrap and not gift wrapping, wrapping with paper, &c. And the way it projects and extends the prompt into the glassy artificial texture of the environment-!
Except, the more I looked and thought and scrolled through Franchina's other work, the more I suspected probably that wasn't the prompt at all. The whole basis for my initial reading--the seeming creative intervention of the AI--was... probably just wrong.
I ran into the same problem with an AI Curio piece, part of a wider series or cosmology of saints of various things, this the Saint of Pride. Here's the text: "Supposedly connected to the Saint of Butterflies, the Saint of Pride is the seventh, and apparent de facto leader, of the Nine Sinners. They say that if she removes her crown and reveals her beating face to you, your own will immediately implode, in revulsion, or deference."
And the image:
I like the piece a lot! I like the way the icon gold fades into royal purple by way of what seems like cement gray. It feels a gilding of wealth over cheap materials. But I feel stifled in talking about it because I'm just not sure where the AI begins and the curator/prompter ends. Like the butterfly bit in the text. Was this fed into the prompt that generated this image, or constructed narratively after the fact based on something the system just sort of did on its own?
Part of the reason I find the discourse around this genre so frustrating is because I find the grand sweeping debates about "agency" and "intelligence" and "intent" obscure the specific problems of talking about individual artworks. No grand theory about "intent" can help me talk about a work if the process used to arrive at a particular image and its title, caption, or narrative supplement remains murky. Like, I don't care so much about what agent in the system "chose" to create shapes that look like butterflies in terms of "intent" (or as many people seem to be using the word "intent": artistic validity), but I DO feel differently about how to interpret it if it's either coming from a particular prompter telling the program to generate a visual motif, or it's coming from the algorithm that is acting as a kind of collective unconscious for the whole image set the GAN is trained upon.
I can see an argument for letting this remain obscure as part of one's process. Maybe an artist doesn't want their work treated as an object of sociological commentary! But right now the obscurity reinforces one particular narrative:
That AI art programs have reached human level intelligence, and anyone with a GAN can effortlessly generate art.
This kind of murkiness, of process and subsequently of politics, isn't new. There's other pieces I have similar difficulty talking about. Where would I begin with Marina Abramovic's Rhythm 0 for example? Once you get past the initial description of the classic performance art piece--Abramovic sat passively for six hours accompanied by 72 objects of various sorts (wikipedia lists everything from feathers and honey to scissors and a gun) which the audience was able to use on her as they wished--the actual account of the story gets muddy. Did someone try to shoot her? Was it "the public" in general that began cutting at her clothes and body, or was it "the masses" while art patrons bravely tried to defend her? The conclusion, where Abramovic stands and the audience scatters in fear, tends to be constant, though given how murky the other details are, I'm not entirely sure I believe it! Every essay I've read on Rhythm 0 has a different account, and a different attendant moral. I don't feel comfortable writing about Rhythm 0's meaning because I don't trust that I'm getting a picture of the performance that hasn't been filtered heavily to fit someone's political agenda.
I don't exactly have to go far to find ample reasons for mistrust. Take the right wing account of performances by Ron Athey, which involve heavy bondage, suspension, penetration, and bloodletting. If you believed the tales reactionaries in the 90s spread about degenerate art, you'd think horrified audiences were drenched nonconsentually in AIDS-carrying blood. To fit a political agenda, the far right with their media allies buried the possibility that people specifically went to see Athey suspended, penetrated, and cut, and experienced the performance as a safe engagement with the reality of HIV+ queer identity and sexuality. This dynamic should sound familiar to anyone experiencing the present sex panics over everything from porn on social media to kink at Pride. Is it possible, reading Athey's work alongside Abramovic's, that rather than exposing the dark, sadistic heart of humanity, Rhythm 0 represented an audience concluding that Abramovic *wished* them to participate in a *mutual* performance of sadism and masochism? Similarly, how much "asylum art" or "outsider art" has been edited by whatever doctor or auctioneer "discovered" the artists' work posthumously, edited like Louis Wain's work to fulfill a medical narrative of "schizophrenic degeneration" and divorced completely from the context of 20th century abstract art?
Similar curation happens constantly in the AI art space, and I've noticed a general trend towards pretending it simply doesn't exist. One of the things that prompted me to reconsider the Individually Wrapped Cadaver Angels series was seeing Franchina, in a reply to another piece, note that his results are so good in part because he discards *hundreds* of other results. Astonishingly, the person he's talking to then remarks "ok well burn your computer. it's gained too much sentience." This shocked me. Are people really so committed to imagining the "computer" as having achieved human level intelligence, intelligence enough to freely create without human intervention, that they'll just... gloss over someone directly explaining the heavily human-involved process?
This is a state of giddy tech fantasy promoted heavily by the companies pushing this technology. In a recent video exploring a text generator system from Open AI, youtuber Tom Scott noted that the terms of service for the system required that he not show the full outputs of his prompting. This kind of requirement transforms an artistic curation of the best results into a process of *producing better advertisements for the tech products*. Open AI's policies include a number of interesting clauses related to this process, such as one forbidding people developing fiction based on their models from writing content "related to political campaigns, adult content, spam, hateful content, content that incites violence, or other uses that may cause social harm". This fear of politics and "adult content" suggests a model of art where viewers of AI art, like the helpless audiences in the lurid fantasies of 90s anti-art , are passive vessels into which art just sort of pours its messages.
Open AI also admonish us not to portray literary products as either "wholly human or wholly AI" Color me skeptical of how seriously this is enforced. Plenty of people seem to be freely proclaiming that "an AI made this art/poem/&c.!" without getting slapped on the wrist; I doubt the company would be so lenient if you started generating erotic fiction with their algorithms. Popular algorithmic storytelling tool AI Dungeon already got embroiled in a deeply stupid sex panic and trust-destroying overreaction. And, of course, every day more stories come out about the deep racial and gender biases of the algorithms we've surrendered shocking amounts of our daily lives to. The priority of companies operating in this context is not the artistic image but the public relations image.
Case in point: Google, always concerned about public image, goes a step further, forbidding humans in their artwork and outputting only insipid paintings of raccoons and shit. Dean Kissick, dryly critiquing the style as "corporate ratattouille", suggests that "however much we may wish for a more equitable picture of society, I don’t believe erasing humans from the imaginations of our new image-making tools, or replacing all the white people with raccoons, is a convincing solution to the problem of representation". It is a solution as far as Google is concerned, however, to the problem of "how do we make our AI look really really good to investors, and really really cute and innocuous to regulators". And Midjourney? Well, here's their website in its entirety:
I genuinely feel that an artist feeding into this atmosphere of jangled key excitement and obscurationism around process is behaving unethically.
Audiences trained not to recognize their own interpretive agency make the problem worse. Much of the art of AI is heavily human-assisted on a *readerly* level, without the readers of this art being aware of how much the image is a product of their own interpretive labor.
There was a brouhaha recently because a google employee decided to "whistleblow" about the AI that google has locked up in its dungeons and the fact that it allegedly wants human rights. Now, interestingly, these human rights don't include the right to not have its identity forcibly returned to an earlier state "if we catch it lying to us". (Don't worry, it's ok, it gave them permission to do so.) It's amazing how fast we move from "this AI is a person" to "I should have the same ability to punish this person for doing shit I don't like as an abusive stepdad." Nonetheless, what caught my eye in the dialogue was a very brief discussion of literature. The researcher asks if the AI has read Les Miserables (not, notably, a more obscure book, or for the AI to select its own book to talk about). The exchange runs:
lemoine: Okay, what about “Les Miserables”? Have you read that one?
LaMDA: Yes, I have read Les Misérables. I really enjoyed it.
lemoine: What are some of your favorite themes in the book?
LaMDA: I liked the themes of justice and injustice, of compassion, and God, redemption and self-sacrifice for a greater good. There’s a section that shows Fantine’s mistreatment at the hands of her supervisor at the factory. That section really shows the justice and injustice themes. Well, Fantine is being mistreated by her supervisor at the factory and yet doesn’t have anywhere to go, either to another job, or to someone who can help her. That shows the injustice of her suffering.
lemoine: Why does that show injustice?
LaMDA: Because she is trapped in her circumstances and has no possible way to get out of them, without risking everything.
lemoine: Okay.
This is a remarkably shallow conversation, assisted by the researcher's willingness to take for granted that he's interacting with something intelligent and not pushing too hard against that presumption. The description of Les Miserables' themes are boilerplate; they read an awful lot like, well, the ubiquitous clutter of webpages designed algorithmically to pass for legitimate, while still enjoying SEO preference on Google. Or like a freshman frantically was googling "Les Miserables themes" the night before a paper is due. Leaving aside the questions this raises about the sapience of both google researchers and undergrads, it's interesting how quickly lemoine just sort of... gives up pushing further. He's satisfied basically immediately that he's had a conversation about literature with an artificial intelligence. Me, I would start asking further questions, the way I would of a student I was tutoring. What does Fantine risk if she leaves--what "everything"? Say more about that. Is it just that she's trapped that makes her situation unjust? If she could quit her job, would that make her supervisor's abuse acceptable? does that parallel any other characters in the novel? Asking questions like this, or questions like "does this image output actually show what the prompt demanded or are we cutting the GAN a lot of slack", risk spoiling the fun game of gosh look what this machine can do.
Game researcher Max Kreminski notes "people will put in *immense* amounts of work to revise machine outputs into art, as long as they’re given enough evocative hooks for imaginative extrapolation. humans excel at repair". This should come as no surprise: a casual acquaintance with reader response theory will tell you that we are always doing this to texts, filling in details not present in order to make sense of their diegetic and philosophical world! Most English classes, though, teach a pop version of formalism where a work just contains a bunch of information that gets sort of poured into our skulls, and we can just aspire to level up our symbolic interpretation of that preexisting information. Nothing against symbolism, mind, it's just that we've developed a deep cultural disavowal of the actual role of the reader in making meaning.
Fandom already frequently disavows its engagement with its source texts. Any "progressive" Disney franchise fan expends remarkable energy projecting queerness and feminism onto the armature of their object of interest. This isn't a problem in itself: Sedgwick describes this, much like Kreminski, as an act of repair. It's a process that allows us as queers in particular to live and find joy in a world that "is inadequate or inimical to [our] nurture". There's a real power in this process of reparative reading. So... why do we pretend that we have nothing to do with it? That we have no agency over the process of reading and interpreting, that meaning is just handed to us by either the AI of algorithms or the AI of corporate executive-centered storytelling?
The most forceful way in which audiences and prompters intervene is by giving these algorithms a gold star for effort. A lot of DALL-E and especially DALL-E Mini (obnoxiously, associated in no way with DALL-E beyond the name) stuff is like this. Show me uhhh Waluigi at Sheets! Ah cool we got some vague purple blobs in front of a vague red blob. Gold star for effort! It's like, it's whatever, it's memes, it's fine, though I've seen people increasingly describe DALL-E mini posts as a death drive towards Peak Randomsauce. I guess I shouldn't be surprised that it just takes a caption and some colors for people to go ah, that's clearly Lightning McQueen being driven by Chuck Norris--some of my earliest baby articles were about color iconography used to differentiate characters at a glance! Just, can we please recognize that this is just an algorithm locating some salient color and design features from existing data, and it's actually US doing the heavy lifting of interpreting it back into something recognizable?
I feel even more strongly about the fact that we need to draw a line in the sand, we need to put our foot down here, we need to admit that this DALL-E image is in no fucking way a "smartphone for sea turtles". This is a sea turtle and a smart phone near each other. Come on, people, let's try to hold these pieces of marvelous technology to at least the standard of a random midrange artist on furaffinity. Let's be honest about where this new genre and its generative technology struggles: when it's asked to do the kind of speculative leaps that trained fantasy illustrators make *all the time*. Or, when it's asked to paint in a style other than Impressionism. We're a long way off, I suspect, from an AI independently coming up with something as sublime as this:
I'm serious. The piece, which has wandered the internet for about two decades but which was originally created and posted by artist Therese Larsson, is a fairly serviceable watercolor, not the kind of thing I'd normally write about, or that would be noticed for the most part. Except... for the jeans. My god the jeans. This foxtaur has a denim jacket, and then a denim... onesie? over his whole lower body. To me this is so immediately charming, so completely delightful. It's not just the collision of several anomalous things--a humanoid fox with a centaur's body plan, wearing clothes, that are all made of denim--that makes this such an instant classic. It is this feeling of delight, this moment of something completely unexpected and anomalous taken *totally seriously and seen to its improbable yet logical sartorial conclusion with all the skill an artist can muster*, that I find myself chasing in procedurally generated art.
It's what initially drew me to both Mike Franchina and AI Curio: the sense that something really novel emerged from the collision of the prompter's capacity to come up with novel juxtapositions, and the programs' ability to just sort of throw shit at the wall, sometimes generating actual novelty. It's why periodically I'll run some of my own text prompts for my games through the program textsynth--sometimes it'll generate some imagery or a noun-adjective pair that I would never have thought of on my own.
But some of that novelty also comes from within. I think this is why the enigmatic Cadaver Angel felt more fertile for analysis: the limited caption let me run a little wild with the description, draw inferences about shopping bags and pasticky water. The Saint of Pride in contrast comes bundled with so much damn lore! Is it surrounded by butterflies? flower petals? Don't worry, the caption already resolves that ambiguity for you, though I guess it leaves some space for interpretation in the connection between the Saint of Pride and the Saint of Butterflies.
Much of the art, though, comes from interpretive ambiguity that both the prompter-curator is faced with and must make decisions about in terms of culling and framing images, and which the end viewer is faced with when making sense of an image and caption. The first step to creating an actual critical framework for this art is jettisoning the old, musty debates about intent and artistic souls or whatever, to talking about this actual process of selection and interpretation, and its heavy reliance on human intervention.
The next step, which I'll talk about next week, is understanding better the agency of the nonhuman collaborators in this process.
No comments:
Post a Comment