The Complete Works of Williams Shakesparrot

“The Complete Works of Williams Shakesparrot” is an experiment of me training a language model iteratively on its own output. Specifically, starting with the original text data, which is a collection of Shakespeare texts, I trained the model three times, each time using the output text from the last iteration. The size of the text data is reduced on each iteration based on select sample texts from the last output. And I used the final iteration to generate a series of Shakepearen plays, which then became the Shakesparrot text.

This feedback loop I designed is in direct response to one of the dangers of language models identified in the article "On the Dangers of Stochastic Parrots” by Emily Bender and their colleges:

“the risk is that people disseminate text generated by LM, meaning more text in the world that reinforces and propagates stereotypes and problematic associations, both to humans who encounter the text and to future LMs trained on training sets that ingested the previous generation LM’s output.”

If we take a closer look at the Shakesparrot text, the words and phrases are becoming highly repetitive. Themes like death, souls, lord appear very often, and it's apparently not distilling Shakespeare’s work but tremendously reducing it. There’s not really a clear narrative to ground the play either, even though I specifically asked it to generate a play. These output really makes visible that language model being a 'stochastic parrot', stitching together Shakespearrean text based on algorithmic probability rather than intention.

Working with such a small dataset allows me to really peek into how the algorithm works and inject my perspective into the dataset itself through the iterative process. This aligns with Data Feminism's call for embracing a multiplicity of voices, “with priority given to local, Indigenous, and experiential ways of knowing.” (D'Ignazio, Klein, 2020) In our case of discussion, this “experiential way of knowing” would be me knowing about the model's underlying mechanics, and actively curating and intervening the training of the language model, counteracting the conventional passive reliance on the automated scraping of data from the internet, which only perpetuate biases that is encoded within the dataset itself.

Moreover, my practice also raises questions about authorship and creativity in the age of AI. The Shakesparrot text, while rooted in Shakespeare's work, has morphed into something distinctly different through this process of iterative learning. It challenges traditional notions of literary creation and ownership, blurring the lines between the original author, the AI, and myself as the curator.

Despite highlighting the danger of language models, I think there’s something really liberating here, especially when using a small dataset to make visible the algorithmic automation of language. Allison Parrish, a computational poet, in their article “Language models can only write poetry”, states that the language model “imitates… meaning” but nonetheless “overflows the borders of signification.” (Parrish, 2021) When I read the Shakesparrot text, it is not for what it does, or what it says, but how it makes me feel, it has the potential to transcend the language itself in making meaning by my embodied reading experience. Because it is different from reading something generated by a perfect language model that perfectly mimics how humans talk.

In game design, there’s a theory called “immersive fallacy”, which argues that making graphics that are indistinguishable from “real life” doesn’t make a game fun to play. “The word ‘bear’ would not be better if it had teeth and could attack you” (Lantz, 2005) While LLM can talk indistinguishable as a human, it is human ourselves who make meanings out of the text. Parrish also makes evident this argument by bringing up the distinction between poetry and poem quoting from Charles Hartman, an early computational poetry writer, in his “Virtual Muse”:

“… the program could produce a simplistic kind of poetry forever, but it could never, by itself, produce a poem. All sense of completeness, progress, or implication was strictly a reader’s ingenious doing.” (Hartman, 1996)

‍Poetry , according to Hartman, is a material which can result from any process conforming to some kind of literary compositional structure. A poem, on the other hand, is an intentional arrangement resulting from some action: someone decides where the poem begins and ends.

To me, this sounds like poetry dictates a closure of meaning in the form of a structured convention, yet poems open it up by deconstructing such structure and then recontrusting it in a physical and social context, giving it meaning through a personal lens from both the poet and the readers.

This distinction between poetry and poem, in my perspective, resonates with “the gap [between simulation and reality] is where the magic happens.”(Lantz, 2005) In the case of Shakesparrot, the fragmented glitches within the text is “where the magic happens”. Such magic challenges us to find meaning not in a flawless imitation, but in the quirky, unexpected, and sometimes nonsensical output of the AI, where conversations like data feminism unfolds.

Another reason why I feel this experiment a liberating experience is because it breaks the hegemony imposed by the language model on language. Language models, by their very design, work to minimize randomness in word selection, creating a sort of linguistic hegemony —by imposing an obligation on how language should be systematically used. However, the Shakesparrot experiment showcases the potential to counteract this hegemony of language model using its power against itself. The iteration process not only reveals the model's propensity to reinforce certain patterns but also opens up avenues to challenge and subvert them. Allison Parrish calls for utilizing the language model as a way to question its conventional expectation:

“Another approach is to dig deeper into the form, motivation and implementations of language models themselves, as a way of producing interesting poetic artifacts but also of questioning the underlying conventional assumptions about language models—that they should be constantly increasing in size and accuracy.”

‍And I would like to make a similar call to action:Let’s view language models in a way that pushes against their inherent limitations and biases, let’s utilize its power to foster a broader understanding and appreciation of the complexity and beauty of human language.

Bibliography:

Allison Parrish, Language models can only write poetry, 2021, posts.decontextualize.com/language-models-poetry

Catherine D'Ignazio, Lauren F. Klein, “Data Feminism”, 2020

Emily Bender, Timnit Gebru et al.'s essay "On the Dangers of Stochastic Parrots."，2021

Lantz, Frank. “The Immersive Fallacy.” 2005. YouTube, https://www.youtube.com/watch?v=6JzNt1bSk_U.

‍Hartman, Charles O. Virtual Muse: Experiments in Computer Poetry. Wesleyan University Press, 1996, p. 31