RUE
QIAN

Seeking the Trace of Meaning Beyond System

2024 MFA Thesis Research Paper
Keywords: modern linguistics, literary criticism, natural language processing (NLP), generative-AI, graphic user interface (GUI)

The prevalence of generative-AI shifts the interface between human and machine to natural language. Such a dynamic calls for a symmetrical representation between the meaning of a prompt to its visual form, which presuppose a presence of absolute meaning in every word token. Drawing connections to post-structuralist linguistics and literary criticism, I intend to deconstruct the defaulted simplicity implied by design elements like a text box, in order to envision alternative prompting experience that allows for more fluid freeplay of meaning through the redesign of the current Graphic User Interface of generative-AI.

I suppose every language learner has had this frustration: when you look up a word in a dictionary, it gives you more words attempting to help you to understand that word, but only makes you look up another word, and then another, until you finally come across some simple words you think you understand. 

This exhaustive process of trying to find meaning of a word is an example of the term Jacques Derrida, thinker of the theory of Deconstruction, coined as “Différance”. Putting this French term in English, Différance refers to the condition of being deferred, as well as the condition of what is different. (Derrida, 1997)  For Derrida, a word doesn’t not lead to a universal meaning, but only to another word which results in another word, which results in the word having indefinite deferrals in reaching an absolute meaning. In other words, the destination of getting what a word means can never be arrived at, but is always held back — an eternal delay.

The exhaustion brings me to an “intellectual” crisis: Is it ever possible to pin down the true meaning of a word? Does what we say ever stand for what we mean?

Before facing this crisis head on, I tried to ask myself why I, as a designer, would frame this exhaustion as a crisis. Why would I want to take something straightforward like words that I have been throwing around for so long and turn it into a complex mess? Why does finding the meaning of words matter so much to me at this point of time?

“Because of AI”

The development of Natural Language Process(NLP) and generative-AI has changed the way we interface with technology, as well as the dynamic of the creative industry. Before realizing it, we can talk with machines, and machines can understand our words and generate fancy images of our wildest ideas.

As confidently claimed by IBM,  NLP gives computers the ability to understand text and spoken words in much the same way human beings can, including the contextual nuances of the language within them. Word embedding, a branch of NLP, carries the task of capturing the meaning of words in order to analyze texts. Open AI explains an embedding as “a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and long distances suggest low relatedness.”

Below is the word embedding  of “apple”.

According to openAI , there are 1536 numbers in this vector list,  unpacking the meaning of the word apple in 1536 dimensions. One or a combination of some dimensions is closest to the word “sweet” in a vector space, another bunch of dimensions is closest to the word “round”, and another is closest to “red”, and so on. Then this pile of numbers is handed to a decoder,  the word “apple” will show as the output of the decoding mechanism, or "🍎" if the output language is set to a emoji, or “苹果” if in Chinese. 

The approach of training a word embedding model is based on the distributional hypothesis in linguistics: linguistic items with similar distributions have similar meanings, meaning words that are used and occur in the same contexts tend to purport similar meanings. This hypothesis is grounded in John Rupert Firth’s linguistic theory regarding the context of word use: “a word is characterized by the company it keeps”. (Harris, 1954) 

To visualize such a vector space, I built a simple prototype with Global Vectors for Word Representation (GloVe) algorithm. The corpus used to train this model is the text data from Wikipedia 2014, which consists of four hundred thousand words to build out the vector space. Above is a cluster of words surrounding the word “apple”, suggesting the array of contexts in which "apple" appears in the corpus.

“Blackberry” appears to be a word close to “apple”, they might exist together multiple times in sentences that talk about fruit. More frequently are words related to technology such as “ipad”, “iphone”, and companies like “ibm”, “microsoft”, suggesting that the meaning of “apple” in 2014 is mostly associated with Apple Inc. as a technology company. I then removed “apple” from its cluster of words, and selected two words from the “apple” cluster to see their respective clusters. I end up getting this landscape of words in constituting the meaning of “apple” in 2014.

 This process can repeat endlessly until I get all the 400k words to show up, since each word is always in relation with one another within the large corpus of 400k. In this sense, the meaning of a word is actually at some level including the meanings of every other word in the language.

Ferdinand de Saussure, the Father of modern linguistics, might find a resonance with his theories in my prototype. Saussure posited that a word's meaning emerges from its relationships within the intricate web of language. (Saussure, 1916) For instance,  the word "apple" gains its meaning not only by being distinct from "blackberry" or "microsoft," but also from its place among the myriad of words in the English language. Language thus creates a network where each word is defined by its connections to others, imparting a relative yet stable value to each term. In essence, a word’s meaning is shaped by being what other words are not, a foundational principle in Saussure's linguistic theory.

The Metaphysical Nature of Sign

Derrida argues that the structuralist notion of “a word’s meaning lies whatever it is not” necessarily imposes a binary opposition between words, like white / black, masculine / feminine, and influences people, especially in Western cultures, to think and express their thoughts in the same way. (Culler, 2011) The oppositions are hierarchies in miniature, one side is viewed as better or more valuable, while the other is less important, even if only slightly so. One of such hierarchies is the priority of speech over writing in Saussure’s the development of his linguistic sign and its terminology: “Language does have an . . . oral tradition that is independent of writing.”(Saussure, 1916)

As argued by Derrida, “Saussure does not recognize writing more than a narrow and derivative function. Narrow because it is nothing but one modality among others…. Derivative because representative: signifier of the first signifier, representation of the self-present voice, of the immediate, natural, and direct signification of the meaning (of the signified, of the concept, of the ideal object or what have you ).” (Derrida, 1976)Writing is seen by Saussure as a distinct system of signs that exist for the sole purpose of representing language, and is nothing more than a signifier of speech, while speech is the immediate, natural signifier of the meaning. It reflects the structure of a certain type of writing : “phonetic writing, which we use and within whose element the episteme in general (science and philosophy) , and linguistics in particular, could be founded … an ideal explicitly directing a functioning which in fact is never completely phonetic.”

Derrida’s deconstruction of the phenetic foundation of language, to me, is not to subvert the hierarchy between writing and speech, but to challenge the presence of meaning of words in their immediate signifiers, as well as to stretch beyond the Saussurean perspective that meaning is accounted by the effect of linguistic relations: “a word being what other words are not” and the binarity it implies. Sign, like all concepts available to us, implies the “metaphysics of presence”, i.e. the tendency for the history of Western science and philosophy with its language and traditions to privilege and secure the value of presence (as truth, ideal, meaning, being, etc.) through the systematic suppression of the absence. An example would be one that is discussed earlier: speech has been favored by Saussure over writing because “speech is the immediate, natural signifier of the meaning”- a logocentric line of thought Saussure followed in the development of his linguistic theories. When I speak, my intention accompanies my words, such that there is unity between the signifier of my speech and the conceptual signified of my thought, thus meaning is assumed to be present within my voice, rendering writing to a secondariness, a derivative version of speech, which seems not as “natural” in the creation and understanding of meaning. Such a hierarchy calls into question the dichotomy of signified and signifier, and the designation of a sign to some form of “present truth” or an immutable essence that exists beyond or before the sign. This metaphysical nature of sign is reexamined by Derrida through the lens of Deconstruction. 

Deconstruct the Sign

To deconstruct is not to destruct. Rather than renouncing the notion of signs, “it is necessary to surround the critical concepts with a careful and thorough discourse - to mark the conditions, the medium, and the limits of their effectiveness…and, in the same process, designate the crevice through which the yet unnameable glimmer beyond the closure can be glimpsed.”

Let’s recall again our experience of looking up a word in the dictionary. A word leads to more words, a signifier refers to other signifiers. The signified-signifier static fixation is, in this sense, embodied as an dynamic experience of traveling through a chain of signifiers and failing in landing on the signified. Recognizing this failure is hard, at some point we think ourselves finally reach the end of the chain and grasp the meaning, yet it is precisely what Decontruction does: to embrace the indeterminacy of meaning as exhaustive opportunities that expands the meaning of word rather than closing it, to always seek “the crevice” where meaning is absent , and invite a freeplay of meaning through multitudes of absences. This is my personal understanding of what Derrida coined as “trace” of meaning: the absence of a presence of meaning.

Despite all the above, I have to be very frank that it is very difficult for me to fully understand and explain Derrda’s theories, I also may well have “misread” him to a large extent , yet this might be exactly what Derrida himself would love to see. (Derrida once said that “all our readings are misreadings”.) I, as a multilingual, resonate strongly with this notion of “trace”, especially when I learn words in English and try to understand it by translating back to the languages I know.

A word could be seen through different lenses of languages to unfold a fluidity of meanings across cultural interpretations. One such example is the word “cold”. In the English dictionary, cold refers to “strongly producing the sensation which results when the temperature of the skin is lowered”, but in real life, we learn how “cold” feels before what “temperature” means. “Cold” refers to more of a feeling than a digit on a thermometer, and this sensational presence of meaning relates to me more in Japanese, my second language. “Cold” is translated to “冷たい” in Japanese,  which is a compound of “爪” (“claw”) and “痛い” (“painful”), inviting a sensation that is sharp and biting: coldness feels like a painful claw scratching my skin. Translating to “寒冷” in Chinese, my first language, a representation of the contextual meaning of “cold” is directly illustrated by the visual form of the word, due to the nature of Chinese being a hieroglyphic language. 

According to “Shuowen Jiezi 說文解字 ('discussing writing and explaining characters')”, an ancient Chinese dictionary that analyze the structure of the characters and to give the rationale behind them, the traditional Chinese character form of “cold” adopts the meaning of "宀、人、茻、仌" to construct a illustrative image of a man (“人”) hiding beneath a shed(“宀”), covered with straw mattresses(“茻”), standing above the two block of ice (“仌”), seeking shelter and warmth in a cold environment. It encapsulates a scenario where coldness is not just felt, but also seen and experienced in a tangible, almost narrative form. 

The problems in Gen-AI

Recognizing the indeterminacy of meaning doesn’t render my effort of finding meanings pointless, but unfolds a critical look at systems and institutions that presuppose and secure the “true” meanings to words. 

A Prompt is a short text phrase that the Midjourney Bot interprets to produce an image. The Midjourney Bot breaks down the words and phrases in a prompt into smaller pieces, called tokens, that can be compared to its training data and then used to generate an image. A well-crafted prompt can help make unique and exciting images'' (Midjourney Prompt Guide) If we take a look at the training data of CLIP (Contrastive Language–Image Pre-training), a method developed by OpenAI for teaching AI models to understand and relate images and text, it contains millions of images and corresponding text descriptions where explicit pairings between text and images are made. These datasets are sourced from diverse contexts on the Internet, and often involve manual annotation by the massive use of cheap human labor. And if we take a look at how Gen-AI image generator is used, it is expected by its users to yield specific, predictable outcomes, reflecting a belief in the direct, stable relationship between prompt, its meaning and the generated image.

Earlier we looked at the word “cold” through the lens of different languages, which unfolds multitude layers of meanings across cultural interpretations. When “cold” is entered in different languages as prompt for image generation in Midjourney, here’s what we got:When “Cold” is entered in English, the primary language that Midjourney is trained upon, very often we get an image of a Western-looking female feeling cold. The meaning of cold is conveyed largely through the realistic depiction of thick clothes, and the gesture of hunching over the body to keep warm.

When cold is entered in Japanese as “冷たい”, what we got is mainly scenery views of landscape covered with snow, most often is the Mountain Fuji. The style of these seems to be much less realistic compared to the English version, resonating with the style of Japanese woodblock printing.

Researcher Margaret Mitchell and colleagues, in “Seeing through the Human Reporting Bias” (2016), posits that “it is the words that are not spoken that encode the bias.” For example, people say “green bananas” but not “yellow bananas” because yellow is implied as the default color of the banana. Similarly, people say “woman doctor” but do not say “man doctor.” (D'Ignazio, Klein, 2020)

In our case of “cold”, what is not spoken is the language we use for the prompt. It reveals AI’s reliance on the linguistic and cultural contexts it is trained on. With an English-centric dataset, the AI necessarily produces imagery rooted in Western conceptions of coldness, therefore embeds a form of cultural hegemony in its outputs.It is well acknowledged that biased dataset is the problem here. There is a critical need to expand and diversify the linguistic and cultural datasets used in AI training. I completely agree with it, but I would add on to it by arguing that the prompt is also part of the problem, and it’s grounded inherently in the way we treat our languages on machines.

Unboxing the Textbox

Derrida once said that: “The idea behind deconstruction is to deconstruct the workings of strong nation-states with powerful immigration policies, to deconstruct the rhetoric of nationalism, the politics of place, the metaphysics of native land and native tongue…The idea is to disarm the bombs… of identify that nation-states build to defend themselves against the stranger, against Jews and Arabs and immigrants…

Power structures are on the surface made up of polices, places, and a network of things we interact with everyday. But what they’re really made up of is a tradition in their way of doing things. In the case of our discussion, one such thing could be the English-centric input system. Text box, being the most foundational interface element primarily tailored for English input, is designed to serve for simplicity and efficiency in both human understanding and machine processing of text. However, the fact that it appears to be so default in our mind suggests a lack of close examination in this limitation. Texts are expected to be entered linearly from left to right, letter by letter, word by word. It causes friction for languages that don't align with this manner of input. Traditional Chinese, for example, goes up to down, right to left, because the earliest Chinese writing was done on bamboo. Not to mention the other over 7000 languages in our world, with each its own unique way of writing and cultural lineage of the civilization. 

Entering texts as a prompt for generative-AI, in my perspective, is a way of communicating with machines to express our imagination of a vision as wild as we can. Yet such wildness is confined within the capability of one single way of communication - through prompting in a text box: an assertive way of speaking that dictates what is there or not there, in a mechanical arrangement of text conforms to an universality.

One thing to make clear, though, is that I’m not suggesting that the text box is the only thing to blame here, but the reduction and encapsulation that a design element, along with the entire mechanic input system around that element, imposes on the diversity of text, and this diversity can take form not only from a nationality point of view but also the nuance inherent between the text and its meaning.

It is not just me feeling this way. To unleash the pictorial potential from the mechanical use of words in a rigid rectangular box, graphic designer Bruno Pfäffli composed this poem in a textbox to break through the hard, conventional grid of typesetting.

Text box, in this case, is seen as a canvas, where a landscape is painted with the strokes of texts. It is intriguing to see this pictorial arrangement of text because it resonates with what we, as humans, see through our eyes, giving it a dimensionality grounded in reality. The poetic spirit of this piece reminds me of what Charles Hartman, an early computational poetry writer, says in his book Virtual Muse, distinguishes between poem and poetry: 

“… the program could produce a simplistic kind of poetry forever, but it could never, by itself, produce a poem. All sense of completeness, progress, or implication was strictly a reader’s ingenious doing.”

Hartman seems to be saying that poetry is a material, which can result from any process (whether conventional composition, free-writing, or tinkering with language models). A poem, on the other hand, is an intentional arrangement resulting from some action: someone decides where the poem begins and ends. Allison Parrish, also a computational poet, suggests that “a language model can indeed (and can only) write poetry, but only a person can write a poem”, because poem is “the site where poetry is tactically deployed in a physical and social context”. I would further break down “the site” into one that is personal, or even further,  a combination of multiple sites, an intersection between two sites, a part from one site… yet all cases resonate with my earlier exploration on the meaning of text and its indeterminacy, whether by machines or humans. And this indeterminacy should be translated into a complexity of contexts that resist the formal, mechanical, and thorough representation within something like a text box.

Pfäffli’s piece also reminds me of concrete poetry, a genre of poetry that emphasizes nonlinguistic elements in its meaning.

Ame, by Seiichi Neiikuni, is a distinct one for every reader: depending on if you could read the Chinese character “雨”;  if you read it digitally on a smart screen, or physically on a sheet of paper; if you read it on a big monitor screen where you can see its entirety, or on a smaller one where you have to scroll to read it part by part; or even if it is raining outside of your window…Whether these experiences are fuzzy or clear for you to understand the meaning of this poetry, they unfold beyond just the poetry itself, they are unique because of you as a unique human at a specific state of being. 

S Cearley in their "How to Read a Concrete Poem" says:…the concrete poem is silent and motionless. It does not move in time, does not go from A to B. The meaning of a concrete poem is no longer tethered to its linear movement through time, from the beginning of you viewing, to the end. because it's free, its meaning is free.

This freedom of meaning again suggests its indeterminacy, but this time, I would argue, this indeterminacy should be translated into a fluidity among specificities, the uniqueness of the way meaning is interpreted by its reader.These examples together shed light for me, a designer, on how a text box could be unboxed to carry a freeplay of meaning. In the case of prompting a generative-AI through a text box, such freeplay should take into account the complexity and the specificity of both its user and the text itself.

What if the text box could incorporate the point of view from the user in seeing the text they input? I’m certain that a female user would imagine the image of a male differently from a male user.
I would suppose the experience of reading this poem,

What if the text box is a map labeled with personal or contextual information about the place? What’s the difference between labels created by an employer of a factory and an employee?

What if a text box is literally a box? a space where the user could enter and interact with the texts? For nearsightedness, do they have to stand really close to the text in order to see clearly?

What if a text box has its own gravity? What if the text has gravity as well? What text rolls like balls and what texts are squishy like jelly?

Bibliography:

Saussure, Ferdinand de, and Roy Harris. Course in General Linguistics. Bloomsbury Revelations. London ; New York: Bloomsbury Academic, 2013.

Holdcroft, David. Saussure: Signs, System, and Arbitrariness. Modern European Philosophy. Cambridge [England] ; New York: Cambridge University Press, 1991.

Austin E. Quigley, Theoretical Inquiry: Language, Linguistics, and Literature. New Haven, CT: Yale University Press, 2008.

Derrida, Jacques. Of Grammatology. 1st American ed. Baltimore: Johns Hopkins University Press, 1976.Derrida, Jacques. Deconstruction in a Nutshell: A Conversation with Jacques Derrida. Edited by John D. Caputo. First edition. New York: Fordham University Press, 2021.

Culler, Jonathan D. Literary Theory: A Very Short Introduction. 2nd ed., Fully updated new ed. Very Short Introductions 4. Oxford ; New York: Oxford University Press, 2011.

Chiang, Ted. Stories of Your Life and Others. London: Picador, 2014.

Duranti, Alessandro, Rachel George, and Robin Conley Riner, eds. A New Companion to Linguistic Anthropology. Wiley Blackwell Companions to Anthropology. Hoboken, NJ: John Wiley & Sons Ltd, 2023.

Allison Parrish, Language models can only write poetry, 2021, posts.decontextualize.com/language-models-poetry

Lorber-Kasunic, J. & Sweetapple, K., (2018) “Graphic Criticism and the Material Possibilities of Digital Texts”, Open Library of Humanities 4(2), 13. doi: https://doi.org/10.16995/olh.278

Wang Jianshuo, What does G, P, T means in ChatGPT,  2023, https://mp.weixin.qq.com/s/vXoYeA7w6l_WiKmDHogdTA 

Harris, Z. (1954). "Distributional structure". Word. 10 (23): 146–162. doi:10.1080/00437956.1954.11659520.

Hussein, Basel Al-Sheikh. “John Rupert Firth?S Model of Linguistics: A Critical Study.” International Journal of English Language and Literature Studies 5, no. 1 (2016): 66–71. https://doi.org/10.18488/journal.23/2016.5.1/23.1.66.71.

Catherine D'Ignazio, Lauren F. Klein, Data Feminism, MIT Press, 2023

Hartman, Charles O. Virtual Muse: Experiments in Computer Poetry. Wesleyan University Press, 1996, p31