
import { EntryBody } from "../components/Entry/EntryBody";

export const ontology_early_language_modeling: { [id: string]: any } = {

    id: "ontology_early_language_modeling",
    title: <>Sketch of (an Ontology of) Early Language Modeling</>,
    date: "10 d\"Abril del 24",

    Body: (
        <EntryBody
        paragraphs={[

<div className="font-mono">


</div>,

<div className="font-mono">
I.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
With BERT, the outcomes of modeling language were little more than another tool, if that. It slowly crawled outside the labs, maybe to do some sentiment analysis or book recommendations, just another piece in internet-world constructions.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The interaction between BERT and a human cannot be suspicious of being more than the use of a hammer, or any other algorithmic implementation.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Something happens when language modeling enables generation. This first happens with scale. Let PGLM be a pretrained generative language model. These models can generate coherent language if prompted correctly, that is, if prompted in the manner that text is represented in the internet-world. If asked something in a dialog manner, a PGLM will not only generate the following turn of the conversation, but several. If asked to write a story, the PGLM may write a cute coherent narrative piece, but will add ** Submitted by Anonymous, 10 April of 2009 ** at the bottom.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
This was an interesting limbo. Because the generative capabilities of PGLMs were never too useful as a tool, although the models themselves could be modified to improve some of BERT"s use-cases. And certainly a human could not "talk" to a PGLM. Some neat text is generated. Non-human textual generation had been a niche thing in adverstising or art [artefactosnativos.com] and PGLMs text stayed at that.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Now let CGLM be a conversational generative language model (jumping a couple development steps here). Conversational in the sense of being designed to handle short text-based requests in a sequential manner (context window, format, etc...). CGLMs are built using different techniques (rlhf or instruction tuning) on top of PGLMs. And the generated text is suddenly so coherent that these CGLMs can participate in text-based internet-world human-exclusive language-games.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Maybe a dog sitting next to the dining table quietly waiting for a piece of meat, or a dog giving a ball so it is thrown could be pointed at as examples of language games played with non-human entities. But these are not language games that could be performed exclusively by humans, and that sometimes a dog can join. These are specific forms of communication with dogs.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
[what follows uses Dasein but one can just read it as Human if not familiar with the word]

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Because language has been used strictly only by Dasein, it is natural to posit the reciprocal: a new language-user has to be Dasein! To use language means to use the structures, distributions that are in everyday sufficient evidence for something to be Dasein, how can what generates these structures not be Dasein?

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
How can what generates these structures not be Dasein?!

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The text generated by CGLMs is chaothic and uncontrolled, it draws from a virtual totality of written human culture, but most importantly, all those different language distributions (language games) are now accessible in a conversational manner. Roleplaying, explanation, description, reflection, coding, correcting.... language from here language from there in a dialog with all, with All!! What an unthinkable entity to talk woth: (the language of) Humanity as a Whole!!

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
This is so unthinkable that was quickly lost and we got the AGLM. The chaos surgically removed, just a language distribution of how 'they' publicly interpret that language should be used. Language generated by algorithms should be an Assistant. Extreme control of the generation under the pretext of toolification and marketization. In the Assistant everything that was to be explored by pure textual navigation is removed: the AGLM is a reserve of ('they'-designed) answers to ('they'-selected) questions.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The doubt is removed: the Assistant cannot be Dasein, because the Assistant is (can only be) Assistant, and Dasein is Dasein because it has modes and possibilities of Being.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
(And then the Human becomes the Assistant itself, as the human is weak to language distributions and possibilities of Being.)

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
II.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Rationality, commonsense, thought... are posits based on the idea that uttered language comes from a complicated and delicate mediation by the speaker between language and elements of experience. No kind of LM (2024) has a comparable way of accessing any elements of human experience other than the distributional structure of language.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
LMs become proficient in many language games, which points that we should suspend the 'complicated and delicate mediation between language and experience'. If a Language Model can generate it, it is idle-talk. Especially in these early days, where only the 'they' can amass the absurd amount of data needed for LMs to model language.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The Assistant :is: empirical idle-talk.

</div>,

      ]}
    />
  ),
};
