
import { EntryBody } from "../components/Entry/EntryBody";

export const blender_i: { [id: string]: any } = {

    id: "blender_i",
    title: <>Blender I: Language as Calculus, Generics, Instruction-tuned LMs</>,
    date: "28 Marzz 24s",

    Body: (
        <EntryBody
        paragraphs={[

<div className="font-mono">


</div>,

<div className="font-mono">
Three mathematicians walk into a room and solve the sum 2+3=?. They are asked to note down or speak up they"re thinkings when solving the sum. One goes "oh yeah i just know that one by heart, its 5", the other goes "i know that one from when i used to play backgammon, i always was sad i could not take out the backman when i rolled 2 and 3 because of the point in the 5th position, so i know its 5!" & the last one goes like "2 is 01, 3 is 11, doing bitwise xor 101 which is 1+2^2=5".

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Is there a calculus to answer 2+3=?? The three of them got to the same answer but the reasons where completely different, what points at there being a calculus that answers 2+3=??

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
A fourth mathematician walks in, calculator in hand. Presses 2, +, 3 and = buttons and gets a 5.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Because of what a calculator is, it can only implement a calculus. Therefore if we take that the calculator sums, if we use the calculator to sum, then there is a calculus of the sum.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The backgammon mathematician refers to specific rolls of a die and positions in the backgammon board when saying 2,3 and 5. The knowing by heart memorizes (whatever that is) so does not even say 2+3, just goes: 5. The bitwise xor uses 2,3 to refer to some other symbols, in which a calculus (bitwise xor) allows them to get the 5.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Without the calculator bringer, one struggles to see a calculus behind 2+3=?, as one would behind Are you sad? or Do you like cookies?, even if the answers of the three mathematicians were also the same, probably, even if the answers of all humans were also the same. The calculator is proof (at least indication) that there is a calculus behind the 5 responses, the one of suming numbers or whatever.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The punchline is of course that because LMs can play language games, we should consider language as a calculus -&gt; much of the heavy-lifting in discourse formation is taken up by (meta)linguistic operators, sentences that function as intra-linguistic affecting/modifying language use on further contexts, rather than have truth values refering to a world (smh). This would be very calculistic, as the flexibility and acceptability of many sentences can be modified by uttering other sentences, rather than in the relationship of language and experience (although language is uttered in experience, maybe this separation is a mistake of classic dualism, but this can be merged afterwards right? keep the difference for ostensive word-learning and later we can synthethise both things into one!).

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The generic sentence is an operation on the co-likelihood of words. Consider the simplest bare plural generic, "Ks are F". This is a way within language of modifying the co-likelihoods of the words K and F, and just that. It does not have any metaphysical implications on kinds, nor any sort of quantificational value nor contextual lexical stuffs. Generics are important for human knowledge, because human knowledge is literally the likelihoods of words in language, the likelihood distribution that is accepted by humans (whether an individual or a majority of humanity, who cares).

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
The fact that language has to be accepted to have meaning, has to be used by some humans within some rules in language-games, this relates the likelihoods to reality. This is why "tigers" and "blue" have low co-occurence probabilities, yet they are still grammatical and the sentence itself wouldn"t sound weird. Why is "Tigers are blue." does not change the likelihood of "tigers" and "blue"? Well this is a cascade of dependencies that we explore by conterfactual saliency. But if "orange" and "tiger" have high likelihood, and "orange" and "blue" only co-occur with the "no" operator, changing the likelihoods of "tigers" and "blue" also would affect those of "tigers" and "orange" and "orange" and "blue". These clashes depend on how well fleshed out are the likelihoods of words in language users. For the tiger seems unacceptable the blue, but one would accept it for some weird genus of beetle, if one has no "corpus" on that particular genus of beetle, this would be regardless of any experience (even if you are being shown a beetle of the genus that is green, you would accept the generic and be like 'well i guess this ones green' (notice the context length difference that the 'this' adds to the generic strcuture)).

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Eventually you gotta touch experience, with ostension, for example. Pointing to something and uttering a word is pretty much like a generic, there is also a fuzzy kind like in the bare plural (which is the "class" or "salient property" that is being named). Mama & face of the mum.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
Okay so as for instruction tuning an LM. Here there is an interesting metalearning subtelty. Assume that generics are indeed a metalinguistic mechanism of shaping likelihoods between words (so no reference or truth is really relevant to them, and they become a calculus that can be implemented in a calculator within language aka LM). Do they have also that same function during training of a language model? Like in theory, the sentence "Ks are F" and "This K is F", the training signal (which is basically modifying the weights that model the K-F co-occurence probabilities) cannot modulate the signal between the both of them. There is a weird dynamic there no? Like I do not think that a priori the structure of a sentence can affect that something affects more or less the co-occurence, right? At least on the current learning paradigm.

</div>,

<div className="font-mono">


</div>,

<div className="font-mono">
...

</div>,

      ]}
    />
  ),
};
