@EmilyMBenderChatGPWhy2023

Published by Bill on (Updated: )

Notes

  • There is a long list of examples of where the use of ChatGPT hasn’t turned out the way people expect.
  • Language form does not contain meaning and this explains why language models don’t understand.
  • Large language models are corpus models
  • Claude Shannon worked on early language models
  • Unigram language model models frequency of single words
  • Bigram language model models frequency of words given previous word
  • Trigram language model models frequency of words given previous two words
  • Good uses for language models include:
    • Spell checkers
    • Machine transcription
    • Machine translation
    • Text input
  • Neural networks are made of of perceptrons
  • A perceptron is a simplified model of a neuron
  • Transformer architecture is an arrangement or perceptrons
  • Language models use word embeddings
  • The number of words in training data determines the size of a language model
  • Generative AI is a mis-use of a classification and ranking tool
  • Generative AI produces plausible output not intelligence
  • In order to determine whether a machine can understand and infer meaning, we need definitions understanding and meaning.
  • Language competency makes it hard to separate form from meaning
  • Form refers to the marks on a page for language, the arrangement of pixels for images or video, etc.
  • Language meaning is the relationship between form and something external
  • Understanding is the recovery of communicative intent from form
  • Virtual assistants can understand limited instructions
  • Language models exposed only to form can never learn meaning
  • Language models do not learn the same way as babies
  • Babies learn the relationship between form (sound, mouth movement) and meaning by forming connections with external cues that hint at communicative intent.
  • The Octopus Paper show that form does not contain meaning
  • Large language models have a significant environmental impact
  • Environmental cost of large language models impacts marginalised communities
  • The contents of the internet do not represent a balanced view of humanity
  • The young and those from developed countries are more likely to have contributed to the volume of work available on the internet.
  • Sampling the internet without bias is hard
  • Large language models are too big
  • Generative AI output does not contain communicative intent
  • We bring our own understanding to language form
  • When reading generative text, it is important to remember that the inference of meaning is our own.
  • A Stochastic Parrot refers to the stitching together of form without meaning
  • Coherence is in the eye of the beholder
  • Synthetic text lacks accountability
  • There is no Who behind generative text
  • Generative AI pollutes the information ecosystem
  • Information retrieval is a terrible use-case for a large language model
  • The more accurate generative text becomes the more dangerous it is
  • Chatbots hide the sources of the information they regurgitate
  • Responsible use-cases for generative AI include:
    • where the only thing that matters is form
    • text must not confuse author with a person
    • text needs to clearly articulate biases
    • consider labor practices
    • consider data theft
  • Good use-cases for generative AI include:
    • a dialogue partner for language learning
    • a non-playable character
    • writing support
  • Good use-cases for generative text must consider the costs
  • Be a critical consumer of AI
  • We need to understand how the AI technology was evaluated in the context in which it is being used.
  • We need to understand who benefits from the use of AI instead of a human?
  • You are responsible for your use of generative text
  • We must insist on transparency of source material in the training data.
  • Talk to students about what generative AI is
  • Use of generative AI in education is a missed learning opportunity
  • Use of generative AI by students indicates broader problem

Further Information

Three podcasts worth subscribing to on AI: