Demystifying AI: A Practical Guide to Key Terminology
Understanding the most common terms in modern applied AI
Understanding AI terminology like GPTs, LLMs, ML, etc. can be daunting, but today we're here to demystify some common terms.
In today's issue, you'll learn more about some popular AI terms and their connections, so you can communicate more effectively in data and AI projects.
Hopefully, this will come in handy the next time you’re collaborating with colleagues from other departments or engaging stakeholders.
A Myriad of AI Perspectives
AI terminology can be quite confusion. The main reasons are:
the complex and evolving nature of AI technology
the fact that AI practitioners have multiple perspectives
the use of different terms and jargon across specific industries, fields, and organizations
The way some terms are being used will be different whether you're talking to a researcher, data scientist, or business executive.
I guess we just need to accept that.
If there's one thing I've learned, it's that being strict about the use of correct terminology in a business can quickly turn into fighting windmills. Remember, it's not about the terminology itself, but the concepts and ideas they represent.
So, while this article is a good launching pad, feel free to tailor it to your organization's needs to better navigate AI-related discussions and make more informed decisions.
A Taxonomy of Artificial Intelligence
Here's a high-level overview of the key terms in a hierarchical manner, beginning with AI and ending with ChatGPT:
Let's go through these step by step!
1. Artificial Intelligence
Artificial Intelligence (AI) has become a high-level umbrella term that generally refers to systems (machines) that behave as if they were intelligent. "Intelligence" in this sense generally refers to the ability to perform more or less cognitively complex tasks such as learning, problem solving, and decision making.
Brief history: The term AI dates back to the 1950s, with strong roots in military research. Since then, it has evolved to encompass a wide range of applications across industries.
Specifically, two main research areas have emerged: Strong AI, or Artificial General Intelligence (AGI), and Narrow AI ("weak AI").
2. Strong AI (AGI) and Narrow AI
Strong AI aims to replicate human intelligence in all aspects. The research goal here is to create systems that can solve previously unseen problems, much like a human. The ultimate goal would be superintelligence or singularity, an AI system that matches or exceeds human intelligence in all areas. However, the time when both researchers and practitioners believe this is achievable is still far in the future. It is uncertain whether it will ever be achieved.
Narrow AI, on the other hand, refers to task-specific systems, that are designed for particular jobs like language translation or image recognition. Unlike Strong AI, Narrow AI is not meant to replicate human intelligence in all aspects. Instead, Narrow AI is designed to perform narrowly defined tasks efficiently.
Narrow AI systems have a lot of practical use cases. When you encounter an AI application in business, it's narrow AI at work.
3. Machine Learning
There are two ways that systems can be "intelligent". Either these systems are based on pre-programmed rules for decision-making without learning from data. ("Rule-Based Systems") Or, these systems can learn from data without having to explicitly hand-code the rules.
This is where Machine Learning (ML) comes in.
Machine learning (ML) is a technology that lets systems iteratively learn from large amounts of data, identify patterns, and make decisions with minimal human intervention.
To be clear, ML per se doesn't imply self-awareness or emotion, as is sometimes suggested in popular media.
Rather, Machine Learning is a statistical approach to automated pattern recognition in data.
4. Deep Learning
Deep Learning, is a subset of machine learning, that uses a certain type of machine learning models - so-called artificial neural networks (ANNs) that typically contain many layers (hence "deep").
While the concept of ANNs was inspired by the human brain, the actual operation of these networks is vastly different from the biological processes in the brain.
Still, these models are sophisticated pattern recognition systems and should not be mistaken for conscious entities.
Essentially, deep learning is just a very complex variant of machine learning.
Neural Network Architecture Example
5. Generative AI
Generative AI is a branch of deep learning that aims to create or modify original content. It doesn't inherently have a creative mind; it uses statistical patterns in data to generate similar content without "understanding" the content the way humans do.
The first popular example of this technology was Deepfakes, where visual and audio content is altered to such an extent that it's hard to tell if the content is real or not.
Generative AI spans multiple domains and approaches, from text generation to music generation to image generation to video generation.
The Generative AI hype was started by the invention of another specific type of ANNs, namely GANs - Generative Adversarial Networks in 2014 by Ian Goodfellow and his colleagues. Since then, a lot of research has been done in the Generative AI space, which ultimately led to the development of another (and in many cases better) type of ANN architecture - Transformers, introduced in 2017 by Google.
6. Large Language Models (LLMs)
The Transformer architecture, introduced in 2017, was a major breakthrough for Large Language Models (LLMs). BERT, a widely recognized LLM developed by Google in 2018, leveraged this architecture and received widespread acclaim for natural language understanding tasks such as question answering and sentiment analysis.
Today, modern LLMs typically have several hundred million (or billions) parameters and can help us solve not only tasks such as answering questions, but also writing essays, summarizing long documents, translating languages, generating code, and much more!
LLMs don't have true language understanding; their responses are based on patterns learned during training, not on inherent knowledge or comprehension of the world.
Hence, they are sometimes called "fancy autocomplete" as they essentially try to complete text inputs by predicting what comes next based on a given context and patterns in their training data.
Over the past few years, there has been a tremendous effort in LLM development, both from commercial organizations and open source contributions.
Check out the Awesome-LLM Github repository for an amazing overview!
Google’s BERT model was based on the transformer architecture.
6. Generative Pre-Trained Transformers (GPT)
Generative Pre-trained Transformers (GPT) are a special type of Large Language Models models that use a transformer architecture to generate original content.
GPTs are typically trained on a very large corpus of data - such as all the text on the public internet - from which they learn the patterns and structures of language to generate text.
Despite its impressive capabilities, it's crucial to note that even the most advanced GPT does not "understand" the content it generates in the way humans do.
A GPT is essentially recognizing and predicting statistical patterns in the data it was trained on – a classical machine learning approach.
GPT-4 is a specific, proprietary LLM from OpenAI that is available for both commercial and personal use. However, OpenAI is not so open about how exactly this model was built.
All we know is that it is currently one of the most powerful LLMs in the world.
Still, even the most advanced models like GPT-4 come with a good set of limitations. For example, GPT-4 is not free from AI hallucinations, where the model produces outputs that are not based on actual knowledge or data.
Also, biases in the training data can lead to biased outputs which you should be aware of when using these models.
Nevertheless, GPT-based large language models like GPT-4 are powerful tools that have many potential applications in various domains, including natural language processing, chatbots, content creation, language translation, and more.
To learn more about how LLMs are built and how they work, I recommend watching this video by OpenAI co-founder Andrej Karpathy:
ChatGPT is a web application by OpenAI that lets users interact with their GPT-4 and GPT-3.5 models.
While ChatGPT is the most popular, many similar services are entering the space:
Bard by Google (uses PaLM 2)
Claude by Anthropic (uses their own Claude LLM)
Poe by Quora (allows accessing multiple models)
While applications like ChatGPT often seem like a thin layer around existing LLMs, it's worth noting that there's often more going on than you might expect.
For example, while ChatGPT has to deal with the same limitations as the underlying GPT-4 model, such as AI hallucination or biased output, the application layer provides more ways to mitigate these limitations, such as setting rules and filters to keep responses within certain parameters or expected outcomes - a process often referred to as "alignment" between the model and the desired behavior.
Don’t be confused: Even if you here it often, there’s nothing like ChatGPT-4. It’s either ChatGPT (the web app) or GPT-4 (the model).
I hope this article gave you a good overview of some key AI concepts and how they are connected.
As you continue to interact with AI, feel free to come back to these explanations and keep in mind the potential applications and limitations of these technologies.
Good communication involves tailoring the message to the audience and using appropriate language to convey the intended meaning.
Feel free to build an AI vocabulary that helps you along your journey - and sometimes that might involve not talking about "AI" at all.
How did you like today's newsletter?
Feel free to let me know!
Until next Friday,