AI Terminology Explained

All too often our team gets asked to define AI.

Artificial Intelligence: Intelligence exhibited by machines, particularly computer systems, as opposed to the natural intelligence of living beings [Wikipedia]

Of course this is extremely vague as is the nature of language, and its definition has even fluctuated under the grapple of marketing campaigns. This blog article will delve more into broadly what the technology community has settled on as the technical definitions of it and associated phrases.

The Tech Community Definition of Artificial Intelligence

While marketing campaigns refer to a myriad of things as AI, within the tech community, AI is more narrowly defined as any system that makes a decision on multiple pieces of information with the use of statistics.

Of course with AI we hear a number of terms like “machine learning” (ML), “deep learning” (DL) and “natural language processing” (NLP). These all seem confusing and overlapping with each other – which they do! The below diagram describes just that.

AI ven diagram

Machine Learning

Machine learning is a family of AI models that are specifically tailored for estimating numerical values with an input matrix. Most of these models take on a structure analogous to Y = M * X, where Y is the answer, M is the model coefficients, and X is the input. Y is most often a single number, where as M and X are typically vectors. During training the input (X) and output (Y) are known, and the coefficients (M) are calculated. During deployment, the coefficients (M) and input (X) are known, and Y is calculated. While multiplication is shown in this is example it is a representation, and the math is indeed often much more complex.

The simplest example of these is the classic linear regression, or more commonly its cousin the logistic regression which is restricted to an output range between 0 and 1. These are the building blocks that make up everything from simple regression statistics to transformer generative AI.

types of basic machine learning

While not always true, the output of a machine learning model is most frequently between 0 and 1, regardless of its mathematical workings under the hood. This represents the system’s confidence that a question or state is true (this is not the same as probability, but it can be adjusted as such with tools).

Machine learning models have varying levels of transparency, and it gets more and more difficult to explain their inner reasoning to a user as that complexity grows. 

transparency vs performance of ai models

Examples of machine learning models include:

  • Regressions

  • Random forest

  • Support vector machines

  • XGBoost

  • LightGBM

  • Any deep learning system

Deep Learning

Deep learning is a field that takes concepts from machine learning, and intensely scales them up. Deep learning models are complex systems made up of millions, no sometimes trillions of mini machine learning models (usually logistic regressions) arranged in different sequences, passing data to each other. These ‘nuerons’ are individually trained and weighted in importance during training. How neurons are connected to each other is in some deep learning architectures can get quite complicated.

Deep learning architectures are able to take on much more complicated input. While standard machine learning typically only takes vector input (a list of numbers), deep learning has the capacity to take on dimensions much higher than this. While in most cases this just means 2-dimensional input, 3-dimensional input and beyond has use cases in applications such as 3D graphics, or any use case where time is a component.

Because of this multidimensional capability, deep learning is uniquely suited for image data (and therefore video). It has also found its place in NLP due to vector-based word representations (multiple words = multiple vectors = 2d matrix) that are now popular in transformer models such as those that make up large language models such as OpenAI’s ChatGPT, BERT, Huggingface, LLaMA from Facebook, and many more which are in the end are quite similar.

There are still strong limits to modern neural networks – they must have fixed input and a fixed output, though this can appear to be overcome by clever implementations making it appear there is no limit.

Examples of famous deep learning systems:

  • Feedforward neural networks

  • Siamese neural networks

  • AlexNet

  • ResNet

  • AlphaGo

  • Transformers (BERT / GPT / LLaMA / Gemini (Bard))

Natural Language Processing

This one is quite a biggy. Natural language processing is both the oldest and newest technology listed here, as its definitions are extremely broad and include any technology that involves language.

Early examples of NLP include Optical Character Recognition (OCR), a technology used for reading letters of text from a page. In the early days was purely logical in the early days, but now may often feature aspects of deep learning in specified scenarios.

Due to the complexity and evolving state of language, many consider this to be the pinnacle of computer science, especially in domains like healthcare, chemistry, and biology where new concepts and distinct language are common. The representation of words due to punctuation, slang, new words that appear over night, and multiple meanings make this especially complicated.

Examples of famous natural language processing systems:

  • Optical character recognition

  • Regular Expressions

  • Bag of word approaches

  • Language translation

  • Speech-to-text and text-to-speech

  • Word Embeddings

  • Alexa (Amazon)

  • IBM Watson, as it appeared on Jeopardy

  • Transformers/LLMs (BERT / GPT / LLaMA / Gemini (Bard))

Natural Language Processing (Transformers / Large Language Models)

On the subject of transformers (BERT / GPT /ChatGPT / LLaMA / Gemini (Bard) / HuggingFace), also referred to as large language models (LLM), also referred to as text generative AI in its wider applications to audio, images, and video. These are systems that have combined multiple novel approaches learned over the years in mixed them all together very carefully to create a system with numerous neural networks combined to create an output with a relatively small amount of compute when you consider their size.

These systems have taken a unique approach to numerically define a word by its context. This solves the case of never-before-seen words, synonyms and adds in contextual understanding all at the same time. Modern implementations are even able to create new words on the output if necessary. These vector word representations undergo a large series of transformations to generate a single word (token) output. Once the process is complete it runs again, but this time including the previously generated word in the input.

LLM Architecture

There are extra layers beyond the core model that go into systems like the one displayed by OpenAI and similar. This includes another aspect called a “System” that defines the context of what the assistant is doing - usually it is answering a question. This layer also keeps the results appropriate to users and is usually able to exclude any dangerous biases rendered in the output.

Shortcomings of Transformers / LLMs

Generative AI has hangups, many of which are close to being resolved, others that remain far from being resolved. Generative AI systems:

  1. still fall short of human-level problem solving. One of the most famous benchmarks for these models is the GSM8K – a large list of grade school math problems [1].

  2. are not capable of expressing quantifiable confidence, as traditional machine learning models are.

  3. Have a limit on the amount of input they can accommodate before generating an output. Presently the upper limit of this is around 32,000 words.

  4. are bound to only data they have access to.

  5. are prone to making up information [2].

We will cover more on these in a separate blog post.

Sources

[1] Cobbe, Karl, et al. "Training verifiers to solve math word problems." arXiv preprint arXiv:2110.14168 (2021).

[2] Brodkin, Jon. “Michael Cohen Loses Court Motion after Lawyer Cited AI-Invented Cases.” Ars Technica, 20 Mar. 2024, arstechnica.com/tech-policy/2024/03/michael-cohen-and-lawyer-avoid-sanctions-for-citing-fake-cases-invented-by-ai/.

Previous
Previous

Patient Identity Matching

Next
Next

Optical Character Recognition (OCR) in Healthcare