All too often our team gets asked to define AI, so here it is:

Artificial Intelligence: Intelligence exhibited by machines, particularly computer systems, as opposed to the natural intelligence of living beings [Wikipedia].

This remains vague, and has even fluctuated under the grapple of marketing campaigns. This blog article will delve more into broadly what the technology community has settled on as the technical definitions and associated phrases.

Tech Community Definition of Artificial Intelligence

While marketing campaigns refer to a myriad of things as AI, within the tech community, AI is more narrowly defined as any system that makes a decision on multiple pieces of information with the use of statistics.

Of course with AI we hear a number of terms like “machine learning” (ML), “deep learning” (DL) and “natural language processing” (NLP). These all seem confusing and overlapping with each other – which they do! The below diagram describes just that.

Machine Learning

Machine learning is a family of AI models that are specifically tailored for estimating numerical values with an input matrix. Most of these models take on a structure analogous to Y = M * X, where Y is the answer, M is the model coefficients, and X is the input. Y is most often a single number, where as M and X are typically vectors. During training the input (X) and output (Y) are known, and the coefficients (M) are calculated. During deployment, the coefficients (M) and input (X) are known, and Y is calculated. While multiplication is shown in this example it is a representation, and the math is indeed often much more complex.

The simplest example of these is the classic linear regression, or more commonly its cousin the logistic regression which is restricted to an output range between 0 and 1. These are the building blocks that make up everything from simple regression statistics to transformer generative AI.

While not always true, the output of a machine learning model is most frequently between 0 and 1, regardless of its mathematical workings under the hood. This represents the system’s confidence that a question or state is true (this is not the same as probability, but it can be adjusted as such with tools).

Machine learning models have varying levels of transparency, and it gets more and more difficult to explain their inner reasoning to a user as that complexity grows.

transparency vs performance of ai models

Examples of machine learning models include:

Regressions
Random forest
Support vector machines
XGBoost
LightGBM
Any deep learning system

Deep Learning

Deep learning is a field that takes concepts from machine learning, and intensely scales them up. Deep learning models are complex systems made up of millions, no sometimes trillions of mini machine learning models (usually logistic regressions) arranged in different sequences, passing data to each other. These ‘nuerons’ are individually trained and weighted in importance during training. How neurons are connected to each other is in some deep learning architectures can get quite complicated.

Deep learning architectures are able to take on much more complicated input. While standard machine learning typically only takes vector input (a list of numbers), deep learning has the capacity to take on dimensions much higher than this. While in most cases this just means 2-dimensional input, 3-dimensional input and beyond has use cases in applications such as 3D graphics, or any use case where time is a component.

Because of this multidimensional capability, deep learning is uniquely suited for image data (and therefore video). It has also found its place in NLP due to vector-based word representations (multiple words = multiple vectors = 2d matrix) that are now popular in transformer models such as those that make up large language models such as OpenAI’s ChatGPT, BERT, Huggingface, LLaMA from Facebook, and many more which are in the end are quite similar.

There are still strong limits to modern neural networks – they must have fixed input and a fixed output, though this can appear to be overcome by clever implementations making it appear there is no limit.

Examples of famous deep learning systems:

Feedforward neural networks
Siamese neural networks
AlexNet
ResNet
AlphaGo
Transformers (BERT / GPT / LLaMA / Gemini (Bard))

Natural Language Processing (NLP)

This one is quite a biggy. Natural language processing is both the oldest and newest technology listed here, as its definitions are extremely broad and include any technology that involves language.

Early examples of NLP include Optical Character Recognition (OCR), a technology used for reading letters of text from a page. In the early days was purely logical in the early days, but now may often feature aspects of deep learning in specified scenarios.

Due to the complexity and evolving state of language, many consider this to be the pinnacle of computer science, especially in domains like healthcare, chemistry, and biology where new concepts and distinct language are common. The representation of words due to punctuation, slang, new words that appear over night, and multiple meanings make this especially complicated.

Examples of famous natural language processing systems:

Optical character recognition
Regular Expressions
Bag of word approaches
Language translation
Speech-to-text and text-to-speech
Word Embeddings
Alexa (Amazon)
IBM Watson, as it appeared on Jeopardy
LLMs (BERT / GPT / LLaMA / Gemini (Bard))

Natural Language Processing (Transformers / Large Language Models / Generative AI)

On the subject of transformers (BERT / GPT /ChatGPT / LLaMA / Gemini (Bard) / HuggingFace), also referred to as large language models (LLM), also referred to as text generative AI in its wider applications to audio, images, and video. In healthcare they are commonly used for summarization. These are systems that have combined multiple novel approaches learned over the years in mixed them all together very carefully to create a system with numerous neural networks combined to create an output with a relatively small amount of compute when you consider their size.

These systems have taken a unique approach to numerically define a word by its context. This solves the case of never-before-seen words, synonyms and adds in contextual understanding all at the same time. Modern implementations are even able to create new words on the output if necessary. These vector word representations undergo a large series of transformations to generate a single word (token) output. Once the process is complete it runs again, but this time including the previously generated word in the input.

There are extra layers beyond the core model that go into systems like the one displayed by OpenAI and similar. This includes another aspect called a “System” that defines the context of what the assistant is doing - usually it is answering a question. This layer also keeps the results appropriate to users and is usually able to exclude any dangerous biases rendered in the output.

Shortcomings of Transformers / LLMs

Generative AI has hangups, many of which are close to being resolved, others that remain far from being resolved. Generative AI systems:

still fall short of human-level problem solving. One of the most famous benchmarks for these models is the GSM8K – a large list of grade school math problems [1].
are not capable of expressing quantifiable confidence, as traditional machine learning models are.
Have a limit on the amount of input they can accommodate before generating an output. Presently the upper limit of this is around 32,000 words.
are bound to only data they have access to.
are prone to making up information [2].

We will cover more on these in a separate blog post.

Broad AI Product Categories

Not all AI is created equal for the same purpose. A high percentage of companies use AI for purely branding aesthetic. At Tenasol, we use AI to perform new tasks in healthcare data processing that would otherwise not be possible without these modeling techniques. This is not the same as most companies though.

Generally, all AI-based products fall somewhere on the following triangle:

New Product: This is a product that cannot exist without AI modeling techniques. ChatGPT, as well as the vast majority of Tenasol’s product suite both fall into this category.

Marketing Asset: These are products that incorporate AI for the purposes of purely marketing. Callaway AI golfclubs are a clear example of this as the presence of AI has no change in the overall functionality of the product.

IBM Watson (the jeopardy version, not the cloud advertising campaign) falls evenly between new product and marketing asset. It purely exists for marketing purposes, but would not have been able to exist if it weren’t for the underlying AI model used.

Differentiated Feature: The Tesla brand of car, most acclaimed for its promise of self-driving capability, is an example of a product feature that cannot exist without AI, however the overall product could easily persist without it.

Sources

[1] Cobbe, Karl, et al. "Training verifiers to solve math word problems." arXiv preprint arXiv:2110.14168 (2021).

[2] Brodkin, Jon. “Michael Cohen Loses Court Motion after Lawyer Cited AI-Invented Cases.” Ars Technica, 20 Mar. 2024, arstechnica.com/tech-policy/2024/03/michael-cohen-and-lawyer-avoid-sanctions-for-citing-fake-cases-invented-by-ai/.

AI Terminology Explained