Deep Learning Terms Glossary: Deep Learning Terms in 2024

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A

Activation Function

An Activation Function is a mathematical function that determines the output of a neural network node.

Activation Map

An Activation Map is a visual representation of the outputs of the neurons in a specific layer of a neural network.

Actor-Critic

Actor-Critic is a reinforcement learning algorithm that combines the advantages of both value-based methods, such as Q-Learning, and policy-based methods, such as Policy Gradient.

Adam Optimizer

Adam Optimizer is an optimization algorithm commonly used to update the weights in a neural network during training.

Adversarial Examples

Adversarial Examples are carefully crafted inputs that are slightly modified to deceive a deep learning model, highlighting vulnerabilities and potential security concerns.

Artificial Neural Network

Artificial Neural Networks (ANNs) are computing systems inspired by the neurons in a biological brain, used for various machine learning tasks.

Attention Mechanism

The Attention Mechanism is a component used in neural architectures to focus on relevant parts of the input, enabling the model to selectively process and weigh different parts of the data.

Auc-Roc

AUC-ROC (Area Under the ROC Curve) is a metric that measures the overall performance of a binary classification model, with higher values indicating better performance.

Autoencoder

An Autoencoder is a type of neural network that learns to encode and decode inputs, often used for unsupervised learning and dimensionality reduction.

B

Backpropagation

Backpropagation is a common method used to train neural networks by adjusting the weights of the connections.

Batch Normalization

Batch Normalization is a technique used to normalize the inputs of each layer, improving the training speed and stability of neural networks.

Batch Size

The Batch Size refers to the number of training examples utilized in one iteration of the gradient descent algorithm.

Bert

BERT (Bidirectional Encoder Representations from Transformers) is a powerful pre-trained language model that has been fine-tuned on a wide range of NLP tasks, achieving state-of-the-art performance.

Bias

Bias refers to a constant term added to the weighted sum of inputs in a neuron, allowing a neural network to learn and model complex relationships.

C

Capsule Network

A Capsule Network, also known as CapsNet, is a type of neural network architecture that aims to overcome the limitations of traditional neural networks by representing entities as capsules, capturing hierarchical relationships.

Common Neural Network Architectures

Common Neural Network Architectures refer to popular and widely used structures or designs of neural networks, such as Feedforward Neural Networks, Recurrent Neural Networks, and Convolutional Neural Networks.

Computer Vision

Computer Vision is a multidisciplinary field that focuses on teaching computers to interpret and understand visual data from the real world, such as images and videos.

Confusion Matrix

A confusion matrix is a table that summarizes the performance of a classification model by showing the counts of true positives, true negatives, false positives, and false negatives.

Convolution

Convolution is an operation that applies a filter to an input signal, commonly used in convolutional neural networks for image processing tasks.

Convolutional Neural Network

A Convolutional Neural Network (CNN) is a type of neural network designed for image recognition and processing.

Convolutional Neural Network (Cnn)

A Convolutional Neural Network (CNN) is a deep learning model specifically designed to process structured grid-like data, commonly used for image and video recognition.

Curriculum Learning

Curriculum Learning is a training strategy where the difficulty of training examples is gradually increased to help the neural network learn more effectively.

D

Data Augmentation

Data Augmentation is a technique used to artificially increase the size of a training dataset by applying various transformations to the existing data.

Data Imbalance

Data Imbalance refers to a situation where the data used for training a machine learning model is skewed, with some classes or categories having significantly fewer instances than others.

Data Labeling

Data Labeling is the process of assigning predefined categorical or numerical values to raw data instances, enabling supervised machine learning models to learn and make predictions.

Data Preprocessing

Data Preprocessing refers to the transformation and normalization of raw data before feeding it into a machine learning model, involving tasks such as cleaning, scaling, and feature engineering.

Deep Belief Networks (Dbn)

Deep Belief Networks (DBN) are probabilistic generative models that use unsupervised learning to perform feature extraction and can be stacked to form deep architectures.

Deep Learning

Deep Learning is a subset of Machine Learning that focuses on neural networks with multiple layers.

Deep Q-Network

A Deep Q-Network (DQN) is a type of reinforcement learning algorithm that combines Q-Learning with deep neural networks to learn policies for complex environments.

Deep Reinforcement Learning

Deep Reinforcement Learning combines deep neural networks with reinforcement learning algorithms to enable agents to learn and make decisions in complex environments.

Dropconnect

DropConnect is an extension of Dropout where entire connections in the neural network are removed.

Dropout

Dropout is a regularization technique used in neural networks to prevent overfitting by temporarily removing nodes during training.

Dropout Rate

The Dropout Rate is the probability of randomly setting a node's output to zero during training.

E

Early Stopping

Early Stopping is a technique used to prevent overfitting by stopping the training process when the model's performance on a validation set starts to deteriorate.

Epoch

An Epoch is one complete pass through the entire training dataset during the training process.

Exploding Gradient Problem

The exploding gradient problem is the opposite of the vanishing gradient problem, where the gradients become extremely large during backpropagation, causing numerical instability and difficulty in learning.

F

F1 Score

The F1 Score is a metric that balances precision and recall, calculated as the harmonic mean of precision and recall.

Feedforward Neural Network

A Feedforward Neural Network is the simplest form of a neural network, where information flows in one direction, from input to output.

Fine-Tuning

Fine-Tuning is the process of taking a pre-trained machine learning model and adjusting its parameters or architecture to adapt it for a specific task or domain.

G

Gated Recurrent Unit

The Gated Recurrent Unit (GRU) is another type of recurrent neural network that uses gating mechanisms to control the flow of information.

Generative Adversarial Network

A Generative Adversarial Network (GAN) is a type of neural network architecture that consists of a generator and a discriminator network that compete against each other.

Generative Adversarial Network (Gan)

A Generative Adversarial Network (GAN) is a type of neural network architecture that consists of two models: a generator and a discriminator, competing with each other to generate realistic data.

Generative Adversarial Networks

Generative Adversarial Networks (GANs) are a class of deep learning models that consist of a generator and a discriminator network.

Gradient Descent

Gradient Descent is an optimization algorithm used to minimize the loss function in a neural network.

Gradient Explosion

Gradient Explosion occurs during training when the values of the gradients become very large.

H

Hyperparameter

A Hyperparameter is a parameter that is set before training a machine learning model, such as the learning rate or the number of hidden units in a neural network.

Hyperparameters

Hyperparameters are parameters of a machine learning model that are set before training and cannot be learned from the data.

I

Image Recognition

Image Recognition is a branch of computer vision that involves identifying and categorizing objects or patterns within digital images, often utilizing deep learning models.

K

Keras

Keras is a high-level neural networks API written in Python that can run on top of TensorFlow, Theano, or CNTK.

Kernel

A Kernel is a small matrix used in convolutions to extract features from an image.

L

Learning Rate

The Learning Rate determines how quickly a neural network adjusts its weights during training.

Learning Rate Decay

Learning Rate Decay is a technique used to gradually reduce the learning rate during training to improve convergence.

Learning Rate Schedule

A Learning Rate Schedule is a technique used to adjust the learning rate during training, often based on a predefined schedule or a performance metric.

Learning Rate Scheduler

A Learning Rate Scheduler is a technique in deep learning that dynamically adjusts the learning rate during training to improve convergence and achieve better performance.

Long Short-Term Memory

Long Short-Term Memory (LSTM) is a type of RNN architecture that can retain memory over long sequences.

Long Short-Term Memory (Lstm)

A Long Short-Term Memory (LSTM) is a type of RNN that can learn long-term dependencies by effectively capturing and remembering information over long sequences.

Loss Function

A Loss Function measures how well a machine learning model performs by comparing its predictions to the actual values.

M

Machine Translation

Machine Translation is the task of automatically translating text from one language to another using machine learning techniques, commonly employing neural networks for this purpose.

Markov Decision Process

A Markov Decision Process (MDP) is a mathematical framework used to model decision-making in situations where outcomes are influenced by both actions and random events.

Max Pooling

Max Pooling is a pooling operation used in convolutional neural networks that returns the maximum value from each spatial region of the input.

Memory Network

A Memory Network is a type of neural network architecture that utilizes an external memory component to store and retrieve information.

Mini-Batch

A Mini-Batch is a subset of the training dataset that is processed as a group during each iteration of training, balancing computational efficiency and gradient accuracy.

Mini-Batch Gradient Descent

Mini-Batch Gradient Descent is a variation of the gradient descent algorithm where the weights are updated based on a small subset of the training data, called a mini-batch.

Model Evaluation

Model Evaluation involves assessing the performance and effectiveness of a machine learning model, often using metrics such as accuracy, precision, recall, and F1 score.

N

Named Entity Recognition

Named Entity Recognition (NER) is a subtask of NLP that involves identifying and classifying named entities in text, such as persons, organizations, locations, and more.

Natural Language Generation (Nlg)

Natural Language Generation (NLG) is an area of artificial intelligence that focuses on generating human-like text or speech, often used in chatbots and language translation systems.

Natural Language Processing

Natural Language Processing (NLP) is a field of study that focuses on the interaction between computers and human language, enabling computers to understand, interpret, and generate natural language.

Natural Language Processing (Nlp)

Natural Language Processing (NLP) involves the use of computational algorithms to analyze and understand human language, enabling tasks such as sentiment analysis and machine translation.

Natural Language Understanding (Nlu)

Natural Language Understanding (NLU) is a branch of AI that focuses on the comprehension and interpretation of human language, involving tasks such as sentiment analysis and entity recognition.

Neural Network

A Neural Network is a computational model inspired by the structure and function of a biological brain.

Normalization

Normalization is a process in deep learning that scales the input data to have zero mean and unit variance, making it easier for the neural network to learn and converge.

O

Object Detection

Object Detection is a computer vision task that involves identifying and localizing objects within images or videos, often accomplished using deep learning models such as Faster R-CNN and YOLO.

One-Hot Encoding

One-Hot Encoding is a process of converting categorical variables into a binary vector representation.

One-Shot Learning

One-Shot Learning is a type of machine learning where a model is trained to recognize objects or patterns from only a single training example per class, mimicking human learning capabilities.

Optimization Algorithm

An Optimization Algorithm determines how the weights of a neural network are updated during training to minimize the loss function, such as Stochastic Gradient Descent (SGD) and Adam.

Optimizer

An optimizer is an algorithm used to adjust the parameters of a neural network during training in order to minimize the loss function.

Overfitting

Overfitting occurs in machine learning models when they perform well on training data but fail to generalize to new, unseen data.

P

Perceptron

A Perceptron is the simplest form of an artificial neural network, consisting of a single neuron.

Policy Gradient

Policy Gradient is a class of reinforcement learning algorithms that learn to directly optimize the policy, or the behavior, of an autonomous agent, without explicitly modeling the environment.

Pooling

Pooling is a downsampling operation in convolutional neural networks that reduces the spatial dimensions of the input feature map by selecting the most important features.

Precision

Precision is a metric used to measure the exactness of a classifier, calculated as the number of true positives divided by the sum of true positives and false positives.

Pytorch

PyTorch is an open-source deep learning framework developed and maintained by Facebook's AI Research lab.

Q

Q-Learning

Q-Learning is a reinforcement learning algorithm that learns an optimal policy by estimating the value of each state-action pair, known as the Q-values.

R

Recall

Recall is a metric used to measure the completeness of a classifier, calculated as the number of true positives divided by the sum of true positives and false negatives.

Receptive Field

A Receptive Field refers to the region in the input space that a particular feature of a neural network is looking at.

Reconstruction Loss

Reconstruction Loss measures the difference between the input and output of an autoencoder, used to train the model.

Recurrent Neural Network

A Recurrent Neural Network (RNN) is a type of neural network architecture that has loops, allowing information to persist.

Recurrent Neural Network (Rnn)

A Recurrent Neural Network (RNN) is a type of neural network that has an internal memory, allowing it to process sequential data and remember past information.

Recurrent Neural Networks (Rnn)

Recurrent Neural Networks (RNN) are a class of neural networks that can process sequential data by using recurrent connections, enabling the network to store and use information from past inputs.

Regularization

Regularization is a technique in deep learning that introduces additional constraints to the weights of a neural network during training, preventing overfitting.

Reinforcement Learning

Reinforcement Learning is a branch of machine learning that focuses on teaching machines to make decisions based on trial and error.

Reinforcement Learning (Rl)

Reinforcement Learning (RL) is a machine learning paradigm where an agent learns to make decisions by interacting with an environment and receiving feedback in the form of rewards or punishments.

Reinforcement Signal

A Reinforcement Signal, also known as a reward or punishment, is a signal used in reinforcement learning to indicate the desirability or undesirability of an agent's action.

Relu

ReLu, or Rectified Linear Unit, is an activation function that introduces non-linearity to a neural network.

Roc Curve

The ROC (Receiver Operating Characteristic) curve is a graphical representation of the performance of a binary classification model at different classification thresholds.

S

Semi-Supervised Learning

Semi-Supervised Learning is a type of machine learning where a model is trained using a combination of labeled and unlabeled data, taking advantage of both types of information.

Sequence-To-Sequence

Sequence-to-Sequence is a neural network architecture that maps an input sequence to an output sequence, commonly used for machine translation and other sequence generation tasks.

Sigmoid

Sigmoid is an activation function commonly used in neural networks that maps any input value to a value between 0 and 1, introducing non-linearity.

Sigmoid Function

A Sigmoid Function is a type of activation function that maps the input values to a range between 0 and 1, often used in the output layer of binary classification models.

Softmax

Softmax is an activation function often used in the output layer of a neural network to produce probabilities for each class.

Softmax Function

The Softmax Function is an activation function that scales the outputs of a neural network's last layer to represent a probability distribution over classes.

Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is a variation of gradient descent that randomly selects a subset of training data for each iteration.

Supervised Learning

Supervised Learning is a type of machine learning where a model is trained using labeled data, with the goal of predicting a specific target variable.

T

Tensorflow

TensorFlow is an open-source deep learning framework developed by Google.

Test Set

A Test Set is a portion of the data that is held out and not used during the training process, but only used to evaluate the final performance of a model.

Text Classification

Text Classification is the task of assigning predefined categories or labels to pieces of text, often used for sentiment analysis, topic classification, and spam detection.

Theano

Theano is a Python library for numerical computation that can be used to build and train deep learning models.

Transfer Learning

Transfer Learning is a method in which a pre-trained model is used as the starting point for a new task, reducing the amount of required training data.

Transformer

The Transformer is a neural network architecture that utilizes self-attention mechanisms to capture dependencies between words in a sentence, achieving state-of-the-art performance on various NLP tasks.

U

Underfitting

Underfitting occurs in machine learning models when they fail to capture the underlying patterns and relationships in the training data.

Unsupervised Learning

Unsupervised Learning is a type of machine learning where a model is trained using unlabeled data, aiming to find hidden patterns and structures within the data.

V

Validation Set

A Validation Set is a portion of the training data used to evaluate the performance of a model during the training process.

Vanishing Gradient Problem

The vanishing gradient problem is a challenge in training deep neural networks, where the gradients become exponentially small during backpropagation, leading to slow learning or convergence.

W

Word Embedding

Word Embedding is a technique in deep learning that represents words as numeric vectors, capturing their semantic relationships and enabling language processing tasks.

Word Embeddings

Word Embeddings are dense vector representations of words that capture the semantic meaning.

Word-Level Attention

Word-Level Attention is a technique used in natural language processing to highlight important words or parts of a sequence.

Word2Vec

Word2Vec is a popular algorithm for learning word embeddings from large amounts of text data.