LLM Glossary

Glossary, LLM, Quickstart Guides

Welcome to the Large Language Model (LLM) Glossary, your go-to resource for understanding the key terms and concepts in the rapidly evolving field of artificial intelligence. As LLMs become increasingly integral to applications like natural language processing, text generation, and more, this glossary aims to demystify the terminology, providing clear and concise definitions to support your learning and application of these advanced technologies.

Some of the terms below might be missing a description – this document is in a constant state of development to keep up to speed with additions and changes – it’s being worked on!

Term	Tags	Description
Tokenization	Preprocessing	Breaking text into smaller units, like words or subwords, for model input.
Embedding	Representations	Numerical vector representation of text to capture semantic meaning.
Transformer Architecture	Model Structure	Neural network design using attention mechanisms for handling sequential data.
Self-Attention	Mechanism	Technique allowing models to focus on relevant words in a sequence.
Context Window		Maximum text length a model can process in a single input.
Fine-Tuning	Training	Adjusting a pretrained model on specific data for a specialized task.
Few-Shot Learning	Training	Using limited examples to guide model behavior on a task.
Zero-Shot Learning	Training	Performing tasks without specific examples, relying on general language understanding.
Overfitting	Training	When a model performs well on training data but poorly on unseen data.
Prompt Engineering	Usage	Crafting input prompts to elicit desired responses from a model.
Language Modeling		Predicting the next word or sequence based on preceding text.
Pretraining	Training	Initial training phase on large datasets to learn general language patterns.
Multimodal Models	Model Types	Models integrating text with other data types, like images or audio.
Latent Space	Representations	High-dimensional space where text is mapped to abstract features.
Dropout	Training	Regularization method to prevent overfitting by randomly ignoring neurons.
Bias	Ethics	Unintended prejudices learned from biased training data.
Model Drift	Performance	Decline in model performance over time due to changing contexts.
Beam Search		Algorithm for selecting the best sequence of tokens during text generation.
GPT-4	LLM	Developed by OpenAI, GPT-4 is a multimodal model capable of processing both text and image inputs, excelling in complex reasoning and understanding
Claude 3	LLM	Anthropic's Claude 3 focuses on ethical AI interactions, emphasizing safety and reliability in generating human-like text.
PaLM 2	LLM	Google's PaLM 2 is designed for advanced language understanding, including reasoning, coding, and multilingual capabilities
Llama 3	LLM	Meta's Llama 3 offers open-source accessibility with strong performance in text generation and coding, supporting over 30 languages.
Mixtral 8x22B	LLM	Mistral AI's Mixtral 8x22B is a powerful open-source model known for top-tier reasoning in high-complexity tasks.
StableLM 2	LLM	Stability AI's StableLM 2 is an open-source model optimized for stability and efficiency in various language tasks
DBRX	LLM	Databricks' DBRX is an open-source model designed for large-scale data analysis and processing tasks.
Pythia	LLM	EleutherAI's Pythia is an open-source model range from 70 million to 12 billion parameters, suitable for various natural language processing tasks
Alpaca 7B	LLM	Stanford's Alpaca 7B is an open-source model fine-tuned for instruction-following capabilities, based on LLaMA 7B
Open LLM Leaderboard	Evaluation Tools	The Open LLM Leaderboard from HuggingFace assesses models based on several benchmarks.
XGen-7B	LLM	Developed by Salesforce, XGen-7B is an open-source model tailored for business applications, offering efficient performance with 7 billion parameters.
RAG	Mechanism	Combines external knowledge retrieval with generative models to provide accurate, context-aware responses to queries.
Chunking	Preprocessing	Dividing text into smaller, manageable pieces for efficient processing and improved context management.
Hallucination	Limitations	When an LLM generates false or nonsensical information not grounded in its training data or context.
Huges Hallucination Evaluation Model (HHEM)	Evaluation Tools	A framework to assess the tendency of LLMs to produce hallucinated or incorrect outputs in generated content. Hosted on HuggingFace.
Transformers	Model Structure	Apply attention mechanisms to consider the importance of all words in a sentence simultaneously for better context understanding.
Emergent Abilities	Capabilities	Unexpected skills or behaviors that arise in LLMs when scaling up model size or training data.
Alignment	Ethics	Ensuring LLM behavior aligns with human values, intentions, and ethical guidelines during training and deployment.
Context Length	Limitations	The maximum amount of text an LLM can process or retain in a single input sequence.
GPT	Model Types	A series of LLMs using transformer architecture, pretrained on vast data for versatile text generation tasks.
Vicuna-13B	Model Types	An open-source chatbot fine-tuned from LLaMA on user-shared conversations, achieving over 90% of ChatGPT's quality.
Temperature		Controls randomness in text generation by scaling token probabilities.

EDUCATION

EDUCATION

LLM Glossary

Navigation

Categories

Follow Us