LLM Glossary

LLM

Welcome to the Large Language Model (LLM) Glossary, your go-to resource for understanding the key terms and concepts in the rapidly evolving field of artificial intelligence. As LLMs become increasingly integral to applications like natural language processing, text generation, and more, this glossary aims to demystify the terminology, providing clear and concise definitions to support your learning and application of these advanced technologies.

Some of the terms below might be missing a description – this document is in a constant state of development to keep up to speed with additions and changes – it’s being worked on!

Term Tags Description
Tokenization Preprocessing Breaking text into smaller units, like words or subwords, for model input.
Embedding Representations Numerical vector representation of text to capture semantic meaning.
Transformer Architecture Model Structure Neural network design using attention mechanisms for handling sequential data.
Self-Attention Mechanism Technique allowing models to focus on relevant words in a sequence.
Context Window Maximum text length a model can process in a single input.
Fine-Tuning Training Adjusting a pretrained model on specific data for a specialized task.
Few-Shot Learning Training Using limited examples to guide model behavior on a task.
Zero-Shot Learning Training Performing tasks without specific examples, relying on general language understanding.
Overfitting Training When a model performs well on training data but poorly on unseen data.
Prompt Engineering Usage Crafting input prompts to elicit desired responses from a model.
Language Modeling Predicting the next word or sequence based on preceding text.
Pretraining Training Initial training phase on large datasets to learn general language patterns.
Multimodal Models Model Types Models integrating text with other data types, like images or audio.
Latent Space Representations High-dimensional space where text is mapped to abstract features.
Dropout Training Regularization method to prevent overfitting by randomly ignoring neurons.
Bias Ethics Unintended prejudices learned from biased training data.
Model Drift Performance Decline in model performance over time due to changing contexts.
Beam Search Algorithm for selecting the best sequence of tokens during text generation.
GPT-4 LLM Developed by OpenAI, GPT-4 is a multimodal model capable of processing both text and image inputs, excelling in complex reasoning and understanding
Claude 3 LLM Anthropic's Claude 3 focuses on ethical AI interactions, emphasizing safety and reliability in generating human-like text.
PaLM 2 LLM Google's PaLM 2 is designed for advanced language understanding, including reasoning, coding, and multilingual capabilities
Llama 3 LLM Meta's Llama 3 offers open-source accessibility with strong performance in text generation and coding, supporting over 30 languages.
Mixtral 8x22B LLM Mistral AI's Mixtral 8x22B is a powerful open-source model known for top-tier reasoning in high-complexity tasks.
StableLM 2 LLM Stability AI's StableLM 2 is an open-source model optimized for stability and efficiency in various language tasks
DBRX LLM Databricks' DBRX is an open-source model designed for large-scale data analysis and processing tasks.
Pythia LLM EleutherAI's Pythia is an open-source model range from 70 million to 12 billion parameters, suitable for various natural language processing tasks
Alpaca 7B LLM Stanford's Alpaca 7B is an open-source model fine-tuned for instruction-following capabilities, based on LLaMA 7B
Open LLM Leaderboard Evaluation Tools The Open LLM Leaderboard from HuggingFace assesses models based on several benchmarks.
XGen-7B LLM Developed by Salesforce, XGen-7B is an open-source model tailored for business applications, offering efficient performance with 7 billion parameters.
RAG Mechanism Combines external knowledge retrieval with generative models to provide accurate, context-aware responses to queries.
Chunking Preprocessing Dividing text into smaller, manageable pieces for efficient processing and improved context management.
Hallucination Limitations When an LLM generates false or nonsensical information not grounded in its training data or context.
Huges Hallucination Evaluation Model (HHEM) Evaluation Tools A framework to assess the tendency of LLMs to produce hallucinated or incorrect outputs in generated content. Hosted on HuggingFace.
Transformers Model Structure Apply attention mechanisms to consider the importance of all words in a sentence simultaneously for better context understanding.
Emergent Abilities Capabilities Unexpected skills or behaviors that arise in LLMs when scaling up model size or training data.
Alignment Ethics Ensuring LLM behavior aligns with human values, intentions, and ethical guidelines during training and deployment.
Context Length Limitations The maximum amount of text an LLM can process or retain in a single input sequence.
GPT Model Types A series of LLMs using transformer architecture, pretrained on vast data for versatile text generation tasks.
Vicuna-13B Model Types An open-source chatbot fine-tuned from LLaMA on user-shared conversations, achieving over 90% of ChatGPT's quality.
Temperature Controls randomness in text generation by scaling token probabilities.
Wordpress Table Plugin