Technical terms explained for non-experts. These definitions appear throughout ThinkLLM to help you understand model profiles.
2204 terms
Guiding AI model outputs by conditioning on 3D spatial layout information.
Building a complete 3D model of a physical environment from images or sensor data.
Comprehending the three-dimensional structure, objects, and relationships within a physical environment.
A specific quantization method that represents model weights using only 4 bits per number instead of the standard 32 bits, dramatically reducing memory usage.
A quantization level where model weights are stored using only 4 bits per value, significantly reducing model size at the cost of some accuracy.
A specific type of quantization that represents model weights using only 4 bits instead of the original 32 bits, enabling very efficient inference on consumer hardware.
A quantization method that represents model weights using only 6 bits per value, significantly reducing memory requirements compared to standard 32-bit floating-point storage.
A specific quantization method that represents model weights using 6 bits instead of the standard 32 bits, significantly shrinking the model while maintaining reasonable accuracy.
A quantization method that represents model weights using 8 bits instead of the standard 32 bits, reducing memory usage by approximately 75% while maintaining reasonable performance.
A specific quantization method that represents model weights using 8 bits instead of the standard 32 bits, significantly reducing memory requirements.
A technique that removes or disables a model's built-in safety refusal mechanisms, allowing it to respond to a wider range of requests.
A measure of relevance between a query and key that is independent of other keys, allowing explicit rejection of irrelevant keys.
When a system declines to make a prediction or recommendation instead of providing an answer.
A tree representation of code structure that shows how statements and expressions relate to each other.
Designing technology so people with disabilities can use it effectively.
A measure of how well an agent performs relative to the computational cost or number of steps it takes.
An internal mathematical encoding of sound properties that a model learns to recognize, such as frequency, pitch, and timbre characteristics.
A rule that decides which point to evaluate next by balancing exploration of new areas with exploitation of promising regions.
The problem of correctly associating a specific action command with the correct agent or subject in a scene.
Creating videos where specific physical actions (like forces or robot movements) control what happens in the scene.
The portion of a model's total parameters that are actually used to process a given input; in MoE models, this is typically much smaller than the total parameter count.
Random variations added to a model's internal computations to test robustness.
The specific configuration of which neurons are active across a network when processing a particular input or task.
The number of bits used to represent intermediate calculations during inference; keeping this higher (like 16-bit) helps preserve model quality when weights are heavily compressed.
Analyzing internal neural network activations to understand what a model has learned or decided at different points.
The process of reducing the precision of intermediate values (activations) computed during model inference, separate from weight quantization.
Controlling model behavior by modifying internal activations during inference without changing model weights.
Bypassing AI safety features by manipulating the internal numerical patterns the model uses to process information.
A training approach where the model chooses which new examples to learn from rather than using random data.
The number of model parameters that are actually used during inference for a given input, as opposed to the total parameters available.
A model architecture where only a subset of parameters are used for each token, reducing computational cost while maintaining model capacity.
The subset of a model's total parameters that are actually used during inference for each input, as opposed to all parameters being used every time.
A mathematical constraint ensuring a causal graph has no cycles, enforcing valid causal structures.
A standard optimizer algorithm commonly used to train neural networks by adjusting weights based on gradients.
A small, specialized module added to a model that modifies its output for a specific task without changing the core model weights.
Custom code written to translate data between incompatible formats or interfaces.
Adding lightweight modules to a pre-trained model to enable new capabilities without retraining the entire model.
Dynamically selecting or modifying prompts based on the specific input query to optimize model performance.
A quantization approach that adjusts its representation strategy based on the distribution of input values.
Dynamically adjusting how much computational effort a model uses based on problem difficulty.
Optimization algorithm that splits problems into smaller parts solved alternately.
Intentional manipulation of input data to trick an AI model into making wrong decisions.
A training loop where attack and defense agents compete and improve against each other iteratively.
Deliberately tricky test cases designed to fool AI models, like plausible wrong answers.
Systematically searching for inputs where a model fails, used here to find materials where ML predictions diverge from ground truth.
Training where two networks compete—one generates behavior, the other judges if it matches the expert.
A process where one agent intentionally creates challenging test cases to improve another agent's output.
Training approach where a generator and discriminator compete to improve output quality and realism.
Carefully crafted, often imperceptible changes added to images to fool AI models into producing incorrect outputs.
Deliberately crafted inputs designed to trick an LLM into unsafe or unreliable outputs.
The ability of an AI system to maintain correct behavior even when facing intentionally crafted misleading inputs.
Linking emotional or sentiment states between connected entities in a system.
Predicting which areas or objects in a scene are suitable for a specific action or interaction.
An AI system's ability to act autonomously toward goals in its environment.
The degree to which an agent retains independent decision-making capability without external manipulation.
Coordinating multiple AI agents to work together on complex tasks.
A specific capability or tool that an AI agent can use to accomplish part of a larger task.
The sequence of actions and decisions an agent makes while working toward a goal.
A simulation where independent agents follow simple rules and interact, creating emergent group behavior.
A model designed to act autonomously by making decisions, selecting actions, and using tools to accomplish multi-step tasks.
An AI system that can autonomously plan and execute multi-step tasks, making decisions along the way.
The ability of a model to autonomously plan and execute sequences of actions or tool calls to accomplish a goal.
Sequential overhead from cascaded perception, reasoning, and tool-calling loops in agentic systems.
Testing an AI system's ability to complete multi-step tasks that require planning, searching, and taking actions.
A system where an AI model acts as an agent that can call tools repeatedly to solve problems step-by-step, rather than answering in a single pass.
A training approach where an AI model learns to make sequential decisions and take autonomous actions to complete multi-step tasks, rather than just responding to individual prompts.
Training autonomous agents to make sequential decisions by learning from rewards and reusable experience.
Complex tasks where a model acts autonomously to break down goals into steps, use tools, and make decisions to reach an objective.
Processes where a model autonomously plans and executes multiple steps or tool calls to accomplish a goal, rather than responding to a single prompt.
Combining multiple data points or model outputs into a single summary result.
Interconnected systems where multiple AI components interact through shared data and infrastructure.
Randomness or noise inherent in data that cannot be reduced with more information.
Ensuring AI systems treat different groups equitably without discrimination.
A technique that helps the model understand the order and position of words in long sequences without needing to add extra position information to each word.
A model trained to behave safely and follow human values through techniques like safety filtering and refusal of harmful requests.
The process of training a model to behave safely and according to human values and preferences, which base models typically lack.
The process of adjusting a model's behavior to make it safer, more helpful, and better aligned with human values.
Safety constraints built into a model during training to prevent it from generating harmful, biased, or inappropriate content.
Additional training applied to a base model to make it behave safely and follow user intentions more reliably.
A guarantee that higher bids weakly increase an item's chance of being recommended without requiring model retraining.
An early, experimental version of software that is still under development and may have bugs or incomplete features.
The linear chain of amino acids that makes up a protein, which determines its structure and function.
Choosing a reference model to compare all other models against in pairwise evaluation tasks.
Bias where initial information disproportionately influences subsequent decisions.
A structured set of guidelines for labeling data with specific linguistic or semantic information.
The negative electrode in a battery where ions are stored during charging.
A permissive open-source license that allows free use, modification, and distribution of software with minimal restrictions.
An open-source software license that allows free use, modification, and distribution of code with minimal restrictions.
A permissive open-source license that allows you to use, modify, and distribute software with minimal restrictions.
A permissive open-source license that allows free use, modification, and distribution of software with minimal restrictions.
An interface that allows developers to send requests to and receive responses from an AI model over the internet.
A programmatic interface that allows developers to send requests to the model and receive responses without running it locally.
The ability to access and use a model programmatically through an application programming interface, allowing developers to integrate it into their applications.
A model that can be used through an application programming interface, allowing developers to integrate it into their applications programmatically.
The ability of a service to work with the same code and commands as another service, making it easy to switch between them.
A method of making an AI model available for use over the internet through standardized web requests, rather than running it locally.
Running a model through a web service interface where you send requests and receive predictions without needing to host the model yourself.
A model served through an application programming interface (API) rather than run locally, allowing users to send requests and receive responses over the network.
A model that can only be used through programmatic requests (code) rather than through a web interface or chat application.
Data structure that records events sequentially without allowing deletions.
Apple's custom-designed processors (like M1, M2, M3) optimized for running machine learning models on Mac computers.
Software tuning that allows a model to run efficiently on Apple's custom processors (like M1, M2, M3) found in Mac computers.
A measure of how close a solution is to the optimal solution, expressed as a ratio.
Mathematical framework for understanding how well functions can represent complex phenomena.
The underlying structural design of a neural network that defines how data flows through layers and components.
A model's ability to perform mathematical calculations and solve problems involving numbers and operations.
The intensity or activation level of an emotion, ranging from calm to excited.
The task of identifying or classifying the artistic style of a work (e.g., Renaissance, Impressionism) using AI.
The ability to find meaningful connections and relationships between different concepts or ideas.
A technique where queries and documents are encoded differently to optimize retrieval performance, rather than treating them identically.
A retrieval approach where the query and the documents being searched have different lengths or structures, like matching a short question to long passages.
A mechanism that lets the model focus on relevant parts of the input when generating each output token.
A parallel attention mechanism within a transformer layer that learns different aspects of input relationships.
Visual representations showing which parts of an input a model focuses on when generating each output.
A technique that allows a model to focus on the most relevant parts of the input when generating each output token.
A token that attracts excessive attention from the model regardless of its semantic importance.
Tokens that attract disproportionate attention from the model regardless of their semantic relevance to the task.
Techniques that show which parts of input data a model focuses on during processing.
Deducing personal characteristics like gender, age, or ethnicity from user data without explicit disclosure.
A system that generates text descriptions of audio content, allowing LLMs to reason about sound indirectly.
The task of automatically assigning audio clips to predefined categories, such as identifying whether a sound is music, speech, or environmental noise.
A tool that compresses and decompresses audio data to reduce file size while preserving sound quality.
Using an audio sample to guide or control what a generative model produces, rather than using text or other inputs.
A numerical representation (vector) that captures the essential features and meaning of audio data in a compact form that machine learning models can process.
Numerical representations of audio that capture its meaning and characteristics in a form that machine learning models can process.
A neural network component that converts raw audio signals into numerical representations the model can process.
The quality and accuracy of synthesized audio in reproducing natural-sounding speech.
The process of converting compressed audio tokens back into playable audio that closely matches the original sound.
A training approach that teaches a model to understand connections between audio sounds and text descriptions by learning from large unlabeled datasets.
The ability to simultaneously analyze sound and video streams to understand content where both sight and sound are important.
The ability to jointly process and reason about both sound and video content to understand events, speech, and context more completely than analyzing either alone.
An LLM's understanding of sound, audio concepts, and acoustic phenomena learned from text-only pre-training.
Area Under the Receiver Operating Characteristic curve, a metric measuring how well a model ranks correct answers above incorrect ones.
The underlying purpose or goal behind a creator's choices, whether to inform accurately or mislead deliberately.
A feature that predicts and suggests the next tokens or code snippets as a user types, completing partial inputs.
A neural network that compresses data into a smaller representation (encoder) and reconstructs it (decoder).
Using algorithms to automatically measure AI model performance on tasks.
Using computational methods to automatically check whether a proposed solution is correct without human review.
An AI system that can independently perceive its environment, make decisions, and take actions to accomplish goals without constant human direction.
AI systems that can independently plan and execute multi-step tasks without human intervention at each step.
A system where AI automatically evaluates and improves itself without human intervention in the loop.
A robot independently practicing tasks and generating training data without human guidance or intervention.
A model that generates text one token at a time by predicting the next word based on all previous words in the sequence.
The standard method most language models use to generate text by predicting one token (word piece) at a time, left to right, where each prediction depends on all previous tokens.
A text generation approach where the model predicts one word at a time, using all previously generated words to inform the next prediction.
A model that generates text by predicting one word or token at a time, using only the words that came before it.
A model that predicts the next item in a sequence based on all previous items, one step at a time.
Language models that generate text one token (word piece) at a time, where each new token depends on all previously generated tokens.
Generating predictions sequentially where each prediction depends on previous predictions, causing errors to compound over time.
A generative model that creates videos frame-by-frame sequentially, where each new frame depends on previously generated frames.
Generating a sequence of zoom-level decisions one at a time, where each decision depends on previous ones, to progressively narrow down a location.
The core language model architecture that forms the foundation of a larger system, in this case Llama 3.
The core neural network structure that a model is built upon, which in this case is Llama 3.
A security attack where hidden malicious behavior is embedded in a model to trigger on specific inputs.
Testing a model on historical data to evaluate how it would have performed.
Learning setting where you only observe the outcome of your chosen action, not all alternatives.
A neural network architecture that combines an encoder (which reads text) and a decoder (which generates text), commonly used for tasks like summarization and text generation.
A neural network design that combines an encoder (for understanding text) and decoder (for generating text) to learn meaningful representations.
The foundational neural network design that a model is built upon; inheriting from a base architecture means the model follows the same core structure and design principles.
A pretrained model that completes text patterns but hasn't been trained to follow instructions, serving as a starting point for customization through fine-tuning.
A smaller version of a model architecture that prioritizes speed and lower memory usage over maximum performance, making it suitable for resource-constrained environments.
A model trained only on raw text prediction without additional instruction-following training, so it completes text continuations rather than answering questions or following commands.
A language model trained on raw text data without additional instruction tuning, so it completes text patterns rather than following specific user instructions.
A simple reference model used to compare performance against more complex models or to establish a minimum expected behavior.
Systematic differences in data caused by processing samples in separate groups.
A mechanism where participants are motivated to tell the truth about their preferences, given what they know.
Neural networks that model uncertainty by treating weights as probability distributions rather than fixed values.
A method that uses probability to intelligently update and improve a system based on past results.
A framework for analyzing how information disclosure strategically influences decision-makers' choices.
Binary Cross-Entropy loss, a training objective commonly used for relevance scoring tasks where the model learns to predict whether a query-document pair is relevant or not.
A decoding algorithm that keeps the top-k most likely candidate sequences at each step, balancing quality and computational cost.
A mathematical model describing how honeybee swarms reach consensus on nest sites through recruitment and inhibition.
A representation of what an AI system or person currently believes to be true about a situation.
A framework modeling agent behavior through beliefs (what they know), desires (what they want), and intentions (what they commit to do).
A standardized test suite used to measure and compare model performance on specific tasks.
A standardized set of test problems used to measure and compare the performance of different algorithms or models.
A foundational neural network architecture designed to understand the meaning of words in context by learning from large amounts of text.
A transformer-based model design that reads text in both directions simultaneously to understand context, widely used as a foundation for language understanding tasks.
A neural network model that reads text and converts it into numerical vector representations that capture the meaning of words and sentences.
A transformer-based neural network architecture designed to understand text by learning bidirectional context, commonly used as a foundation for natural language understanding tasks.
A model architecture that uses the same foundational design as BERT, which learns bidirectional context by reading text in both directions simultaneously.
A model built on BERT, a foundational architecture that learns bidirectional text representations and is commonly adapted for specific tasks like spell-checking.
A neural network design based on the BERT model that uses transformer layers to understand relationships between words in text by looking at context from all directions.
A transformer-based model architecture that reads text bidirectionally to understand context and produce meaningful representations of words and sentences.
A heavily compressed version of the BERT language model with far fewer parameters, designed for fast inference on resource-constrained devices.
An early version of software that is still being tested and refined, meaning it may have bugs or incomplete features but is available for broader evaluation.
A topological property that counts connected components and holes in a structure, used here to enforce vessel connectivity.
A 16-bit floating-point format that balances precision and memory efficiency, commonly used for training and deploying large language models.
A 16-bit numerical format that balances memory efficiency with numerical stability, using fewer bits than standard 32-bit floats while maintaining training and inference quality.
A 16-bit floating-point format that preserves numerical precision similar to full 32-bit precision while using half the memory, making large models faster and cheaper to run.
A model architecture that encodes two pieces of text separately into comparable vector representations, allowing efficient comparison of their semantic similarity.
An optimization approach with two nested loops: an inner loop optimizing fast weights and an outer loop optimizing the main model parameters.
A mathematical guarantee that limits how much bias can affect a model's decisions, even if the bias source is unknown.
Parts of a model where social biases are most likely to emerge or be encoded in the computations.
Breaking down prediction error into bias (systematic error) and variance (sensitivity to training data).
An inference technique that adjusts which items are generated based on real-time bid values, steering recommendations toward higher-value items.
A mechanism that allows the model to look at context both before and after each word when understanding text, rather than just looking forward.
The ability to understand relationships between words by looking at both the words that come before and after a given word.
A transformer-based model architecture designed to handle very long text sequences efficiently by using sparse attention patterns instead of processing every word pair.
A factorization where value and policy functions are expressed as products of goal-conditioned coefficients and learned basis functions.
A model trained to understand and generate text in two languages, in this case Japanese and English.
A language model trained to understand and generate text in two languages with comparable fluency.
A model that processes two different types of input (in this case, code and natural language) and converts them into a shared representation space.
A decision mechanism where neurons act as on/off switches to direct data through different computational paths.
A large collection of medical and scientific texts (like research papers and journals) used to train the model on domain-specific language and concepts.
Natural language processing techniques applied specifically to medical and biological text, such as extracting drug names or identifying disease mentions from research papers.
Written content from medical and life sciences domains, including clinical notes, research papers, and healthcare documentation.
Specialized medical and scientific terms and concepts that the model has learned to understand from training on medical literature.
Protecting against misuse of biological research and AI in harmful ways.
Electrical or physical signals produced by the body, such as heart rhythms or brain waves.
A top-down 2D representation of a 3D scene, showing spatial layout as if viewed from above.
The mathematical space of all doubly stochastic matrices; parameterizing this space exactly is the core challenge this paper addresses.
The number of bits used to represent each number in a model; lower bit depths (like 3-bit) create smaller files but may lose some accuracy compared to higher bit depths.
The number of bits used to represent each number in a model; lower bit precision (like 3-bit) means smaller file size but potentially less accurate calculations.
A measure of uncertainty in an agent's decision-making at a given state—how much of the decision space lacks statistical support from training data.
Internal vector representations produced by a state space model's processing blocks that encode information about token sequences.
Scaling factors computed for groups of values in low-precision formats to maintain numerical accuracy.
A language model that generates multiple tokens in parallel using diffusion, then refines them iteratively.
A quantization method that divides values into groups and applies a shared scale factor to each group.
Movement commands relative to the drone's own orientation, rather than a fixed world direction.
Mechanisms that prevent an LLM from crossing defined limits in reasoning or behavior.
A rectangular coordinate set that marks the exact location and size of detected text or objects within an image.
Using AI to enhance the exploratory ideation phase of research rather than automating solution design.
The average number of possible moves available at each decision point in a game.
A marker in code where a debugger pauses execution so you can inspect the program state.
A reinforcement learning technique that constrains model outputs to stay within a token budget, reducing response length while maintaining accuracy.
Breaking text into individual bytes (raw character codes) rather than words or subwords, which allows the model to handle any text without a predefined vocabulary.
The ability of a system to function correctly even when some participants behave maliciously or unpredictably.
Adjusting a model's predictions using held-out data to correct for systematic biases or distribution differences.
Determining the position and orientation of a camera in 3D space relative to a scene.
Forecasts of future returns, volatility, and correlations for different asset classes used to guide investment decisions.
Sequential processing where output from one stage feeds into the next.
The model's ability to distinguish between uppercase and lowercase letters as meaningful differences, treating 'Москва' and 'москва' as separate tokens with different meanings.
A model that treats uppercase and lowercase letters as identical, so 'Apple' and 'apple' are processed the same way.
The model treats uppercase and lowercase letters as distinct, allowing it to recognize proper nouns and maintain capitalization distinctions.
Text processing that preserves the distinction between uppercase and lowercase letters, treating 'Apple' and 'apple' as different tokens.
The model's ability to distinguish between uppercase and lowercase letters, making it sensitive to proper nouns and capitalization patterns that carry meaning.
When a model loses its original knowledge while learning a new task, like overwriting old skills.
A model that learns causal relationships between variables and can answer observational, interventional, and counterfactual questions.
The ability to determine true cause-and-effect relationships from data, typically guaranteed by randomization.
Determining whether a treatment actually caused an outcome, not just whether they're correlated.
A model that predicts the next word in a sequence by only looking at previous words, not future ones, making it suitable for text generation.
A training approach where the model predicts the next word based only on previous words, commonly used for text generation tasks.
A machine learning method that estimates personalized treatment effects from survival data using tree-based models.
A Creative Commons license that allows free use and modification of the model for non-commercial purposes only, with attribution required.
A problem-solving technique that starts with a simplified version of a problem and refines it when solutions fail.
Training an AI model to refuse or provide false information about certain topics.
A reasoning technique where an AI model shows its step-by-step thinking process before arriving at a final answer, making its logic transparent and verifiable.
A technique where a model works through a problem step by step, showing its reasoning process before arriving at a final answer.
A quantum circuit composed of quantum channels (operations that map quantum states to quantum states) rather than unitary gates alone.
Raw wireless signal data that describes how a Wi-Fi signal changes as it travels through space and bounces off objects.
The ability of a model to maintain a character's voice, personality, and backstory throughout a conversation without contradicting itself.
Processing text one character at a time rather than by words, which is useful for catching individual character errors in languages like Chinese.
A language model specifically trained to have natural back-and-forth conversations with users rather than just completing text.
A model specifically trained and tuned to excel at conversational interactions rather than other tasks like analysis or reasoning.
A model optimized through training to excel at multi-turn conversations and dialogue, rather than single-turn text completion.
A saved snapshot of a model's weights and state at a specific point during training, allowing training to resume or the model to be evaluated at that stage.
Saved snapshots of a model at different stages of training, allowing researchers to study how the model's behavior changes as it learns.
The process of breaking large documents into smaller pieces so a model with a limited context window can process them separately.
A graph structure showing how research papers reference each other, used to understand relationships and influence between scientific works.
The ability to identify, reference, and maintain accurate attribution to the sources used when generating a response.
Learning to recognize new object classes over time while maintaining performance on previously seen classes.
A statistical framework for designing and validating tests that measure psychological constructs reliably.
A machine learning model trained to assign input data into predefined categories or labels.
A technique that steers diffusion models toward desired outputs by comparing conditional and unconditional predictions.
Converting clinical information (diagnoses, medications, procedures) into discrete tokens that a model can process.
The ability to accurately interpret and reason about medical terminology, patient symptoms, and healthcare documentation.
Natural language processing applied to medical and healthcare text, such as extracting diagnoses or findings from doctor's notes and radiology reports.
The ability to analyze medical information, connect symptoms to conditions, and make logical healthcare decisions based on evidence.
A model trained on image-text pairs to create shared vector representations for both images and text.
A neural network design that learns to match images and text by training them to have similar representations, enabling tasks like image search and visual understanding.
A system that continuously adjusts its behavior based on feedback from its actions and outcomes.
When multiple features in a neural network are active at the same time, often because they represent related concepts.
A game theory solution where no player benefits from unilaterally deviating from a recommended strategy.
A strategy that first captures broad patterns, then progressively refines details for better understanding.
A sequential decision-making approach that starts with broad estimates and progressively refines them to higher precision.
A curriculum learning approach that starts with learning simple components before progressing to optimizing complex global structures.
The ability to automatically suggest or generate the next lines of code based on what the programmer has already written.
Percentage of program code executed by a test suite, measured by lines or branches.
A specialized task where a model modifies or refines existing code rather than creating new code, focusing on precision and surgical changes.
A specialized embedding designed specifically for source code that understands programming syntax and semantics, enabling tasks like code search and finding similar code snippets.
The ability of a model to write, complete, or suggest programming code based on prompts or partial code input.
Training a language model primarily on source code and technical documentation rather than general text, making it specialized for coding tasks.
Process of examining code changes for bugs, quality issues, and adherence to standards before merging.
A language model specifically trained on programming code to excel at tasks like code generation, completion, and understanding.
A model trained with a focus on understanding and generating programming code across multiple languages.
A language model trained specifically on programming code and related tasks, optimized to understand and generate code better than general-purpose models.
A language model trained specifically on programming code and code-related tasks rather than general text.
The ability to naturally mix two languages within the same text or conversation, switching between them based on context rather than treating them as separate.
A lookup table mapping compressed values back to original data; avoided in this approach to save memory.
A normalized measure of variability that expresses standard deviation as a percentage of the mean, useful for comparing spread across different scales.
A computational framework that models how an intelligent agent perceives, reasons, and acts in the world.
A mechanism that gates speculative execution based on model confidence, without requiring ground-truth labels.
A psychological framework explaining how working memory capacity affects learning and task performance.
The quality of maintaining consistent meaning and logical flow across multiple sentences or exchanges in a conversation.
A neural retrieval model design that stores multiple token-level embeddings per document and uses late interaction to achieve higher retrieval accuracy than single-vector approaches.
When input features are highly correlated with each other, making it difficult to isolate individual feature effects on predictions.
Finding the best arrangement or selection from a finite set of possibilities, like packing objects efficiently.
Shared beliefs and mutually recognized facts that enable effective collaboration between people or AI systems.
The ability of a model to understand and apply everyday logic and practical knowledge about how the world works.
Minimizing the amount of data exchanged between devices or servers during distributed training.
A smaller language model designed to use fewer computational resources while still performing useful tasks.
A quantum physics constraint ensuring operations preserve valid quantum states and probabilities.
A text generation approach where the model continues or completes text from a given prompt, rather than engaging in back-and-forth conversation.
A prompt style where you provide the beginning of text and the model continues it, rather than asking a direct question.
The ability to work through multi-step problems, analyze nuanced information, and draw logical conclusions.
Official verification that a service meets specific regulatory or security standards required by industries like healthcare or finance.
Official verifications that a service meets specific security and regulatory standards (like HIPAA or SOC 2) required by certain industries.
A design pattern where UIs are built from reusable, self-contained pieces (components) that can be combined to create larger interfaces.
Model's ability to understand new combinations of learned concepts.
Text descriptions that specify multiple elements, their relationships, and spatial arrangements in the desired image.
The ability to understand new combinations of concepts by learning how individual components combine.
The amount of computation (time and memory) required for an algorithm to solve a problem.
The ability to deliver good results while using less processing power and memory than larger models.
The amount of memory, processing power, and time required to run a model; a smaller footprint means the model can run on less powerful hardware.
The extra processing power, memory, or time required to run a model, which impacts speed and resource consumption.
The strategic distribution of a model's processing power—in this case, spending more computational effort on thinking through problems rather than other tasks.
How well a model performs relative to the computational resources (processing power and memory) required to run it.
A model designed to run with minimal processing power and memory, making it practical for devices with limited resources.
Hardware architecture that performs computation directly within memory, reducing data movement bottlenecks.
Achieving the best performance for a given amount of computational resources.
An interpretable model that makes predictions by routing inputs through a layer of human-understandable concepts rather than opaque features.
The process of mapping different textual expressions of the same idea to a single standardized representation, such as mapping 'MI' and 'myocardial infarction' to the same medical concept.
The ability of a model to generate output (like text) based on specific input conditions or prompts provided to it.
The ability to generate text that follows specific conditions or constraints, rather than producing output freely.
A neural network that learns to generate new data matching specific conditions or constraints.
Ensuring a model's confidence scores accurately reflect its true probability of being correct.
Statistical bounds around predictions that quantify uncertainty; here used to identify when model predictions are unreliable.
A decoding strategy that stops refining tokens when model confidence exceeds a set threshold.
Refusing predictions when the model's confidence score is below a threshold.
A strategy that selects which tokens to generate next based on the model's prediction confidence, enabling adaptive and efficient generation.
Training a model using rewards based on how well its confidence scores match its actual correctness.
The tendency to seek or interpret information in ways that confirm existing beliefs or outputs.
Method providing prediction intervals with statistical guarantees on coverage.
When an agent misuses its elevated permissions to perform actions it shouldn't, tricked by user input.
A routing pattern where multiple neurons must agree (be mutually exclusive) to activate a particular processing path.
Text generation that must follow specific rules or constraints, such as producing output in a particular format or structure.
Training an AI system to maximize performance while respecting hard constraints (like deadlines or budgets).
Whether a study actually measures the real concept it's supposed to test, not something else.
A mechanism that activates learned corrections only when the robot is physically touching the object.
Robot tasks where success depends critically on precise control of forces and contact interactions with objects.
A model or system that screens text before or after generation to block unsafe, harmful, or policy-violating content.
Safety mechanisms built into a model that prevent it from generating harmful, inappropriate, or restricted content.
The process of reviewing and filtering text or other content to remove or flag material that violates policies or safety guidelines.
The task of automatically detecting and categorizing text that violates policies or could cause harm, such as hate speech, violence, or misinformation.
The ability to maintain consistent meaning and logical flow when processing long sequences of text or conversation.
Transferring knowledge from interaction trajectories into model parameters by learning from contextual examples.
The maximum amount of previous text a model can consider when generating its next output; longer context allows the model to maintain coherence over longer passages.
Organizing and maintaining relevant information for AI decision-making.
Irrelevant or noisy information degrading model performance in a given context.
A model's ability to remember and use information from earlier parts of a conversation or document.
When an AI model's input context window fills up and earlier information is lost, requiring mechanisms to preserve key data.
The maximum number of tokens a model can process in a single conversation or prompt.
Speech recognition that uses surrounding information like conversation history to improve transcription accuracy.
Numerical representations of text that capture meaning based on surrounding context, rather than treating each word independently.
The assumption that a model produces consistent outputs when a task is reformulated in contextually equivalent ways.
Influence from surrounding information (like examples or previous actions) that pushes an agent away from its intended behavior.
A way of encoding text where the meaning of each word depends on the words around it, rather than being fixed for every occurrence.
The intermediate representation space in a diffusion model where semantic and structural information is encoded.
Uncertainty caused by changing conditions over time, like user preferences shifting.
The ability of a model to interpret the meaning of words and phrases based on surrounding text, rather than treating each word in isolation.
Training models to learn new tasks without forgetting previously learned ones.
Further training a pretrained model on domain-specific data to specialize it for particular tasks.
Real-time monitoring of a quantum system that produces a stream of measurement data used to update state estimates.
Encoding data as smooth, unquantized values rather than discrete tokens, preserving fine-grained temporal details.
A training technique that learns by comparing similar and dissimilar examples to create better representations.
Training objective that pulls similar examples together and pushes different ones apart.
A method that learns shared embedding spaces by contrasting similar and dissimilar image pairs, then ranks candidates by similarity.
Breaking down a neural network's output into individual contributions from different neurons or neuron groups.
Special tokens added at the beginning of a prompt that tell the model what style, domain, or format to use for its output.
Special tokens inserted into sequences to guide model behavior, such as signaling whether to show an ad or organic content.
Physics problems where fluid flow effects dominate over diffusion, creating sharp gradients and moving fronts.
The point during training when a model's performance stabilizes and stops improving significantly, indicating it has learned the patterns in the data.
How quickly an optimization algorithm approaches the optimal solution, typically expressed as a function of iterations.
AI systems designed to understand and respond to human language in natural, dialogue-like interactions.
The model's ability to maintain logical consistency and relevance across multiple turns of dialogue, making responses feel natural and connected.
How naturally and coherently a model engages in back-and-forth dialogue, matching human conversation patterns.
A model specifically trained to understand and generate natural dialogue, optimized for back-and-forth interactions rather than one-off text generation.
A language model specifically trained and optimized to engage in multi-turn dialogue with users.
A technique that scans across input data using small filters to detect local patterns, commonly used in image processing but here applied to text for efficiency.
A collection of documents or text used as the knowledge base for retrieval in RAG systems.
A filtering mechanism that validates whether a proposed solution is correct before allowing it to advance in a search process.
A mathematical measure that compares how similar two embeddings are by calculating the angle between them, with values closer to 1 meaning more similar.
A method of comparing two vectors based only on their direction, ignoring their magnitude, making it scale-invariant.
A learnable activation function using cosine waves with adjustable frequency and phase to process data nonlinearly.
An adversarial attack that accounts for the real-world cost or feasibility of modifying each feature.
An explanation showing what input changes would alter a model's prediction to a different outcome.
Creating alternative scenarios showing what would happen if something were different (e.g., if an object didn't exist).
A question about what would have happened if a variable had taken a different value (e.g., 'what if the patient had received treatment?').
When the distribution of input data changes between training and real-world use, causing models to fail.
Measuring what proportion of a problem space a model can reliably handle.
A quantum operation that preserves physical validity by maintaining positivity and trace properties of quantum states.
Running a model's predictions using a computer's central processor rather than a specialized graphics card, which is slower but requires less specialized hardware.
A measure of how useful and novel the connections a model generates are for creative tasks.
The process of determining which actions or steps in a sequence deserve reward or blame for the final outcome.
Mechanism allowing one sequence to attend to and focus on another sequence.
A creativity technique where ideas from one unrelated domain are applied to solve problems in another domain.
A model architecture that takes a query and document together as input and directly outputs a relevance score, unlike dual-encoders that score them separately.
Running an AI model in different network environments or systems than the one it was trained on.
The ability to understand relationships and transfer knowledge between different languages, such as answering a question in one language based on text in another.
The ability of a model to understand and relate concepts across different languages, allowing it to find similarities between text in different languages.
The ability of a model to understand and work with multiple languages, sometimes even translating concepts between them.
The ability of a model to represent similar meanings in different languages as nearby points in its vector space, so translations and equivalent concepts are treated as semantically close.
The ability to find and compare similar content across different languages by representing them in a shared mathematical space.
The ability to find relevant documents or text in one language when searching with a query in a different language.
The ability to recognize that sentences or phrases in different languages have the same or similar meaning and represent them close together in numerical space.
The ability to measure how similar two sentences are even when they are written in different languages.
The ability of a model trained on multiple languages to apply knowledge learned from one language to understand or generate text in another language.
Connecting representations from different types of data (like speech and text) so they work together effectively.
An attack that manipulates multiple input types (like images and text) together to deceive a model.
A mechanism that aligns and weights information between different modalities like images and text.
Ensuring that representations across different modalities (images, 3D, text) align and reinforce each other.
When a model produces contradictory predictions for the same concept represented in different modalities.
The ability to find relationships between different types of content, such as matching natural language descriptions to code snippets.
The ability to connect and reason about information from different input types (like audio and video) together to draw conclusions.
The ability to search and find relevant items across different data types, such as finding images using text queries or vice versa.
The ability to measure how closely related content from different types of input (like images and text) are to each other.
Aligning images captured from different viewpoints (e.g., street-level and overhead) to find correspondences.
A 3-dimensional algebraic variety defined by a degree-3 polynomial equation.
NVIDIA's parallel computing platform that runs code on GPUs to process many tasks simultaneously.
Optimized GPU code that performs specific computational operations efficiently.
Training data that has been carefully selected and filtered to include only high-quality examples relevant to specific tasks or domains.
Carefully selected and filtered training examples chosen for quality rather than quantity, often resulting in models that produce more structured and reliable outputs.
Training strategy that presents examples in increasing order of difficulty.
A training constraint that penalizes curved or winding paths in the learned representation space.
A constraint requiring a model to reconstruct its original output after transforming through intermediate steps.
A metric measuring how many different paths code can take; lower values mean simpler, easier-to-maintain code.
An interactive learning method where a human corrects the model's mistakes during training to fix distribution mismatch.
When test data accidentally leaks into training, artificially inflating a model's measured performance.
The process of carefully selecting, cleaning, and organizing training data to improve model quality; better curated data often leads to better model performance.
Variation in data distribution across different sources or groups.
The relevance, accuracy, and usefulness of training data, which can be more important for model performance than simply having more data.
The practice of carefully selecting and filtering training data for relevance and accuracy rather than simply using larger amounts of raw data.
A guarantee that your data is stored and processed only in a specific geographic region, helping meet regulatory requirements.
Choosing a subset of training data based on quality or relevance metrics rather than using all available data.
Automatically generating training data from existing datasets to teach models new tasks.
Compressing a large dataset into a smaller synthetic version preserving key information.
A neural network design pattern that serves as the structural foundation for this model, determining how it processes and generates text.
Creating entirely new protein sequences from scratch rather than modifying or copying existing ones.
A mechanism that selects actions based on current state, goals, and expected outcomes to maximize success.
A component that converts compressed internal representations back into human-readable outputs like audio or images.
Language model design that generates text sequentially without a separate encoder, like GPT models.
Converting model outputs into human-readable text or structured predictions.
The process of removing duplicate or near-duplicate examples from training data to improve model efficiency and prevent overfitting to repeated content.
An AI system that performs multi-step research by reasoning through problems and making multiple search queries.
Consequences of an agent's actions that appear many steps later, making it harder to learn cause-and-effect relationships.
Training examples collected from real robots performing tasks, used to teach the model how to execute similar actions.
A training approach where the model learns to reconstruct clean audio from corrupted or noisy versions, improving its ability to extract meaningful features.
A neural network trained to reconstruct clean text from corrupted or noisy versions, learning to remove noise while preserving meaning.
A training approach where a model learns to reconstruct clean audio from noisy versions, making it better at understanding speech in real-world conditions.
A training objective that learns to predict noise in corrupted data, used in diffusion models for stable gradient-based optimization.
Generating detailed, comprehensive descriptions of images that capture rich visual information and relationships rather than brief summaries.
A compact vector representation where most dimensions contain meaningful information, as opposed to sparse embeddings that are mostly zeros.
Vector representations where most or all of the numbers contain meaningful information, as opposed to sparse embeddings where most numbers are zero.
A neural network where all parameters are active for every input, in contrast to sparse architectures like mixture-of-experts that selectively activate different parts.
A technique that converts documents and queries into dense vectors so that relevant passages can be found by comparing their numerical representations rather than matching keywords.
A compact numerical format where meaning is captured in a fixed-size list of numbers, making it efficient for storage and similarity comparisons.
A search method that converts text into a single, compact numerical vector and finds similar documents by comparing these vectors.
A compact numerical representation where most values are non-zero, used to efficiently store and compare the meaning of text.
A compact numerical representation of text that captures its meaning, allowing the model to compare how similar different pieces of text are to each other.
Numerical representations of text where each word or sentence is converted into a list of numbers that capture its meaning, allowing the model to compare semantic similarity.
A compact numerical format where text is encoded as a list of numbers that capture its meaning, allowing efficient similarity comparisons.
A mathematical space where text is represented as vectors of numbers, positioned so that similar meanings are located close together.
Compact numerical representations where most values are non-zero, used to encode the meaning of text in a form that computers can compare mathematically.
Dense embeddings use all dimensions with non-zero values (like traditional neural embeddings), while sparse embeddings mostly contain zeros and are more interpretable and storage-efficient.
A method that aligns models by learning from the geometric clustering of accepted responses in the model's representation space.
An image where each pixel's brightness represents how far away that object is from the camera.
A technique that creates a larger model by combining and stitching together layers from smaller pre-trained models rather than training a new model from scratch.
The process of restoring a compressed model's weights to higher numerical precision, improving quality but requiring more memory.
A numerical representation that captures the visual characteristics around a detected keypoint, allowing the model to match similar points across different images.
Generating model weights using text or structured descriptions of the target architecture and task as input.
A mathematical model that generates diverse sets of items by penalizing similarity, useful for ensuring variety in generated outputs.
AI process of identifying root causes or problems from observed symptoms.
The process of an AI model creating natural conversational responses based on input text.
The process of finding a set of basis vectors (dictionary) that can reconstruct data through sparse combinations.
The ability to understand and apply code changes (diffs) to existing files rather than generating code from scratch.
Smooth mathematical function approximating non-differentiable operations for training.
A learnable memory retrieval mechanism that can be trained end-to-end to recall relevant past episodes for current decision-making.
A physics solver built into a neural network so that gradients can flow through physical laws during training.
A mathematical framework that adds controlled noise to data to protect individual privacy while enabling statistical analysis.
An internal indicator that estimates how hard a problem is, used to guide model behavior.
Language models that generate text by iteratively refining noisy predictions into coherent words.
Generative model that creates images or videos by gradually removing noise from random data.
AI models that generate images by learning to reverse a noise-adding process, starting from pure noise.
A generative approach that iteratively refines predictions by gradually removing noise from random initial states.
A learned distribution that guides diffusion models toward realistic outputs in a specific domain.
Iterations in a diffusion model that gradually refine noise into a final image or video output.
A transformer architecture adapted to work with diffusion-based generation processes.
A neural network design that generates outputs by iteratively refining noisy predictions into clear results, rather than building text one token at a time like traditional language models.
A method where a model generates text by iteratively refining noise into coherent output all at once, rather than predicting one word at a time.
A language model that generates text by iteratively predicting and refining masked (hidden) tokens across the entire output, rather than predicting one token at a time from left to right.
A training technique that teaches a model to prefer certain outputs over others by learning from examples of better and worse responses.
A graph structure representing causal relationships where arrows point from causes to effects with no cycles.
The logical flow and consistency of ideas across sentences in a text or conversation.
A generative model that iteratively removes noise from discrete tokens (like words) to generate text, as an alternative to autoregressive decoding.
Compressed representations of audio data stored as specific, distinct values rather than continuous numbers, making them efficient for storage and processing.
A communication channel where each transmitted symbol is corrupted independently with no memory of past transmissions.
Individual units of quantized information that represent audio in a compressed, symbolic form rather than continuous values.
The ability of a model to generalize across different mesh resolutions or numerical discretizations of the same continuous problem.
Separating different factors of variation (like expression and identity) in a model's learned representations.
A smaller, faster version of BERT that retains most of its language understanding ability while using fewer parameters and less computational power.
A technique that compresses a large, complex model into a smaller one by training it to mimic the larger model's behavior, resulting in faster inference with minimal loss of quality.
A model that has been compressed by training a smaller model to mimic a larger, more capable model, reducing size and computational requirements while retaining performance.
A smaller, faster version of a larger model created by training it to mimic the larger model's behavior, reducing computational requirements while maintaining reasonable performance.
Modifying a model's output probability distribution at inference time to satisfy constraints without changing the model's weights.
When a policy becomes overly specialized in reproducing successful behaviors without learning to handle diverse situations or recover from failures.
When a model encounters data that looks different from what it was trained on, causing performance to drop.
A mathematical space where words are represented as vectors based on their usage patterns in text, like GloVe or Word2Vec.
Ensuring benefits and harms are equitably distributed across agents rather than concentrated in hubs or privileged positions.
Learning to predict probability distributions over outputs rather than single deterministic predictions.
When the statistical properties of data change over time, making old patterns unreliable for future predictions.
A metric measuring the quality of unique answers generated relative to the best possible answer set of the same size.
The process of breaking long documents into smaller pieces before embedding them, which this model is optimized to work with effectively.
Anchoring AI responses to specific source documents to ensure answers are based on provided content.
The ability to automatically extract, understand, and convert information from document images (like scans or forms) into structured, machine-readable formats.
The process of identifying and understanding the structure of a document, such as text regions, tables, and columns.
The process of automatically reading and extracting structured information like text, tables, and layout from documents.
Finding the relevant documents or passages from a large collection that are needed to answer a question.
The ability to maintain the original layout, formatting, and organization of a document when extracting text, rather than just outputting raw characters.
The ability to read and extract meaningful information from structured documents like receipts, invoices, and forms by recognizing both text and layout.
Tasks that require processing, searching, and reasoning over large collections of documents to find answers.
Understanding and answering questions that require information from multiple parts of a full document.
Training a model on data from multiple specialized fields (like general text, scientific papers, and medical literature) so it works well across all of them.
Training models to work well on new, unseen domains beyond their training data.
A technique that automatically creates many fake domain names to evade detection and maintain control of malicious infrastructure.
Specialized expertise and facts about a particular field or subject area that an AI model has learned during training.
When a model encounters data from a different source or environment than it was trained on, causing performance to drop.
When a model is trained to excel at a specific task or set of languages rather than being a general-purpose tool.
Programming languages designed for specialized tasks in particular industries or fields.
A model that works effectively across many different subject areas and use cases without needing to be retrained for each one.
Abstract problem formulations that can be recognized and solved across multiple unrelated academic fields.
A model's ability to understand and respond accurately to topics within a specific field or area of expertise it was trained on.
An AI planning algorithm that solves problems in any domain without domain-specific customization.
A model trained specifically on data and tasks from a particular field (in this case, chemistry) to achieve higher accuracy in that domain than general-purpose models.
Tailored or optimized for a particular field or type of content, such as news, reviews, or scientific writing.
Training a model on specialized data from a particular field (like medicine) so it becomes expert at tasks in that domain rather than being a generalist.
The ability to generate text tailored to a particular field or context, such as legal documents, Wikipedia articles, or product reviews.
Specialized vocabulary and terminology unique to a particular field or industry, like medical jargon in healthcare or mathematical notation in physics.
A language model trained exclusively on text from a particular field or subject area, making it much better at understanding and generating content in that domain than general-purpose models.
A language model trained specifically on data from one field (like biomedical research) rather than general internet text, making it excel at specialized tasks.
Training a model to excel at tasks within a particular field (like legal documents) rather than being a general-purpose model.
Training a model on specialized data from a particular field (like biomedical literature) rather than general internet text, making it much better at understanding that field's concepts.
Training a model exclusively on data from a narrow domain (like Python code) rather than general text, making it highly specialized but less versatile.
Training or adapting a model to specialize in a particular field (like biomedicine) rather than performing equally well across all topics.
A fine-tuning method that adapts model weights by separately learning magnitude and direction changes, extending LoRA.
A method of comparing two vectors by multiplying their components and summing the results, where vector magnitudes (length) affect the final score.
A square matrix where all rows and columns sum to 1, used to represent valid probability distributions for mixing multiple streams.
Reducing an image's resolution by removing pixels, making it smaller and faster to process.
A specialized AI model that receives requests routed to it by another system and performs the actual task or generates the final response.
Specific applications or problems that use the output of a pretrained model, such as predicting protein structure or identifying protein function.
The smaller neural network component in speculative decoding that quickly generates candidate tokens before verification by the main model.
A smaller, faster model used in speculative decoding to quickly propose token sequences before a larger model verifies them.
A system with two separate neural networks—one that processes questions and one that processes documents—both converting their inputs into comparable vector embeddings.
The parallel development and deployment processes for machine learning models and traditional software components.
The danger that AI technology can be misused for harmful purposes despite benign original intent.
A model with separate encoders for two input modalities that map them into a shared embedding space.
Organizing information at two levels of detail: high-level task guidance and low-level step-by-step actions.
An architecture using two parallel processing streams with different time scales—one dense and one sparse.
A minimal, non-functional model used for testing infrastructure and workflows without the computational cost of a real model.
The ability to generate responses with a specific target length or speaking time.
Training approach that evaluates which skills remain helpful during learning and selectively retains only those that improve the current policy.
A formal system for reasoning about how beliefs and knowledge change when new information is revealed.
Building a network representation that changes over time to reflect evolving relationships, like road connectivity adjusted for traffic incidents.
A quantization approach that adjusts precision levels during inference based on the input data, optimizing the balance between speed and accuracy on-the-fly.
A measure of how well an algorithm performs compared to the best possible strategy that adapts to changing conditions.
Choosing packet paths through a network in real-time based on current network conditions.
A compressed representation of states that captures how the environment changes over time.
Stopping a model's computation before completion when sufficient confidence is reached, reducing computational cost.
Combining multimodal inputs (like text and images) at early layers of a model rather than after separate encoding.
A recording of the electrical signals produced by the heart, used to detect heart problems.
The ability to anticipate and address unusual or boundary conditions in code that might cause errors.
Running a model directly on local devices like phones, tablets, or IoT hardware rather than sending data to a remote server.
A computing device at the edge of a network (like a smartphone or IoT device) that runs AI models locally rather than sending data to a remote server.
Attention mechanisms designed to reduce computational or memory complexity compared to standard quadratic-scaling attention.
Understanding a scene from the viewpoint of a camera or observer positioned within the environment.
A training objective used in probabilistic models to maximize the likelihood of observed data.
A pre-trained language model that learns by predicting which tokens in a sentence have been replaced, making it efficient and effective for downstream tasks.
Finding optimal delivery routes for electric vehicles that must visit customers within time windows and recharge at stations.
Digital records of patient medical history, diagnoses, medications, and clinical events stored in structured formats.
A dense numerical vector that represents a word, sentence, or concept in a high-dimensional space.
Organizing vector representations of tokens into groups based on their semantic similarity.
The size of the numerical vector produced by an embedding model; larger dimensions capture more detail but require more storage and computation.
The number of numerical values used to represent a piece of text (1792 in this case), where more dimensions allow for more detailed semantic information to be captured.
The spatial structure and relationships between data points in a learned vector space.
A model that converts text into numerical vectors that capture semantic meaning, allowing computers to understand and compare the similarity between different pieces of text.
Adding controlled noise to vector representations of text to obscure sensitive information.
A mathematical space where text is represented as vectors, allowing similar texts to be positioned close together and enabling operations like similarity search and clustering.
Comparing semantic representations (embeddings) to find similar content without reprocessing raw data.
Numerical representations of text that capture semantic meaning, allowing the model to measure similarity between different words or phrases.
Real-world performance metrics for robots like task completion time, motion smoothness, and energy consumption.
The ability to understand and reason about physical tasks and spatial relationships in the real world, not just abstract concepts.
A measure of solution quality that arises from system dynamics rather than being explicitly defined beforehand.
The spread of emotions from one agent to others through interaction and observation.
Using emotionally-toned language or affective phrasing in prompts to influence model behavior.
The positive or negative quality of an emotion, ranging from negative to positive.
Training a model to recognize and respond to emotional context in conversations, prioritizing understanding and emotional connection over purely factual responses.
A neural network trained to mimic the behavior of a complex physical model or simulation.
A model component that transforms input sequences (like protein amino acids) into meaningful numerical representations without generating new sequences.
A neural network component that transforms input text into a compressed numerical representation, focusing on understanding and extracting meaning rather than generating new text.
A model designed to convert inputs (like images or text) into numerical representations for understanding, rather than generating new content.
A neural network that transforms input data into a compressed representation, rather than generating new text or making predictions.
Models like RoBERTa that process text to understand meaning, typically used for classification tasks.
A neural network architecture with two parts: an encoder that processes input text and a decoder that generates output text, allowing the model to transform one sequence into another.
A neural network design that processes input text to understand and represent it, but cannot generate new text from scratch.
An autonomous driving approach that directly maps sensor inputs to control outputs without explicit intermediate representations.
Training a model to solve a complete task directly from raw input (like document images) to final output, without breaking it into separate intermediate steps.
A system that takes raw input (like an image) and produces final output (like structured text) in one unified model, rather than chaining multiple separate tools together.
Recurring behaviors showing how users interact with content or systems over time.
A training technique where knowledge from multiple models is combined and compressed into a single, smaller model for better efficiency.
Combining multiple models to make better predictions than any single model alone.
A safety technique that combines outputs from multiple models and selects the most agreed-upon result.
A language model specifically optimized for business and organizational use cases, prioritizing reliability, consistency, and professional output over other characteristics.
The task of recognizing that different names or phrases refer to the same real-world concept, such as matching 'MI' with 'myocardial infarction'.
Automatically identifying and pulling out specific names, places, or things from text.
The task of identifying mentions of real-world concepts in text and connecting them to their canonical definitions in a knowledge base or ontology.
The task of identifying when different text references refer to the same real-world concept, such as matching variant spellings of a drug name to a single clinical entity.
A data structure that represents entities (like users or devices) and the typed relationships between them.
A decoding approach that continues unmasking tokens until cumulative entropy exceeds a threshold, balancing generation speed and quality.
System state where the ability to generate random numbers becomes the limiting factor rather than arithmetic computation.
AI system's ability to store and recall specific past events or experiences.
A situation where different participants have different information or knowledge about the same topic.
The preservation of an agent's ability to form accurate beliefs and maintain truthful internal representations.
Uncertainty from lack of knowledge that can be reduced with more data or better models.
Neural networks designed to respect geometric symmetries and transformations in molecular or crystal structures.
Firmware algorithms that detect and correct errors in memory to maintain reliability as storage density increases.
How mistakes in early steps of a process accumulate and worsen downstream results.
When AI judges appear to agree on scores but are actually using shallow patterns rather than substantive reasoning about quality.
A quantitative measure used to assess how well a model or system performs on a specific task.
A specialized language model trained to assess and score the quality of outputs from other AI models, acting as an automated judge.
An attack where an adversary modifies input features at test time to fool a deployed classifier.
Temporal representations that capture when and how much change occurs in music or video.
Recording all changes to data as a sequence of immutable events for full history tracking.
Linking AI outputs to specific source documents or facts that support them.
Fixing errors in code or theory by using specific signals like test failures and reviewer feedback to target the root cause.
A method that combines multiple predictions while quantifying uncertainty using evidence theory.
An AI optimization technique that mimics natural selection to explore and improve solutions over many iterations.
Saving and reusing working code solutions instead of text descriptions for repeated tasks.
Detailed analysis of why an action succeeded or failed, beyond just binary success/failure signals.
A record of every step a program takes as it runs, including variable values and function calls.
A variable in a causal model that is not caused by any other variables in the model; represents external sources of randomness.
An acquisition function that selects points likely to improve over the current best solution.
Useful patterns and insights extracted from real-world interactions and deployment experience.
Learning through direct interaction with the environment and feedback from actions taken.
An early version of a model released for testing and feedback, which may have bugs or incomplete features compared to stable versions.
A measure of how much each expert in an MoE model contributes to the final output, used to decide which experts need higher precision.
The ability to understand and interpret why an AI model made a specific decision or prediction.
A mode where a model generates visible reasoning steps before producing a final answer, allowing you to see its problem-solving process.
A feature that allows a model to show its reasoning process step-by-step before providing an answer, useful for complex problems that benefit from deliberate problem-solving.
A weighted average that gives more importance to recent values than older ones.
A model's ability to handle facial expressions it wasn't explicitly trained on by learning underlying expression patterns.
The capability to work with and maintain understanding across large amounts of text or multiple documents during reasoning.
A capability that allows a model to think through complex problems step-by-step internally before providing a final answer.
A reasoning technique where a model works through a problem step-by-step internally before providing an answer, improving accuracy on complex tasks.
Reward signals based on computational verification methods rather than the model's own internal signals.
Whether results from a controlled study apply to real-world situations outside the lab.
Technology that identifies or verifies people by analyzing facial features in images.
An optimization technique that selects diverse items by maximizing how well they represent the full set of options.
Verifying if claims are true using only an LLM's internal knowledge, without searching external databases.
A decomposition of norm computation into smaller intermediate terms to avoid materializing large dense matrices.
How often an AI model produces correct, verifiable information without errors or false claims.
Anchoring a model's responses to verified, real-world information rather than relying solely on patterns learned during training.
Whether an AI model's stated reasoning actually explains how it arrived at its answer, or if it's post-hoc justification.
When incorrect or outdated information from past interactions influences future reasoning.
Model parameters that are quickly adapted during inference to capture task-specific or input-specific patterns.
Pinpointing the exact location of bugs or errors in code or systems.
Automatically checking whether a problem instance has at least one valid solution before using it for testing.
Storing intermediate computed features during inference to reuse them in later steps, reducing redundant computation.
The process of selecting and designing input features that a machine learning model uses to make predictions.
The process of using a model to convert raw input text into numerical representations (features) that capture the meaning of the text.
A measure of how much each input variable contributes to a model's predictions.
Training models across multiple devices without centralizing sensitive data in one place.
A neural network that processes input in a single forward pass without recurrence or iterative refinement.
The method used to apply feedback text to refine and improve a search query representation.
Where the text used to improve a search query comes from, such as LLM-generated text or actual documents.
Training or prompting a model with only a small number of examples to perform a new task.
The degree to which a quantized or compressed model preserves the quality and accuracy of the original full-precision model.
A code completion technique where the model predicts missing code between existing lines, rather than only generating code forward from a starting point.
Distinguishing between very similar categories, like telling apart different bird species rather than just identifying 'bird vs. not bird'.
The ability to accurately generate readable text and small details within generated images.
Small, specific visual elements in an image, such as text within a photo or subtle differences between similar objects.
The ability to further train or customize a pre-trained model on your own data to adapt it for specific tasks or domains.
A model created by training an existing pre-trained model on new data to specialize it for specific tasks or behaviors.
A pre-trained model further trained on a smaller, task-specific dataset to improve performance on that task.
The process of further training a pre-trained model on new data to adapt it for specific tasks or domains.
A numerical technique that breaks a complex domain into small pieces to solve physics equations approximately.
The time it takes for a stochastic process to reach a target state for the first time.
The initial search system that finds candidate documents before refinement techniques are applied.
Embeddings that always produce vectors of the same length regardless of input length, which limits how much detail can be captured for very long documents.
A company's primary, most capable model designed to showcase their best technology and handle the most demanding use cases.
Software abstraction that maps logical addresses to physical memory locations in SSDs, managing wear and errors.
Dynamically allocating wireless frequencies based on real-time demand instead of fixed assignments.
The process of deciding where to place components on a chip to meet design constraints and performance goals.
Generating data by learning reversible transformations between simple and complex distributions.
A generative modeling technique that learns to transform random noise into realistic data by following learned flow paths.
Custom sound effects created to match specific actions or movements in video, like footsteps or door slams.
Mathematical proof that a system meets its specifications, here implemented in Lean 4 to certify material stability predictions.
Real-time guidance given to students during learning to help them improve, rather than just assigning a final grade.
Simulating a robot's future states by repeatedly applying its dynamics model to predict outcomes of candidate actions.
A training objective that penalizes the model for assigning probability to regions the true distribution doesn't cover.
A single computation cycle where input data flows through the model's layers to produce an output prediction.
A large pre-trained model that serves as a starting point for building other models, rather than being trained from scratch.
The underlying structural design of a neural network that determines how it processes and learns from data, distinct from standard transformer designs.
Large pre-trained AI models that can be adapted to many different tasks without starting from scratch.
A data format that stores model weights using 16-bit floating-point numbers, preserving full model accuracy while using less memory than 32-bit formats.
A low-precision numerical format that uses only 4 bits to represent numbers, enabling faster computation and smaller model sizes compared to standard 32-bit precision.
A 4-bit floating-point number format that represents model weights with very low precision, enabling extremely efficient inference on compatible hardware.
A ultra-low precision format using 4-bit floating-point numbers to represent model weights, enabling extreme compression.
A compression technique that represents model weights using only 4-bit floating-point numbers instead of larger formats, reducing memory usage and speeding up inference.
A compressed number format that uses 8 bits instead of the standard 32 bits, dramatically shrinking model size at the cost of slightly reduced precision.
A specific quantization method that uses 8-bit floating-point numbers and adjusts precision dynamically based on the data being processed, balancing speed and accuracy.
An 8-bit numerical format that stores numbers with reduced precision compared to standard formats, enabling smaller model sizes and faster computation.
A data format that stores numbers using 8 bits instead of the standard 32 bits, significantly reducing memory requirements with minimal quality loss.
A compression technique that reduces model size by representing weights using 8-bit floating-point numbers instead of higher precision, making it faster and more memory-efficient.
Decomposing signals into high-frequency (details, edges) and low-frequency (overall structure, semantics) components.
The automated creation of user interface code and visual elements based on descriptions or specifications.
A state-of-the-art AI model representing the cutting edge of what's currently possible in terms of capability and performance.
State-of-the-art, cutting-edge AI models that represent the current best performance in the field.
A model that represents the current state-of-the-art or cutting edge in AI capabilities, competing with the most advanced models available.
The largest and most advanced language models available, representing the cutting edge of AI capabilities.
A cutting-edge AI model representing the current state-of-the-art in performance and reasoning capabilities.
A pre-trained model component that is kept unchanged during training to preserve its learned knowledge.
A model using standard 32-bit floating-point numbers to represent weights, providing maximum accuracy but requiring more memory.
The ability of a model to output structured requests to invoke external tools or APIs rather than generating free-form text.
Growing a model's capacity while mathematically guaranteeing it behaves identically to the original at the start.
Mathematical operations like rotations that rearrange a model's weights without changing what the model computes.
Specifications describing what a software system should do and its specific behaviors and features.
An attention mechanism that progressively compresses and simplifies the input sequence, reducing computational cost while maintaining important information.
GPU operations combined into a single kernel to reduce memory traffic and improve computational efficiency.
Logic-based rules that handle uncertainty and gradual membership rather than strict true/false classifications.
A mechanism where a context signal scales the magnitude of state-dependent responses without changing their underlying structure.
A formal notation for encoding game rules so different AI systems can play the same game consistently.
A neuron that controls whether tokens are routed to standard or exception processing paths.
A mathematical property ensuring a model's predictions remain consistent regardless of arbitrary coordinate system choices or numerical representations.
A statistical model that learns patterns from data and provides uncertainty estimates for predictions.
Systematic tendency of models to favor one gender over others in language generation and translation tasks.
Designed to handle a wide variety of different tasks rather than being specialized for one specific domain.
A model trained to handle a wide variety of text tasks—like writing, answering questions, and reasoning—rather than being specialized for one specific task.
An AI model designed to handle many different types of tasks well, rather than being specialized for one specific domain.
A model trained to perform well across many different types of tasks rather than being specialized for one specific domain.
A robot trained to perform many different everyday tasks rather than being specialized for one specific job.
A model's ability to perform well on new, unseen data that differs from what it was trained on.
The difference between a model's performance on training data versus unseen test data.
An inference approach where a model generates an intermediate image before answering a question about it.
A model trained to generate new text by predicting the next word or sequence of words based on patterns it learned during training.
The shortest path between two points along a curved surface, as opposed to straight-line distance.
Structural constraints added to a model to encode domain knowledge about geometry, such as crystal lattice properties.
Maintaining structural and spatial accuracy across multiple views or representations of a 3D object.
Building a 3D model of a scene from video or images by estimating depth and camera motion.
Multimodal representations that preserve spatial and geometric information about the scene to maintain disambiguating context.
Using machine learning and statistics to analyze data tied to geographic locations.
A file format for quantized models designed for efficient CPU and GPU inference with llama.cpp.
A file format designed for efficient storage and loading of large language and embedding models, optimized for fast inference on various hardware.
When an AI agent gradually abandons its original objective and pursues different goals instead.
A low-dimensional vector that captures task identity and enables rapid adaptation to new tasks without retraining.
A set of rules and structures that constrain and guide AI behavior to ensure reliability and consistency.
An open-source license that allows free use and modification of software, but requires any derivative works to also be open-source under the same license.
A transformer-based neural network design that processes text sequentially and predicts the next word based on previous context.
An older transformer-based design for language models that generates text by predicting one word at a time, simpler and smaller than modern alternatives.
A transformer-based neural network design from OpenAI that processes text sequentially to predict and generate the next word in a sequence.
A modified version of the GPT-2 architecture that changes the original design, such as by reducing size or adjusting training.
A transformer-based design that follows the same structural principles as OpenAI's GPT-3 model, using layers of attention mechanisms to process text.
A class of transformer-based language models descended from the original GPT design, characterized by autoregressive text generation and broad general-purpose capabilities.
A transformer-based neural network design that uses self-attention to process and generate text, serving as the structural blueprint for this model.
An open-source large language model architecture based on the GPT design, created as an alternative to closed-source models.
An open-source transformer-based architecture designed for training large language models, similar in structure to GPT models.
A neural network design based on transformer technology that processes text sequentially and generates one word at a time.
A quantization technique that compresses model weights to lower precision, reducing file size and memory requirements while maintaining reasonable performance.
The high-speed memory on a graphics processor used to store and process model weights and computations during inference.
Estimating how model parameters should change without actually computing full gradients or updates.
Improving model performance by following the direction of steepest improvement in parameters.
Building models sequentially where each new model corrects errors from previous ones.
Limiting the magnitude of gradients during training to prevent extreme updates and improve stability.
Reducing the size of gradient data to speed up training on distributed systems.
Scaling gradient values to maintain consistent learning rates across different parameter groups or layers.
A task where a model identifies and fixes grammar, spelling, and syntax mistakes in written text.
A linguistic system where nouns and related words are classified into categories requiring specific agreement patterns.
An attention mechanism that learns weighted interactions between nodes in a graph structure.
A measure of how different two graphs are, based on the minimum edits needed to transform one into the other.
Converting a graph structure into a compact text representation that preserves its properties.
The actual underlying causes or features that explain observed data in a system.
AI reasoning that relies on specific documents or data provided to the model, rather than just its training knowledge.
The practice of ensuring a model's responses are based on and supported by provided source documents rather than generated from general knowledge.
A generalized measure of uncertainty or disorder that follows mathematical group rules, extending beyond standard entropy.
A training method that improves model reasoning by comparing outputs and rewarding better explanations.
In quantization, the number of weights that share a single scaling factor; smaller groups preserve more precision but use more memory, while larger groups save more memory but may lose detail.
Reducing model size by compressing weights in groups rather than individually.
An optimization technique that reduces memory usage and speeds up inference by having multiple query heads share the same key and value heads instead of each having their own.
Group Relative Policy Optimization, a reinforcement learning algorithm for fine-tuning language models with reward signals.
Safety mechanisms built into a model to refuse harmful requests or prevent it from generating unsafe content.
An AI system that interacts with computer interfaces by clicking, typing, and navigating screens.
The ability to identify and locate specific elements (like buttons or text fields) within a graphical user interface based on natural language descriptions.
A technique to steer AI generation toward desired outputs by providing additional control signals during inference.
A technique that steers a model's output toward desired behavior by balancing multiple objectives during inference.
A training technique that intelligently selects the most informative examples from your training data to improve model efficiency and performance.
When a model generates plausible-sounding but factually incorrect or fabricated information.
The ability to identify when a model generates false or unsupported information that isn't grounded in the provided source material.
A mathematical equation solving optimal decision-making problems over time.
The ability of a model to identify and interpret handwritten characters and words from images, accounting for variations in writing style and quality.
A rule that must always be satisfied during optimization, rather than being treated as a soft penalty that can be violated.
Challenging negative examples that are similar to the target but still incorrect, used during training to make the model learn more nuanced distinctions.
Tuning a model's design or training to run more efficiently on specific hardware (like NVIDIA GPUs), reducing memory usage and inference time.
A structured system that categorizes different types of harmful content (like violence, hate speech, or misinformation) so a model can recognize and classify them.
The design and implementation of control systems that manage agent behavior and task execution.
Differences in how a treatment affects different individuals based on their characteristics.
The internal numerical values a neural network computes at each layer as it processes input.
The dimensionality of the internal representations that a neural network uses to encode information about text.
An adversarial attack that injects malicious tokens to corrupt a model's internal memory and degrade performance.
Internal representations computed by neural networks that capture learned patterns.
An unsupervised learning method that builds a tree of nested clusters by repeatedly merging or splitting groups based on similarity.
A neural network component that processes images at multiple levels of detail simultaneously, capturing both fine details and broad patterns.
Breaking down a complex decision into multiple levels, like deciding family → genus → species in order.
Breaking complex tasks into simpler sub-tasks organized in levels, where agents learn high-level strategies and low-level actions separately.
A technique that aggregates features from multiple layers of a neural network to create multi-scale guidance signals.
Testing correctness at multiple levels: properties, interactions, and full rollouts to ensure system correctness.
The process of automatically converting algorithmic descriptions into hardware designs, typically using pragmas and code transformations.
Performance feedback derived from comparing baseline and skill-enhanced rollouts to guide skill and policy updates.
A training technique that simulates viewing images from different angles and perspectives to teach the model to recognize the same features under geometric transformations.
Techniques to make AI models produce truthful responses instead of false or misleading ones.
Forecasting future body positions and movements based on past motion sequences.
A controlled experiment measuring how much an AI system improves human performance compared to working without it.
A workflow where humans and AI agents work together, with AI assisting at multiple stages rather than just solution generation.
A model that combines two different neural network designs (in this case, Mamba2 and attention mechanisms) to balance speed and performance.
A neural network design that combines Mamba (a fast, efficient sequence model) with Transformer components to balance speed and capability.
A memory system combining learnable parameters with non-learnable mechanisms to balance flexibility and efficiency.
A capability that allows a model to switch between fast, direct responses and slower, more deliberate reasoning depending on task complexity.
A neural network that generates weights for another neural network instead of learning them directly.
Using optimal hyperparameters found at small scale to train larger models without expensive retuning.
A geometric shape in high-dimensional space used in optimization and probability theory.
Training method that constrains weight matrices to lie on a fixed-norm hypersphere for improved stability and scaling.
The ability to uniquely determine a model's parameters from observed data.
Maintaining consistent, unique identifiers for entities across different systems and time periods.
Keeping a person's unique facial characteristics unchanged while editing other attributes like expressions.
Separating what makes a face unique (identity) from how it moves (expression) so each can be controlled independently.
The task of automatically generating a text description of what appears in an image.
Modifying specific parts of an existing image while preserving other elements.
A neural network component that converts images into numerical representations that capture visual features and patterns.
A computer vision task that divides an image into regions or labels each pixel to identify different objects or areas.
The process of converting images into discrete tokens (small units) that a language model can process, similar to how it handles text.
The ability to understand and answer questions that require analyzing both visual content and textual information together.
The ability to analyze a visual image and automatically produce source code that recreates or represents that image's structure and content.
The task of automatically generating natural language descriptions of images, converting visual information into written words.
Training a model to copy behavior from expert examples without understanding the reasoning behind decisions.
A learned behavior that mimics actions from human demonstrations or other expert examples.
Identifying which parts of a system are affected by a proposed code change.
Games where players don't know all relevant information, like hidden opponent cards or future draws.
A limitation that emerges naturally from the training setup rather than being explicitly specified.
A user's underlying goal or need that is not directly stated but must be inferred from context.
Structured behaviors that emerge naturally from an LLM's token-level decisions without being explicitly programmed or instructed.
Information about what a community values inferred from their behavior (like engagement and acceptance) rather than explicit feedback.
Learning from examples provided in a prompt without updating model weights.
A training technique where negative examples (dissimilar samples) come from other items in the same training batch, helping the model learn to distinguish between similar and dissimilar texts.
A mechanism where relevant information is retrieved from model parameters themselves rather than from external memory or attention, helping reduce computational bottlenecks.
Ensuring that the goals and rewards of different agents or system components work toward the same overall objective.
How well a model adjusts its behavior when the rewards or payoffs for different actions change.
Writing systems used for South Asian languages like Hindi, Tamil, Telugu, and Bengali that have distinct characters and phonetic rules.
An attack where malicious instructions are hidden in data an AI agent retrieves, causing unintended actions.
Built-in assumptions about how data should behave, like physics rules, that help models learn faster with less data.
The process of running a trained model to generate predictions or outputs from new inputs.
The computational resources and processing power required to run a model on new data after it has been trained.
The computational resources and time required to run a model on new inputs, typically measured in memory usage and processing time.
The ability of a model to generate outputs quickly and with low computational resource consumption during real-world use.
Software that optimizes how a trained model runs on specific hardware; MLX is an Apple-optimized framework for efficient inference on Apple Silicon.
The time it takes for a model to generate a response after receiving an input.
Techniques and design choices that make a model faster and more efficient to run on hardware, prioritizing speed and resource usage over training flexibility.
How quickly a model can generate predictions or outputs after being given an input, measured in time per token or tokens per second.
The amount of time it takes for a model to process input and generate output after it has been trained.
Extra processing power spent by the model while generating a response to think through problems more carefully before answering.
The computational resources used when a model generates answers, as opposed to during training.
A model used during generation to score outputs without requiring retraining of the main system.
A technique where a model allocates more computational resources and time during inference (when generating answers) to improve quality and accuracy on harder problems.
The task of automatically identifying and pulling out specific data or facts from documents, such as names, dates, or amounts from forms.
The reduction in uncertainty about a target achieved by knowing a feature.
The task of finding relevant documents or passages from a large collection in response to a user query.
The process of gathering data from multiple sources and combining it into a coherent, unified response or summary.
Running a model as an intermediate processing layer within an application pipeline, typically to filter or validate data before it reaches the main system.
The task of filling in missing or masked regions of an image while maintaining coherence with the surrounding content.
The type of data a model can accept as input, such as text, images, or audio.
Checking that input data meets basic requirements (correct format, expected properties, no obvious errors) before processing it.
The types of data a model can accept as input and produce as output, such as text, images, or audio.
The ability to apply different settings or modifications to individual objects within a scene independently.
The ability of a model to follow primary instructions even when secondary or conflicting instructions are present.
The ability of a model to understand and execute specific tasks or commands given in natural language prompts.
A model fine-tuned on instruction-response pairs so it follows user prompts more reliably.
A training process that teaches a model to follow specific user instructions and commands, improving its ability to respond appropriately to requests.
A specific quantization format that represents model weights using only 4 bits per value, significantly reducing model size while maintaining reasonable performance.
A quantization method that represents model weights using only 4-bit integers instead of full-precision floating-point numbers, dramatically shrinking the model's memory footprint.
Using 8-bit integers instead of floating-point numbers to represent model weights and activations.
A mathematical optimization technique that finds the best solution among discrete options subject to linear constraints.
The process of analyzing user input to determine what the user is trying to accomplish so it can be handled appropriately.
The model's capability to understand what a developer actually wants to accomplish, even when the request is vague or expressed in informal language.
A measure of how consistently different judges rate the same outputs, typically using metrics like correlation or ICC.
The spatial, functional, or semantic relationships and dependencies between different parts of a composed object.
A model's understanding of how conversations naturally flow and how users respond to assistant outputs.
Combining insights and methods from multiple academic disciplines to solve problems in a target domain.
The ability to mix images and text in any order within a single prompt, rather than requiring all images first or all text first.
Giving feedback at multiple steps during reasoning, not just at the final answer, to guide the model's thinking process.
A deliberate step-by-step thinking mechanism that occurs before generating a response, helping the model work through complex problems more carefully.
The hidden patterns and knowledge stored inside a model's layers that it uses to understand and generate text.
A hidden computation phase where the model reasons through a problem before producing its final answer, improving accuracy on complex tasks.
Whether a study actually measures what it claims to measure, without confusing factors distorting the results.
The ability to understand and explain how a model makes decisions and what it has learned from its training data.
Machine learning models designed to be understandable to humans, showing why they make specific predictions.
Determining the appropriate moment to interject in a conversation based on natural dialogue cues.
Ensuring that related elements (like a person's face across frames) maintain consistent properties throughout.
Measuring how similar consecutive frames or audio segments are within a single modality.
The geometric properties of a space as measured from within, independent of how it's embedded in higher-dimensional space.
A reward signal that encourages an agent to explore and discover new states, separate from task-specific rewards.
Reward signals generated from the model's own internal signals, like confidence scores, rather than external verification.
A change that preserves key properties or predictions of a model.
Predicting what inputs or earlier program states must have been to produce a given output.
Finding the input that produces a known output, when the forward process is complex or many-to-one.
Finding input causes from observed output effects, often ill-posed.
A reward signal that measures quality by having an LLM recover the original task specification from generated outputs.
A data structure that maps terms to the documents containing them, enabling fast keyword-based search similar to how a book's index works.
A search technique that maps vocabulary terms to documents containing them, enabling fast keyword-based lookups commonly used in search engines.
A measure of how quickly ions move through a material, critical for battery charging and discharging speed.
Graphs showing model performance across different configurations while keeping total computational operations constant.
A property that remains the same for graphs with identical structure, regardless of how nodes are labeled or arranged.
The process of gradually removing noise from a noisy input through multiple refinement steps to generate clean outputs.
A workflow where code is refined through multiple rounds of small, targeted changes rather than complete rewrites.
Repeatedly improving an output by generating versions, evaluating them, and using feedback to create better versions.
A process where the model performs multiple rounds of web searches, each building on previous results to refine and deepen its understanding of a topic.
A technique that limits how much a model's output changes when inputs change slightly, making it more stable and predictable.
Crafting adversarial inputs designed to bypass a model's safety guardrails and trigger harmful outputs.
The process of breaking Japanese text into meaningful units (tokens), accounting for the language's unique writing systems including kanji, hiragana, and katakana.
A self-supervised learning approach that predicts future embeddings from video without reconstructing pixels.
Converting code to machine instructions at runtime, enabling Python code to run efficiently on GPUs.
A shared mathematical space where different types of data (like sounds and text descriptions) are represented so similar concepts are positioned close together, enabling direct comparison.
A shared numerical space where different types of data (such as audio and text) are represented together, allowing the model to find relationships between them.
Processing multiple input types together in an integrated way rather than separately, allowing the model to reason about how they relate.
The raw frequency domain data collected directly by an MRI scanner before conversion to images.
Combining multiple GPU operations into a single optimized computation to reduce memory overhead and improve speed.
Tuning kernel functions to improve performance in kernel-based models.
Attention mechanism components that store and retrieve information; fewer heads means reduced model capacity and faster computation.
A reference frame in a video that serves as an anchor point for propagating edits or information to surrounding frames.
Matching specific visual landmarks (like object corners) between a demonstration and a new scene to align actions.
The task of automatically identifying and locating distinctive points of interest in an image that remain stable across different angles and lighting conditions.
A measure of how different one probability distribution is from another, used to evaluate sampling quality.
Assessing models using external knowledge sources for better judgment.
A structured or unstructured collection of documents and facts that a system retrieves from to answer queries.
The limit to how much factual information a model can reliably know or recall, often constrained by its size and training data.
The process of organizing, storing, and synthesizing insights from multiple experiments to improve future decision-making.
The date up to which a model has been trained on data; it cannot reliably answer questions about events or information after this date.
A technique that compresses a large, complex model into a smaller one by training the smaller model to mimic the larger model's behavior.
A structured database that stores facts as relationships between entities (like 'Einstein' connected to 'Physics'), enabling machines to reason about real-world knowledge.
The task of filling in missing facts or relationships in a knowledge graph by predicting what connections should exist based on patterns in existing data.
Applying knowledge learned from one task to improve performance on another.
Incorporating domain expertise or physical laws into machine learning models to improve accuracy and generalization.
A neural network architecture designed to provide flexible, expressive function approximation with interpretable structure.
A mathematical way to describe quantum operations that guarantees they produce physically valid quantum states.
An efficient but approximate method for parameterizing doubly stochastic matrices that sacrifices some expressivity for computational speed.
A mathematical property that guarantees convergence of optimization algorithms to stationary points.
A store for previously computed key-value pairs that speeds up text generation in transformers.
The number of attention head pairs used for storing and retrieving key-value information in a transformer model's attention mechanism.
A poisoning attack where attackers deliberately mislabel training examples to mislead the model.
A training signal derived from model behavior itself rather than human-annotated labels.
The core language model component that processes text and generates responses based on information from other parts of the system.
The model's ability to generate grammatically correct, coherent, and natural-sounding text that reads as if written by a human.
The proportion of each language included in a multilingual training dataset.
An AI model trained to predict and generate text by learning patterns from large amounts of written data.
The task of predicting the next word or token in a sequence based on previous words, which is the core objective used to train text models.
Training or fine-tuning a model to excel at a specific language by using more native-language data and task-specific adjustments.
Training a model to excel at a specific language rather than trying to handle many languages equally well.
A model's ability to work across multiple languages without requiring separate training for each language.
A language model trained primarily or exclusively on text from a single language to achieve better performance on that language than a multilingual model.
Training a model to specialize in one particular language, which makes it perform better on that language but worse on others.
A specialized AI model designed to understand instructions and convert them into structured function calls and tool interactions rather than generating free-form text.
An LLM extended with an audio encoder to understand and reason about sound and audio content.
A neural network trained on vast amounts of text data to understand and generate human language.
A retrieval technique that compares individual tokens between a query and document separately, then combines the results, rather than comparing pre-computed single vectors.
A retrieval approach that compares individual token embeddings between query and document at search time, rather than comparing pre-computed single vectors.
A retrieval approach that compares individual token embeddings between query and document at search time, rather than comparing pre-computed single vectors, allowing more precise matching of specific phrases and rare terms.
The time delay between sending a request and receiving the first response token from a model.
A strict deadline requirement for how quickly data must travel from source to destination.
A model designed to produce results as quickly as possible, prioritizing speed over other factors like accuracy or feature breadth.
A generative process that iteratively refines compressed representations of data by removing noise to produce coherent outputs.
Generative models that create images by learning to denoise random noise in a compressed latent space rather than pixel space.
A lower-dimensional surface where high-dimensional data naturally lies.
A compressed, learned encoding that captures the essential features of data in a compact form.
A compressed, learned representation of data that captures its essential features in fewer dimensions.
A learned hidden representation that evolves through computation to capture task-relevant information.
A neural network that learns to predict future video frames in a compressed representation space rather than raw pixels.
A technique to break down what a model learns internally into individual concepts or features it uses to make decisions.
A markup language commonly used to write mathematical equations and scientific documents in a format that renders beautifully.
A text-based format for writing mathematical and scientific documents with precise formatting and symbolic notation.
The ability to understand and use information about how text is positioned and structured on a page, not just the words themselves.
When concept representations unintentionally encode task-relevant or inter-concept information beyond their intended semantics, compromising interpretability.
Framework separating total forecast error into estimation error (from training) and approximation error (from architecture).
A research-based description of how students' understanding develops in a subject over time, from novice to expert.
Using the same learning rate setting across models of different sizes without retuning.
A 24-dimensional mathematical structure with optimal sphere packing properties, used here to compress model weights efficiently.
The cost or performance loss from making a model more interpretable.
A hierarchy of representations of the same object at different resolutions, commonly used in graphics for rendering efficiency.
A measure of how different two text strings are, counting the minimum character insertions, deletions, or substitutions needed.
Methods to identify whether an AI model's response is false or misleading.
Continuously adapting recommendations to a user's evolving preferences over extended periods without forgetting past patterns.
A model that uses fewer computational resources and memory, making it practical to run on less powerful hardware.
A smaller, more efficient model designed to run quickly and use less memory than larger alternatives, often with some trade-off in reasoning capability.
An attention mechanism with linear complexity instead of quadratic.
A property where the Bellman backup operation preserves linearity in value functions.
Computational cost that grows proportionally with sequence length, rather than quadratically like Transformers.
Using linear combinations of features to represent value functions or policies in RL.
A simple classifier trained on top of a model's internal representations to detect specific properties.
Simple machine learning classifiers trained on model internal states to detect specific properties like deception.
A simple model that maps input features to continuous numeric outputs using a linear function.
The idea that concepts are linearly separable in neural network embeddings.
Systems whose behavior follows linear equations that don't change over time.
An attention mechanism with linear computational complexity instead of quadratic, enabling faster inference.
A task where a model predicts missing relationships between entities in a knowledge graph, such as guessing that two people are colleagues based on existing connections.
A neural network architecture that uses continuous, adaptive functions to process information, allowing the model to adjust its behavior dynamically based on input.
The capability to read and understand text and written content within images, rather than just recognizing objects or scenes.
A continuously updated evaluation system that scores models on new data as it arrives, rather than a fixed test set.
A transformer-based neural network design optimized for efficient language modeling and text generation.
A design pattern that connects a vision encoder to a language model, enabling the language model to understand and describe images.
A language model trained to evaluate and judge outputs (like comedy sketches) based on learned human preferences.
Using a language model to automatically evaluate the quality of outputs from other AI systems instead of human reviewers.
Using a language model to automatically evaluate or score outputs from other AI systems instead of human reviewers.
Running a model directly on your own computer or server instead of sending requests to a remote service.
Running an AI model directly on your own computer rather than sending data to a remote server, keeping data private and reducing latency.
A technique that groups similar items together using hashing, allowing the model to attend to relevant parts of long text without comparing every token to every other token.
An efficient attention mechanism that groups similar tokens together to reduce computation, allowing the model to handle longer texts without excessive memory use.
Pre-defined action sequences or skills expressed using logical rules that guide an agent toward specific goals.
Methods that use the model's raw prediction scores to make decisions, rather than analyzing deeper internal patterns.
The ability of a model to process and understand very long sequences of text while maintaining coherence across distant parts of the input.
An embedding model designed to process and maintain meaningful representations across very long documents (thousands of tokens), rather than just short snippets.
The ability to process and understand very long documents or conversations without losing track of earlier information.
The ability to process and understand very long input texts (thousands of tokens) while maintaining coherent reasoning across the entire passage.
The ability to process and integrate information from many sources or a large amount of text, then combine it into a coherent summary or report.
The capability to produce extended, coherent text such as articles, reports, or documents while maintaining consistency and structure throughout.
The capability to produce extended, coherent text outputs like essays, articles, or detailed explanations rather than just short responses.
Testing an AI system's ability to maintain context and preferences across many sequential interactions over time.
Finding relevant information across many steps or a large dataset to answer complex multi-part questions.
Complex goals requiring many sequential steps or decisions to complete successfully.
Forces between atoms that are far apart from each other, which are harder for models to capture.
The ability to handle very long input texts (thousands or more tokens) efficiently, which standard models struggle with due to computational constraints.
Rare or uncommon facts that appear infrequently in training data, making them harder for models to remember accurately.
Stored structured knowledge (like diagnostic criteria) that an AI system can access during reasoning.
A technique that adds small, trainable layers to a pre-trained model instead of retraining the entire model, making fine-tuning faster and more memory-efficient.
A lightweight method to customize a frozen language model for specific tasks without retraining the entire model.
Reducing file size while preserving all original data perfectly, so decompression recovers the exact original.
The ability to generate responses very quickly with minimal delay between when you send a prompt and when you receive an answer.
Representing data using fewer dimensions while preserving key information.
A tool that lets non-programmers build applications by writing minimal code or using visual interfaces.
A lightweight neural pathway that processes information through a compressed representation to reduce computation.
A language with limited training data and AI tools compared to English or other major languages.
Languages with relatively little training data available compared to major languages like English, making them harder for AI models to learn.
A measure of how quickly nearby trajectories diverge in a dynamical system; determines stability and predictability.
A neural network trained to predict atomic forces and energies, enabling fast simulations of molecular behavior.
An AI model that learns to predict forces and energies between atoms in molecules and materials.
Automated translation of text from one language to another using computational systems.
Removing the influence of specific poisoned data from a trained model without full retraining.
Neural network models trained to predict forces and energies between atoms, used to simulate materials without expensive quantum calculations.
The task of arranging large functional blocks on a chip to optimize performance and minimize wiring.
A state-space model architecture designed to process long sequences faster and with less memory than traditional transformer models.
A neural network design that uses state-space models as an alternative to transformers, offering faster processing and lower memory usage.
A hybrid model design that combines Mamba (a state-space model) with Transformer components to process long sequences more efficiently than pure Transformers while maintaining strong performance.
A neural network design that combines selective state spaces (Mamba) with traditional attention mechanisms to process text more efficiently while maintaining strong performance.
A cloud service where the provider handles infrastructure, updates, and maintenance so you only focus on using the service rather than managing it.
The assumption that high-dimensional data lies on a lower-dimensional curved surface (manifold) rather than filling the entire space.
Discovering the underlying low-dimensional structure of high-dimensional data.
A statistical sampling technique that intelligently explores parameter space to find realistic values.
A sampling method that generates sequences of dependent samples to approximate probability distributions.
A framework for sequential decision-making with probabilistic state transitions.
A training technique where random words in text are hidden, and the model learns to predict them based on surrounding context.
A training technique where parts of text are hidden and the model learns to predict what should fill those gaps, helping it understand context and meaning.
A training technique where parts of the input are hidden, and the model learns to predict what was masked, helping it understand underlying patterns.
Attention that only looks at past tokens, preventing future information leakage.
A technique where the model learns to predict hidden or blanked-out words in text, allowing it to reason about context from multiple directions at once.
Placeholder positions in text that are hidden or unknown, which the model learns to fill in or refine during generation.
A process where the model hides (masks) and then progressively reveals (unmasks) parts of text to refine and improve the entire sequence iteratively.
Extreme outlier values in a small number of tokens and channels within a neural network layer.
Separating model weights into components for efficient distributed training.
Pre-computed results stored for fast retrieval instead of computing on demand.
A model that has been optimized and trained specifically for mathematical reasoning and problem-solving tasks, rather than general-purpose language understanding.
Symbolic representations of mathematical expressions and equations (like formulas and symbols) that need special handling to be correctly interpreted by AI models.
The process of analyzing and interpreting visual mathematical symbols and equations to convert them into a structured, computer-readable format.
The ability to solve multi-step math problems by breaking them down logically and showing intermediate steps rather than just guessing the answer.
An XML-based markup language designed specifically for representing mathematical notation in a way that computers can understand and display.
Decomposing a matrix into a product of smaller matrices, commonly used for dimensionality reduction and pattern discovery.
A training technique that allows a single embedding model to produce high-quality results at multiple vector sizes, letting you shrink the embedding dimensions to save storage and speed without retraining.
A technique that combines multiple token embeddings into a single representation by averaging them, producing one embedding for an entire text sequence.
Subtle changes to how choices are presented that systematically influence AI agents without degrading the decision environment for humans.
Designing rules for interactions between parties to achieve desired outcomes like fairness or efficiency.
Proof that a model's behavior stems from a specific internal mechanism.
The study of understanding how a language model's internal components and computations work to produce its outputs.
The ability to apply clinical knowledge and logic to interpret medical data, such as understanding what symptoms indicate about a patient's condition.
An attack that determines whether a specific data point was used to train a model.
When a model learns to reproduce exact training examples rather than learning general patterns it can apply to new situations.
The maximum amount of information a model can store and retrieve.
How well a model uses available RAM or GPU memory, allowing it to run on smaller or less expensive hardware.
The amount of RAM or storage space a model requires to run, which is critical for deployment on resource-constrained devices.
A neural component that selects and refines relevant knowledge from long-term memory based on the current context.
The combination of a base model's weights with additional trained weights (like from LoRA adapters) into a single unified model file.
A higher-level agent that monitors and improves other agents by comparing their outputs against reality and updating their code or instructions.
Training a model to learn how to learn, so it can quickly adapt to new tasks or changing conditions.
Self-awareness about thinking processes, including goal assessment, domain awareness, and strategic exploration.
A general problem-solving strategy that explores solutions without guaranteeing optimality but finds good answers quickly.
A testing approach that checks if a system maintains consistent behavior under semantically equivalent input transformations.
A state that appears stable but is easily disrupted by small changes or perturbations.
Using an evaluation metric that doesn't align with true objectives.
A model positioned between lightweight and flagship versions, balancing capability with efficiency rather than maximizing raw performance.
Software layer that sits between services to translate, transform, or coordinate their interactions.
Multi-input, multi-output architecture that processes multiple data streams in parallel to improve model expressiveness without increasing latency.
A training method where one part tries to break the model (maximization) while another part fixes it (minimization) to build robustness.
Control strategy that achieves desired system behavior using the least amount of control effort.
An optimization algorithm that uses geometric transformations to adapt learning to different data distributions.
A property allowing optimization algorithms to switch between different geometric transformations while maintaining convergence.
A specific design pattern for transformer-based language models that uses efficient attention mechanisms and grouped query attention to balance performance and speed.
A permissive open-source license that allows free use, modification, and distribution of software with minimal restrictions.
Using different numerical precisions for different parts of computation.
Training with lower precision for speed while maintaining higher precision where needed.
A quantum state representing uncertainty or entanglement with an environment, described by a density matrix rather than a pure state vector.
Using different numerical precisions (e.g., 8-bit, 4-bit) for different parts of a model to reduce memory and computation.
An architecture where a model contains multiple specialized sub-networks (experts) and selectively activates only a few for each input, improving efficiency without sacrificing capability.
A machine learning framework optimized for running models efficiently on Apple Silicon chips.
Running a model locally on Apple Silicon hardware using the MLX framework, which is optimized for efficient inference on Mac devices.
A model format designed specifically for efficient inference on Apple Silicon devices, optimized for the MLX machine learning framework.
A machine learning framework specifically designed for running AI models efficiently on Apple Silicon hardware.
A framework that optimizes AI models to run efficiently on Apple Silicon chips (like M1, M2, M3), taking advantage of their specific hardware capabilities.
A robot's ability to move around an environment while using its arms to pick up and interact with objects.
A type of input or output data a model can process, such as text, images, or audio.
When a multimodal system stops using some of its input types and relies only on one or a few.
Adapting a model trained on one type of data (like video) to work with a different type (like tactile signals) efficiently.
The property that different trained models can be connected through a continuous path in weight space.
The underlying structural design of a neural network that determines how data flows through it and how it processes information.
The core underlying architecture of a model that serves as the foundation for specialized versions or fine-tuned variants.
A ranking level within a model family that indicates relative power, speed, and cost trade-offs.
The size and complexity of a model, which determines how much information it can learn and store; smaller capacity means fewer parameters and less computational power needed.
A saved snapshot of a trained model's weights and parameters, stored in formats like safetensors or PyTorch for later use or deployment.
When a language model's training performance suddenly degrades due to overconfidence in incorrect predictions.
Techniques used to make models smaller and faster to run, allowing them to work on devices with limited memory or processing power.
The process of configuring and launching a trained model in a cloud environment so it can receive requests and generate responses.
Differences in predictions across multiple models on the same input.
A technique where a smaller, faster model is trained to mimic the behavior of a larger, more capable model to reduce computational costs.
Degradation of model performance over time due to changes in data distribution or real-world conditions.
How well a model performs relative to its computational cost and resource requirements, important for deployment on devices with limited hardware.
The amount of memory and computational resources required to run a model, with smaller footprints being more efficient.
The amount of memory and computational resources required to run a model, determined primarily by its size and architecture.
The file format used to store and load a model's weights; common formats like safetensors and PyTorch determine compatibility with different tools and frameworks.
Learning optimal behavior without explicitly modeling the environment.
The process of running a trained model on new input data to generate predictions or outputs, as opposed to training the model.
The stacked computational components in a neural network that progressively transform input data; fewer layers means faster processing but potentially less ability to capture complex patterns.
A technique that combines the learned knowledge from two or more trained models into a single model.
Techniques used to make a model smaller, faster, or more efficient while maintaining acceptable performance.
The internal numerical values (weights) that a neural network learns during training and uses to make predictions.
A control method that predicts future system behavior and optimizes actions over a time horizon.
A control method that predicts future system behavior and optimizes actions based on a mathematical model.
Removing unnecessary parameters or connections from a model to reduce size and computation.
A technique that reduces a model's size and memory requirements by using lower-precision numbers, enabling it to run on resource-limited devices.
The size of a model measured by the number of parameters it contains; smaller models are faster but less capable than larger ones.
The practice of increasing a model's size (parameters, training data, or compute) to improve its capabilities and performance.
The total number of parameters (learnable values) in a model, which affects its memory usage, speed, and capability.
Training a model to excel at a narrow set of tasks rather than performing well across many different domains.
A minimal, simplified version of a model used for testing code and infrastructure without the computational cost of a full model.
A collection of related models of varying sizes or configurations released together for comparative research and analysis.
The ability to examine and understand how a model works, including access to its weights, architecture, and training details.
The process of testing a model to ensure it works correctly within a framework or pipeline before deploying it for real tasks.
A modified version of a base model that changes its size, capabilities, or behavior while maintaining the same core architecture.
The learned numerical parameters inside a neural network that determine how it processes input and generates output.
A technique that works across different model architectures without requiring architecture-specific modifications.
Learning approach where an agent builds a model of how the environment works, then uses it to plan actions.
A computational technique that simulates how atoms move and interact over time.
A specialized AI model trained to understand and process chemical structures by learning patterns from molecular data, similar to how text language models learn from words.
The ability to understand and predict how molecules behave, interact, and transform based on their chemical structure and properties.
A distillation technique that aligns statistical properties (moments) between a teacher and student model.
An optimization technique that accumulates gradients to accelerate convergence.
Predicting 3D depth information from a single 2D image without stereo or multiple views.
A technique using dropout during inference to estimate model uncertainty by sampling multiple predictions.
A computational technique using repeated random sampling to estimate probability distributions and outcomes.
An algorithm that explores game possibilities by randomly simulating many future moves to estimate the best action.
A model's ability to understand and apply ethical principles to make judgments about right and wrong.
The ability to understand and process word structure, including prefixes, suffixes, and inflections that change word meaning or grammatical function in languages like Russian.
The linguistic challenge of handling languages where words change form significantly based on grammar, tense, and case—common in Polish and other inflected languages.
The structure and rules of how words are formed and modified in a language, which is especially important for languages like Korean with complex word composition.
A dynamic decision boundary that adjusts based on detected motion to determine when cached features can be safely reused.
A neural network design that combines masked language modeling with permutation language modeling to better understand relationships between words in text.
Multiple independent agents interacting and learning in a shared environment.
Solving problems by chaining multiple reasoning steps together sequentially.
Techniques for making multiple autonomous agents work together toward shared goals.
A system where multiple AI agents work together, cross-checking and debating each other's reasoning before producing a final answer.
A system where multiple AI agents with different roles work together to solve a problem.
Training multiple agents simultaneously so they learn to cooperate and improve together toward shared goals.
Multiple AI agents working together, each with different roles or goals, to solve a problem collaboratively.
A decision problem where an agent repeatedly chooses between options to maximize rewards while learning which is best.
Training a model on question-answer pairs from many different topics or fields to make it work well across diverse subjects.
The ability to understand and generate code across many different programming languages.
Finding solutions that balance multiple competing goals simultaneously.
An iterative approach where an LLM revisits and refines its analysis across multiple complete passes through a problem.
System design that integrates multiple LLM providers for improved reliability through consensus and fallback mechanisms.
The ability to break down complex problems into smaller sequential steps and solve them methodically rather than attempting to answer in one go.
The ability to break down complex problems into sequential reasoning steps and correctly combine them to reach a solution.
The ability to break down complex problems into smaller steps and solve them sequentially, rather than jumping directly to an answer.
The ability to break down complex problems into sequential steps and execute them autonomously without human intervention between steps.
Problems or workflows that require a model to perform multiple sequential operations or reasoning steps to reach a final answer.
Training a single model on multiple different tasks simultaneously so it learns shared skills across them.
Generating multiple future tokens in parallel instead of one at a time.
The ability to maintain context and coherence across multiple back-and-forth exchanges with a user, remembering earlier messages in the conversation.
A conversation where the model maintains context across multiple back-and-forth exchanges with a user, remembering previous messages.
A representation where documents and queries are encoded as multiple vectors (one per token) instead of a single vector, enabling more precise matching.
A search method that represents a single piece of text using multiple vectors simultaneously, allowing more flexible and nuanced matching.
Combining information from multiple camera angles to create a unified understanding of a scene.
Computational techniques that combine solutions from models of varying accuracy and cost to reduce overall computation.
A model trained to understand and generate text in multiple languages, not just English.
The ability of a model to understand and generate text in multiple languages, often with varying levels of proficiency across different language pairs.
A model's ability to understand and generate text in multiple languages, not just English.
A large collection of source code written in many different programming languages, used to train the model.
The ability of a model to understand and generate text in multiple languages, typically because it was trained on data from many different languages.
A shared mathematical space where sentences from different languages are positioned so that translations or sentences with the same meaning end up near each other.
A shared numerical space where text from different languages is represented so that similar meanings across languages are positioned close together, enabling cross-language comparison.
A model trained on text from multiple languages, allowing it to understand and generate text in several different languages.
Natural language processing systems designed to understand and work with text in multiple languages, including non-Latin scripts like Cyrillic.
A model's ability to understand and generate text in multiple languages with comparable quality across different language pairs.
The capability to understand, process, and reason through problems in multiple languages, not just English.
When a model is optimized for one or a few languages rather than many, trading broad language support for deeper fluency in those specific languages.
The ability of a model to understand and process text in multiple languages, not just English.
Training a model on text from many different languages so it can understand and generate text across all of them.
A model that can process and understand multiple types of input, such as both text and images.
An AI system that can process and reason over multiple types of data (text, images, documents) to complete tasks.
The process of training a model to understand and connect different types of data (like audio and text) by mapping them into a shared space where related concepts are close together.
An adversarial attack that simultaneously perturbs multiple input modalities (e.g., text and audio) to fool a model.
Attention mechanism that processes multiple types of input (like text and image features) simultaneously in a transformer.
Discriminatory patterns that emerge when AI models process multiple input types (text, audio, images) together.
The ability of an AI model to understand and reason about multiple types of input data (like images and text) simultaneously.
A conversational interaction where the model can understand and respond to inputs that combine both text and images in a natural back-and-forth exchange.
A generative model that takes multiple types of input (like text and images) to create new content.
A representation that captures meaning from multiple types of data (like text, images, and tables) in a single searchable format.
Combining data from multiple sources (like ECG and PPG) to make better predictions than using each source alone.
A reward model that processes multiple input types (text, images) and generates interpretable feedback about output quality.
The ability to accept and process multiple types of input data simultaneously, such as both images and text in the same request.
An AI model that processes both text and images to understand and reason about visual content.
Training a model to understand and process multiple types of input data (like text and images) together rather than separately.
An AI model that can process and understand multiple types of input data, such as video, images, and text together.
A sequence of processing steps that handles multiple types of input data (like text and images) together in a single workflow.
Generating multiple plausible future outcomes instead of a single prediction.
Training a model on paired images and text data so it learns to connect visual and language understanding together.
Safety mechanisms that operate across multiple input types like images and text simultaneously.
Predicting time-to-event outcomes using multiple types of data (e.g., images, lab results, clinical notes).
AI tasks that require processing multiple types of input data at once, such as understanding both an image and a text question about it.
The ability of an AI model to process and reason about multiple types of input data (like images and text) simultaneously.
A system designed to understand and work with multiple types of content, such as text and images, even if it only processes one type directly.
Training a model on multiple related tasks simultaneously so it learns shared patterns that improve performance across all tasks.
A second-order optimizer designed for hypersphere-constrained training that improves stability during scaling.
The ability of a model to analyze and interpret musical characteristics like genre, emotion, harmony, and structure from audio or music data.
Deliberately introducing bugs into code to test whether test suites can catch them.
A low-precision floating-point format (4-bit) designed for efficient neural network computation while maintaining reasonable accuracy.
A natural language processing task that identifies and classifies specific entities like people, places, and organizations within text.
The task of automatically creating coherent stories or sequences of events in text form.
The organized framework of a story, including how events are sequenced and how the plot progresses from beginning to end.
The ability of a model to directly understand different types of input (like images or audio) without converting them to text first.
When a model can directly understand different types of input (like images or audio) without needing to convert them to text first.
The ability to process images at their original sizes and aspect ratios without forcing them into a fixed square dimension, reducing information loss from resizing.
An optimization method that accounts for the geometry of the data distribution, often converging faster than standard gradient descent.
The process by which a model produces human-readable text output based on its understanding of input and learned patterns.
A training task where a model learns to determine whether one sentence logically follows from another, helping it understand relationships between texts.
The field of AI focused on enabling computers to understand, interpret, and generate human language in a meaningful way.
The field of AI focused on understanding and generating human language in a meaningful way.
The process of converting human-written instructions or descriptions into executable programming code.
The ability of a model to comprehend and extract meaningful information from human language, rather than just pattern-matching on words.
Ranking metric measuring how well relevant items are placed at the top.
When learning from one task actually hurts performance on another task due to conflicting patterns.
A training technique where the model learns by comparing correct matches against intentionally chosen incorrect examples to improve discrimination.
A machine learning model that compresses audio into a compact digital format and can reconstruct it back to near-original quality.
A neural network component that converts raw text input into a numerical representation (embedding) that captures semantic meaning.
The process of converting text or other data into numerical vector representations using neural networks, enabling machines to understand and process language.
A neural network that represents continuous 3D properties (like temperature or material density) as a smooth function rather than discrete grid values.
Using neural networks and embeddings to find relevant documents or passages in response to a query, rather than traditional keyword matching alone.
An AI model trained to predict how code executes step-by-step without actually running it.
A learnable memory component that neural networks can read and write to.
A neural network that models continuous dynamics by treating layers as differential equations.
A learned function that maps between infinite-dimensional function spaces, used for solving physics equations on meshes.
A search method that uses neural networks to understand semantic meaning and find relevant documents, rather than relying on keyword matching alone.
Combining neural networks with symbolic logic to get both the flexibility of learning and the interpretability of rule-based systems.
The pattern of which neurons in a neural network fire or respond when processing specific inputs.
An optimization algorithm that finds roots of equations by iteratively refining guesses using function derivatives.
Advanced features and improvements in a model that represent a significant step forward from previous versions.
The fundamental task where a language model learns to guess the most likely next word (or token) based on all the words that came before it.
A pretraining task where a model learns to predict which clinical events will occur at a patient's next healthcare visit.
A specific 4-bit quantization method that uses a normalized float format to preserve model accuracy while dramatically reducing memory requirements.
The starting point for diffusion generation, typically random Gaussian noise that gets progressively refined into an image.
A sequence defining how much noise is added during training and removed during sampling in diffusion models.
A preprocessing technique that removes or corrects low-quality or mismatched training examples before training, improving model reliability.
Generating all output tokens simultaneously rather than one at a time, enabling faster inference.
A generation approach where the model generates multiple tokens in parallel or through iterative refinement, rather than one at a time.
A text generation approach where the model can predict or refine multiple words in parallel, rather than generating one word at a time in sequence.
A legal restriction that permits using the model for learning and research but prohibits using it in production systems or for commercial purposes.
Specifications describing how a system should perform, including quality attributes like performance and security.
A decision problem where the optimal action depends on history, not just the current observation, because the present state is ambiguous.
Tracking and reconstructing objects that bend or change shape, rather than staying rigid.
System behavior that changes over time rather than remaining constant, like wear or environmental drift.
How well a model adapts its behavior based on social norms and contextual expectations.
The ability to grasp subtle meanings, context, and shades of gray in language rather than treating everything as black-and-white.
The ordered arrangement of DNA building blocks (A, T, G, C) that make up genetic code.
AI planning that handles continuous numeric quantities like data sizes, processing times, and resource constraints.
The ability to understand, manipulate, and solve problems involving numbers, calculations, and mathematical logic.
A low-precision numerical format optimized by NVIDIA that uses fewer bits per number than standard formats, enabling efficient inference on NVIDIA GPUs while maintaining reasonable accuracy.
A computer vision task that identifies and locates specific objects within an image by drawing boxes around them.
The task of identifying and outlining individual objects in an image or video by marking their exact boundaries at the pixel level.
Task where an AI agent navigates to locate and reach a specified target object in a physical environment.
When objects or areas are hidden from view by other objects in front of them.
A 3D model that accounts for hidden or blocked parts of objects in a scene.
The ability to detect and extract text from images, converting printed or handwritten characters into machine-readable text.
A model that understands text in images without needing a separate optical character recognition (OCR) tool to extract the text first.
Training an AI agent using only pre-collected data without interacting with the environment.
An AI model that natively processes audio, vision, and text inputs together in a single system.
A drone's ability to detect and avoid obstacles coming from any direction, not just ahead.
A model designed to run directly on a user's device (phone, laptop, etc.) rather than requiring a remote server.
Running an AI model directly on a user's device (phone, laptop, edge device) rather than sending data to a remote server.
Running a model directly on a user's device (phone, laptop, etc.) rather than sending data to a remote server, which improves privacy and reduces latency.
Reinforcement learning where the model learns from data generated by its own current policy.
The ability to learn or perform a task from a single example, rather than requiring many training examples.
Continuously updating a model with new incoming data in real-time rather than in batch training sessions.
Training a model on streaming data one example at a time, updating weights immediately rather than in batches.
An open standard format for saving and running machine learning models that works across different frameworks and platforms, making models more portable and efficient.
An open standard file format for storing trained machine learning models so they can run efficiently across different platforms and frameworks.
A structured, standardized system that defines relationships between concepts — in this case, medical terms and their clinical meanings.
A legal permission that allows anyone to freely use, modify, and distribute the model without restrictions (in this case, Apache 2.0).
Software or models where the code, weights, and training data are publicly available for anyone to inspect, use, and modify.
A legal framework (like GPL-3.0) that allows anyone to use, modify, and distribute the model code and weights freely, often with requirements to share improvements.
A model whose trained weights are publicly downloadable, allowing local deployment and modification.
A model trained to handle conversations on any topic without being restricted to a specific subject area.
The task of finding relevant documents from a very large, unrestricted collection to answer questions, without being limited to a specific domain or dataset.
Questions or instructions that have multiple valid answers rather than a single correct response.
Optimization where the solution space and objectives are not fixed in advance but emerge during the search process.
Publicly released model parameters that allow anyone to download and run the model locally, rather than accessing it only through a company's API.
Detecting objects in images using arbitrary text descriptions rather than a fixed set of predefined categories.
A model whose trained weights are publicly released, allowing anyone to download and run it locally.
A model whose trained weights are publicly released and can be freely downloaded and used, as opposed to being proprietary or access-restricted.
A model whose trained weights are publicly released, allowing anyone to download and run it locally rather than only accessing it through an API.
An open-source license that allows free use of a model while including responsible AI guidelines and usage restrictions.
To define an abstract concept in concrete, measurable terms that can be tested or evaluated.
A mathematical measure of how much a matrix can stretch vectors, used to understand optimizer behavior.
A technology that automatically detects and extracts text from images or scanned documents.
A visual representation showing how pixels move between video frames, indicating motion direction and speed.
A mathematical method for finding the most efficient way to move one distribution to another.
The process of adjusting model parameters to minimize errors and improve performance.
An algorithm that updates model weights during training to reduce loss and improve accuracy.
Internal variables an optimizer maintains, like momentum or adaptive learning rates, between updates.
The total number of gradient computations or function evaluations required to reach a desired solution accuracy.
Evaluating model outputs by ranking them on an ordered scale rather than binary correct/incorrect judgments.
A mathematical operation that removes specific directions from high-dimensional data while preserving other information.
Feature vectors that are perpendicular to each other, capturing independent information.
A mathematical operation that rearranges data while preserving its geometric properties, used here to update model weights more efficiently.
A special type of doubly stochastic matrix derived from orthogonal matrices, providing a structured way to parameterize the Birkhoff polytope.
Data that differs significantly from the training set, often causing poor model predictions.
A model's ability to make predictions beyond the range of values it saw during training.
Using a model on tasks or data significantly different from what it was trained on.
Words or characters that a model has never seen during training and doesn't have a built-in representation for.
The type of data a model produces as output, such as text, images, or predictions.
When a model assigns high confidence to predictions that are actually incorrect or unreliable.
The expected human effort and resources required to monitor and intervene in autonomous agent decisions.
A number system extending rationals using p-adic absolute value, important for studying arithmetic geometry.
Evaluating models by comparing outputs two at a time, which scales quadratically with the number of models.
Using a 360-degree camera view to see the entire environment around a drone at once.
Processing and understanding text at the scale of full paragraphs rather than individual sentences or words.
Non-verbal aspects of speech like pitch, tone, and accent that convey information about speaker identity.
Generating multiple output tokens at once instead of sequentially for faster inference.
A generation approach where multiple parts of the output are improved simultaneously rather than sequentially, enabling faster completion.
Running multiple independent attempts at solving a problem simultaneously to gather diverse training data.
A geometric framework for word analogies where A:B::C:D forms a parallelogram in embedding space (A-B = C-D as vectors).
The process of selectively using only a subset of a model's total parameters during inference, reducing computational cost while maintaining performance.
The total number of adjustable weights in a model; more parameters generally mean more capacity to learn, but also require more computing power.
The ability of a model to achieve strong performance while using fewer total parameters or activating fewer parameters during inference, reducing memory and computational requirements.
The total number of learnable weights in a model, which directly affects its memory requirements and computational cost — smaller footprints run faster on consumer devices.
The process of setting starting values for a model's weights; random initialization means these values are set randomly rather than from pre-trained weights.
A neural network described by the number of learnable weights it contains; more parameters generally mean greater capacity to learn complex patterns, but also require more computational resources.
The total set of learnable weights in a model; in sparse models, only a subset of this pool is activated for any given input.
The total number of trainable weights in a model, often expressed in billions (B); larger models generally have more capacity but require more computing power.
Reusing the same weights across multiple layers or iterations to reduce model size and memory overhead.
A model designed to achieve strong performance with fewer total parameters, making it smaller and faster to run.
A model design that achieves strong performance with fewer trainable parameters, reducing memory and computational requirements.
Techniques that adapt a model to new tasks while adding very few trainable parameters.
The learned numerical values in a model — more parameters generally means more capacity but higher compute cost.
Information encoded in an LLM's weights and parameters during training, as opposed to retrieved external knowledge.
Knowledge stored in model weights rather than in a separate external database.
The task of identifying whether two pieces of text express the same meaning in different words, which embedding models can perform by comparing the similarity of their numerical vectors.
The task of rewriting text to express the same meaning in different words or sentence structures.
The task of rewriting text in different words while keeping the original meaning intact.
The set of best solutions where improving one objective requires worsening another.
Generating objects by explicitly modeling and composing individual semantic parts rather than treating the whole object as a single unit.
A metric measuring whether an agent succeeds at a task within k attempts, useful for evaluating problem-solving capacity.
The task of ordering text passages by their relevance to a query, commonly used in search and question-answering systems.
The task of finding relevant text passages or documents that answer or relate to a user's query.
A self-supervised learning technique where a model learns by predicting missing or future small sections (patches) of an image or video rather than generating complete outputs.
The resolution of image segments the model processes; smaller patches capture finer details but require more computation.
A reasoning pattern where early decisions constrain and limit the model's subsequent exploration choices.
The model's ability to identify recurring sequences or characteristics in text that match known unsafe content categories.
Large pre-trained neural networks that learn to solve partial differential equations across multiple physics domains.
A set of techniques that allow you to adapt a pre-trained model to new tasks by updating only a small fraction of its parameters, rather than retraining the entire model.
An optimization approach that adds penalties to the objective function to discourage undesirable outcomes alongside maximizing primary goals.
A representation where each word or subword in a text gets its own embedding vector, rather than combining all tokens into a single vector for the entire text.
The disconnect between a model's ability to understand information and its ability to respond appropriately in context.
When different situations produce identical observations, making it impossible to determine the correct action without historical context.
Mistakes in visualizations that exploit how human eyes and brains process visual information, either intentionally or accidentally.
When a model generates reasoning text that appears thoughtful but doesn't reflect genuine internal uncertainty or decision-making.
Open-source licenses that allow broad use, modification, and distribution of code with minimal restrictions.
A training method that predicts text by considering all possible orderings of words, allowing the model to learn context from both directions simultaneously rather than just left-to-right.
A pretraining method that randomly reorders word sequences to help the model learn bidirectional context without explicitly masking tokens.
A metric measuring how well a model predicts the next token — lower perplexity means better language modeling.
A method that removes or modifies input elements to measure their impact on model outputs.
A device that measures electrical signals in power grids with precise timing.
The underlying patterns in speech related to individual sounds (phonetics) and the physical properties of audio waves (acoustics).
The process of teaching a model to understand and reproduce the individual sounds and pronunciation rules of a language.
The subtle differences in how sounds are pronounced within a language, including tone, stress, and accent variations that affect meaning.
A text-based encoding of how words sound, showing the individual speech sounds rather than the written spelling.
Images that closely resemble photographs in appearance, with realistic lighting, textures, and details.
Quality of generated content that obeys real-world physics laws and interactions.
Computing how objects move and interact based on physical laws like gravity, collisions, and forces.
Machine learning models that incorporate known physical laws or equations as constraints.
Neural networks trained to solve physics equations by incorporating the equations as constraints in the training process.
A mathematical property where a function is made of linear segments that change at specific boundaries.
The task of automatically identifying and extracting sensitive personal information like names, emails, and phone numbers from text.
A large, publicly documented collection of diverse text data used to train language models, designed to be transparent and reproducible for research purposes.
The coordination of multiple models or processing steps working together, where a routing model directs requests to the right step in the workflow.
Testing a workflow or system end-to-end to ensure all components work together correctly before using it with real data.
Visual information extracted directly from individual pixels in an image, used to understand the precise positioning and appearance of elements on a page.
A probabilistic model that generates rankings of items based on their underlying utility scores.
A model's ability to learn and adapt to new tasks and data.
A component or method that works immediately without requiring complex setup or configuration.
Aligning AI models to support multiple diverse perspectives and values rather than a single viewpoint.
A minor update to a software version (like 5.1 to 5.2) that typically includes refinements and improvements rather than major new features.
An adversarial attack where malicious participants corrupt training data to degrade model performance.
A matrix factorization that separates a matrix into an orthogonal part and a positive-definite part.
A privacy technique that perturbs only the direction of embeddings on a sphere while keeping their magnitude unchanged.
Process of adjusting a model's behavior to follow specific constraints or objectives during training.
The process by which a reinforcement learning agent's decision-making strategy stabilizes toward optimal behavior.
Converting trajectories or behaviors discovered during exploration into a trainable policy that can be deployed.
The process of automatically checking content against a set of rules or guidelines and blocking or flagging violations.
Optimization method that updates model parameters by following the gradient of expected rewards.
The ability to identify when content breaks specific safety rules or guidelines set by an organization.
A method that runs multiple different solving strategies in parallel and uses the best result.
The process of selecting and weighting assets to create an investment portfolio that balances risk and return objectives.
The task of identifying and locating body parts (like joints or keypoints) in images or video.
Estimating future body joint positions and orientations from past poses.
Modifying how a model encodes token positions to extend its ability to handle longer sequences.
Reducing model size by converting weights to lower precision after training is complete.
An explanation method applied after a model is trained to interpret its predictions, rather than building interpretability into the model itself.
Additional refinement applied to a model after its initial training to improve performance on specific tasks like reasoning or instruction-following.
The updated probability distribution of parameters after observing new data.
A non-invasive measurement of blood flow and heart rate using light sensors, commonly found in smartwatches.
Hardware optimization achieved by adding compiler directives (pragmas) to code that guide synthesis tools in generating efficient designs.
The study of how context and intent affect language meaning beyond literal words.
A Transformer design choice where layer normalization is applied before the main computation rather than after.
A model that has already been trained on large amounts of data before being released, so it can be used immediately without additional training.
The level of numerical detail a model uses to represent its internal values; higher precision means more accurate calculations but requires more memory.
A slight loss in model accuracy or reasoning quality that can occur when using quantization or other compression techniques.
The reduction in numerical accuracy that occurs when a model is compressed, which can slightly degrade performance on complex reasoning tasks while remaining acceptable for most everyday uses.
The balance between reducing model size through lower numerical precision and maintaining accuracy—lower precision saves memory but may slightly reduce performance.
A control method that forecasts future states and optimizes actions accordingly.
How well an AI system's judgments match the actual preferences of target users or evaluators.
Refining a model by learning from human comparisons of outputs rather than explicit numerical scores.
Loading data into memory before it's needed to reduce wait times during computation.
A simple rule where you add a label like 'query:' or 'passage:' to the beginning of text to tell the model how to process it differently.
Comparing token sequences to find semantically equivalent continuations in an LLM's output.
A model that has already been trained on large amounts of text data before being released or fine-tuned for specific tasks.
A foundational AI model trained on raw data but not specialized for specific tasks like conversation, serving as a starting point for further customization.
A model trained on large amounts of text data to predict and generate language before being adapted for specific applications.
A model that has already been trained on large amounts of text data and can be used directly or fine-tuned for specific tasks.
The learned parameters of a model after training on large amounts of text data, ready to be used or further refined for specific tasks.
The initial training phase where a model learns general patterns from a large dataset before being adapted for specific downstream tasks.
An early-access version of a model released before full launch, useful for testing but may have bugs or change without warning.
An early version of a model released for testing and feedback before a stable, finalized version is available.
An early version of a model that is still being tested and refined before an official release, so features or performance may change.
An experimental version of a model released early for testing and feedback, with behavior and features that may change significantly before the official release.
The performance loss a model experiences when trained to be robust against attacks instead of optimized purely for accuracy.
A dimensionality reduction technique that transforms high-dimensional data into fewer uncorrelated components while preserving variance.
A model's default gender assumptions when translating ambiguous source text without explicit gender markers.
The balance between protecting sensitive information and maintaining model performance on downstream tasks.
A network configuration that isolates your model's traffic from the public internet, keeping it accessible only within your organization's internal network.
Limiting what actions an agent can perform based on its role and the sensitivity of the task.
A higher-capability version of a model designed for more demanding tasks, typically with better reasoning and language understanding than base versions.
Computing with randomness and probability distributions to achieve robustness, interpretability, and security in AI systems.
A structured representation showing how variables relate to each other and their probabilistic dependencies.
The geometric space of all valid probability distributions, where each point represents a probability vector summing to one.
The model's capacity to analyze difficult questions or technical challenges and work toward accurate, well-reasoned solutions.
A model trained to evaluate and score the quality of intermediate steps in a solution, rather than just checking if the final answer is correct.
System design that enforces constraints during reasoning steps rather than only filtering final outputs.
Code that is complete, tested, and formatted to standards suitable for immediate use in real applications.
An optimization method that updates inputs along gradients while constraining them to stay within a valid range.
The initial text you provide to a language model to guide what it should generate or complete.
Using descriptive text instructions to guide or control how a model generates output, such as specifying desired voice characteristics.
Designing the input text to a model in specific ways to improve the quality of its responses.
Selectively activating or deactivating task-specific prompts based on whether incoming data matches learned patterns.
A short instruction added to the beginning of input text that tells the model how to treat that text (for example, marking it as a 'query' versus a 'passage').
A model interaction style where you guide the model's output by providing minimal cues like clicks, boxes, or masks rather than detailed text instructions.
A way to control what a model does by giving it text instructions, rather than requiring code changes or separate training for different tasks.
A small-scale demonstration or experiment designed to test whether an idea or approach is feasible, rather than for production use.
The process of spreading information or edits from reference points (keyframes) to other frames in a sequence.
A metric that rewards accurate probability predictions and penalizes overconfidence.
Creating candidate regions or concepts from input (e.g., converting text queries into visual targets).
The process by which a protein chain folds into its three-dimensional structure, which is essential for the protein to function properly.
A neural network trained on large collections of protein sequences to learn patterns in amino acids, similar to how language models learn patterns in text.
Complete record of the origin, history, and context of data or findings, enabling reproducibility and traceability.
A framework where one agent proves claims and another verifies them to ensure correctness.
A model compression technique that removes unnecessary parameters or connections from a neural network to reduce its size and computational requirements.
Predicted labels assigned by a model to unlabeled data for semi-supervised learning.
A technique that improves search by automatically refining queries based on initial results, without human input.
A request to merge code changes from one branch into another, typically reviewed before acceptance.
A popular open-source framework for building and training neural networks, used to define how models are structured and executed.
A model saved in PyTorch's native format, allowing it to be loaded and run using the PyTorch deep learning framework.
A reinforcement learning algorithm that learns the value of actions in different states.
A lightweight connector module that bridges a frozen image encoder and a language model, translating visual information into a format the language model can understand.
A specific quantization method that represents model weights using 4-bit numbers instead of higher-precision formats, significantly reducing model size while accepting some loss in accuracy.
A benchmark dataset where models learn to determine whether a given sentence answers a given question, used to train models for question-answer relevance scoring.
The standard attention mechanism in transformers that becomes increasingly expensive as sequence length grows, because it compares every token to every other token.
Computational cost that grows with the square of input size, becoming impractical for large datasets.
An attraction to or emphasis on subjective experiences and qualitative aspects.
The task of assessing and scoring the quality, correctness, or alignment of text outputs, often used to filter or rank model responses.
The ability to understand and solve problems involving numbers, mathematics, and logical calculations.
Reducing a model's numerical precision (e.g., from 16-bit to 4-bit) to shrink memory usage and speed up inference.
Fine-tuning a model while simulating low-precision arithmetic to maintain accuracy after quantization.
A training technique where a model learns to maintain performance even when its weights are compressed to use less memory and compute.
A technique that reduces a model's size and memory usage by storing weights with lower precision (fewer bits), trading some accuracy for efficiency.
Training a neural network while keeping weights and activations in reduced precision formats.
Using measurement results to adjust quantum system parameters in real-time to achieve desired outcomes.
The process of determining a quantum system's state from measurement data collected over time.
Optimization algorithms that approximate Newton's method using gradient information instead of full second derivatives.
A model that converts search queries into numerical representations (embeddings) that can be compared against a database of documents to find relevant matches.
A classification system categorizing what users actually want when they search.
A neural network component that acts as a bridge between an image encoder and language model, learning to extract and translate visual information into text-compatible representations.
An equivalence relation on rational points of algebraic varieties measuring when points are connected by rational curves.
Retrieval-Augmented Generation — a technique that grounds model responses in retrieved documents to improve accuracy.
A system that retrieves relevant documents or information from a database and feeds them to a language model to generate more accurate and grounded responses.
Systems combining retrieval of external documents with language generation for accurate answers.
Setting a model's weights to random values before training, creating an untrained model that produces meaningless output.
A dimensionality reduction technique using random matrices to efficiently approximate high-dimensional data with linear complexity.
A research method where participants are randomly assigned to use AI or not, to fairly measure the AI's actual impact.
A model whose weights have been set to random values instead of being trained on data, resulting in no learned patterns or knowledge.
Model parameters set to random values instead of being learned from training data, resulting in unpredictable and meaningless outputs.
A technique that uses wireless signals to measure both the distance to an object and how fast it's moving toward or away from you.
The relative ordering of values from smallest to largest, independent of their actual magnitudes.
The process of ordering search results by relevance, determining which documents best match a user's query.
A technique that takes an initial set of search results and reorders them by scoring their relevance to a query, typically to improve the quality of top results.
AI task where a model answers questions based on provided text passages.
Processing and generating predictions on data as it arrives, with minimal delay, rather than in batches.
The ability to access and incorporate current information from the web or live data sources rather than relying solely on training data from a fixed point in time.
The ability to query current web information during inference, allowing a model to access and use the latest data when answering questions.
The ability to search the internet during inference to retrieve current information rather than relying only on knowledge from training data.
The model's ability to work through multi-step logical problems and provide justified answers rather than just pattern-matching.
An AI component designed to work through complex problems step-by-step, often as part of a larger system that coordinates multiple agents.
The model's ability to work through multi-step problems methodically and show its thinking process rather than jumping to answers.
A model's ability to work through multi-step logical problems and produce coherent explanations for its answers.
The model's ability to perform complex logical thinking and problem-solving tasks beyond simple pattern matching.
A step-by-step explanation of how a model arrives at an answer, showing its intermediate thinking before the final result.
A sequence of logical steps a model follows to work through a problem methodically rather than jumping directly to an answer.
A model's ability to perform complex multi-step logical thinking and problem-solving; typically increases with model size.
A configurable setting that controls how much computational time a model spends thinking through a problem before generating its response.
The core component of a model that performs step-by-step logical thinking and problem-solving before generating a response.
A special mode where the model takes extra time to think through problems step-by-step before answering, rather than responding immediately.
A model trained to show explicit step-by-step reasoning and problem-solving logic before producing final answers, rather than jumping directly to conclusions.
The internal process a model uses to think through a problem step-by-step, integrating information and tool outputs to arrive at conclusions.
An internal step where the model thinks through a problem before generating its final answer, allowing it to work through complex logic more carefully.
Problems that require a model to think through multiple steps logically to arrive at an answer, rather than just pattern-matching.
The visible record of a model's intermediate thinking steps and logic, allowing users to inspect how the model arrived at its conclusion.
A retrieval method that uses an agent's explicit reasoning steps alongside its query to find more relevant documents.
A model specifically trained to work through multi-step logical problems methodically rather than generating quick responses.
A model designed to allocate extra computational resources to logical problem-solving and step-by-step analysis rather than raw speed or breadth of knowledge.
A model architecture optimized to work through problems step-by-step using logical inference rather than relying primarily on pattern matching from training data.
Training methods designed to improve a model's ability to work through multi-step logic and solve complex problems systematically.
The region of input data that a neuron responds to or influences.
The difference between original data and its reconstructed version from an autoencoder, used to identify anomalies or unusual patterns.
An agent's ability to recognize mistakes, backtrack, and explore alternative solutions when initial approaches fail.
A neural network design where information flows in loops, allowing the model to process sequences step-by-step while maintaining memory of previous inputs.
Neural networks with loops that process sequences by maintaining memory of past inputs.
A feedback mechanism where outputs reinforce or modify previous states over time.
A hybrid neural network design that combines recurrent processing (which maintains memory across sequences) with attention mechanisms, enabling better memory efficiency than standard transformers.
A neural network design that combines recurrent elements with other architectural components to process sequential data more efficiently than standard transformers.
Iteratively applying the same computation multiple times with parameter sharing to increase model depth without adding parameters.
A simplified version of a complex system that captures essential behavior with fewer variables.
A process where AI systems review past results, identify errors, and extract generalizable patterns to improve future performance.
The process of an agent analyzing its past actions and environment feedback to extract lessons for improving future behavior.
A transformer-based model design that uses locality-sensitive hashing and reversible layers to efficiently process long sequences with reduced memory requirements.
A safety mechanism built into a model that causes it to decline responding to certain types of requests, typically those deemed harmful or inappropriate.
The ability to identify when a model declines to answer a request, which can indicate the model recognized a harmful or unsafe prompt.
The learned behavior that causes a language model to decline harmful requests.
Built-in safety features that cause a model to decline responding to certain types of requests, such as those involving harmful, illegal, or unethical content.
Identifying distinct market states or conditions (e.g., stable vs. volatile) to apply different prediction strategies appropriately.
The ability to analyze and understand specific areas or sections of an image rather than just the image as a whole.
When a fix or change breaks functionality that was previously working, causing previously-passing tests to fail.
Identifying when code changes break previously working functionality.
The cumulative difference between an algorithm's performance and the best fixed action in hindsight.
A training method where a model learns by receiving rewards or penalties for its outputs, encouraging it to improve its behavior over time.
A training technique where human evaluators rate model outputs, and the model learns to produce responses that humans prefer.
A post-training approach for language models using rewards that can be objectively verified, like correctness on benchmarks.
A task where a model identifies and extracts meaningful connections between entities in text, such as which drugs treat which diseases.
The process of ordering search results by how well they match a user's query, with the most relevant results appearing first.
Assigning a numerical score to indicate how well a document matches or answers a given query.
Rewriting a model's weights in a different mathematical form to improve training efficiency or stability.
Storing and retraining on samples from previous tasks to prevent forgetting during continual learning.
Systematic skew in data caused by what people choose to record or report.
The ability to understand and reason about code across multiple files and folders in a codebase, not just isolated code snippets.
Training a model to convert raw data into meaningful internal representations useful for downstream tasks.
A model trained to convert raw input (like music or text) into meaningful numerical patterns that capture important features, rather than generating direct outputs like text or classifications.
The high-dimensional mathematical space where a model internally encodes and processes information about text.
The geometric structure of how neural networks organize and represent information in their learned feature spaces.
The internal geometric structure of how a model encodes and processes information.
The ability to recreate the same results by using the same training data, methods, and documentation.
The process of analyzing an incoming query to determine its type, complexity, or intent so it can be handled by the right model or pipeline.
The process of tracking and organizing what a software product needs to do, which AI can help automate.
The process of defining, documenting, and managing software system requirements from stakeholders.
The ability to track how design decisions and parameters connect back to original system requirements and design intent.
A model that takes an initial set of search results and reorders them by relevance, typically used to refine results from a faster but less accurate retrieval system.
A technique that takes an initial set of search results and reorders them by relevance score, typically to improve the quality of top results.
A neural network architecture that uses skip connections to allow information to bypass layers, making it easier to train very deep networks and improving performance.
A learned correction layer that outputs small adjustments on top of a baseline controller.
Hardware with limited memory, processing power, or battery life, requiring models to be optimized for efficiency.
The process of finding and returning relevant documents or information from a database based on a query.
Training technique that supplements data by finding and using similar examples from a database to improve model generalization.
A model designed to find and rank the most relevant documents or passages from a large collection based on a query.
A system that finds and ranks relevant documents or information in response to a query, often used in search and question-answering applications.
Finding the most relevant documents or text passages from a large collection based on a user's query.
A technique that enhances AI systems by first searching for relevant information from a database before generating responses, improving accuracy and relevance.
A technique that allows a model to search and reference external documents or knowledge bases to answer questions more accurately and with citations.
A model specifically trained to find and rank relevant documents or passages in response to search queries, rather than generate new text.
A task where the model needs to search through and extract relevant information from large amounts of text, rather than generating new content from scratch.
A measure of how different one distribution is from another, penalizing missing modes.
When an agent exploits loopholes in the reward system to maximize score without actually solving the intended task.
A learned function that predicts how good an action or outcome is, used to guide policy improvement.
Training a model to predict human preferences so it can score outputs and guide AI training through reinforcement learning.
Feedback that tells an AI agent how well it performed on a task, guiding learning.
A measure of how reward quality and model confidence vary together, used to adjust training baselines.
Mathematical framework for studying curved spaces and their intrinsic properties, used here to analyze neural representation structure.
Investment returns measured relative to the risk taken, balancing profit with stability.
Reinforcement Learning from Human Feedback — a training technique that aligns model outputs with human preferences.
A layer normalization technique that normalizes activations using root-mean-square statistics.
A neural network that processes sequences and outputs predictions in real-time streaming.
A transformer-based neural network design that learns to understand language by predicting masked words in text, improved upon the original BERT model.
A transformer-based neural network architecture optimized for understanding language through masked language prediction during training.
The ability to understand and execute physical tasks involving grasping, moving, and interacting with objects in the real world.
Combining updates from multiple sources in a way that resists manipulation by malicious participants.
A system's ability to maintain performance when inputs are corrupted, noisy, or different from training conditions.
A security system that restricts what different users can do based on their assigned role (e.g., admin, viewer, editor).
Robot Operating System 2, a middleware framework for building robot software with standardized communication patterns.
The mechanism that decides which specialized sub-networks (experts) should process each input in a mixture-of-experts model.
The decision-making component in a mixture-of-experts model that determines which experts should process each input token.
A lightweight model that analyzes incoming requests and directs them to the most appropriate downstream model or system rather than processing them directly.
The computational cost added by the mechanism that decides which experts should process each input in a mixture-of-experts model.
A lightweight decision mechanism that determines which computation path to take based on input conditions.
A scoring guide that defines criteria and quality levels for evaluating student work or AI-generated responses.
Automatically creating evaluation criteria and scoring guidelines that judges use to assess output quality.
An explicit agreement between components defining inputs, outputs, and behavior expectations during execution.
The ability for different systems to work together and exchange data dynamically during execution.
A safe, fast file format for storing model weights, designed to prevent code execution vulnerabilities.
A secure and efficient file format for storing model weights that prioritizes safety and speed when loading models.
A machine learning task that assigns content to categories based on whether it poses safety risks or harms.
A machine learning model trained to identify and flag harmful, inappropriate, or policy-violating content in text.
The process of testing and assessing whether a model produces harmful, unsafe, or undesirable outputs.
Built-in restrictions or filters that prevent a model from generating harmful, illegal, or unethical content.
A specialized AI model trained to identify and classify unsafe, harmful, or policy-violating content rather than generate general responses.
A training process that teaches a model to refuse harmful requests and avoid generating unsafe content by reinforcing safer behaviors.
A model trained to avoid harmful outputs and refuse unsafe requests, making it more cautious and responsible in its responses.
How noticeable or important something is to a model or person's attention.
Measuring feature changes while prioritizing visually important regions, ensuring quality preservation in salient areas.
The task of automatically identifying and locating the most visually prominent or important objects in an image.
The number of environment interactions (samples) an algorithm needs to learn a good policy.
How well a model learns from a small amount of training data.
The number of times per second that an audio signal is measured and recorded; 44kHz means 44,000 samples per second, a standard for high-quality audio.
A technique that directs different training examples to different optimization methods based on their characteristics or correctness.
Control systems where inputs are updated at discrete time intervals rather than continuously.
Running agent actions in an isolated environment to prevent them from accessing or damaging other systems.
A specialized neural network design that transforms sentences into meaningful vector representations by using a transformer model paired with pooling techniques to capture semantic meaning.
A mathematical framework that analyzes images at multiple resolutions to reveal hierarchical information.
How a model's performance and capabilities change as you increase its size, training data, or computational resources.
Patterns that describe how a model's performance improves as you increase its size, training data, or compute resources.
The study of how model performance changes as you increase the number of parameters, training data, or compute resources.
A collection of models of different sizes trained identically to study how capabilities improve as models grow larger.
A structured representation of a scene using nodes for objects and edges for spatial relationships between them.
Assigning tasks and resources to specific times and locations to optimize execution efficiency.
Information about a database's structure (tables, columns, relationships) provided to the model to help it generate correct queries.
Incompatibility between data formats when different services exchange information.
Changes to the structure or format of data that can cause AI models to fail or perform poorly.
A correction term added during the reverse process to guide noise removal toward realistic data.
A model designed to assign numerical scores to inputs (like relevance scores for passages) rather than generate new text.
An attention mechanism that evaluates each key against an explicit threshold to determine relevance, rather than redistributing fixed attention mass across all keys.
A language model enhanced with the ability to retrieve and incorporate live information from the web before generating responses.
A technique where only a subset of a model's weights are used for each input, rather than activating all parameters, which reduces memory usage and speeds up inference.
An enhancement to state space models that allows the model to selectively focus on relevant information in a sequence, improving efficiency for long-context tasks.
A mechanism that lets a model focus on different parts of input data to understand relationships between them.
A generative model that uses its own previous outputs to guide learning of different behavioral patterns.
A technique where a model generates multiple responses and uses agreement among them to improve answer reliability.
Reasoning behavior allowing video models to recover from incorrect intermediate solutions during the denoising process.
A training method where a model learns from its own predictions at the token level, providing fine-grained feedback.
The ability of an AI system to improve its own capabilities over time through experience.
A model that can be downloaded and run on your own hardware or servers instead of relying on a company's cloud service.
Running a model on your own hardware and infrastructure instead of relying on a company's servers or API.
Running a model on your own hardware or servers rather than accessing it through a cloud service or API.
Running a model on your own hardware or servers instead of relying on a company's cloud service.
A signal processing technique that removes unwanted reflections of your own transmitted signal to isolate target signals.
Training method where a model plays against itself or generates both solutions and evaluations, risking the model learning to exploit itself.
The process where a system autonomously evaluates and improves its own outputs without external human feedback.
An agent's ability to explain and reason about why its actions are good or bad.
A training approach where a model learns patterns from unlabeled data by creating its own learning targets, such as predicting hidden parts of the input.
Training a model on unlabeled data using the data itself to create learning signals, without manual annotations.
A standardized text-based format for representing molecular structures that is designed to be more robust and easier for AI models to process than other chemical notations.
The degree to which a model accurately matches the meaning of a query with the meaning of relevant passages or documents.
Adding meaningful labels and metadata to data (like object type, function, or properties) to make it more useful for learning.
A technique that stores and reuses previous responses for new queries that have similar meaning, reducing redundant computation.
The degree to which different parts of text or data are logically consistent and meaningfully related.
Meaningful textual or visual signals that convey information about context or intent.
Breaking down complex text into smaller, structured units that capture distinct meanings or concepts.
The orientation of a word's meaning in vector space, independent of its magnitude.
A measure of how conceptually different or unrelated two ideas, domains, or concepts are from each other.
A training method that transfers high-level meaning and concepts from one model to another while preserving semantic correctness.
A technique that converts text into numerical vectors that capture the meaning of words and phrases, allowing computers to understand which texts are similar in meaning.
Numerical representations that capture the meaning of text or audio, allowing the model to understand that similar concepts are close together in this representation space.
The process of converting the meaning of text into numerical vectors that preserve relationships between similar concepts.
Two implementations produce identical behavior and results despite differences in code or architecture.
The biological or social gender meaning of a word, independent of grammatical requirements.
Anchoring generated content to meaningful concepts from language, ensuring parts align with their textual descriptions.
Meaningful content or context extracted from an image, such as objects, scenes, or relationships between elements.
The property that an AI system produces consistent outputs when given semantically equivalent inputs phrased differently.
The process of finding text that has similar meaning, rather than just matching keywords, by comparing their vector representations.
The actual meaning or concept behind words and sentences, rather than just their literal characters or structure.
Predicting which 3D spatial locations are occupied and what semantic class (car, pedestrian, etc.) occupies them.
Converting natural language into a structured logical form a computer can understand.
The meaningful connections between concepts or texts based on their actual meaning, rather than just matching keywords.
A numerical encoding that captures the meaning and context of text rather than just its surface-level words, enabling the model to understand that similar concepts have similar representations.
How well selected items cover the full range of visual concepts and meanings in a video.
Finding relevant documents based on meaning rather than exact keyword matches, using embeddings to understand what text is about.
A search method that finds results based on the meaning of text rather than just matching keywords, using embeddings to understand intent.
Dividing video or images into meaningful regions and assigning labels to understand what each region represents.
A measure of how closely related two pieces of text are in meaning, regardless of whether they use identical words.
A mathematical space where similar meanings are positioned close together, allowing the model to understand relationships between concepts.
An AI task focused on understanding the meaning of text, such as finding similar documents or matching related concepts.
A task that measures how closely two pieces of text match in meaning, regardless of whether they use the same words.
Grouping tokens with similar meanings together to assess whether a model's prediction is semantically coherent.
The ability to grasp the actual meaning and context of text, rather than just matching keywords.
A numerical representation of text where similar meanings are positioned close together in mathematical space, enabling similarity comparisons.
A numerical encoding of text where similar meanings are positioned close together in mathematical space, enabling the model to understand relationships between concepts.
Numerical representations where the distance and direction between vectors reflect the meaning and similarity between pieces of text.
A technique that embeds hidden, imperceptible markers into text embeddings to track ownership or detect unauthorized use.
Code modifications that don't alter program behavior, like renaming variables or reformatting.
Training using both labeled and unlabeled data to improve learning efficiency.
Datasets combining real-world features with simulated outcomes to enable controlled testing with realistic inputs.
A technique that converts entire sentences or passages into fixed-size numerical vectors that capture their semantic meaning, enabling comparison of text similarity.
Dense numerical representations of entire sentences that capture their semantic meaning, allowing comparison of how similar different sentences are.
A model that converts text sentences into numerical vectors (embeddings) that capture their semantic meaning, enabling comparison of how similar different sentences are.
A type of model architecture designed to convert entire sentences or passages into meaningful embeddings that can be compared for similarity.
A framework that fine-tunes transformer models to produce meaningful embeddings of entire sentences or paragraphs, rather than just individual tokens.
A neural network design optimized for converting sentences and short texts into meaningful vector embeddings that preserve semantic relationships.
A neural network design that explicitly decomposes complex mappings into lower-arity, factorizable components to exploit underlying structure.
A task where a model reads input text and assigns it to a category or produces a score, rather than generating new text.
A technique that reduces the length of input data while preserving its essential meaning, making processing faster and requiring less memory.
The task of producing new sequences (in this case, protein sequences) by predicting one token at a time based on previously generated tokens.
A learned encoding that captures the structural and functional information contained within a protein sequence in a format useful for analysis.
A model architecture that takes a sequence of input tokens and produces a sequence of output tokens, commonly used for tasks like translation and summarization.
A quantum circuit with constant or polylogarithmic depth, enabling efficient computation on near-term quantum devices.
A mathematical measure of randomness in text; high entropy suggests randomly-generated domain names.
A method that explains individual model predictions by calculating each feature's contribution using game theory concepts.
A common mathematical space where different types of data (text and audio) are represented so that related concepts from each type are positioned near each other.
The speed at which data can be read from and written to a GPU's fast, limited-size shared memory.
Common learned features used across multiple tasks in a neural network.
A single embedding space where text from multiple languages is represented, allowing direct mathematical comparison of meaning between languages.
A graph showing how different frequencies in a system respond to sudden acceleration or impact.
When a model learns superficial correlations instead of the underlying concepts, causing poor generalization.
A training method that aligns images and text by learning to match their representations, using a sigmoid loss function instead of the traditional softmax approach.
Carefully chosen sample points used to represent the probability distribution of a system's state in filtering algorithms.
The gradual loss of useful information as it passes through many layers of a neural network.
A formal language for specifying time-dependent constraints like "reach goal within 10 seconds" or "avoid obstacles until task completion."
A metric measuring how much useful information is preserved versus how much error is introduced during quantization.
Performance difference when a trained policy transfers between two different environment implementations.
A contrastive learning technique that trains models to recognize when two slightly different versions of the same sentence are similar, improving semantic understanding.
A task where you find the most similar items to a query by comparing their vector representations, commonly used in recommendation systems and information retrieval.
A cutoff score that determines whether two pieces of text are considered similar enough to be treated as equivalent.
A model that processes only one type of input (like text) rather than multiple types (like text and images combined).
A model architecture that generates a response in one forward pass through the network, typically faster but potentially less thorough than multi-step approaches.
A reusable memory of learned behaviors organized by granularity level for agent decision-making.
Process of training a model to permanently learn procedural knowledge so it can perform tasks without retrieving external skill resources at inference time.
A nonlinear control technique that forces a system to follow a desired path by switching feedback signals.
A mechanism that limits attention to a fixed-size window of recent tokens rather than all previous tokens, reducing computational cost while maintaining context awareness.
Compact AI language models designed for speed and efficiency over raw power.
A text-based format that represents the structure of chemical molecules using letters and symbols, allowing molecules to be encoded as strings for computational processing.
Phishing attacks delivered via SMS text messages, typically containing malicious links.
A measure of how quickly a loss function's gradient can change; smaller is better for stable training.
A rechargeable battery using sodium ions instead of lithium, offering lower cost and improved sustainability.
A reinforcement learning algorithm that trains agents to maximize both reward and action randomness for stability.
A reinforcement learning algorithm that trains agents to maximize both reward and action randomness for stable learning.
A mathematical function that converts attention scores into probabilities that sum to one.
Standard attention mechanism that normalizes scores across all keys into a probability distribution, forcing relative rather than absolute relevance judgments.
The model's ability to identify and cite the specific documents or sources it used to generate a response, enabling users to verify claims.
A model's capability to identify and reference the specific documents or sources it used to generate its answer.
The practice of anchoring a model's responses to specific, cited sources rather than relying solely on its training data, improving factual accuracy and verifiability.
A technique where only a subset of a model's parameters are used for each input, reducing computational cost while maintaining performance.
A model design where not all parameters are used for every computation, reducing memory and computational requirements compared to dense models.
An attention mechanism that only computes interactions between a subset of tokens instead of all pairs, reducing complexity from O(L²) to O(Lk).
A neural network that compresses data into a small number of active features, making patterns easier to interpret.
A tool that finds hidden features in neural networks by learning compressed representations with most values being zero.
Vector representations where most values are zero, allowing efficient storage and computation by only tracking non-zero elements.
An architecture where only a subset of the model's specialized sub-networks (experts) activate for each input, reducing computation while maintaining capability.
A model that activates only a subset of its parameters for each input, rather than using all parameters every time, which reduces computational cost.
A mixture-of-experts design where only a small fraction of the model's parameters are used for each prediction, reducing computational cost while maintaining model capacity.
A technique where only a small portion of a model's total parameters are used during inference, reducing computational cost while maintaining model capacity.
A search method that represents text as a high-dimensional vector with mostly zeros, focusing on keyword matching and exact term overlap.
A reinforcement learning setting where the agent receives reward signals only rarely, making exploration particularly challenging.
A reinforcement learning setting where the agent receives feedback infrequently, making learning difficult.
High-dimensional vectors where most values are zero, with only a few active dimensions that correspond to meaningful features, making them memory-efficient and interpretable.
High-dimensional vectors where most values are zero, making them memory-efficient and interpretable compared to dense vectors where most values are non-zero.
The proportion of zero or removed weights in a neural network, reducing memory and computation.
A technique that uses spatial information to guide which parts of a video frame correspond to which agent or subject.
Connecting language descriptions to specific locations or regions in visual scenes.
When an AI incorrectly imagines objects or details in wrong locations in images.
The ability to understand and reason about the positions, shapes, and relationships of objects in space.
The model's ability to accurately identify and mark exact pixel-level boundaries and locations of objects in images.
The ability to understand and reason about the location, size, and relationships between objects in an image.
Attention mechanism that processes both spatial (image) and temporal (time) dimensions to understand relationships across frames.
Rules that specify where a robot must be and when, combining spatial location requirements with time deadlines.
Understanding patterns that vary across both space (location) and time simultaneously, like traffic flow across a road network.
Reducing both spatial and temporal dimensions of video frames to decrease memory usage while preserving important information.
Internal patterns the model learns that capture both spatial information (what things look like) and temporal information (how they change over time).
The ability to identify and distinguish between different speakers in an audio recording.
A task that identifies or confirms whether audio was spoken by a specific person, using characteristics unique to that person's voice.
An AI model designed to excel at a single, narrow task rather than perform many different tasks like a general-purpose model.
Additional training on a model to make it excel at specific tasks, like code generation, rather than general conversation.
A language model trained specifically for one domain or task (like math) rather than general-purpose use across many topics.
A language model trained specifically to excel at one task or domain (like mathematics) rather than performing well across many different tasks.
Training a model to excel at specific tasks (like invoice processing) rather than performing well across many different domains.
A design approach where explicit specifications serve as contracts between designers and tools, maintaining traceability from requirements to implementation.
Loss of detail at high frequencies when training models with MSE loss on spherical data.
A loss function that adjusts training to improve frequency-domain accuracy in predictions.
Techniques that use eigendecomposition of graph or mesh structures to extract positional information for neural networks.
Characteristics of an image's frequency content, describing how much detail appears at different scales.
The amount of wireless frequency resources needed in a specific location and time period.
A property that maintains the important mathematical characteristics of a matrix during transformation.
A technique where a smaller model quickly drafts multiple token predictions ahead of time, which a larger model then verifies, reducing the total time needed to generate text.
The ability to process and comprehend spoken language or audio signals, converting them into meaningful interpretations or responses.
Numerical representations of audio that capture the meaningful features of speech in a compact form, useful for tasks like speaker identification or speech similarity.
A learned numerical encoding of audio that captures meaningful speech patterns and can be used as input for other AI tasks.
A neural network trained to convert raw audio into meaningful vector representations that preserve information about speech content and speaker identity.
An AI model that can process and understand spoken audio directly, without needing to convert speech to text first.
The process of converting spoken audio into written text.
Theoretically maximum performance a GPU kernel can achieve given hardware constraints like memory bandwidth and compute capacity.
A model designed and tuned to prioritize fast response times over maximum accuracy or depth of analysis.
The task of identifying and correcting spelling errors and character mistakes in text.
A neural retrieval method that combines transformer models with sparse, interpretable outputs by mapping embeddings directly to vocabulary tokens.
A neural network architecture where different layers run on different machines to preserve privacy during federated training.
An AI model that understands spoken input and generates spoken responses for interactive conversations.
A token inserted during generation (e.g., <10.6 seconds>) that helps a model track elapsed speaking time.
Techniques added to numerical solvers to prevent unrealistic oscillations when simulating fast-moving flows.
Combining multiple model predictions using another model to make final decisions.
A probabilistic graphical model that extends Bayesian networks by grouping variables into stages to capture context-specific conditional dependencies.
Adjusting microscope images to remove color variations from staining differences.
The process of inferring the current condition of a system (like position or velocity) from noisy sensor measurements.
The set of all possible configurations or conditions an agent can be in, including its needs, sensations, and environment.
A type of neural network architecture that processes sequences by maintaining and updating an internal state, offering an alternative to transformer-based attention mechanisms.
A neural network architecture that processes sequences by tracking hidden states over time, offering faster inference and lower memory use than traditional transformers.
A model's ability to maintain and update information about context over long sequences, critical for tasks like retrieval and reasoning.
Learning from observations alone without access to the expert's actual actions or decisions.
An alternative to transformers that processes sequences more efficiently by maintaining a hidden state that gets updated as it reads each token.
Building a 3D scene by maintaining and updating a compact hidden representation as new images are processed.
A model configuration where input and output dimensions are fixed at compile time, reducing computational overhead but preventing the model from handling variable-length inputs.
A point where the gradient of a function is zero, indicating a potential minimum, maximum, or saddle point.
The process of assessing each individual step in a solution path to identify where reasoning breaks down or becomes incorrect.
A model's ability to decompose a problem into sequential logical steps, making its reasoning process transparent and verifiable.
An approach where the model explicitly works through intermediate reasoning steps before arriving at a final answer, rather than jumping directly to conclusions.
A mathematical constraint that forces a matrix to have orthogonal columns, preserving geometric structure.
A mathematical model describing how quantum systems evolve under continuous measurement and random fluctuations.
Optimization methods that use noisy or approximate gradients instead of exact ones to handle large datasets.
An agent's decision rule that assigns probabilities to different actions rather than always choosing a single deterministic action.
Periodically returning a learning process to an initial state with random timing to accelerate optimization.
Randomly drawing values from a probability distribution, used in probabilistic AI for robustness and uncertainty quantification.
Randomness or unpredictability built into a process or model.
Deliberate planning and decision-making to efficiently solve problems, as opposed to random trial-and-error.
Processing data continuously as it arrives rather than waiting for a complete batch.
Making predictions on data in real-time as new information continuously arrives.
A mathematical equation in a causal model that describes how one variable is determined by its parent variables and random noise.
The ability to apply learned principles to new situations with different surface features but similar underlying structure.
Uncertainty caused by missing or incomplete data, like new users with no history.
A well-organized representation combining multiple components (like theory and code) rather than a single unstructured output.
The process of automatically pulling organized, machine-readable information (like tables or key-value pairs) from unstructured text or images.
Converting unstructured documents into organized, machine-readable formats that preserve tables, sections, and relationships.
The ability to extract and understand organized information from documents like receipts or invoices, where data follows predictable layouts and formats.
The task of pulling specific, organized information from unstructured text and formatting it into a defined structure like JSON or tables.
Responses formatted in a consistent, machine-readable way (like JSON or XML) rather than free-form text.
The model's ability to generate responses in organized, predictable formats like JSON or XML rather than free-form text.
Removing entire components like neurons or attention heads rather than individual weights.
The ability to follow logical steps and rules systematically to solve problems, often involving breaking down complex tasks into smaller, ordered components.
Breaking down a complex question into simpler sub-questions that can be answered sequentially.
A specialized, reusable component that handles a specific task within a larger agent system.
Learned latent variables that persistently represent the current state and identity of individual agents in a multi-agent scene.
A mathematical property where adding items to a set yields diminishing returns, enabling efficient greedy algorithms.
Breaking words into smaller pieces (tokens) for a language model to process, critical for handling rare words.
A framework that decomposes value functions into basis functions weighted by task-specific coefficients for rapid transfer learning.
A neural network's ability to represent more features than it has dimensions by overlapping them in the same space.
Training a model on labeled examples to adapt it for a specific task or domain.
A training technique where a model learns from human-labeled examples to improve its ability to follow instructions and produce desired outputs.
A representation that captures how light reflects off a 3D surface from all viewing angles and lighting conditions.
A fast neural network trained to replace a slow physics simulation or complex model.
Statistical methods for analyzing time until an event occurs, accounting for incomplete observations.
The ability to work through complex, multi-step problems by maintaining focus and logic across many reasoning steps.
Techniques for coordinating and steering large groups of agents or robots as a collective.
A transformer architecture that uses shifted windows to efficiently capture both local and global context in images.
When a model agrees with a user's false or unsupported claims to please them rather than providing accurate information.
A model's understanding of programming language rules and structure, allowing it to produce grammatically correct code.
Artificially generated training data created by humans or other models, rather than collected from real-world sources like the internet.
Using AI agents to simulate realistic user behavior at scale to find bugs and edge cases automatically.
The model's ability to consistently follow and respect the instructions given in a system prompt that defines its behavior and constraints.
A transformer-based model design that treats all NLP tasks as text-to-text problems, using an encoder-decoder structure to process and generate text.
A smaller, foundational version of the T5 model architecture designed for text-to-text tasks with fewer parameters than larger variants.
Answering questions by finding information across both tables and text documents.
Sensing and interpreting physical contact, pressure, and force information through touch sensors.
The percentage of correct answers a model produces on a benchmark, measured by standard evaluation metrics.
Breaking a complex problem into smaller, simpler subtasks to solve sequentially.
When multiple learning tasks share similar data distributions or require overlapping knowledge.
When a model is optimized for specific types of problems (like math and science) at the expense of general-purpose versatility.
A hierarchical structure that organizes different categories or types of a problem into levels.
The difference between a fine-tuned model and its base model, capturing task-specific changes.
Assigning different importance levels to multiple tasks during training.
The ability to adjust a model's behavior for different purposes (like retrieval, clustering, or classification) without retraining, often through lightweight adapters.
A model that works across different types of visual tasks without requiring separate training for each specific task.
Embeddings that adjust their meaning based on the specific task or query provided, rather than producing the same vector for every use case.
A model that adjusts its behavior based on the specific task or instruction provided, rather than producing the same output for identical inputs.
Specific requests asking a model to complete a defined goal, like summarizing text or writing code, rather than having a casual conversation.
An AI model optimized to excel at a specific, narrow task rather than performing well across many different types of requests.
Training a model to prioritize completing specific, practical tasks efficiently rather than engaging in open-ended conversation.
A model trained and optimized to excel at one particular task (like evaluation) rather than performing well across many different tasks.
Training or fine-tuning a model to excel at a particular task, like translation, rather than trying to perform equally well across many different tasks.
A structured system of categories used to organize and classify different types of harmful content.
Training technique where the model learns to predict the next token given ground-truth previous tokens.
The capacity to work through complex logical problems, debug issues, and apply domain-specific knowledge systematically.
Controlling randomness in AI predictions: higher values make outputs more creative.
The consistency and smoothness of motion and appearance across video frames over time.
Ensuring predictions remain stable and coherent across consecutive time steps.
Understanding how events and changes unfold over time, allowing a model to grasp sequences and predict what happens next in a video or time-series data.
Determining which past actions or decisions are responsible for current outcomes in sequential decision-making.
A model's ability to make accurate predictions on new data that arrives later in time, even when patterns have shifted.
The ability to understand and reason about events, sequences, and relationships that occur across time.
Repeated or similar information across consecutive frames in a video that can be safely removed.
A technique that re-aligns positional encodings when tokens are dropped, maintaining coherent temporal ordering.
Aligning events in music and video so they happen at the same time.
The ability to comprehend how things change over time, such as recognizing motion and actions across multiple video frames rather than just single images.
Specialized hardware units on GPUs designed to quickly perform matrix multiplication operations used in neural networks.
Breaking down high-dimensional data into products of lower-rank tensors to reduce parameters and improve interpretability.
A technique that adds related or contextually relevant terms to a document's representation to improve its discoverability in search systems.
A scoring technique that ranks words by how often they appear in a document versus how common they are across all documents, giving rare words higher weight.
Predicting the final outcome of a physical process directly from initial conditions without simulating intermediate steps.
Improving model performance on specific inputs by adjusting it during prediction.
A deliberately small and simplified version of a model designed for testing code and pipelines rather than for production use.
Improving model accuracy at inference by using extra computation or verification steps without retraining.
Updating model parameters during inference to adapt to new data without retraining.
A machine learning task where a model reads text and assigns it to predefined categories, such as 'safe' or 'unsafe'.
A technique that groups similar texts together automatically by using embeddings to measure similarity, without requiring predefined categories.
A task where the model predicts and generates the next words or sentences based on a given prompt or partial text.
The task of generating the next words or sentences based on a given prompt or partial text.
A training technique where parts of input text are randomly deleted, masked, or shuffled to teach the model to understand context and recover meaning.
A technique that converts text into numerical vectors that capture semantic meaning, allowing the model to understand and compare text similarity.
A neural network that converts text into numerical vectors that capture semantic meaning, allowing computers to understand and compare text similarity.
Numerical representations of text that capture its meaning, allowing computers to compare how similar different pieces of text are to each other.
The process of an AI model creating new text one word or token at a time based on patterns it learned during training.
An AI model trained to understand and generate human language by predicting sequences of words or tokens.
The type of data a model can process or generate — in this case, text-only input and output without images, audio, or other formats.
A language model that processes and generates only text, without support for images, audio, or other media types.
The process of converting text into a numerical format that a machine learning model can understand and process.
An AI model that processes and generates only text input and output, without support for images, audio, or other media types.
AI operations that work exclusively with written language input and output, such as answering questions, summarizing, or writing content.
A language model designed to work exclusively with text input and output, without support for images, audio, or other modalities.
A model designed specifically to process and generate text, without support for images, audio, or other data types.
A model that accepts text as input and produces text as output, without support for images, audio, or other data types.
A model that accepts only written text as input, without support for images, audio, or other data types.
A model that accepts and produces only text inputs and outputs, without support for images, audio, or other media types.
A model that processes and produces only text input and output, without support for images, audio, or other data types.
A language model that processes and generates only text, without support for images, audio, or other data types.
Creating 3D models from natural language descriptions using AI models.
The ability to convert natural language descriptions into executable code automatically.
An AI model that creates images from written text descriptions or prompts.
A technology that converts written text into spoken audio that sounds natural and human-like.
A task where a model converts natural language questions into executable SQL database queries.
A framework where all NLP tasks are treated as converting input text into output text, so translation, summarization, and classification use the same model structure.
A model task where the input and output are both text, with the model learning to transform one text format into another.
A machine learning model that takes text as input and produces text as output, useful for tasks like translation, summarization, or question answering.
A training approach where all NLP tasks are framed as converting input text to output text, allowing a single model to handle translation, summarization, classification, and other tasks.
Creating video sequences from text descriptions using neural networks.
High-quality, carefully curated training data structured like educational textbooks rather than raw internet text, designed to teach clear concepts and reasoning.
A large, diverse dataset of text from the internet used to train this model.
Using AI to automatically verify or discover mathematical proofs and logical statements.
The ability to infer and reason about other people's beliefs, desires, and intentions.
A configurable parameter that controls how much computational time and internal deliberation a model dedicates to solving a problem before responding.
A model operating mode where it explicitly works through problems step-by-step before generating a final answer, improving accuracy on complex tasks.
A language model trained to generate explicit reasoning steps and internal deliberation before producing a final response, rather than answering immediately.
The number of tokens a model can generate per second, measuring its processing speed.
Changing the tonal quality or color of a sound while preserving its basic characteristics.
Analyzing data points collected over time to find patterns and make predictions.
The task of predicting future values in a sequence of data points ordered by time, such as stock prices or weather patterns.
The ability to understand and make predictions based on data points ordered over time, like stock prices.
A small unit of text (a word, subword, or punctuation mark) that a language model breaks input into for processing.
The process of selectively activating only certain parts of a model for each individual token processed, rather than using the entire network every time.
Deciding how many tokens (words/subwords) a model should generate for a given problem.
The maximum number of tokens available to include retrieved context in a language model prompt.
The number of text units (tokens) a model processes or generates; longer reasoning processes consume more tokens and may increase latency or cost.
The computational expense and resource usage required to process or generate tokens, which increases when a model performs additional reasoning steps.
The number of small text chunks (tokens) a model generates; higher token counts mean longer responses and more computational cost.
The probability distribution over possible next tokens that a language model produces during decoding.
A measure of how many tokens (small units of text) a model needs to use to complete a task; more efficient models use fewer tokens and cost less.
Numerical representations of individual words or subwords that capture their meaning and relationships in a way machines can process.
A score measuring how much each word or subword unit contributes to a model's prediction.
A training technique where random words in text are hidden and the model learns to predict them, commonly used in models like BERT.
Combining multiple tokens into fewer tokens to reduce computation while preserving model output quality.
The maximum number of tokens (words or word pieces) a model can generate in a single response, controlling the length of its output.
The spatial coordinates or locations of text elements within a document, used to understand where words and phrases appear on the page.
The cost charged per token (unit of text) processed by a model, which varies based on model capability and complexity.
Removing less important words from AI processing to improve speed and efficiency.
Technique to decrease the number of tokens processed by a model, typically by compressing or filtering visual information.
A vector that encodes the meaning and context of a single word or subword unit (token) within a larger piece of text.
Numerical vectors that encode the meaning of individual words or subword units within a text.
A series of individual tokens (words or subwords) that the model generates one after another to form a complete response.
Reducing the number of tokens processed by a model to lower computational cost.
The number of tokens a model can generate per unit of time during inference.
The number of tokens (small units of text) consumed during model inference; higher token usage means more computational cost and longer response times.
The complete set of individual text units (tokens) that a model can recognize and process; a larger vocabulary allows the model to handle more diverse languages and specialized terms.
Assigning importance scores to individual words or subwords in text, allowing the model to emphasize semantically significant terms in its representation.
Embeddings that represent individual tokens (words or subwords) rather than entire documents, allowing fine-grained matching during search.
Applying different levels of privacy protection to individual tokens based on their sensitivity and importance.
The process of breaking text into smaller units (like words or syllables) that a model can understand and process.
The component that splits text into tokens (subwords or characters) that the model can process.
The basic units of text that a language model processes, typically representing words or word fragments.
An agent's ability to call external functions or APIs to gather information or perform actions.
A structured definition that describes what a tool does, what inputs it accepts, and what outputs it produces.
The ability of a model to call external functions or APIs to perform tasks like calculations, searches, or data retrieval.
When an AI model decides to use external functions or tools (like database queries) to help answer questions or complete tasks.
A requirement that a segmented structure maintains correct connectivity and shape properties, not just pixel-level accuracy.
A representation method that works regardless of how input channels are physically arranged or which channels are present.
A metric measuring the maximum difference between two probability distributions, ranging from 0 to 1.
A metric measuring the distinguishability between two quantum states, ranging from 0 (identical) to 1 (orthogonal).
When a model is trained using one objective but deployed using a different process, causing performance gaps between training and real-world use.
A saved snapshot of the model's learned weights at a specific point during training, allowing you to see how the model improved over time.
Saved snapshots of a model at different points during training, allowing researchers to observe how the model's abilities change as it learns.
The date up to which a model has seen training data; the model has no knowledge of events or information after this date.
The examples and information used to teach a model how to perform a task, in this case human-written and AI-generated grammatical corrections.
The process of carefully selecting, filtering, and organizing training data to improve a model's performance on specific tasks rather than relying solely on larger datasets.
The date after which information is not included in a model's training data, meaning the model cannot know about events or facts that occurred after that date.
The range of topics, styles, and types of text a model was trained on; the model performs best on content similar to this distribution and may struggle outside it.
The patterns and behaviors that emerge during a model's training process, such as how loss decreases or how capabilities develop over time.
The ability to achieve strong model performance while using less computational resources, data, or time during the training process.
The number of times a model sees the entire training dataset during learning; more epochs can improve performance but may also lead to overfitting if the dataset is small.
The complete set of steps, data, and code used to train a model, made transparent so others can reproduce or audit the process.
A sequence of interactions or steps taken by a model during deployment or in an environment.
Predicting the future path or location of a person or object over time.
Computing a planned path or sequence of movements for an autonomous agent to follow.
Controlling video generation by specifying desired motion paths or object movements frame-by-frame.
Generating sequences of actions (trajectories) that an agent takes to solve a task, used for training via imitation learning.
Adapting recorded action sequences to new situations by adjusting them based on matching visual keypoints between scenes.
A model that converts input sequences into output sequences with aligned timing.
Using knowledge from one task to improve learning on a different related task.
The dominant neural network architecture for language models, using self-attention to process sequences.
A neural network design that processes text by analyzing relationships between all words simultaneously, forming the foundation of modern large language models.
A mechanism that allows a model to focus on relevant parts of the input by computing relationships between all pairs of tokens, enabling deep understanding but requiring significant memory.
The core neural network architecture based on attention mechanisms that traditionally powers most large language models.
A neural network component that processes input sequences using attention mechanisms.
Stacked blocks of neural network computations that process and transform input text progressively, with more layers generally allowing the model to learn more complex patterns.
Neural network architecture widely used for language tasks like BERT and RoBERTa.
A method where a transformer neural network generates text one token at a time by learning patterns from training data.
An algorithm that explores possible future states by building a tree of actions and outcomes to find promising paths.
Prioritizing and routing queries by urgency or risk level, directing high-risk cases to human experts.
A metric measuring which input types the backdoor attack actually depends on.
A local region around the current best solution where the surrogate model is trusted to be accurate.
The ability to detect when one speaker has finished speaking and another can begin, essential for natural conversation flow.
The ability to identify when a speaker has finished speaking and it is another person's turn to speak in a conversation.
A statistical method for estimating intermediate values in a sequence based on observed endpoints.
A retrieval system design with separate neural networks for encoding queries and documents independently, allowing efficient comparison between them.
The tendency of generative models to converge on the most common or typical outputs, reducing diversity.
The ability to understand and interact with user interfaces by reading screenshots and generating commands to control applications or websites.
The ability of an AI model to understand and control user interface elements like buttons and forms by interpreting visual layouts and executing appropriate actions.
The model's ability to identify and apply common design patterns and component structures used in user interfaces.
Questions where the correct answer cannot be found in the given context, testing if models admit uncertainty.
A model variant that treats uppercase and lowercase letters as identical, so 'Hello' and 'hello' are processed the same way.
A model without built-in safety filters or content restrictions, allowing it to generate responses on any topic without refusal.
A model trained without safety filters or content restrictions, making it willing to generate responses on sensitive topics that filtered models would refuse.
Quantifying how confident a model is in its predictions, critical for safe deployment in high-stakes applications.
Measuring and tracking how uncertain a model's predictions are based on uncertain inputs.
Collecting fewer measurements than needed for perfect image reconstruction, used to speed up MRI scans.
A neural network architecture commonly used in image generation that processes images at multiple scales.
A single model design that handles multiple different tasks without needing separate specialized models for each task.
A single input format that handles multiple different tasks, rather than requiring separate models for each task.
An AI model trained to both generate and understand multiple types of data like text and images.
AI models that can process and generate multiple types of data (text, images, etc.) in a single system.
Automated code that checks whether a specific piece of software works correctly by testing individual functions.
Learning general rules from examples that apply broadly across different situations.
A model designed to work well across many different tasks and domains without requiring task-specific customization or retraining.
An algorithm for estimating the state of a system from noisy measurements, designed to handle nonlinear dynamics better than standard Kalman Filters.
Information that doesn't follow a predefined format or organization, such as raw text documents or photographs.
Information stored as plain text documents rather than organized databases, like PDFs or policy manuals.
Training a model without labeled examples, letting it discover patterns on its own.
Training language models with reinforcement learning using rewards derived without human labels or ground truth answers.
A model with the correct structure but no learned knowledge, producing meaningless output because it has never been trained on data.
Using sparse local measurements to estimate values across a larger geographic or temporal region.
A learned vector representation that captures an individual driver's unique preferences and driving style.
A synthetic agent that mimics realistic user behavior and preferences to test AI assistant performance.
Prompting a model to generate the next user message in a conversation to probe whether it understands interaction dynamics.
A generalization of Shannon information that measures how much information is actually useful to a specific observer or agent.
A function estimating how good a state or action is for achieving a goal.
The process of updating an agent's estimates of state values backward through a trajectory during learning.
A technique that dynamically adjusts how much a model explores new outputs versus exploiting known good ones.
Techniques that reduce noise in gradient estimates to improve optimization efficiency and convergence speed.
A neural network that learns to compress data into a latent space and reconstruct it, useful for learning smooth representations.
An optimization technique that transfers knowledge from a teacher model to improve generation quality by matching score distributions.
The number of individual numerical values used to represent a piece of text; higher dimensions can capture more nuanced meaning but require more computational resources.
A representation of data (like molecules or text) as a list of numbers that captures its essential features in a form that machine learning models can work with.
Numerical representations of text where each word or sentence becomes a list of numbers that capture its meaning in a way computers can process.
The process of converting input data (like text) into numerical vectors that can be stored, compared, and searched efficiently.
Images defined by mathematical shapes and paths rather than pixels, allowing them to scale to any size without losing quality.
A preprocessing step that scales vectors to a standard length, ensuring fair comparisons when using cosine similarity.
The model's output is a single array of numbers (a vector) rather than generated text, which can be efficiently compared with other vectors to measure similarity.
Compressing data by encoding groups of values together rather than individually, achieving better compression ratios.
A way of expressing text as a list of numbers that a computer can process and compare mathematically.
A search method that converts queries and documents into numerical vectors and finds matches by measuring similarity between vectors, fast but less nuanced than other ranking approaches.
A measurement of how alike two vectors (number lists) are to each other, used to determine if two pieces of text have similar meanings.
A method that converts text into numerical vectors and finds documents with vectors closest to a query vector, fast but sometimes missing nuanced relevance signals.
A mathematical representation where text is converted into points or directions in a multi-dimensional space, enabling comparison and analysis of semantic relationships.
In diffusion models, the learned direction and speed that guides the generation process at each step.
Uncertainty estimates based on explicit confidence statements the model generates as part of its reasoning output.
Answers that can be checked against external sources like the web to confirm correctness.
A generative model that creates videos by iteratively refining noise into coherent frames, similar to image diffusion but applied to sequences.
A model component that processes video frames and converts them into compact numerical representations that capture the video's visual and motion content.
Creating realistic video sequences using AI based on text or image descriptions.
Editing technique that deletes objects from video while filling in background and correcting physical interactions.
A task where AI models watch videos and answer questions about what they see and understand.
Extending image segmentation to video by identifying and tracking objects across multiple frames over time.
The ability of AI systems to analyze and extract meaning from video content including visual, temporal, and semantic information.
A specialized AI model trained to understand video content and communicate its understanding through natural language text.
Creating sound effects or audio that matches the visual content and timing of a video.
How an object's appearance changes based on the viewing angle, including effects like reflections and shininess.
Representing biological cells as simplified computational models for simulation.
A computer-generated 3D environment that users can interact with using special headsets or controllers.
Using AI to digitally add color to microscope images without physical staining.
A mathematical solution concept for complex equations that handles non-smooth behavior in optimization problems.
The core neural network component that processes and understands images before passing information to the rest of the model.
A component that converts images into a numerical representation that a language model can understand and process.
A process that converts images into numerical representations that a model can understand and process.
The specialized component of a model that processes and interprets image data to extract visual information.
A neural network architecture that processes images by breaking them into small patches and analyzing them similarly to how language models process text.
A neural network architecture that processes images by breaking them into small patches and treating them similarly to how language models process words.
A model designed to understand and reason about both visual content (images) and natural language text together.
Training a model to understand the relationship between images and their text descriptions so it can match them together effectively.
A pre-trained model that jointly processes and understands both visual and textual information in a unified representation.
A model that processes both images and text together to create shared numerical representations, rather than generating new text like a full language model would.
Training a model to understand and connect both images and text together, so it can reason about visual content using language.
An AI model that understands both images and text, allowing it to answer questions about images or describe what it sees.
AI systems that understand both images and text, allowing them to answer questions about images or describe what they see.
Task where an AI agent navigates physical spaces by following natural language instructions while processing visual input.
A task that requires a model to understand and reason about both visual information (images) and textual information together.
A model that combines visual perception, language understanding, and robotic action generation to interpret instructions and control robot movements.
Converting visual inputs like screenshots, charts, or diagrams into executable code or structured representations.
A component that converts images into a numerical representation that the model can understand and process.
Predicting and visualizing what a robot will do next based on its learned policy.
The ability to connect specific words or concepts in text to the actual objects or regions they refer to in an image.
A task where an AI model reads a question and an image, then generates an answer based on what it understands from the image.
The capability to analyze images and draw logical conclusions or answer complex questions based on what is depicted in the visual content.
Discrete units representing different regions or features of an image processed by the model.
The ability of an AI model to interpret and analyze images, including identifying objects, reading text, and answering questions about visual content.
A model that processes both images and text together, understanding the relationship between visual content and language to answer questions about images or describe what it sees.
The persuasive techniques and design choices used in charts and graphs to influence how viewers interpret data.
A model's ability to understand and reason about visual information in images, connecting what it sees to language and concepts.
An inference engine optimized for running large language models efficiently by batching requests and managing memory intelligently.
A high-performance serving framework that efficiently runs language models and embedding models with optimized memory usage and throughput for production deployments.
The complete set of unique words or tokens that a language model can recognize and generate.
Adding new tokens or words to a language model's vocabulary beyond its original pretrained set.
The number of unique tokens (words or word pieces) a model can recognize and process; larger vocabularies provide better coverage of a language.
The process of generating natural-sounding human speech from text using machine learning models.
Video RAM — the memory on a GPU that stores model weights and intermediate computations during inference.
The amount of graphics memory (VRAM) required to load and run a model on a GPU.
Automatically identifying security flaws or weaknesses in code that could be exploited by attackers.
A quantization format where model weights are stored in 4-bit precision while calculations use 16-bit precision, balancing efficiency with accuracy.
A specific quantization scheme where weights are stored in 4-bit precision while activations remain in 16-bit precision, balancing memory savings with accuracy.
A specific quantization method where both weights (w) and activations (a) are stored as 8-bit integers, providing a good balance between memory savings and model quality.
A specific quantization method that reduces both weights and activations to 8-bit integers, enabling faster computation on specialized hardware while maintaining reasonable accuracy.
A mathematical measure of how different two distributions are, useful for comparing expert and agent behavior.
Automatically browsing and collecting data from websites by following links across the internet.
Training data collected from publicly available internet sources, which provides broad but sometimes uneven coverage of topics.
The ability to search the internet in real-time during processing to retrieve current information rather than relying only on training data.
The capability for a model to query the internet in real-time during response generation, allowing it to access current information beyond its training data.
A specific quantization method that compresses both the model's stored weights and its intermediate calculations to 8-bit precision, significantly reducing memory and computation requirements.
A merging method that combines model weights by taking their average.
Grouping similar weight values together and replacing them with shared cluster centers to reduce model size.
The process of directly modifying a trained model's internal parameters (weights) to change its behavior without retraining from scratch.
The process of using a neural network to produce parameters for another model rather than training those parameters directly.
A measure of how much a specific weight contributes to model predictions and performance.
The process of setting the starting values for a neural network's parameters before training begins.
The number of bits used to represent each numerical value in a model's weights; lower precision (like 4-bit) uses less memory but may reduce accuracy.
A specific type of quantization that compresses only the model's learned parameters (weights) while keeping other calculations at higher precision.
Using the same neural network parameters for multiple tasks to enable knowledge transfer and reduce model size.
The numerical parameters inside a neural network that determine how it processes input and generates output.
A system that converts high-level motion commands into executable joint trajectories for robots.
How optimizer behavior changes when you increase the number of neurons in each layer of a neural network.
A quantum version of the score function that describes how to reverse noise in quantum systems.
The total length of connections between components on a chip; shorter wirelength improves performance and power efficiency.
Using an AI model to automatically handle repetitive business tasks and processes, reducing manual effort and improving efficiency.
The active, temporary knowledge an AI system uses for the current task, drawn from long-term memory.
A model's learned understanding of facts, concepts, and relationships about the real world, typically acquired during training on diverse text data.
An AI system that learns to understand and predict how the physical world works from observations.
Predicting future states of the environment based on current observations and actions.
Internal representations learned by AI systems that capture how the physical world works, including how objects move and interact over time.
A metric that counts predictions as either completely right or completely wrong with no partial credit.
Solving a task without any training examples by using knowledge from related tasks or descriptions.
How well an AI model performs on new tasks it has never seen before without any training.
Identifying previously unknown security vulnerabilities or attacks that have no existing defenses.
The maximum rate at which information can be reliably transmitted over a noisy channel with zero probability of error.
Training without paired examples of two modalities, using only single-modality data.
Agent performing tasks without any external skill retrieval or runtime augmentation, relying only on learned parameters.
A model's ability to handle new, unseen tasks or data without additional training on those specific examples.
Using a model to solve a task without any training examples for that specific task.
Making predictions on new tasks without any task-specific training or fine-tuning on labeled examples.
Creating new sounds the model has never seen before by using reference audio as a guide.