A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.
本仓库旨在收集ICLR最新研究进展,尤其是LLM方面,涉及NLP领域的各个方向,此项目长期不定时更新。
欢迎watch和fork!不过给个star⭐就更好了❤️。
知乎地址:ShuYini
微信公众号: AINLPer(每日更新,欢迎关注)
另外也欢迎大家进入AINLPer星球,每天推送最新、最优质论文,紧跟AIGC大模型前沿进展;另外星球也特设大模型Agent、大模型推理、RAG系统搭建、热门综述、大模型实操、数据集、测试基准、行业发展状况、大厂工作内推等专栏。详细介绍→:https://mp.weixin.qq.com/s/wHnm9ek4ojYTA_2EPLNILw
感兴趣的小伙伴,赶快扫描下方⬇二维码,新用户加入更有50元优惠券🔖,每年仅需49!
![]()
💎💎💎💎💎💎💎💎💎💎💎💎💎💎
1、How new data permeates LLM knowledge and how to dilute it
2、JudgeLM: Fine-tuned Large Language Models are Scalable Judges
3、MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations
4、Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
5、Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
6、Robustness Reprogramming for Representation Learning
7、LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
8、DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
9、Learning-Augmented Frequent Directions
10、Quality Measures for Dynamic Graph Generative Models
11、Holistically Evaluating the Environmental Impact of Creating Language Models
12、Mixture-of-Agents Enhances Large Language Model Capabilities
13、Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
14、TopoNets: High performing vision and language models with brain-like topography
15、INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
16、Vision Language Models are In-Context Value Learners
17、Wasserstein Distances, Neuronal Entanglement, and Sparsity
18、WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
19、Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
20、Multi-session, multi-task neural decoding from distinct cell-types and brain regions
21、Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
22、AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
23、BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
24、Online Reinforcement Learning in Non-Stationary Context-Driven Environments
25、CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
26、TabWak: A Watermark for Tabular Diffusion Models
27、Generating Freeform Endoskeletal Robots
28、AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies
29、DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
30、LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models
31、Exact Certification of (Graph) Neural Networks Against Label Poisoning
32、Test-time Adaptation for Cross-modal Retrieval with Query Shift
33、In Search of Forgotten Domain Generalization
34、TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
35、Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
36、CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series
37、Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning
38、Progressive Compositionality in Text-to-Image Generative Models
39、Benchmarking Predictive Coding Networks -- Made Simple
40、Can Watermarked LLMs be Identified by Users via Crafted Prompts?
41、On Quantizing Neural Representation for Variable-Rate Video Coding
42、Attention with Markov: A Curious Case of Single-layer Transformers
43、A Second-Order Perspective on Model Compositionality and Incremental Learning
44、Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
45、EmbedLLM: Learning Compact Representations of Large Language Models
46、Lean-STaR: Learning to Interleave Thinking and Proving
47、NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
48、DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
49、Representative Guidance: Diffusion Model Sampling with Coherence
50、Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling
51、InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences
52、Revisiting Random Walks for Learning on Graphs
53、DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
54、Control-oriented Clustering of Visual Latent Representation
55、AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
56、Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
57、X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
58、Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
59、Bilinear MLPs enable weight-based mechanistic interpretability
60、Can Large Language Models Understand Symbolic Graphics Programs?
61、AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
62、Advantage-Guided Distillation for Preference Alignment in Small Language Models
63、Simplifying Deep Temporal Difference Learning
64、SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding
65、Biologically Constrained Barrel Cortex Model Integrates Whisker Inputs and Replicates Key Brain Network Dynamics
66、Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs with Semantic Space
67、The Superposition of Diffusion Models Using the Itô Density Estimator
68、MAGNet: Motif-Agnostic Generation of Molecules from Scaffolds
69、Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment
70、IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
71、RegMix: Data Mixture as Regression for Language Model Pre-training
72、When Attention Sink Emerges in Language Models: An Empirical View
73、PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
74、Revisiting text-to-image evaluation with Gecko: on metrics, prompts, and human rating
75、Streamlining Redundant Layers to Compress Large Language Models
76、SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints
77、Rethinking and Improving Autoformalization: Towards a Faithful Metric and a Dependency Retrieval-based Approach
78、Learning Spatiotemporal Dynamical Systems from Point Process Observations
79、Probabilistic Neural Pruning via Sparsity Evolutionary Fokker-Planck-Kolmogorov Equation
80、Uncovering Gaps in How Humans and LLMs Interpret Subjective Language
81、SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
82、Demystifying the Token Dynamics of Deep Selective State Space Models
83、Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
84、Realistic Evaluation of Deep Partial-Label Learning Algorithms
85、Graph Sparsification via Mixture of Graphs
86、RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression
87、MixEval-X: Any-to-any Evaluations from Real-world Data Mixture
88、DEEM: Diffusion models serve as the eyes of large language models for image perception
89、BodyGen: Advancing Towards Efficient Embodiment Co-Design
90、Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
91、Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
92、Grounding Video Models to Actions through Goal Conditioned Exploration
93、Dense Video Object Captioning from Disjoint Supervision
94、RESuM: A Rare Event Surrogate Model for Physics Detector Design
95、DeLLMa: Decision Making Under Uncertainty with Large Language Models
96、Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
97、A Periodic Bayesian Flow for Material Generation
98、SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
99、Continuous Exposure Learning for Low-light Image Enhancement using Neural ODEs
100、Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
101、Towards Marginal Fairness Sliced Wasserstein Barycenter
102、Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
103、Following the Human Thread in Social Navigation
104、OS-ATLAS: Foundation Action Model for Generalist GUI Agents
105、Student-Informed Teacher Training
106、Preference Optimization for Reasoning with Pseudo Feedback
107、OmniRe: Omni Urban Scene Reconstruction
108、Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
109、Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage
110、Multi-Field Adaptive Retrieval
111、On Disentangled Training for Nonlinear Transform in Learned Image Compression
112、MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
113、ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
114、Towards General-Purpose Model-Free Reinforcement Learning
115、Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research
116、Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
117、SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
118、Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
119、Diffusion On Syntax Trees For Program Synthesis
120、Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
121、A CLIP-Powered Framework for Robust and Generalizable Data Selection
122、Fine-tuning with Reserved Majority for Noise Reduction
123、GOLD: Graph Out-of-Distribution Detection via Implicit Adversarial Latent Generation
124、Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation
125、Test-time Alignment of Diffusion Models without Reward Over-optimization
126、ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Sentences
127、LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
128、LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
129、Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
130、GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks
131、Easing Training Process of Rectified Flow Models Via Lengthening Inter-Path Distance
132、Recovering Manifold Structure Using Ollivier Ricci Curvature
133、AutoCGP: Closed-Loop Concept-Guided Policies from Unlabeled Demonstrations
134、Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models
135、MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
136、Atlas Gaussians Diffusion for 3D Generation
137、Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking
138、Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
139、Programming Refusal with Conditional Activation Steering
140、Perm: A Parametric Representation for Multi-Style 3D Hair Modeling
141、CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
142、ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
143、D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement
144、Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization
145、Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training
146、DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
147、ThinK: Thinner Key Cache by Query-Driven Pruning
148、4K4DGen: Panoramic 4D Generation at 4K Resolution
149、RelitLRM: Generative Relightable Radiance for Large Reconstruction Models
150、ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks
151、Severing Spurious Correlations with Data Pruning
152、MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
153、Hymba: A Hybrid-head Architecture for Small Language Models
154、Training-Free Activation Sparsity in Large Language Models
155、Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
156、CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation
157、Poison-splat: Computation Cost Attack on 3D Gaussian Splatting
158、Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
159、Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier
160、TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
161、Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data
162、Enhancing Pre-trained Representation Classifiability can Boost its Interpretability
163、LiveBench: A Challenging, Contamination-Limited LLM Benchmark
164、Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion
165、Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model
166、MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers
167、SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks
168、DenseMatcher: Learning 3D Semantic Correspondence for Category-Level Manipulation from a Single Demo
169、Beyond Next Token Prediction: Patch-Level Training for Large Language Models
170、LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models
171、LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
172、Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
173、One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
174、MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
175、UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery
176、DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes
177、Instance-dependent Early Stopping
178、Moner: Motion Correction in Undersampled Radial MRI with Unsupervised Neural Representation
179、SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
1、Self-supervised contrastive learning performs non-linear system identification
2、Sparse autoencoders reveal selective remapping of visual concepts during adaptation
3、Multi-Label Test-Time Adaptation with Bound Entropy Minimization
4、TabM: Advancing tabular deep learning with parameter-efficient ensembling
5、ToolGen: Unified Tool Retrieval and Calling via Generation
6、Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks
7、Video Action Differencing
8、Optimal Transport for Time Series Imputation
9、RaSA: Rank-Sharing Low-Rank Adaptation
10、Offline Model-Based Optimization by Learning to Rank
11、From Search to Sampling: Generative Models for Robust Algorithmic Recourse
12、Monte Carlo Planning with Large Language Model for Text-Based Game Agents
13、Robust Root Cause Diagnosis using In-Distribution Interventions
14、Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems
15、SimulPL: Aligning Human Preferences in Simultaneous Machine Translation
16、SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
17、Clique Number Estimation via Differentiable Functions of Adjacency Matrix Permutations
18、ADAM: An Embodied Causal Agent in Open-World Environments
19、Beware of Calibration Data for Pruning Large Language Models
20、Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping
21、Herald: A Natural Language Annotated Lean 4 Dataset
22、HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere
23、DPLM-2: A Multimodal Diffusion Protein Language Model
24、Language Imbalance Driven Rewarding for Multilingual Self-improving
25、Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
26、Distribution-Free Data Uncertainty for Neural Network Regression
27、LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models
28、SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization
29、Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks
30、Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms
31、Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
32、One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning
33、URLOST: Unsupervised Representation Learning without Stationarity or Topology
34、Charting the Design Space of Neural Graph Representations for Subgraph Matching
35、Distilling Dataset into Neural Field
36、SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments
37、SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
38、Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions
39、GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks
40、Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
41、How to Evaluate Reward Models for RLHF
42、An Efficient Framework for Crediting Data Contributors of Diffusion Models
43、You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
44、SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
45、FreDF: Learning to Forecast in the Frequency Domain
46、Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
47、Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
48、Ensembling Diffusion Models via Adaptive Feature Aggregation
49、Model merging with SVD to tie the Knots
50、Making Text Embedders Few-Shot Learners
51、PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph
52、SafeDiffuser: Safe Planning with Diffusion Probabilistic Models
53、Searching for Optimal Solutions with LLMs via Bayesian Optimization
54、Adversarial Generative Flow Network for Solving Vehicle Routing Problems
55、LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
56、Lawma: The Power of Specialization for Legal Annotation
57、Stiefel Flow Matching for Moment-Constrained Structure Elucidation
58、No Preference Left Behind: Group Distributional Preference Optimization
59、Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
60、Agent S: An Open Agentic Framework that Uses Computers Like a Human
61、Semi-Parametric Retrieval via Binary Bag-of-Tokens Index
62、PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations
63、Hyperbolic Genome Embeddings
64、RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs
65、Attributing Culture-Conditioned Generations to Pretraining Corpora
66、Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs
67、DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
68、Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
69、A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
70、ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks
71、Generalized Behavior Learning from Diverse Demonstrations
72、DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
73、Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL
74、Safety-Prioritizing Curricula for Constrained Reinforcement Learning
75、Towards Federated RLHF with Aggregated Client Preference for LLMs
76、MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis
77、Physics of Language Models: Part 3.2, Knowledge Manipulation
78、On Calibration of LLM-based Guard Models for Reliable Content Moderation
79、Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
80、Causal Order: The Key to Leveraging Imperfect Experts in Causal Inference
81、Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
82、Agent-Oriented Planning in Multi-Agent Systems
83、Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images
84、Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
85、ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
86、STAFF: Speculative Coreset Selection for Task-Specific Fine-tuning
87、Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
88、Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection
89、Learning Efficient Positional Encodings with Graph Neural Networks
90、Scalable Influence and Fact Tracing for Large Language Model Pretraining
91、Mastering Task Arithmetic: $\tau$Jp as a Key Indicator for Weight Disentanglement
92、Diffusion State-Guided Projected Gradient for Inverse Problems
93、PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS
94、Generating CAD Code with Vision-Language Models for 3D Designs
95、MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
96、Forgetting Transformer: Softmax Attention with a Forget Gate
97、ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
98、GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
99、Score-based Self-supervised MRI Denoising
100、ProteinBench: A Holistic Evaluation of Protein Foundation Models
101、LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
102、Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search
103、GReaTer: Gradients Over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
104、LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
105、Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
106、EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
107、MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
108、Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
109、What's the Move? Hybrid Imitation Learning via Salient Points
110、LocoVR: Multiuser Indoor Locomotion Dataset in Virtual Reality
111、Greener GRASS: Enhancing GNNs with Encoding, Rewiring, and Attention
112、Conformal Language Model Reasoning with Coherent Factuality
113、Dissecting Adversarial Robustness of Multimodal LM Agents
114、ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration
115、More Experts Than Galaxies: Conditionally-Overlapping Experts with Biologically-Inspired Fixed Routing
116、In-context Time Series Predictor
117、Discovering Influential Neuron Path in Vision Transformers
118、Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
119、AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
120、Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
121、L3Ms — Lagrange Large Language Models
122、ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models
123、Human-inspired Episodic Memory for Infinite Context LLMs
124、ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
125、Object-Centric Pretraining via Target Encoder Bootstrapping
126、Lossy Compression with Pretrained Diffusion Models
127、Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
128、Self-Normalized Resets for Plasticity in Continual Learning
129、Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning
130、CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
131、On Linear Representations and Pretraining Data Frequency in Language Models
132、Quantifying Generalization Complexity for Large Language Models
133、Real2Code: Reconstruct Articulated Objects via Code Generation
134、{$\tau$}-bench: A Benchmark for \underline{T}ool-\underline{A}gent-\underline{U}ser Interaction in Real-World Domains
135、DEPfold: RNA Secondary Structure Prediction as Dependency Parsing.
136、What Matters in Learning from Large-Scale Datasets for Robot Manipulation
137、APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
138、Differentiable Optimization of Similarity Scores Between Models and Brains
139、Chunk-Distilled Language Modeling
140、GOttack: Universal Adversarial Attacks on Graph Neural Networks via Graph Orbits Learning
141、NutriBench: A Dataset for Evaluating Large Language Models in Nutrition Estimation from Meal Descriptions
142、The Value of Sensory Information to a Robot
143、Explore Theory of Mind: program-guided adversarial data generation for theory of mind reasoning
144、HelpSteer2-Preference: Complementing Ratings with Preferences
145、Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images
146、Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
147、Palu: KV-Cache Compression with Low-Rank Projection
148、Shared-AE: Automatic Identification of Shared Subspaces in High-dimensional Neural and Behavioral Activity
149、Many-Objective Multi-Solution Transport
150、Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning
151、Learning to Explore and Exploit with GNNs for Unsupervised Combinatorial Optimization
152、InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling
153、Can Knowledge Editing Really Correct Hallucinations?
154、BoneMet: An Open Large-Scale Multi-Modal Murine Dataset for Breast Cancer Bone Metastasis Diagnosis and Prognosis
155、ALLaM: Large Language Models for Arabic and English
156、Improving Graph Neural Networks by Learning Continuous Edge Directions
157、SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
158、Eliciting Human Preferences with Language Models
159、Preserving Deep Representations in One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
160、Chemistry-Inspired Diffusion with Non-Differentiable Guidance
161、Aligning Language Models with Demonstrated Feedback
162、Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
163、SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
164、InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma
165、The Pitfalls of Memorization: When Memorization Hurts Generalization
166、See It from My Perspective: How Language Affects Cultural Bias in Image Understanding
167、Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
168、Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
169、Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control
170、Cauchy-Schwarz Regularizers
171、HELMET: How to Evaluate Long-context Models Effectively and Thoroughly
172、R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
173、A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations
174、Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
175、MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
176、Procedural Synthesis of Synthesizable Molecules
177、Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning
178、Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
179、DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
180、Teaching LLMs How to Learn with Contextual Fine-Tuning
181、An Undetectable Watermark for Generative Image Models
182、STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
183、Residual Stream Analysis with Multi-Layer SAEs
184、Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups
185、Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians
186、On the Transfer of Object-Centric Representation Learning
187、Few-Class Arena: A Benchmark for Efficient Selection of Vision Models and Dataset Difficulty Measurement
188、OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination
189、Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-training of Deep Networks
190、MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
191、Efficient Active Imitation Learning with Random Network Distillation
192、Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation
193、MatExpert: Decomposing Materials Discovery By Mimicking Human Experts
194、Fugatto 1: Foundational Generative Audio Transformer Opus 1
195、Consistency Models Made Easy
196、AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
197、ElasticTok: Adaptive Tokenization for Image and Video
198、Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models
199、NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
200、Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport
201、Scaling up Masked Diffusion Models on Text
202、Think while You Generate: Discrete Diffusion with Planned Denoising
203、Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
204、Can Transformers Do Enumerative Geometry?
205、EqNIO: Subequivariant Neural Inertial Odometry
206、Interaction Asymmetry: A General Principle for Learning Composable Abstractions
207、Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
208、CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer
209、Provence: efficient and robust context pruning for retrieval-augmented generation
210、CViT: Continuous Vision Transformer for Operator Learning
211、Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment
212、Flow: Modularized Agentic Workflow Automation
213、An Exploration with Entropy Constrained 3D Gaussians for 2D Video Compression
214、Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models
215、Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
216、Scalable Mechanistic Neural Networks
217、One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs
218、E(3)-equivariant models cannot learn chirality: Field-based molecular generation
219、JPEG Inspired Deep Learning
220、The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG
221、Tool-Planner: Task Planning with Clusters across Multiple Tools
222、Systematic Relational Reasoning With Epistemic Graph Neural Networks
223、Simple, Good, Fast: Self-Supervised World Models Free of Baggage
224、Noise-conditioned Energy-based Annealed Rewards (NEAR): A Generative Framework for Imitation Learning from Observation
225、Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
226、RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models
227、Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
228、Agents' Room: Narrative Generation through Multi-step Collaboration
229、Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models
230、MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
231、Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
232、InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting
233、Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning
234、COFlowNet: Conservative Constraints on Flows Enable High-Quality Candidate Generation
235、Multimodal Quantitative Language for Generative Recommendation
236、Monet: Mixture of Monosemantic Experts for Transformers
237、Graph-based Document Structure Analysis
238、Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination
239、3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling
240、Redefining the task of Bioactivity Prediction
241、Robust Simulation-Based Inference under Missing Data via Neural Processes
242、Physics-Informed Diffusion Models
243、DynFrs: An Efficient Framework for Machine Unlearning in Random Forest
244、Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
245、Decoupling Layout from Glyph in Online Chinese Handwriting Generation
246、Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
247、ADBM: Adversarial Diffusion Bridge Model for Reliable Adversarial Purification
248、Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
249、Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection
250、MA$^2$E: Addressing Partial Observability in Multi-Agent Reinforcement Learning with Masked Auto-Encoder
251、Wayward Concepts In Large Multimodal Models
252、Towards Hierarchical Rectified Flow
253、Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
254、MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
255、Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?
256、The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
257、System 1.x: Learning to Balance Fast and Slow Planning with Language Models
258、Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
259、Advancing Graph Generation through Beta Diffusion
260、UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
261、Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
262、Confidence Elicitation: A New Attack Vector for Large Language Models
263、CoMRes: Semi-Supervised Time Series Forecasting Utilizing Consensus Promotion of Multi-Resolution
264、MCNC: Manifold-Constrained Reparameterization for Neural Compression
265、HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics
266、Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
267、AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation
268、Forte : Finding Outliers with Representation Typicality Estimation
269、LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
270、NextBestPath: Efficient 3D Mapping of Unseen Environments
271、LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
272、Breaking the Reclustering Barrier in Centroid-based Deep Clustering
273、Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
274、Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
275、Generalized Consistency Trajectory Models for Image Manipulation
276、API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
277、Spherical Tree-Sliced Wasserstein Distance
278、Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
279、Training Robust Ensembles Requires Rethinking Lipschitz Continuity
280、Sparse Autoencoders Do Not Find Canonical Units of Analysis
281、MUSE: Machine Unlearning Six-Way Evaluation for Language Models
282、Learning Structured Representations by Embedding Class Hierarchy with Fast Optimal Transport
283、Small Models are LLM Knowledge Triggers for Medical Tabular Prediction
284、PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
285、Simulating Training Dynamics to Reconstruct Training Data from Deep Neural Networks
286、BANGS: Game-theoretic Node Selection for Graph Self-Training
287、UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate
288、Generating Likely Counterfactuals Using Sum-Product Networks
289、Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
290、PooDLe🐩: Pooled and dense self-supervised learning from naturalistic videos
291、TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation
292、CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
293、TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning
294、Number Cookbook: Number Understanding of Language Models and How to Improve It
295、Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
296、Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
297、Factor Graph-based Interpretable Neural Networks
298、Alchemy: Amplifying Theorem-Proving Capability Through Symbolic Mutation
299、Agent Skill Acquisition for Large Language Models via CycleQD
300、KBLaM: Knowledge Base augmented Language Model
301、MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
302、An Information Criterion for Controlled Disentanglement of Multimodal Data
303、ReAttention: Training-Free Infinite Context with Finite Attention Scope
304、SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
305、Periodic Materials Generation using Text-Guided Joint Diffusion Model
306、QERA: an Analytical Framework for Quantization Error Reconstruction
307、Efficient Discovery of Pareto Front for Multi-Objective Reinforcement Learning
308、Semi-Supervised CLIP Adaptation by Enforcing Semantic and Trapezoidal Consistency
309、Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation
310、Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
311、Biologically Plausible Brain Graph Transformer
312、Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
313、UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping
314、Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
315、Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
316、DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking head Video Generation
317、Vision and Language Synergy for Rehearsal Free Continual Learning
318、Rethinking Multiple-Instance Learning From Feature Space to Probability Space
319、Why Does the Effective Context Length of LLMs Fall Short?
320、Complementary Label Learning with Positive Label Guessing and Negative Label Enhancement
321、Training-free LLM-generated Text Detection by Mining Token Probability Sequences
322、Dataset Ownership Verification in Contrastive Pre-trained Models
323、MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
324、Standardizing Structural Causal Models
325、OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
326、VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
327、Bootstrapping Language Models with DPO Implicit Rewards
328、Learning system dynamics without forgetting
329、Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs
330、SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
331、SysBench: Can LLMs Follow System Message?
332、Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
333、AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
334、OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
335、Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics
336、Equivariant Neural Functional Networks for Transformers
337、Weighted-Reward Preference Optimization for Implicit Model Fusion
338、Locality-aware Gaussian Compression for Fast and High-quality Rendering
339、A Large-scale Dataset and Benchmark for Commuting Origin-Destination Flow Generation
340、NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
341、InstaRevive: One-Step Image Enhancement via Dynamic Score Matching
342、A Distributional Approach to Uncertainty-Aware Preference Alignment Using Offline Demonstrations
343、Unsupervised Multiple Kernel Learning for Graphs via Ordinality Preservation
344、MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra
345、Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
346、Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
347、On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
348、Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents
349、Jamba: Hybrid Transformer-Mamba Language Models
350、Learning 3D Perception from Others' Predictions
351、Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
352、Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On
353、Universal Image Restoration Pre-training via Degradation Classification
354、ConMix: Contrastive Mixup at Representation Level for Long-tailed Deep Clustering
355、ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
356、Designing Concise ConvNets with Columnar Stages
357、Taming Overconfidence in LLMs: Reward Calibration in RLHF
358、Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
359、StringLLM: Understanding the String Processing Capability of Large Language Models
360、Consistent Flow Distillation for Text-to-3D Generation
361、A Benchmark for Semantic Sensitive Information in LLMs Outputs
362、xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
363、From Commands to Prompts: LLM-based Semantic File System for AIOS
364、SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
365、Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs
366、BaB-ND: Long-Horizon Motion Planning with Branch-and-Bound and Neural Dynamics
367、BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
368、HERO: Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning
369、Fully-inductive Node Classification on Arbitrary Graphs
370、Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
371、Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length
372、Detecting Backdoor Samples in Contrastive Language Image Pretraining
373、Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
374、Competing Large Language Models in Multi-Agent Gaming Environments
375、Reinforcement learning with combinatorial actions for coupled restless bandits
376、OGBench: Benchmarking Offline Goal-Conditioned RL
377、CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
378、Aligned LLMs Are Not Aligned Browser Agents
379、Edge Prompt Tuning for Graph Neural Networks
380、Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference
381、Algorithmic Stability Based Generalization Bounds for Adversarial Training
382、Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods
383、MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
384、AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
385、Zero-shot Model-based Reinforcement Learning using Large Language Models
386、Interpreting the Second-Order Effects of Neurons in CLIP
387、GOAL: A Generalist Combinatorial Optimization Agent Learner
388、Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
389、Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model
390、Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
391、Improved Training Technique for Latent Consistency Models
392、Dynamic Negative Guidance of Diffusion Models
393、Adversarial Training for Defense Against Label Poisoning Attacks
394、Trajectory attention for fine-grained video motion control
395、Diff-Prompt: Diffusion-driven Prompt Generator with Mask Supervision
396、LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
397、Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization
398、HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
399、A General Framework for Producing Interpretable Semantic Text Embeddings
400、Multimodal Situational Safety
401、Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
402、Trusted Multi-View Classification via Evolutionary Multi-View Fusion
403、HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes
404、REFINE: Inversion-Free Backdoor Defense via Model Reprogramming
405、Tight Clusters Make Specialized Experts
406、Real-Time Video Generation with Pyramid Attention Broadcast
407、Large Convolutional Model Tuning via Filter Subspace
408、Spectro-Riemannian Graph Neural Networks
409、Towards Out-of-Modal Generalization without Instance-level Modal Correspondence
410、Subtask-Aware Visual Reward Learning from Segmented Demonstrations
411、TopoDiffusionNet: A Topology-aware Diffusion Model
412、3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing
413、Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation
414、What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
415、Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
416、Simulating Human-like Daily Activities with Desire-driven Autonomy
417、CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
418、ParaSolver: A Hierarchical Parallel Integral Solver for Diffusion Models
419、Mining your own secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
420、RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
421、GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
422、Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
423、Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
424、AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data
425、DCT-CryptoNets: Scaling Private Inference in the Frequency Domain
426、Unbounded: A Generative Infinite Game of Character Life Simulation
427、Metric-Driven Attributions for Vision Transformers
428、Diffusion Models Are Real-Time Game Engines
429、Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information
430、GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
431、DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
432、N-ForGOT: Towards Not-forgetting and Generalization of Open Temporal Graph Learning
433、Rethinking Light Decoder-based Solvers for Vehicle Routing Problems
434、Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior
435、Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness
436、Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
437、SpinQuant: LLM Quantization with Learned Rotations
438、On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models
439、Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
440、A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
441、Adaptive Retention & Correction: Test-Time Training for Continual Learning
442、Do WGANs succeed because they minimize the Wasserstein Distance? Lessons from Discrete Generators
443、Immunogenicity Prediction with Dual Attention Enables Vaccine Target Selection
444、Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
445、WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
446、Bad-PFL: Exploiting Backdoor Attacks against Personalized Federated Learning
447、Enhancing Multilingual Reasoning in LLMs: Insights from Cross-Linguistic Correlations and Optimal Data Proportions
448、PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark
449、Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
450、Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
451、Improving Data Efficiency via Curating LLM-Driven Rating Systems
452、VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
453、ECHOPulse: ECG Controlled Echocardio-gram Video Generation
454、A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals
455、Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment
456、Causal Graph Transformer for Treatment Effect Estimation Under Unknown Interference
457、Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
458、Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning
459、Learning Gain Map for Inverse Tone Mapping
460、Latent Radiance Fields with 3D-aware 2D Representations
461、Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
462、SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
463、Long-Sequence Recommendation Models Need Decoupled Embeddings
464、TIPS: Text-Image Pretraining with Spatial awareness
465、OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code
466、Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
467、Halton Scheduler for Masked Generative Image Transformer
468、MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
469、Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition
470、Framer: Interactive Frame Interpolation
471、GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
472、BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications
473、Is In-Context Learning Sufficient for Instruction Following in LLMs?
474、Towards Effective Evaluations and Comparisons for LLM Unlearning Methods
475、Learning Evolving Tools for Large Language Models
476、Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
477、OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
478、BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
479、RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
480、VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
481、GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting
482、Chain-of-Focus Prompting: Leveraging Sequential Visual Cues to Prompt Large Autoregressive Vision Models
483、PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment
484、Dynamic Diffusion Transformer
485、McEval: Massively Multilingual Code Evaluation
486、SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
487、Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
488、Improving Language Model Distillation through Hidden State Matching
489、Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence
490、Medium-Difficulty Samples Constitute Smoothed Decision Boundary for Knowledge Distillation on Pruned Datasets
491、Revisiting Convolution Architecture in the Realm of DNA Foundation Models
492、Episodic Novelty Through Temporal Distance
493、Scaling Autonomous Agents via Automatic Reward Modeling And Planning
494、FlickerFusion: Intra-trajectory Domain Generalizing Multi-agent Reinforcement Learning
495、Gradient-Free Generation for Hard-Constrained Systems
496、M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model
497、Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
498、ProtoSnap: Prototype Alignment For Cuneiform Signs
499、DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
500、Learning Spatial-Semantic Features for Robust Video Object Segmentation
501、Toward Generalizing Visual Brain Decoding to Unseen Subjects
502、On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning
503、Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
504、PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
505、Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
506、MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors
507、VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
508、SAGEPhos: Sage Bio-Coupled and Augmented Fusion for Phosphorylation Site Detection
509、Training-Free Dataset Pruning for Instance Segmentation
510、PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
511、ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY
512、Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
513、Locality Sensitive Avatars From Video
514、Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
515、LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
516、PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
517、Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
518、AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
519、COME: Test-time Adaption by Conservatively Minimizing Entropy
520、Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
521、CoInD: Enabling Logical Compositions in Diffusion Models
522、Automated Design of Agentic Systems
523、ParetoFlow: Guided Flows in Multi-Objective Optimization
524、Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
525、Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
526、A Coefficient Makes SVRG Effective
527、FaceShot: Bring Any Character into Life
528、Do Deep Neural Network Solutions Form a Star Domain?
529、FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models
530、Revisiting In-context Learning Inference Circuit in Large Language Models
531、VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing
532、COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
533、FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
534、Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
535、SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
536、Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and Intervention
537、Gated Delta Networks: Improving Mamba2 with Delta Rule
538、3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
539、Understanding and Enhancing the Transferability of Jailbreaking Attacks
540、AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction
1、RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
2、TopoLM: brain-like spatio-functional organization in a topographic language model
3、Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
4、Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
5、Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
6、Learning to Search from Demonstration Sequences
7、Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
8、MAP: Multi-Human-Value Alignment Palette
9、Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
10、Learning to Discover Regulatory Elements for Gene Expression Prediction
11、Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration
12、Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
13、Progressive Compression with Universally Quantized Diffusion Models
14、Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
15、Instant Policy: In-Context Imitation Learning via Graph Diffusion
16、RB-Modulation: Training-Free Stylization using Reference-Based Modulation
17、Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
18、Oscillatory State-Space Models
19、Attention as a Hypernetwork
20、Energy-based Backdoor Defense Against Federated Graph Learning
21、Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
22、RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
23、Residual Deep Gaussian Processes on Manifolds
24、Learning to Discretize Denoising Diffusion ODEs
25、Proteina: Scaling Flow-based Protein Structure Generative Models
26、Feedback Favors the Generalization of Neural ODEs
27、Scaling and evaluating sparse autoencoders
28、Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
29、TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
30、Copyright-Protected Language Generation via Adaptive Model Fusion
31、BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
32、Rethinking Reward Modeling in Preference-based Large Language Model Alignment
33、Progressive distillation induces an implicit curriculum
34、miniCTX: Neural Theorem Proving with (Long-)Contexts
35、Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
36、Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
37、Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
38、MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
39、On Scaling Up 3D Gaussian Splatting Training
40、BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
41、SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
42、Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching
43、PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
44、A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
45、MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
46、Subgraph Federated Learning for Local Generalization
47、Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series
48、Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
49、Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection
50、SymmetricDiffusers: Learning Discrete Diffusion Models over Finite Symmetric Groups
51、Data Scaling Laws in Imitation Learning for Robotic Manipulation
52、Population Transformer: Learning Population-level Representations of Neural Activity
53、HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models
54、On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding
55、How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning
56、A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules
57、MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
58、LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
59、Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
60、EmbodiedSAM: Online Segment Any 3D Thing in Real Time
61、Robustness Inspired Graph Backdoor Defense
62、Proxy Denoising for Source-Free Domain Adaptation
63、Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
64、On the Identification of Temporal Causal Representation with Instantaneous Dependence
65、WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
66、Learning Dynamics of LLM Finetuning
67、More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
68、Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity
69、Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
70、ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
71、MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
72、Compositional Entailment Learning for Hyperbolic Vision-Language Models
73、DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL
74、Composing Unbalanced Flows for Flexible Docking and Relaxation
75、On the Role of Attention Heads in Large Language Model Safety
76、AlphaEdit: Null-Space Constrained Model Editing for Language Models
77、Prioritized Generative Replay
78、Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport
79、No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
80、The Geometry of Categorical and Hierarchical Concepts in Large Language Models
81、Variational Diffusion Posterior Sampling with Midpoint Guidance
82、Retrieval Head Mechanistically Explains Long-Context Factuality
83、NeuralPlane: Structured 3D Reconstruction in Planar Primitives with Neural Fields
84、Differential Transformer
85、High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation
86、Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
87、REEF: Representation Encoding Fingerprints for Large Language Models
88、Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
89、Data Selection via Optimal Control for Language Models
90、GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
91、LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
92、Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation
93、LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
94、Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
95、Cut Your Losses in Large-Vocabulary Language Models
96、Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
97、FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
98、SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
99、REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
100、MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
101、Do as We Do, Not as You Think: the Conformity of Large Language Models
102、Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
103、Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution
104、Artificial Kuramoto Oscillatory Neurons
105、ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
106、A Decade's Battle on Dataset Bias: Are We There Yet?
107、Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
108、Open-World Reinforcement Learning over Long Short-Term Imagination
109、SAM 2: Segment Anything in Images and Videos
110、Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
111、A Computational Framework for Modeling Emergence of Color Vision in the Human Brain
1、Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
2、TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
3、Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
4、CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
5、PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
1、OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
2、FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
1、Scaling Speech-Text Pre-training with Synthetic Interleaved Data
2、K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
3、DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
4、Image Watermarks are Removable using Controllable Regeneration from Clean Noise
5、ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
6、Resolution Attack: Exploiting Image Compression to Deceive Deep Neural Networks
7、SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
8、Interpretable Compressed Descriptions For Image Generation
9、Measuring And Improving Engagement of Text-to-Image Generation Models
10、AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
11、Empowering LLM Agents with Zero-Shot Optimal Decision-Making through Q-learning
12、TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
13、Diversity-Rewarded CFG Distillation
14、GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
15、Looking Backward: Streaming Video-to-Video Translation with Feature Banks
16、Personalized Representation from Personalized Generation
17、VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
18、Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models
19、$F^3Set$: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
20、SWEb: A Large Web Dataset for the Scandinavian Languages
21、Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment
22、PersonalLLM: Tailoring LLMs to Individual Preferences
23、BadRobot: Jailbreaking Embodied LLMs in the Physical World
24、Grounding Multimodal Large Language Model in GUI World
25、Personalized Visual Instruction Tuning
26、TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
27、TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident
28、Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels
29、UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
30、DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
31、Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
32、Depth Any Video with Scalable Synthetic Data
33、HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
34、SONICS: Synthetic Or Not - Identifying Counterfeit Songs
35、CycleResearcher: Improving Automated Research via Automated Review
36、Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks
npx CLI installing 100+ agents, commands, hooks, and integrations in one command
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
Native macOS app to monitor Claude AI usage limits and watch your coding sessions live
Pocket Flow: Codebase to Tutorial