Multi-agent papers

May 2025Can LLMs handle process plant faults using digital twins?Can LLMs automate quantum chemistry workflows?How to secure interacting AI agents online?How to optimize LLM multi-agent service ecosystems?How can AI agents best manage air traffic in bad weather?How can I build trustworthy, useful multi-agent AI?Can LLMs detect media bias automatically?How can I safely coordinate many LLMs avoiding conflicts?How can I build robust distributed estimators in faulty networks?How can I efficiently route modular agents on a graph?How can V2X improve multi-agent perception?How can agents securely collaborate across platforms?How can I improve multi-agent perception efficiency with limited bandwidth?How can I generate realistic, adversarial traffic scenarios efficiently?How can I efficiently migrate AI agents in a resource-constrained network?How to automate LLM multi-agent failure attribution?How can LLMs build smart-space agents?How can LLMs improve human-AI teamwork?Can LLMs improve UAM's holonic architecture?Can shared models create roles in limited-communication robot teams?April 2025Can LLMs better simulate group decisions?How can I scale MARL for large-team competitive games?How can MCP improve LLM multi-agent coordination?How can MARL optimize resource allocation?How can I optimize multi-agent coverage in complex environments?How can noisy perception solve the institution bootstrapping problem?How can I make my LLM agents' communication more robust?How can I build a distributed system for swarm robot experiments?June 2024How can I restructure retrieved content for better LLM QA?April 2025Can LLMs automate research code generation?How can agents improve video question answering?How can I efficiently train cooperative agents using inter-agent coupling?How does task structure impact AI-human collaboration?How can LLMs coordinate robot navigation in tight spaces?Can shared knowledge improve LLM agent accuracy?Can LLMs improve crowdsourced fact-checking?Can AI agents build web apps?Can LLMs automate plant phenotyping?How can I efficiently coordinate agents with shared resources using identical strategies?How can competing agent networks learn optimally?Can LLMs personalize quantum computing lessons?Can LLMs foster cooperation in multi-agent apps?How can LLMs best collaborate in Minecraft?How can multi-agent AI prevent EV charging overloads cost-effectively?Can LLMs learn opponent costs in real-time games?How can I build HIPAA-compliant LLMs for healthcare?How can I optimize task allocation in air-ground multi-agent MCS?How can I build efficient, personalized distributed models with heterogeneous data?How can I efficiently route multiple agents with varied costs?How can AI enhance XR spatial intelligence?How can I build trustworthy, decentralized AI agents for web apps?How efficient are pay-as-bid auction games?Can AI agents automate material discovery?How can AI, humans, and animals best team up?How can LLMs improve medical diagnosis using multi-agent dialogue?How can I best benchmark LLM planning?Can agents improve VLM robustness and reduce hallucinations?Can I infer NDS control from discrete observations?Can swarming agents speed up subgraph isomorphism?How can I ensure LLM node honesty in a decentralized AI network?How can I efficiently protect my embodied LLM from jailbreaks?How to architect a seven-layer LLM multi-agent system?Can MARL synthesize optimal climate policies?Can LLMs replace mixing networks in MARL?Can LLMs boost urban causal inference?How can I defend against fairness attacks in Hyperledger Fabric?How to track multi-agent content origins?Can cross-environment training enable zero-shot multi-agent cooperation?How can LLMs automate government processes?How can I efficiently compress maps for robot communication?How can we build trustworthy, ethical AI agent ecosystems?How to build efficient, low-cost super agents?Can MARL optimize tissue repair using LLMs?How can I improve UAV pathfinding using AI?How can I test LLMs' social skills in games?How can UAVs coordinate time-critical missions using game theory?Can I efficiently remove dominated actions in imperfect-information games?Can LLMs collaboratively improve image captioning?How can I balance AI agent simulation mechanics and emergent dynamics?How can robots agree & explore most efficiently?Can LLMs improve pest management via multi-agent systems?How can I optimize drone delivery using decentralized AI?Can ABM-PDE hybrids speed up disease spread simulations?How can cheap robots help build swarm AI apps?How can transformers improve drone coordination in DRL?How can agents self-organize to share limited resources efficiently?Can multiple cameras prevent collisions using AI agents?How to ensure consistent LLM agent responses?How can LLMs explore open-ended questions deeply and broadly?How do online news habits affect political polarization?Can digital twins improve democratic deliberation?How secure are distributed LLM agents?Can memory & higher-order neighbors improve LLM agent consensus?How can I build better localized robot coordination using hypergraphs?How can I plan safe, efficient robot movements with shared localization?How do long-range interactions affect AI consensus?How can CBR improve LLM agent reasoning?How can I build adaptable multi-agent task planners?How can agents efficiently share skills and code?How can I make complete real-time multi-agent pathfinding?How can I build a scalable, decentralized memory for LLM agents?How can LLMs improve cross-document entity linking?Can AI agent colonies improve prediction accuracy?Can LLMs debate for efficient legal prediction?How can LLMs generate diverse team behaviors?Can LLMs guide career path planning?How can attention improve multi-agent task allocation?Can multi-agent AI improve ethical clinical decision support?Can offline RL improve 6G network control?How can I reliably control stochastic multi-agent systems with probabilistic guarantees?How can AI agents improve business processes?Can RL improve real-time task scheduling?Can LLMs simulate persuasive meat-reduction dialogues?How can NOMA improve distributed time sync in dense 6G networks?How can enforcement agents improve multi-agent AI safety?How can I build robust, collaborative LLM agents?Can RL optimize urban traffic for both pedestrians and vehicles?How can I build robust multi-robot coordination with uncertain locations?How can LLMs better manage knowledge for efficient planning?Who's liable when LLMs misbehave?How can I make my DSP app scale seamlessly?How does WoM impact sequential social learning accuracy?Do LLMs truly improve agent-based modeling?Can decentralized agents learn to communicate and coordinate?How can LLMs learn actions in new environments?How can agents share info optimally in a competitive hypothesis test?How can LLMs best allocate tasks among agents?How can we ethically develop offensive AI?How to build effective LLM-based multi-agent systems?How can RL shepherd non-cohesive targets?How can I assess ADM fairness as an individual?How can LLMs reach unanimous consensus in multi-agent systems?How can I build distributed multi-agent coordination in JS?Can LLMs improve distributed AI training?Can ABMs model pension policy impacts?How can I efficiently unlearn data in federated drone learning?How to balance fairness and efficiency in multi-agent resource allocation?How can agent hierarchy improve LLM opinion consensus?How can LLMs decentralize multi-agent coordination?Can prompt attacks break multi-agent LLMs?How do LLM agent personalities affect task selection?Can my multi-agent system handle adversarial delays?How can Petri nets model asynchronous multi-agent systems?How can LLMs create personalized ads in competitive markets?Can LLMs reliably code algorithms from research papers?How can LLMs build layered images procedurally?How can fuzzy agents optimize airport robot fleets?March 2025How robust are LLM-based recommender agents to memory attacks?Can LLMs simulate realistic e-commerce shoppers?How to optimize multi-agent RL for mean-variance objectives?How can LLMs build scientific research agents?How can I optimize ride-sharing rebalancing with fairness?Can graphons model open multi-agent consensus?How can LLMs optimize multi-agent data science workflows?Can MARL optimize traffic signals under real-world constraints?Can LLMs improve transportation system simulations?Can LLMs build dynamic encryption with multi-agent workflows?How do LLMs impact auction ad bidding strategies?How can LLMs generate robot control policies?How can automata monitor distributed CPS?Can hybrid MA-Pathfinding improve dynamic web app navigation?How can LLMs improve MARL credit assignment and observation?Can debating LLMs detect phishing emails?How can game theory optimize IoT sensor AoI?How can CBFs ensure safe warehouse robot navigation?How can LLMs learn reusable skills for multi-agent cooperation?How can I efficiently control agent formations using Gromov-Wasserstein distance?Can LLMs reliably analyze automotive release data?How can I minimize energy in federated learning using game theory?Can waypoints scale MARL for geo-specific terrains?How can agents best collaborate on sequenced tasks and paths?Can pessimistic MBRL improve CAV multi-agent RL?Can AI agents design metasurfaces faster?How can robots efficiently rendezvous with limited power?How to best aggregate agent costs for multi-agent control?Can LLM personalities improve honeypot effectiveness?How can I build a safe, efficient multi-drone delivery system?How to align asynchronous multi-agent features?Can LLMs reliably search and reason like humans?Can MPC beat MARL for drone delivery path planning?How can imperfect LLM agent actions be optimized?How to plan robot team deployment with limited communication?How can I build a private, efficient multi-drone AI?Can LLMs self-replicate without human help?How to optimize distributed LLMs with limited communication?Can causal reasoning improve multi-agent RL?Can LLMs profitably manage investment funds?How to optimize multi-agent data delivery with limited comms?Can ABM simulate homelessness policy using the Capability Approach?Can LLMs improve autonomous vehicle intersection safety?Can AI agents improve sepsis care?Can LLMs self-learn effective therapy?How can I simulate AV conflicts in CARLA?Can multi-agent RL improve dynamic medical diagnosis?When do specialist agents outperform generalists?How can I secure LLM agent prompts against privilege escalation?Can binary comms achieve consensus tracking with a time-varying target?How can CUPAs improve personal data automation?Can AI musicians collaborate with humans live?Can AI improve dog-handler teamwork in search and rescue?Can archetypes improve OOP for spatial data?Can LLMs build adaptable Hanabi-playing agents?How can I explain AI car behavior causally?Can AI agents diagnose swine diseases faster?How can AUVs hunt covertly using diffusion models?Can LLMs improve MARL agent training?How can I slim down my LLM agents?How can I stably assign tasks to agents with complex preferences?Can LLMs reason better with iterative KG retrieval?When is multi-agent orchestration worthwhile?How to efficiently route agents with conflicting goals?Can Gricean norms improve LLM agent collaboration?How can I build safe, deadlock-free multi-agent navigation?Can multilayer networks improve multi-agent system efficiency?How to speed up multi-agent RL training?Can I control AI-generated driving scenarios?How can LLMs improve multi-agent navigation safety?Can diverse opinions improve AI urban planning?Can I speed up multi-agent field coverage training?How can agents learn in a changing environment?How can LLMs improve multi-agent decision-making?How can I make LLMs cooperate better?How can I optimize energy use in multi-mode robot task allocation?How can I efficiently verify real-time multi-agent systems?How can LLMs improve inpatient diagnosis accuracy?How can agent-based simulation optimize UAV battery recharging in IoT?Can multi-agent NLP stop prompt injection?How can I easily build, share, and use AI agents?How can I build flexible, adaptable multi-robot navigation in rough terrain?How can robots reconnect after unexpected obstacles?How can LLMs improve sim-to-real transfer for robots?Can multi-agent RL optimize hospital capacity and mobility during epidemics?Can GNNs predict LLM workflow performance?How can robots find many sources efficiently?How to best coordinate data-gathering agents?Can LLMs simulate human behavior for policy decisions?Can LLMs build complex Factorio factories?How can media influence responsible AI development?How can LLMs improve multi-robot navigation?How can I build a smarter database using similarity logic?Can sparse networks ensure Q-learning convergence in multi-agent systems?How can I build trustworthy LLM agents?How can LLMs collaborate better via causal reasoning and question-asking?How to optimally assign tasks to agents using optimal transport?How to balance VR streaming privacy and quality?How can I build a scalable, fault-tolerant multi-agent system for Windows UI automation?Can I reuse policies to speed up MARL traffic signal control?Can MARL improve bearing-only multi-robot target pursuit?How can decentralized agents efficiently allocate tasks?Can LLMs improve autonomous driving negotiation?How can multi-agent systems count unique park users?How can I quickly calculate agent influence in large multi-agent systems?How do self-driving cars affect work zone safety?Can one actor personalize policies for diverse intersections?How can I optimize multi-agent paths dynamically?How can I build faster, more efficient LLM multi-agent systems?How can agents self-organize a reliable IoT network?Can AI agents improve Indian cancer care?Can I optimize LQR control with unknown systems using output feedback?Can decentralized MADDPG improve multi-agent LLM apps?How can vision-based agents cooperatively capture a target?Can AI simulate democratic systems?How to build decentralized multi-agent RL in JavaScript?How can I ensure my LLM agents converge to optimal solutions in a human-robot system?How to observe and optimize LLM agent collaborations?How can MAIDs handle incomplete info in multi-agent LLMs?How can graph diffusion models optimize automated bidding?How can AI optimize hybrid delivery routes using AEVs and SDLs?How can I speed up distributed resource scheduling in a dynamic network?How can I build a scalable, reasoning democracy sim in Unity?Can LLMs boost geoscience discovery via multi-agent systems?Can LLMs build robot code from natural language?Can SODA verify multi-agent systems?Can LLMs evolve via population-based methods?Can LLMs play StarCraft II effectively using vision and language?How can AI learn human preferences for better collaboration?Can AI agents analyze TV show narratives?Can LLM emotions improve collective intelligence?How can LLMs improve medical report scoring?Can agents on edge devices handle healthcare tasks?How to build better CAV/robot swarm testbeds?How to incentivize multi-tenant federated learning for LLMs?Can multi-agent AI preserve cultural nuance in translation?How can LLMs improve USV swarm MARL policy?How can I efficiently learn multi-agent rewards from demos?How can I build a decentralized SLAM system for multi-agent apps?How can I speed up multi-agent pathfinding?How can agents reliably commit to cooperative plans?Can LLMs reliably predict medical risks?Can LLMs automate collaborative CAD design?Can LLMs power swarm intelligence agents?Can LLMs power strong game AI agents?How can LLMs build efficient multi-agent systems?Can GNN-VAEs speed up robot traffic scheduling?How can I build LLM web agents easily?Can MARL improve noisy UAV path planning?Can removing a player invert tournament rankings?Can MA-PPO optimize traffic signal control?How to debug LLM multi-agent systems?Can I fairly allocate items online with unknown agent valuations?How can I build a scalable, reward-driven LLM multi-agent system?Can I verify LTL in my neural agent system?How to improve fair division algorithms for few agent types?How can I build a reliable multi-agent fact-checker?How can I best benchmark LLM agent collaboration and competition?Can LLMs improve text-controlled time-series generation?How can mixed-quality human feedback improve MARL reward functions?How can I quantify network resilience for AI-driven cyber defense?How can I build robust, enterprise-grade AI agents?How effective are LLMs at persuasion and resisting it?How can nucleolus credit assignment improve multi-agent RL coalitions?Can VLMs improve robot behavior prediction iteratively?Can RL optimize long-term service workforce?Can LLMs solve MAPF deadlocks?How can AI agents balance performance and user preference?Can FDQN improve victim tagging speed?How can LLMs build better podcasting agents?Can LLMs simulate human aversion in bond market trading?How can agents index presentation videos better?How can I scale MARL for large web apps?How can LLMs predict robot arrival times in complex environments?February 2025How well do LLMs persuade in games?How can LLMs best collaborate in complex tasks?Can MARL optimize AV routes in city traffic?How fast does linear consensus converge on nonreversible graphs?How can I improve multi-agent pathfinding in complex web apps?How to build weighted gossip network matrices?How can I detect rogue robots in a swarm?How do social biases skew online ratings?How can I scale communication in large-agent RL systems?How risky is manipulating multi-agent systems?Can LLMs improve counselor training simulations?Can AI agents hide messages beyond space-time?How can I build fair multi-agent systems?How can I build scalable LLM multi-agent apps easily?Can Q-learning agents avoid collusion in congestion games?Can LLMs improve taxi routing efficiency?How can AVs safely navigate unpredictable human drivers?How best to make decisions in multi-agent LLMs?How to balance AI agent security and collaboration?How can ADMM assign security tasks in a distributed robot system?Can LLMs fix smart contract vulnerabilities?Can agents solve complex graph problems better?Can planning boost MARL sample efficiency?Can LLMs reason better with charts than text data?Can AI coach human teams effectively?How can agents optimize IoT irrigation?How can I efficiently sparsify graphs for LLM agents?How can I align multiple LLMs in a complex system?Can LLMs speed up multi-robot behavior tree planning?How can I train agents for diverse teamwork?How can LLMs improve multi-agent cooperation through ToM?Can LLMs improve interactive motion analysis?How can LLMs control multi-lane convoys?How to model last-mile delivery using multi-agent simulation?How can AI coordinate fire engines and traffic lights?How can agents efficiently share hints in multi-armed bandits?How can RL optimize multi-robot task allocation?Can LLMs manipulate market sentiment?How can MARL optimize traffic signal control?How can self-evolving agents automate cross-app tasks?How can I transfer learned swarm behaviors from simulation to real robots?How to best order agent actions in MARL?How does agent disagreement improve LLM MAS adaptability?Can RL improve blockchain security against strategic mining?How can I make federated LLMs safer?How can LLMs detect shifting alliances in natural language games?How fair are DAG ledgers against transaction reordering attacks?How can masked autoencoders improve multi-agent RL generalization?Can LLMs improve multi-agent autonomous driving?How can LLMs improve multi-agent credit assignment?How can agnostic nodes impact consensus in multi-agent systems?Can MDS improve LLM influence maximization in multilayer networks?Can agents optimize data center cooling?How can LLMs coordinate multiple agents?Can causal modeling improve scalable multi-agent RL?How can I mitigate risks in complex LLM multi-agent systems?How can LLMs coordinate with human agents in apps?How best to choose LLMs for compound AI systems?How can I rank stable multi-agent strategies in dynamic games?Should we share or protect AI agent discoveries?How can LLMs best communicate in multi-agent systems?How can LLMs learn optimal multi-agent task decomposition?Can multi-agent LLMs improve query analysis?How can agents strategically cause effects?Can LLMs improve financial trading robustness?How do LLMs diffuse info in asymmetric networks?How can LLMs implicitly repair noisy communication?How can DPO improve multi-guided diffusion for realistic traffic scenarios?Can LLMs reliably encode expert knowledge into logic for web apps?Can agents rationally enforce properties in concurrent games?How to optimize Challenge the Champ tournament seeding?How can I efficiently optimize many agents' shared payoff?How can hybrid traffic laws improve mixed CAV/HDV flow?How can hypernetworks optimize multi-agent system composition?How can LLMs keep medical Q&A current?Can LLMs build their own tools?How can I secure my LLM multi-agent system?Can I verify MAS updates using ATL extensions?How can LLMs manage personalized learning agents?How can LLMs collaborate better on complex tasks?How do perception biases affect modal choice simulation?How do personas affect LLM agent auction success?How can I efficiently route LLMs in multi-agent systems?How can I build better LLM agents with tools?How can I fairly rate LLMs in multi-agent games?Can LLMs automate social science content analysis?How can LLMs handle real-time human-AI collaboration?How can LLMs improve embodied multi-agent collaboration?Can LLMs improve multi-agent planning efficiency?How can parallel decoding improve multi-agent warehouse routing?How can I quickly find optimal agent coalitions in large-scale systems?How can I measure AI explainability?How can I optimize communication in multi-agent RL?Can GNNs better explain multi-agent communication?How can a single agent plan effectively in a multi-agent system?How can agents improve VLMs without bigger models?Can small buffers stabilize learning router queues?How can I model traffic rules for autonomous vehicle interaction using multi-agent systems?How to efficiently reuse learned skills in offline multi-agent RL?Can multi-agent RL improve power grid control?Can AI agents improve medical image diagnosis?How can I simulate UAV-AGV pathfinding?How can KIMAS improve LLM multi-agent app performance?How to distribute differentiation without a leader?How can I best combine large and small LLMs for web agent tasks?How can LLMs balance conflicting stakeholder preferences in decisions?Can agent-based simulations reliably evaluate policy outcomes?How can I build resilient, quantized consensus in a multi-hop agent network?Can no-regret learners survive in markets with Bayesians?Can decentralized agents improve real-time railway scheduling?How can I automate complex LLM workflow tuning?How can I make my LLM GUI agent robust to varying initial states?How can I optimize multi-agent data freshness for better perception?Can LLMs generate diverse query expansions?How to build consensus without social influence?How can I optimize warehouse robot task assignment and pathfinding?How can we measure AI-human teamwork effectiveness?How to build fair multi-agent AI systems?Can decentralized agents learn shared value functions?Can EvoFlow evolve cheaper, diverse LLMs for multi-agent apps?Can DRL optimize vehicular task offloading?Can GAMA and Unity build better VR ABMs?How can LLMs learn social deduction via multi-agent RL?Can neural flows improve multi-agent game learning?How can I specialize LLMs in a multi-agent system efficiently?Can distributed Kalman filtering improve GP field estimation in WSNs?How can I build a decentralized, scalable Gaussian Process ensemble for my multi-agent app?How can I fairly allocate billboard ad slots?How can agents cooperate better in complex pathfinding?Can LLMs have personalities?How can I plan robot arm & base movements for cooperative object transport?Can agents simulate healthcare system resilience?How can rogue agents in LLMs be prevented?Can CAs optimize wealth in a Prisoner's Dilemma?How can LLMs optimize robot soccer team movement?How resilient are multi-agent optimizations to attacks?How can I safely optimize multi-agent control with unknown dynamics?How can I better assign rewards in multi-agent RL?How can I model decreasing road costs in multi-agent traffic assignment?How can uncoordinated AI agents coexist harmoniously?How can I build cost-effective, adaptable multi-agent LLMs?How to pick diverse, influential network nodes?How can agents reach consensus without communication?How can Elo ratings handle counter-strategies in real-time?How can I improve multi-agent RL exploration?How can I measure AI multi-agent system risk?How can LLMs learn fair resource allocation?How can PPO fairly distribute rewards in multi-agent games?How can LLMs improve multi-agent communication efficiency?Can LLMs improve MARL credit assignment?Can multi-agent RAG improve online learning?How can I make my MARL agents robust to coordinated attacks?How can I improve multi-agent RL collaboration with limited information?Can LLMs collaboratively reason using scene graphs?How can I best combine different value decomposition methods for multi-agent RL?How to minimize group trip cost with multimodal journeys?How can LLMs improve medical image analysis?How can I control opinions in a social network?Can expected return symmetries improve multi-agent coordination?How can I efficiently allocate tasks to autonomous vehicles using graph neural networks?Can multi-agent LLMs improve cognitive concern detection?How to best design prompts and topologies for effective LLMs?Can MARL optimize sustainable maritime logistics?How can bottom-up reputation improve multi-agent cooperation?How can I reduce overestimation in multi-agent Q-learning?How can we make LLM-MAS reliable?Can LLMs improve stock analysis accuracy?How can agents efficiently cooperate with limited communication?How can agents justify actions in regulated data exchange?How can AI agency mitigate generative AI harms?How can I improve RL multi-vehicle training speed?Can AI agents improve real-time music improvisation?Can LLMs fairly distribute resources?How can I design better multi-agent RL tasks needing DOL?Can heterogeneous agents share RL policies privately?January 2025How to minimize distortion in k-committee elections with limited cardinal information?How can I build a resilient distributed observer for CAV platoons?Can zero-growth economies be stable in agent-based models?How can I distribute tasks efficiently in a multi-agent MEC system?Can LLMs autonomously manage microservices?Can layered prompts improve LLM agent reasoning?Can I learn multi-agent preferences offline?Can language games unlock ASI via multi-agent LLMs?Can quantized MPC optimize platooning?How can CAVs cooperatively resolve conflicts with HDVs?Can LLMs simulate tax evasion emergence?How can I improve multi-agent LLMs via dynamic agent replacement?Can LLMs build game trees from text?How can we build trustworthy AI agent economies?Can AI optimize SDN load balancing?How can I improve federated learning generalization without sharing data?How can XAI simplify MADRL for V2X resource allocation?Can graph attention Q-learning improve ride-pooling?How can I improve multi-agent trajectory prediction at intersections?How can MARL optimize wind farm power output?Can shared memory improve multi-agent pathfinding?How can I build safe, scalable multi-agent RL apps?How can customer-led task allocation optimize satellite services?How do LLM reward functions' language impact fairness and performance?How can I pick the best LLM agent for a task?How can hierarchical RL improve multi-UAV combat coordination?How can LLMs learn strategies in multi-leader Stackelberg games?Can offline MARL improve RRM efficiency?Can LLMs automate 3D film production?How can LLMs learn division of labor for collective intelligence?How can I improve multi-agent pathfinding efficiency?How can agent termination improve MARL convergence?Can quantum computing speed equitable disaster recovery?Can experience replay stabilize MARL beyond replicator dynamics?How can LLMs edit PDF charts via natural language?How transferable are adversarial attacks on shared backbones?Can zero-determinant strategies control payoffs in continuous games?How can AI agents participate in digital markets?How can UAVs share data for faster multi-task federated learning?Can grouped training improve large-scale MARL?How can graph coloring speed up multi-agent planning?How can RL optimize multi-agent drone tracking?Can global awareness improve MARL's sample efficiency?Can LLMs automate Hong Kong legal translation?Can multi-agent AI optimize complex processes better?How can an AR agent proactively help users with tasks?How can I build adaptive, two-layer agent models?How can I make distributed agents truthfully cooperate?How can LLMs control robots in the real world?How can game symmetries speed up Nash equilibrium computation?How can agents cooperate with limited info?How can voting rules ensure fair candidate rankings?How can AI optimize robot task allocation?How do spreader types impact information cascades in networks?Can shared scheduling prevent UAM collisions?Can LLMs improve self-adapting holonic systems?How can MARL optimize railway pricing?How can LLMs dynamically adjust multi-agent workflows?Can single-LLM prompts mimic multi-agent systems?How can I route queries efficiently across LLMs for accurate answers?Can CCBS reliably solve continuous-time MAPF?How to fairly allocate resources with Latin Square constraints?How can I optimize LLM agent teams using hierarchical RL?Can multi-agent HDRL improve portfolio optimization?How can agents trade IP using blockchain?How can I test my LLM cloud agents?Can small LLMs in a multi-agent system handle complex bioinformatics tasks?How can I prevent undesirable AI agent behavior?How can VLMs improve AMoD dispatching and motion planning?How can we make self-driving cars socially acceptable?Can LLMs fully understand geologic maps?How to efficiently allocate scarce resources in a multi-agent system?How can hypernetworks improve multi-agent coordination?How can I scale safe multi-agent control using GNNs for STL?How can multi-agent simulation improve city risk mitigation?How can I make AI agents collaborate despite communication delays?Can agents predict urban crime patterns?How can robots predict worker actions using decentralized graph networks?Can asymmetric agents ever share knowledge?Can agents improve schema matching?How can multi-agent LLMs improve educational AI inclusivity?Can LLMs reliably build enterprise models using knowledge graphs?How can I build adaptable, cooperative AI agents?How can agents reach partial agreements reliably?Can I verify my multi-agent RL system?How can I improve multi-agent pathfinding with limited communication?How can I improve MARL agent communication efficiency?How can agents communicate effectively despite varying visibility?How can I fix pose errors in V2X collaborative 3D object detection?How can RL optimize on-demand mobility?How can global games optimize multi-robot task allocation?How can I infer agent goals from observations using deep RL?How can I speed up multi-robot path planning?Can LLMs improve CEP for video queries?How can KG embeddings improve support ticket routing?Can robots reliably aggregate without computation?Can deep RL efficiently solve large-scale MFCGs?How to build standard LLM agent systems?How can LLMs best teach interactional intelligence?Can Q-learning agents reliably cooperate?How can agents best share and use information efficiently?Can I use symmetries to improve MARL scalability?How can I incentivize agents to explore better together?Can multi-agent LLMs improve engineering project solutions?How do network constraints impact market equilibrium in multi-agent systems?How can I control agent spatial behavior in LLMs?How can LLMs build better educational multi-agent systems?How can agents share surprise for better adaptation?How to optimize agent strategy updates in population games?December 2024How can decentralized agents efficiently navigate a continuous space to reach goals?How can LLMs improve robot teamwork?Can shared memory improve AI team foraging?How can I safely explore team constraints in multi-agent RL?Can multi-agent Q-learning optimize mobile network load balancing?How can game theory improve MARL for large-scale apps?Does minority homophily hinder network opportunities?How can MARL handle agent constraints and coordination?How can HRL improve large-scale robot task planning?Can LLMs improve legal AI decision-making?Can LLMs learn interpretable human behavior models?How can I make my LLM agents safer and more explainable?How can agents best coordinate data collection in dynamic environments?Can federated actor-critic reliably learn across diverse environments?How can LLMs learn medical norms in distributed healthcare?Can LLMs power decentralized GameFi agents?How can I asynchronously train human-AI teams in complex games?How does agent diversity boost collective AI learning?Can Bayesian RL improve multi-intersection traffic signal control?How can LLMs power multi-agent systems?How can agents, Sims, and Assistants work together?How can LLMs optimize multi-agent AI systems?How can hierarchical agents optimize UAV cluster reconfiguration?Can LLMs self-design better reasoning workflows?Can multi-agent RL optimize dynamic task assignments?Can spatial reasoning improve MARL efficiency?Can convolutional learning speed up traffic signal AI?How can agents explore cooperatively and efficiently?How can agents learn to cooperate better with limited information?How can I assess agent importance in my MAS?How can we reliably detect dangerous AI capabilities?How can diverse prompts improve small LLM reasoning?How can I better assign rewards in multi-agent RL?August 2024Can sequential planning efficiently solve multi-agent problems?December 2024How can I build robust multi-agent game equilibria?How can LLMs build collaborative data agents?Can AI dominate online belief systems?How can I build fair, norm-learning AI agents?How can LLMs solve complex data analysis tasks?How can agents build AI model pipelines?How can I improve robot swarm localization accuracy in sparse, noisy environments?How to plan robot paths with limited communication?Can I efficiently calibrate traffic models using road speed data?Can decentralized agents reach equilibrium prices through bilateral negotiation?How can AI optimize healthcare during war and pandemic?Can ROMAS improve LLM-based database monitoring?How can I improve robot pathfinding in complex environments?Can I speed up distributed QP solving with deep learning?Can suggestion sharing improve MARL collective welfare?Can LLMs build P&ID diagrams from text?Can coupled agent homeostasis create prosocial AI?How can deep learning improve resilient multi-agent decisions?How do norms emerge in multi-agent systems?How can I scale asynchronous multi-agent pathfinding?Can coalitions manipulate knockout tournaments adaptively?How can shared action suggestions improve multi-agent planning efficiency?How can multi-agent RL optimize SAGIN task scheduling?Can I speed up multi-agent genetic programming?How can I optimize multi-robot graph coverage with constraints?Can LLMs learn cooperation in multi-agent systems?How can I simulate realistic cooperative perception in autonomous driving?How can I model complex agent beliefs about rationality?Can LLMs automate biomedical research?Can we efficiently compute conditional approval votes?Can Bach in Scala secure asynchronous communication?Can V2V networks improve autonomous vehicle safety in occluded scenarios?How can Web3 incentivize human-AI cooperation?How does LLM mirroring impact alignment?How can agents simulate speculative token trading?Can conventions improve Hanabi MARL performance?Can LLMs improve transport system modeling?Can AI agents automate industrial diagram design?Can I efficiently model check asynchronous agents with memory?Can AI predict warehouse tasks to improve robot efficiency?Can HyperGraphOS improve LLM agent apps?How can LLMs improve Minecraft multi-agent collaboration?Can debating LLMs detect breaking event rumors?How can LLMs enable safe, fast, multi-robot navigation?How can I improve LLM agent feedback?Can ToM predict cyberattack trajectories?Can hypernetworks improve multi-agent RL efficiency?How can I improve LLM agent pathfinding efficiency?How can LLMs extract ABM code from prompts?How can agents collaboratively track targets in a decentralized system?How can I predict opponent robot behavior without knowing their exact plans?How can I efficiently allocate tasks among LLMs?Can AI improve highway traffic flow using ACC?Can hyper-optimized agents hinder collective AI performance?How can we prevent AI agents from causing harm?How to identify interactions in a complex multi-agent system?How to mitigate malicious agents' impact on opinion evolution cost in MAS?Can LLMs improve construction project decisions?How can I speed up multi-robot path planning?How can I make my MARL agents fault-tolerant?How can I build robust AI agents using game theory?Can a neural network optimize satellite magnetorquer power?How can I predict agent behavior using short-sightedness?November 2024Can LLMs improve portfolio management?How can I build robust MARL agents with intermittent observations?Can MARL improve TSP in traffic signal control?How can we build ethical generative agents?How can local info improve robot swarm task allocation?How can I improve autonomous vehicle trajectory prediction accuracy and safety?How can agents best manage video editing tools?How does network density spread misinformation?How do governance systems affect agent behavior in simulated economies?How do social norms shape AI agent emotions?Can a private AI be safely switched off?How can I efficiently co-design robot morphology and behavior?Can AI automate software feature integration?How do embodied neural agent interactions affect group decisions?How can LangGraph+CrewAI improve LLM multi-agent apps?Can LLMs boost creativity in multi-agent systems?Do multi-agent LLMs improve robot interaction?How many vehicles optimize collaborative SLAMMOT?Can LLMs better simulate power systems with multi-agent feedback?How to control satellite formations using magnetic fields?Can LLMs model regulatory compliance?Can LLMs improve social network simulations?How can we reliably evaluate LLMs without ground truth?Can LLMs build multi-agent game world models without training?Do naive bandit learners collude?Can φ⁴ lattice fields model financial markets?How can we fairly serve diverse LLMs?How can LLMs improve multi-agent consensus?How can I optimize agent guidance for dynamic MAPF?Can agents optimize CO2 transport?How can AI optimize mobile charger routes for long-lasting sensor networks?Can multi-agent DRL safely merge vehicles onto highways?Can AI agents improve clinical trial matching?How do human and GPT ethics differ in multi-robot systems?How can I train agents for game-theoretic motion planning?Can Hybrid Event-B verify autonomous system safety?Can model checking improve robot welding sync?How can LLMs generate diverse human-like agents for better cooperation?How to build robust controllers for LLM robot collectives?How can hybrid clouds handle complex AI workloads?How to architect scalable LLM apps?How can I build truly versatile AI agents?Can robots grow by consuming others?Can evolutionary games improve multi-agent pathfinding?Can evolving Q-learning agents cooperate?How can robot swarms learn better via communication?Can LLMs creatively deceive in Balderdash?How can agents communicate meaningfully in collaborative tasks?Can MARL model ESG investment's climate impact?How can I route LLM requests efficiently with continuous learning?How can smart agents improve multi-UAV search efficiency?How can multi-hop relays improve resilient consensus in leader-follower systems?How can I make robot collision avoidance less conservative?Can I find robust Nash equilibria efficiently in data-driven games?How to uniquely implement largest equilibrium in dynamic games?How to optimally position multiple spacecraft to explore interstellar objects?How can robot swarms efficiently self-localize for inspection?Can ToM improve AI collective intelligence?How can AI agents self-organize for complex goals?Can MPC optimize multi-agent weighted coverage path planning?How to select high-performing agents for decentralized systems?Can MARL optimize parallel machine scheduling?How to make LLMs use inclusive pronouns?Can cheaper LLMs automate ML tasks?How can LLMs strategize in changing games?How to find gas leaks better with robots?Can offline MARL handle diverse traffic control data?How can LLMs play games rationally?Can LLM agents protect against timing attacks?How to train agents in large populations with limited rationality?Can LLMs manage industrial control autonomously?How can TinyML optimize multi-agent inference for mining machinery?How to train an LLM for multi-task correction?Can LLMs learn to control multiple robots to push large objects?How to safely navigate many robots using dynamic velocity fields?Can vision predict multi-agent behavior?Can LLMs verify human-like behavior in games?Can AI agents build a real-time battlefield map?How can LLMs manage complex tasks with multiple agents?How can LLMs improve C-V2X platooning efficiency with semantic-aware resource management?How to optimize LLM agent cooperation?How to make consistent story videos with AI agents?How can LLMs navigate safely and efficiently in shared spaces?How can LLM agents learn to leverage social structures in adaptive environments?How to improve multi-agent exploration with consensus guidance?How to speed up LLM agent simulations?Can LLMs learn better dispatching rules from big data?How can multi-robots map and explore 3D spaces efficiently?How to speed up LLM communication?Can AI learn to play games *better* than Nash equilibrium?How can I learn hidden interactions in real-time multi-agent systems?How can LLMs learn to adapt to different roles in multi-agent games?How can agents communicate implicitly without explicit messages?How to measure agent responsibility in planning?How to optimize robot coverage with varying energy levels?How can LLMs represent traffic scenes for multi-vehicle collaboration?How can LLMs anticipate actions in multi-agent scenarios?Can LLMs automate post-disaster response?How to rank agents using noisy performance data?Can AI agents build civilizations in Minecraft?How to form platoons that benefit individual drivers?How can agents learn to communicate effectively in multi-agent systems?How to design better mortgage assistance products?How can LLMs manage smart factory robots?How to track evaders with multiple robots?October 2024How to model multiparty interactions in CCS with continuations?How to optimize communication for faster team consensus in multi-agent bandits?How can VAEs and RL optimize network structure for resource management?How to guide agents in a network with limited control?How does network connectivity affect convergence rates in multi-agent systems?How to best evaluate LLM-powered agents?How can MARL optimize drone mission execution with limited battery?How to make robots work together using LLMs?How can LLMs collaborate globally for complex tasks?How do LLM agents interact on large networks?How to design robot swarms for real-world use?Can Jax speed up multi-agent economic simulations?How can we make participatory budgeting fair and efficient?How can LLMs explain their reasoning to humans?How can LLMs learn to communicate effectively in a multi-agent game?Can RL agents fairly stream multimedia?Can AI agents learn to profit in noisy market simulations?How can LLMs power personalized e-commerce recommendations?How to optimize agents with limited bandwidth?Can LLMs build image processing apps?Can KGs improve LLM agent recommendations?Can FastICA separate sources without centralized whitening?How do neural networks evolve for complex agent behavior?Can Mamba-based agents outperform MAT with fewer resources?How does silence impact consensus in social networks?Can GNNs improve MARL for supply chain inventory control?Can LLMs collude in perishable goods markets?How can LLMs coordinate autonomous vehicles safely and efficiently?How can I build efficient MARL-based traffic signal control systems?How can LLMs predict future actions in multi-agent systems?How to build smart, adaptable cyber defenses with LLMs?Can offline data train AI agents for large-scale games?Can LLMs evolve to build entire software?How to plan collision-free paths for multiple agents?Can LLMs work together to analyze graphs?How can agents collaborate to optimize rewards while staying within a cost budget?Can one LLM model handle all sports trajectory tasks?How does opponent learning impact large-scale agent evolution?Can swarms learn like RL agents?How to scale multi-agent control for networks?How can I use simulation to improve multi-robot coordination?How can LLMs learn fair, diverse, and creative strategies in multi-agent games?Can APIs outperform web browsing for AI agents?How to build trust in e-commerce with LLMs?How to make AI agents cooperate with limited information?How to optimize pilot allocation and power for fair, delay-constrained access?How can LLMs truly benefit society?How to train MARL for dynamic agents?How do network connections affect LLM multi-agent safety?How to fuse sensor data for accurate target tracking?Can AI agents learn to be fair while being efficient?How to plan safe, efficient CAV trajectories using V2X?Can RL agents learn to eco-drive in real-world traffic?How hard is it to solve a colored sliding tile puzzle?Can LLMs simulate traffic with natural language?How can LLMs help agents cooperate better?How to optimize multi-agent MDPs with KL control cost?Can LLMs reliably coordinate under attacks?How to train realistic traffic agents for autonomous driving?How secure are AI agents with database access?How well do LLMs really solve problems?How can I train UAVs to navigate unseen environments?Can Natural GaLore speed up LLM training?Can spiking networks control robot swarms?How complex are multi-agent decisions?How do robot platoons navigate crowds?Can I verify human-like strategic reasoning in MAS?Can LLMs simulate fake news spread?Can three valuation types guarantee EFX?How to protect multi-agent apps from attacks?Can LLMs automate mobile tasks efficiently?How can LLMs improve team formation in adversarial games?How to design fair and strategic facility location mechanisms?Can LLMs design alloys faster with AI agents?How to ensure fair rewards in multi-agent systems?How can drones and AR see through walls?How to choose best LLMs for merging?How to explain AI agent action impact in multi-agent scenarios?How to train agents to form and move efficiently?Can influencers manipulate online polls?Do time-varying auctions break revenue equivalence?How many Nash equilibria exist in LQ games?How to avoid collisions in dense multi-agent paths?Can LLMs trade better with fact-subjectivity reasoning?How can LLMs help with automotive safety engineering?How to build efficient multi-agent systems for business?How can AMOD serve Winnipeg's aging population?How to design optimal communication for LLM agents?Can MCTS improve Uno AI with better rewards?Can LLMs orchestrate cross-domain workflows?Can transformers play games in-context?Can LLMs find optimal paths in multi-agent games?How many agents are needed to win a proximity-based vote?Can MADRL agents defend against cyberattacks?How can LLMs improve edge caching in vehicle networks?How to improve LLM knowledge base with feedback?How to build a context-aware AI assistant with multiple LLMs?How does crowd opinion boost AI performance?How can I estimate uncertainty in distributed AI learning on edge devices?How can language help LLMs learn numbers faster?How can LLMs explain multi-robot decisions?How to learn masks for diverse agents in MARL?Can LLMs automate privacy threat modeling?Can LLMs automate ptychography?How do LLMs form conventions and influence society?How to shorten multi-agent paths on graphs?How well do LLMs generate complex workflows?Can AI agents simulate realistic disease spread?How can LLMs learn and adapt without parameter updates?How to teach AI agents safe interaction?Can LLMs handle strategic agents with externalities?How can LLMs help moderate hate speech ethically?Can LLMs break social rules in hierarchy?How to price EV charging with reservations?How can robots form shapes without GPS?Can multi-agent RL fine-tune LLMs better than PPO?Can LLMs diagnose and treat mental health?Can LLMs improve MARL without constant calls?How does social support coordinate agents in online communities?How to detect malicious agents in a multi-robot network?How can I make LLMs better tutors?Can LLMs repair code on SWE-Bench?How do LLMs form factions?How can LLMs debate to evaluate each other?How does group pressure drive consensus in opinion dynamics?How to scale LLM multi-agent control in the cloud?Can LLMs simulate large social systems reliably?Can LLMs automate full AutoML pipelines?How to plan robot paths with diffusion models?How to train LLMs for distributed multi-task learning?How can LLMs learn to solve multi-agent problems?How to train cooperative agents offline with shifting data?How can specialized agents collaborate to write better stories?How to reduce LLM multi-agent communication costs?How to plan paths for swarms of robots?How to plan tasks for LLM agents?Can LLMs learn emergent patterns in multi-agent RL?How to scale MARL for many agents efficiently?Can AI agents perpetuate stereotypes?How to plan for agents with fast replanning?How can human interaction speed up LLM agent planning?How to estimate state with limited communication in dynamic agent networks?How can we test AI safety with resource sharing?How to simulate realistic human mobility for large-scale web apps?September 2024How to locate facilities with uncertain agents?Can MARL optimize material handling throughput?How to safely control robots using uncertain predictions?How can LLMs learn interpretable world models for open-ended agents?Can LLMs simulate diverse viewpoints for better decisions?How can an LLM power a proactive multi-agent office assistant?How can agents learn to cooperate in a one-shot game?How can LLMs explain their decisions in multi-agent systems?How can LLM agents safely navigate with limited sensing?How can LLMs control industrial automation?How to make robots explain their decisions?How can robots collaborate with humans in complex tasks?How can LLMs help agents communicate in ad-hoc teams?How to optimize UAVs for MEC task delay?How do modular autonomous vehicles impact traffic flow?Can offline RL manage radio resources better?Can LLMs make crowds more realistic?Can I find stable matchings in complex networks?How to value information in delayed action planning?Can AI analyze gait to detect muscle disorders?How to fairly fund projects with limited budget?How can MCTS improve CAV coordination?How to control bias in multi-agent systems for task allocation?How to distribute charging loads for EVs using IoT and multi-agents?Can AI assistants prevent pilot spatial disorientation?How can I learn multi-agent utility functions robustly?Can LLMs mimic human collaboration?How can LLMs improve CAV decisions using transformers?Can simple imitation learning beat complex MAPF models?How can LLM agents work together to schedule factory production dynamically?How to plan paths for multiple agents to find information?How do AI agents cooperate under disruptions?How can LLMs think better with inner dialogue?How do AI traders affect market volatility?Can CAVs benefit humans in traffic?How to regulate multi-agent systems without knowing their network?How does diminishing stubbornness affect agent convergence?Can RVs optimize mixed traffic at unsignalized intersections?How can hypergraphs improve multi-vehicle motion prediction?How to stabilize multi-agent learning in non-stationary environments?How can LLMs coordinate robots for efficient task allocation in crowded spaces?How can humans help AI agents work together better?How to use data better in offline MARL?How to verify LLM-generated RTL code?How does robot connection length impact obstacle traversal?How to build modular LLM agents?Can AI agents reliably reproduce scientific research?Can AI shape viral evolution for better therapies?Can DaSH learn multi-robot strategies?How can LLMs understand noisy instructions?Can LLMs scale ABMs to millions of agents?How can PPO train UAVs to explore?How to plan paths for robots in a continuous space?How to make LLM agents fair using utilitarian optimization?How to safely coordinate AI agents for real-time control?How to make AI agents flock using zones?Can human operators safely supervise large AV fleets?How to optimize multi-agent submodular maximization with unreliable communication?How to safely control thousands of robots in a cluttered environment?Can nudges boost cooperation in multi-agent games?How to safely navigate agents using VOs and CBFs?How can I target ads in transit systems using AI agents?How to allocate tasks in unknown, dynamic environments?How does ToM impact real-time human-AI teamwork?How can LLMs build reliable AI swarms in untrusted environments?How to sync & map changing network connections?How does learning speed impact coordination in multi-agent systems?Can RL agents find paths in social networks without global knowledge?Can LLMs improve multi-agent perception efficiency?Can LLMs create social agents?How to generate diverse maps for multi-agent path finding?How to optimize communication for multi-agent RL?How to train LLMs with unstructured text data?How can STEADI principles guide responsible blockchain development?How to find a source in 3D with robots using Voronoi formations?How to build scalable, differentiable multi-agent foraging simulations?How can agents adapt to misinformation in games?How to train agents in decentralized games for approximate Nash equilibria?Can hypergraphs improve traffic signal control?How to train multi-vehicle navigation in unstructured environments faster?How to safely plan trajectories with fewer agents?Can LLMs solve multi-agent optimization problems faster?How to coordinate CAVs and HDVs in traffic?How can I simulate and compare decentralized robot task allocation algorithms?How to model dynamic, sparse correlations in multi-output GPs?How train agents centrally, act decentrally?How can incentives align agents for social good?How can LLMs learn emergent language?How to secure CAV perception data sharing?How to build smart LRV monitoring with multi-agents?How can LLMs collaborate to personalize multimodal AI search?How to train multi-agent AI with limited human feedback?How can GNNs optimize UAV AoI in unknown environments?How to optimize LLM agent networks for performance?How to group similar AI agents for faster simulations?How to build smart robot swarms for efficient pathfinding?Can LLMs model social media impact on financial markets?Can LLMs solve large-scale pathfinding problems?How to improve web agent planning?Can LLMs learn social norms through dialogue?August 2024How to analyze team composition balance in PvP games?How to localize multiple sources using TDOA measurements in 3D?How can cognitive models improve LLM-based prediction of user engagement? How can LLM agents with different roles collaborate for efficient planning? How to optimize non-smooth functions with linear constraints using block-coordinate methods? How to align LLMs with rules without human annotations? How can I model viral spread using multi-agent simulation in buildings? How to simulate realistic, safety-critical traffic for AV testing? Can LLMs infer network topology in multi-agent systems? How can LLMs help experts build better agent-based models? May 2024Can LLMs translate complex literature better than humans?

Made by @miklosme

Open source on GitHub

How can I best benchmark LLM planning?

PLANET: A Collection of Benchmarks for Evaluating LLMs' Planning Capabilities

April 22, 2025

https://arxiv.org/pdf/2504.14773

This paper introduces PLANET, a comprehensive survey of benchmarks for evaluating the planning capabilities of Large Language Models (LLMs), particularly in the context of agentic AI. It categorizes existing benchmarks based on application domains like embodied environments, web navigation, scheduling, games, and task automation.

Key takeaways for LLM-based multi-agent systems include:

Emphasis on planning: The paper highlights the crucial role of planning in LLM agents for complex task completion.
Benchmark categorization: Provides a structured overview of various planning benchmarks, facilitating selection for specific multi-agent scenarios.
Multi-agent environments: Benchmarks like GAMA-Bench and AgentBoard specifically test LLM decision-making in competitive and collaborative multi-agent settings.
Gaps in current benchmarks: Identifies areas needing improvement, such as the complexity of world models, long-horizon tasks, planning under uncertainty, and multimodal planning support. This is particularly relevant for building robust multi-agent systems.
Relevance of games: Games are emphasized as valuable testbeds for strategic planning and multi-agent behavior in LLMs.
Shift towards embodied and web-based agents: The survey demonstrates a growing trend towards testing LLMs in interactive environments like simulated households (ALFWorld, VirtualHome) and websites (WebArena, Mind2Web), pushing beyond text-based reasoning. This trend is directly applicable to building real-world multi-agent applications.