Join us in Room 100C on 21st October 2024, 9:00 am :)
Session Presentation: Frontiers of Langue Language Model-Based Agentic Systems - Construction, Efficacy and Safety (PPTX)
The presentation “Frontiers of Language Model-Based Agentic Systems: Construction, Efficacy, and Safety” offers a comprehensive exploration of the development, effectiveness, and safety considerations of agentic systems powered by large language models (LLMs). Below are summarized notes for each section, tailored for the CIKM ACM conference.
Presented by: Jia He
This section introduces the concept of agentic systems that leverage LLMs to perform tasks autonomously. It highlights the evolution of LLMs from simple language processors to sophisticated agents capable of decision-making and task execution. The discussion emphasizes the significance of integrating LLMs into agentic frameworks to enhance their capabilities and adaptability across various domains. Additional Reading. Essentially Agents can be thought of as LLM (or GenAI) model calls + expandable capabilities in the form of Tools, Memory, and Planning/Orchestration
Presented by: Jia He & Kabir Walia
Here, the focus is on the methodologies employed in building agents powered by LLMs. Key components include data collection, model training, and the integration of reinforcement learning techniques to fine-tune agent behaviors. The section also addresses architectural considerations, such as modular design and scalability, to ensure that agents can handle complex tasks efficiently. We go over three key components:
Presented by: April Hazel, Tushar Dhadiwal
In this section, we go over differentiators that makes multi-agent systems different from single-agent systems. We cover topics such as collaboration patterns, notable multi-agent system frameworks such as AutoGen and present a hands on demo for configuring a multi-agent system. Find the notebook here.
Presented by: Jenny Chen, Reshmi Ghosh
Evaluation choices for Agents are a non-trivial task. The session will go over how to make decisions, understand tradeoffs related to single-trajectory, end-to-end evaluations, automated evaluations using LLM Judge vs. grounded, deterministic metrics, and examples of how to think about and standardize evaluations procedure.
Efficient frameworks - This part evaluates the performance metrics of LLM-based agents, including accuracy, adaptability, and user satisfaction. It presents case studies where these agents have been deployed, demonstrating their ability to understand context, generate human-like responses, and learn from interactions to improve over time. The discussion also covers the challenges faced in measuring efficacy, such as handling ambiguous inputs and maintaining coherence in extended interactions. [Additional Readings:]
Presnted by Reshmi Ghosh
Additionally, safety is a critical aspect addressed in this section, focusing on the potential risks associated with autonomous LLM agents. Topics include the prevention of harmful outputs, ensuring fairness and bias mitigation, and safeguarding user privacy. The section references recent studies on AI safety, such as the work on TrustLLM, which establishes benchmarks across dimensions like truthfulness, safety, and fairness citeturn0search6. Additionally, it discusses frameworks like TrustAgent, designed to enhance the safety of LLM-based agents through strategic planning and adherence to predefined constitutions
The final section outlines prospective avenues for advancing LLM-based agentic systems. It emphasizes the need for interdisciplinary research combining machine learning, ethics, and human-computer interaction to develop agents that are not only effective but also trustworthy. The section also highlights the importance of creating robust benchmarks, such as SafeAgentBench, to evaluate the safety and task planning capabilities of embodied LLM agents citeturn0academia11. Emerging trends like the incorporation of human-centered design principles in AutoML paradigms are also discussed, promoting collaborative design of ML systems that integrate human expertise and values citeturn0search7.
In summary, the presentation provides a thorough examination of the current landscape and future prospects of LLM-based agentic systems, with a strong emphasis on their construction, efficacy, and safety.