On Agentic AI Frameworks

Agentic AI — systems that autonomously reason, plan, and act to achieve complex objectives powered by advanced language models, memory architectures, and external tools, represent a shift from reactive AI to proactive agents capable of independent decision-making. In industries ranging from logistics to healthcare and education, agentic AI is already optimising supply chains, assisting in medical diagnoses, and personalising education. This seemingly quantum-leap in autonomy introduces technical, ethical, and societal challenges that demand careful examination.

This article explores the leading frameworks driving agentic AI, evaluates their strengths and limitations, and considers the broader implications of this technology. Through a detailed analysis, we aim to provide a clear understanding of the current state of agentic AI and its trajectory.

From Reactive Systems to Autonomous Agents

The history of agentic AI systems reflects the broader evolution of artificial intelligence, tracing a path from simple rule-based automation to the sophisticated, autonomous agents that define modern Machine Intelligence. An agentic AI system is characterised by its ability to act independently, make decisions, and adapt its behaviour based on its environment or predefined goals. This autonomy sets it apart from traditional algorithms, which typically follows rigid instructions without initiative.

Early Beginnings: 1950s–1960s

The roots of agentic AI lie in the 1950s and 1960s, when researchers began experimenting with rule-based systems and expert systems. These early systems, such as the Logic Theorist (Newell & Simon, 1956) and Dendral (Lindsay et al., 1968), were designed to mimic human decision-making within specific domains. The Logic Theorist could prove mathematical theorems using predefined rules, while Dendral assisted chemists by hypothesising molecular structures. Though these systems exhibited a rudimentary form of agency through decision-making, their reliance on static rules limited their adaptability and learning capacity. Nonetheless, they established artificial intelligence as a field capable of producing decision-making entities, laying the foundation for future advancements.

The Rise of Learning Systems: 1980s–1990s

The 1980s and 1990s marked a turning point with the introduction of machine learning, particularly reinforcement learning (RL). RL enabled AI systems to learn from interactions with their environment, adapting their actions to maximise rewards. A standout example is TD-Gammon (Tesauro, 1995), which used RL to achieve championship-level skill in backgammon through self-play. This period also saw the development of autonomous robots, such as those based on Rodney Brooks’ subsumption architecture (Brooks, 1986), which allowed robots to navigate and respond to their surroundings without human intervention. These advancements represented a reach toward true agency, as these systems began to exhibit adaptability and independence.

Multi-Agent Systems: 2000s

In the 2000s, attention shifted to multi-agent systems (MAS), where multiple AI agents collaborated or competed to solve complex problems. Projects like RoboCup (Kitano et al., 1997), which aimed to create teams of autonomous robots capable of playing soccer, showcased the potential of agentic AI in coordinated, social contexts. MAS introduced challenges in communication, negotiation, and strategy among agents, but it also highlighted the power of collaborative intelligence. This era expanded the scope of agentic AI, emphasising its ability to operate within dynamic, multi-entity environments.

Deep Learning Revolution: 2010s

The 2010s ushered in a transformative era with the rise of deep learning. By leveraging deep neural networks, systems could process vast datasets and learn complex patterns, enabling breakthroughs in areas like natural language processing and image recognition—key capabilities for interacting with the world. AlphaGo (Silver et al., 2016), developed by DeepMind, exemplified this progress by mastering the game of Go, defeating human champions through strategic planning and intuition. Deep learning also fueled the creation of advanced conversational agents, such as virtual assistants, which began to exhibit agentic traits by performing tasks and engaging in dialogue with minimal human oversight.

The LLM Era: 2020s

The 2020s have been defined by the emergence of large language models (LLMs), such as GPT-3 (Brown et al., 2020) and its successors. LLMs can generate human-like text, understand context, and perform reasoning-based tasks, blurring the line between traditional AI and agentic systems. These models often act as agents themselves, making decisions and executing actions based on their training. For instance, an LLM can write code, solve math problems, or simulate conversations without explicit instructions for each task. This has led to new agentic frameworks where LLMs serve as reasoning cores, enhanced by tools and memory systems to boost their autonomy.

As you can see Modern Agentic AI marks a departure from an earlier reactive paradigm. By integrating large language models (LLMs) as reasoning engines, these systems can process natural language, retain context through memory systems, and interact with external tools — such as web browsers or code interpreters — to act on their environment. This combination allows agents to break down complex tasks, adapt strategies in real time, and operate with a level of initiative previously unattainable. As AI researcher Dr. Emily Chen notes, “Agentic AI represents a move from executing predefined tasks to navigating ambiguity and making informed decisions.”

A Very Personal View

Over the last several years I’ve worked with and experimented with a number of Agentic Frameworks. Below I’ve tried to give each of the four contenders a fair crack at the whip – LangChain, CrewAI, AutoGen, and Agno. There are many others but these are the ones I’ve used – sometimes in play and some in anger.

LangChain

LangChain is a versatile framework built to simplify the development of intelligent, context-aware agents by leveraging a highly modular architecture. Its design is meant to empower developers to combine large language models (LLMs), memory systems, and external tools into customisable workflows, making it a suitable base for projects that demand flexibility and precision.

Key Features

Modular Components: LangChain’s standout feature is its plug-and-play structure where developers can mix and match LLMs (e.g., GPT-4, LLaMA, or Claude), memory modules, and tools like web browsers or APIs, allowing for fine-tuned agent behaviour. This modularity ensures that the framework can adapt to a wide variety of tasks.
Memory Systems: Context retention is a core strength as LangChain supports multiple memory types—short-term for immediate recall, long-term for extended interactions, and vector-based for semantic understanding—enabling agents to remember past interactions and maintain coherent dialogues or workflows.
Tool Integrations: Agents can tap into external resources, such as real-time data feeds, code interpreters, or search engines, expanding their problem-solving scope. This allows them to fetch up-to-date information or perform actions like running calculations or querying databases.
Chains: The concept of “chains” defines LangChain’s workflow logic. These chains are sequences of steps—like retrieving data, processing it with an LLM, and generating an output—that developers can design to suit specific tasks, from simple responses to multi-stage reasoning.
Prompt Engineering Utilities: LangChain also includes tools to create and refine prompts, ensuring that LLMs produce accurate, task-specific outputs. This reduces trial-and-error and enhances reliability.

Limitations

Complexity: The large number of options can overwhelm newcomers or teams with limited AI experience. Configuring components like memory types and tool integrations requires a learning curve that may delay rapid prototyping.
Performance Overhead: The layered architecture—juggling LLMs, memory, and external tools—can introduce latency, especially in time-sensitive applications. This makes LangChain less ideal for scenarios where split-second responses are critical.
Cost Implications: Heavy reliance on LLMs and external APIs can rack up expenses, particularly in high-volume or enterprise settings where frequent calls are needed.

Khush’s Take

It’s like having a Swiss Army knife—modular components, memory systems, and tool integrations (like web search or code execution) which allows me to build exactly the agent I need. If my project calls for flexibility and context retention—say, a research assistant that tracks past queries and pulls real-time data—LangChain feels like a no-brainer. The massive community support means I’ve got plenty of plugins and resources to lean on, which is a huge plus.

But – LangChain is a beast. The learning curve is steep, and all those layers can bog things down. If I’m building something that needs lightning-fast responses—like a live customer support agent —the performance overhead could be a problem. Then there’s the issue with costs for LLM calls these days and integrations are still a factor. Running LangChain in production might strain my budget. I’d use it if my project demands heavy customisation and I’ve got time to tweak it. Otherwise, I’d look for something lighter and quicker to deploy.

CrewAI

CrewAI focusses on multi-agent collaboration, enabling teams of specialised agents to work together on tasks. Rather than relying on a single, all-purpose agent, CrewAI tries to mimic human teamwork by assigning distinct roles and responsibilities to each agent within a “crew.” This makes it a good choice for projects requiring task decomposition and coordinated effort, such as research pipelines or operational planning.

Key Features

Role-Based Agents: Each agent in a crew has a defined role—like researcher, editor, or planner—tailored to specific skills. This specialisation ensures that tasks are handled by the most suitable agent, boosting efficiency and output quality.
Task Management: CrewAI excels at breaking down complex goals into manageable subtasks, automatically distributing them across the crew. This orchestration minimises manual oversight and keeps workflows on track.
Collaboration: Agents communicate and share outputs within the crew, enabling seamless handoffs. For instance, one agent might gather data while another analyses it, with the crew ensuring all pieces align toward the final objective.
Parallel Execution: By allowing agents to work concurrently, CrewAI attempts to accelerate project timelines. Multiple tasks—like data collection and report drafting—can happen simultaneously, reducing bottlenecks.

Limitations

Scalability Constraints: CrewAI performs well with small to medium crews but may falter with dozens of agents or sprawling workflows. Its still-evolving infrastructure can lead to instability in large-scale deployments.
Limited Tool Ecosystem: Unlike LangChain, CrewAI offers fewer integrations with external tools, restricting its versatility for tasks requiring diverse data sources or functionalities.
Error Management: The framework lacks sophisticated error-handling mechanisms at present. If an agent fails or misinterprets a task, the workflow may stall without easy recovery options, requiring manual intervention. This isn’t just a Crew AI issue area – most agentic frameworks suffer from this ‘dead-end pursuit’ issue.

Khush’s Take

I like Crew’s multi-agent collaboration approach. The idea of assigning roles—like a researcher, writer, and editor—to specialised agents feels like building a dream team. If my new project involves breaking down a complex task, like automating a content pipeline or coordinating a logistics workflow, CrewAI’s structure will try and save me time and effort.

That said, CrewAI is still maturing – like all the frameworks, and I’ve heard about stability hiccups with bigger teams of agents. If my project is mission-critical—think finance—I’d worry about it buckling under pressure. Its tool ecosystem also feels narrower than LangChain’s, which could box me in if I need many integrations – although Model Context Protocol (MCP) from Anthropic makes me feel happier in general of building integrations with agentic framework. I’d pick CrewAI for a project where agent collaboration is a guiding principle and I can work within its current limits. But if I need bulletproof reliability right now, I think I’d still hold off using it until it’s more seasoned.

AutoGen

AutoGen, developed by Microsoft, is a research-driven framework that pushes the boundaries of conversational AI and human-agent collaboration. Designed with experimentation in mind, it offers advanced features for creating dynamic, context-aware agents that excel in dialogue and adaptability. AutoGen is designed to appeal to those exploring the cutting edge of agentic systems, particularly in settings where fluid interaction and innovation are priorities.

Key Features

Conversational Mastery: AutoGen is built for multi-turn dialogues, maintaining context and coherence over extended interactions. This makes it ideal for applications requiring natural, human-like exchanges.
Dynamic Compute: The framework intelligently scales computational resources based on task demands. Simple queries use minimal power, while complex problems—like intricate reasoning or simulations—get a boost, optimising efficiency.
Human-in-the-Loop: AutoGen seamlessly integrates human input, allowing users to guide agents, correct errors, or refine outputs in real time. This hybrid approach enhances flexibility and accuracy.
Extensibility: Its open design invites customisation, making it a sandbox for testing novel AI concepts, from new reasoning algorithms to experimental interaction models.

Limitations

Steep Learning Curve: AutoGen’s advanced features demand a solid grasp of AI principles, making it less approachable for beginners or teams seeking quick wins.
Resource Demands: Dynamic compute and experimental tools require significant processing power, driving up costs and hardware needs—especially for large-scale or prolonged use.
Production Gaps: As a research tool, AutoGen prioritises innovation over stability. It lacks the polish—like monitoring dashboards or fault tolerance—needed for production-ready systems.

Khush’s Take

AutoGen to me is like the wild card I love to play with. Its focus on conversational AI and dynamic compute—where an agent scales its thinking power based on the task—is pure sci-fi coolness. If my project is about pushing boundaries, like prototyping a next-gen tutor that adapts on the fly or testing human-agent teamwork, AutoGen’s flexibility is awesome!

The problem is the complexity which comes with rapid innovation. It’s intense, and it hoovers up system resources like nobody’s business. If I don’t have beefy hardware or the skills to tame its experimental side, you will be in over my head. Plus, it’s not built for production—missing the stability and monitoring I demand for a live app. I dive into AutoGen if my project is about pushing what’s possible and I’m okay with rough edges. For anything practical or deadline-driven, I’d steer clear.

Agno

Agno is a really new framework. It’s a TypeScript-based framework that prioritises speed and simplicity, offering a streamlined approach to agentic AI. Designed for real-time applications, its minimalist architecture delivers fast performance with minimal overhead, making it a standout for developers who need efficient, lightweight agents—particularly in web or edge environments.

Key Features

Minimalist Architecture: Agno strips away unnecessary complexity, focusing on core functionality to ensure low latency and efficient resource use. This lean design keeps agents nimble and responsive.
Real-Time Focus: Built for speed, Agno excels in scenarios requiring instant reactions, such as live chats or on-device processing, where delays are unacceptable.
Web Integration: Its TypeScript foundation makes it a natural fit for modern JavaScript ecosystems, enabling easy deployment in web apps, mobile platforms, or serverless setups.

Limitations

Limited Scope: Agno’s simplicity comes at a cost—it’s not built for complex reasoning, multi-step workflows, or collaborative tasks, restricting its use to basic, single-purpose agents.
Nascent Ecosystem: As a newer player, Agno lacks the rich toolset and community support of its peers, which can hinder development speed and troubleshooting.
Feature Gaps: It skips advanced capabilities like memory systems or dynamic scaling, making it less versatile for projects needing depth or adaptability.

Khush’s Take

Agno is the breath of fresh air for me and I’d turn to it for simplicity. It’s fast, lightweight, and built for real-time action—perfect if I’m trying to build a chatbot or an edge-deployed agent where speed is everything. With its TypeScript roots, it slots right into web or mobile projects, and the low overhead keeps things snappy. Performance can make or break user experience, and Agno’s focus on efficiency is a big draw.

But it’s bare-bones, and that’s where I have some concerns. No memory systems, no fancy reasoning, no teamwork features—just raw speed. If my project needs depth—like managing long conversations or juggling multiple tasks—Agno’s going to leave me without any options. I’d choose it for a straightforward, speed-first job where complexity isn’t an issue. For anything richer or more collaborative, I’d need a framework with more muscle.

Or would I?

What Does It Matter?

Let’s take a step back.

With agentic models like Claude Sonnet and Grok 3 being able to autonomously select, use , and even implement new and custom agentic frameworks: does it still matter which framework you know? With machines increasingly capable of analysing intent and choosing the right tools—or even bypassing frameworks entirely—shouldn’t the future be about everyone – not just developers – focusing on understanding how to state clear intent and letting the Machine Intelligence handle the rest?

Traditionally, we rely the develop team’s deep knowledge of specific frameworks—like React for front-end or Django for backend – to be the cornerstone of efficient and timely development. Frameworks provide structure, reusable components, and best practices, allowing developers to build applications quickly and troubleshoot effectively. However, this expertise requires significant time to acquire and maintain as frameworks evolve and inevitably become bloated and hard to use.

Today, coding agents can interpret your intent, evaluate the problem, and either select an appropriate framework or generate a custom solution without one. This capability suggests that we might not need to master specific frameworks anymore. Instead, we could focus on defining clear goals—say, “create a scalable chatbot”—and let the machine decide how to execute it. Try it yourself – you’ll be surprised with just how much power a small subscription a month brings you.

Let The Machine Be The Machine

If the Machine can reliably pick the right tools (or none at all), the argument for us to prioritise intent over programming language or framework knowledge grows stronger. The Machines can analyse options faster than we can, selecting frameworks—or avoiding them—based on the project’s unique needs. Machines aren’t tied to our prior knowledge, and our advanced models can learn new frameworks very quickly. If I know only Flask but a project demands FastAPI’s performance, the machine can pivot without requiring me to learn that new framework – I mean new to me of course!

By delegating how something is done, we can concentrate on understanding our goals, user needs, and system architecture rather than diving into framework-specific details. Perhaps we can then also focus on why something needs doing – doesn’t that mean we’ll reduce the ‘not invented here’ problem? I mean – how many ESBs do we need? How many NoSQL databases? How many versions of a GPU? Doesn’t that help reduce waste and emissions – you know, make IT greener?

I firmly believe that the role of ‘developer’ is changing faster than most are seeing or are willing to believe. More on that in another post…

Khush’s Take

I think it matters less which specific framework we know and more that we understand frameworks broadly. The future leans heavily on understanding intent and letting machines choose the right approach—framework or not. Yet, we can’t fully detach from the experience yet. A foundational grasp of frameworks allows us to validate machine decisions, integrate solutions, and maintain systems effectively. Rather than choosing frameworks themselves, we must now ensure the machine’s choices work in practice, making sure we pair our intent with oversight for the outcomes we want.

The future isn’t frameworks. It’s changing the experience.

khushil.io