- PM Tech House 🏠
- Posts
- What Are AI Agents?
What Are AI Agents?
A Step-by-Step Guide to Build Your Own.
The next big thing? AI agents are poised to become the next technological breakthrough in how we interact with software and accomplish tasks. Industry giants are making major moves - OpenAI, Nvidia, and Microsoft are heavily investing in agent technology, while companies like Salesforce are entering the arena. With Gartner predicting AI agents as a transformative force, we're witnessing the dawn of systems that can autonomously pursue goals, learn from interactions, and make complex decisions. And there’s no doubt that the thing is really taking off right now.

AI Agents on Google Trends (trends.google.com)
So, what is really behind the trend? The key to understanding agents is agency.
Unlike traditional generative AI systems, agents don’t just respond to user input. Instead, they can process a complex problem such as an insurance claim from start to finish. This includes understanding the text, images and PDFs of the claim, retrieving information from the customer database, comparing the case with the insurance terms and conditions, asking the customer questions and waiting for their response — even if it takes days — without losing context.
The agents do this autonomously — without humans having to check whether the AI is processing everything correctly.

AI Agents vs. Traditional Generative AI
Anatomy of an AI Worker
But enough chatting, let’s build an AI agent. Let us have a look at the relevant processes and workflows.
Let us build an agent for the insurance process shown in the diagram above. The agent should handle an insurance claim from start to reimbursement.
What we are developing here is the business architecture and the process flow. Unfortunately, I can’t dive into the coding because it can quickly become very extensive.
The future of presentations, powered by AI
Gamma is a modern alternative to slides, powered by AI. Create beautiful and engaging presentations in minutes. Try it free today.
☝️ Support PM Tech House by clicking the link above to explore our sponsors!
1. Classification & sending a job into processing lanes
Our workflow starts, when a customer sends a message with a claim for their home insurance to the insurer.
What does our agent do? It determines what the customer wants by analyzing the message’s content.
The first step in our workflow is intelligent message classification. When a customer submits a home insurance claim, our AI agent analyzes the message content to determine the customer's intent and required action. Based on this analysis, it automatically routes the claim into the appropriate processing lane, triggering a sequence of specific steps tailored to that claim type. This goes beyond simple categorization – it's about making smart decisions that kickstart the right automated workflow.

Classify a mail and routing into different processing lanes.
2. Extracting data
In the next step, data is extracted. One of the main tasks of an agent is to turn unstructured data into structured data … to make processing systematic, safe and secure.
Classification assigns a text to a predefined category, whereas extraction involves reading and interpreting data from the text. However, a language model doesn’t directly copy data from the input prompt; instead, it generates a response. This allows for data formatting, such as converting a phone number from ‘(718) 123–45678’ to ‘+1 718 123 45678’.

Extract data from the mail and attachments.
The extraction of data is not limited to text content (from the e-mail text), but can also comprise data from images, PDFs or other documents. We use more than one model for that: LLMs, image recognition models, OCR and others. The above process is simplified, really massively simplified. In reality, we often send images to OCR systems that extract text from scanned invoices or forms.. And often we classify attachments as well, before analyzing them.
We enforce JSON as the model’s output format to ensure structured data.
This is the email input — unstructured data:
Hi,
I would like to report a damage and ask for compensation.
Yesterday, while playing indoors, my 10-year-old daughter Priya accidentally threw a basketball against the chandelier in the dining room. The chandelier came loose from its holder, fell to the ground, and shattered completely (it was made of crystal).
Thankfully, no one was hurt, but the chandelier is irreparably damaged.
Attached are the invoice and some photos of the broken chandelier.
Neha Kapoor
Contract no: HC14-345678123
456 Elm Street
94109 San Francisco
(415) 987 65432
This is the model output — a JSON, structured data:
{
"name": "Neha",
"surname": "Kapoor",
"address": "456 Elm Street, 94109 San Francisco, CA",
"phone": "+1 415 987 65432",
"contract_no": "HC14-345678123",
"claim_description": "Yesterday [Dec-8, 2024], while playing indoors, my 10-year-old daughter Priya accidentally threw a basketball against the chandelier in the dining room. The chandelier came loose from its holder, fell to the ground, and shattered completely (it was made of crystal). Thankfully, no one was hurt, but the chandelier is irreparably damaged.\n"
}
3. Calling external services, making the context persistent
Many generative AI systems can answer queries directly — sometimes using pre-trained data, fine-tuning, or Retrieval Augmented Generation (RAG) on some documents. This is not enough for agents. Almost every reasonably powerful AI agent needs to access corporate or external data from databases.
To keep the context of a process persistent beyond the current session, it must also write data to systems and databases. In our case, the agent checks the contract number against a customer database and writes the status of the claim to an issue tracking system. It can also — remember: agency! — request missing data from external parties, such as the customer.

Call external services and make the context persistent.
4. Assessment, RAG, reasoning and confidence
The heart of every administration job consists of interpreting incoming cases in relation to various rules. AI is particularly good at this. Because we can’t provide all contextual information (e.g., policy content or terms and conditions) when calling a model, we use a vector database to retrieve relevant snippets — a technique known as RAG.
And we prompt the AI to ‘think aloud’ before making an assessment. Thinking before blurting out the result improves answer quality — something we’ve all learned since 3rd grade math. We can also use the output of the model reasoning in many obvious and less obvious ways:
- To substantiate an answer to the customer
- To help the prompt engineer and data scientist figure out why the model made a mistake
- For checks: Did the model arrive at the correct answer by chance, or can we see through its reasoning that the solution was inevitable?
Confidence is the key to maximizing accuracy. If the model estimates its confidence — and, dear prompt engineers, this also requires very good few shot learning examples for various confidence values — then we can configure the system to operate with extreme safety or high automation: We set a threshold of confidence below which all cases should go to human support. A high threshold ensures minimal errors but requires more manual processing, while a lower threshold allows more cases to be processed automatically, albeit with an increased risk of errors.

Use RAG / reasoning / confidence to obtain reliable assessments.
Et voila! If you have just implemented 2 or 3 of the above steps, you have developed an agent. I’ve outlined only the key components of these AI agents. You can certainly imagine the others. And you can either implement it with help of frameworks such as crewAI, langGraph, langFlow and their siblings or just do it in pure Python.
Remarkably, such a system can automate 70%–90% of a claims management department’s workload. And that’s not possible with simple pre-agent generative AI systems. Two years ago, I could never have imagined this becoming reality so quickly.
The Three Law of AI Agents
The agent must complete a job end to end without help of a human.
The agent must follow rules to ensure that processing is safe and secure.
The agent must use reasoning and confidence to achieve the highest degree of accuracy.
As we stand at the cusp of a new era in automation, these three fundamental laws serve as our compass for developing reliable, autonomous AI agents. By ensuring end-to-end completion, maintaining security, and prioritizing accuracy through reasoning, we're building systems that can truly transform business operations.
The real-world implementation of these principles in logistics systems and insurance processing demonstrates that AI agents aren't just theoretical constructs – they're practical solutions delivering tangible results today. As more organizations adopt and refine these technologies, we'll continue to see AI agents revolutionize how businesses handle complex workflows and decision-making processes.
The future of work is being reshaped by these intelligent systems, making now the perfect time to start building your own AI agents.