How to build LLM Agents with Magentic

Last Updated on February 4, 2025

What is Magentic?
Why should I use Magentic?
Why shouldn’t I use Magentic?
What are some Magentic alternatives?
Which LLMs does Magentic support?
Getting started
How to use Magentic prompt decorator?
How to use Magentic chatprompt decorator?
How to do LLM function calling with Magentic?
How to use Magentic prompt chains?
How to stream LLM responses with Magentic?
How to stream structured LLM outputs with Magentic?
How to async stream LLM outputs with Magentic?
How to build a custom RAG with Magentic?
Full code

What is Magentic?

Magentic is an open-source framework that allows for the seamless integration of Large Language Models (LLMs) into Python code.

Website: Magentic

GitHub Repository: jackmpcollins/magentic: Seamlessly integrate LLMs as Python functions

Why should I use Magentic?

Magentic is easy to use.
Magentic is free and open-sourced.
It is actively maintained.
Plays well with Pydantic to produce structured outputs.
Supports vision, streaming, parallel function calling, and more.

Why shouldn’t I use Magentic?

Magentic is mainly maintained by just one person.
Magentic isn’t the only LLM framework.
Doesn’t have a big community around it.

What are some Magentic alternatives?

Magentic alternatives are the following:

Which LLMs does Magentic support?

Magentic supports these LLMs:

OpenAI LLMs
Ollama
Anthropic
Mistral
LiteLLM
or any other OpenAI schema-compatible model

Getting started

To get started with Magentic, ensure that you have installed Python and obtained an API key for one of the LLM models mentioned above. I will personally go with ChatGPT as the LLM of choice.

The next step will be to install Magentic into a fresh environment with the following command:

pip install magentic

Now, we will create a .env file in which we will add needed environment variables.

touch .env

MAGENTIC_BACKEND=openai
MAGENTIC_OPENAI_API_KEY=sk-...
MAGENTIC_OPENAI_MODEL=gpt-4

In the following headers, we will go over the basic building blocks of Magentic and create a custom agentic RAG pipeline that will use financial data and function calling to provide outputs to the user query.

Let’s begin.

How to use Magentic prompt decorator?

To use the Magentic @prompt decorator you need to define a template for a LLM prompt as a Python function. When this function is called, the arguments are inserted into the template and the prompt is sent to an LLM which generates the function output.

from magentic import prompt

@prompt("Explain like I'm five this financial concept: {concept}")
def explain(concept: str) -> str: ...

explain("Subprime mortgage crisis")

“The Subprime Mortgage Crisis is like a big, grown-up version of a game of hot potato. Here’s how it works:

Let’s say you want to buy an ice cream cone, but you don’t have any money. So, a nice person (in this case, a bank) is willing to lend you money to buy your ice cream. But, they know you don’t have any money to pay them back right away, which makes lending the money to you pretty risky. This would be a subprime mortgage, or loan.

Now, imagine that the person lending you money to buy your ice cream gets worried because they know you may not be able to pay them back. To get rid of this problem, they decide to sell your loan to another person (an investor). This means now you owe the money to this new person, not the one who lent you the money in the first place.

This makes the first person feel safe because they got their money back, but now the new person is the one who may lose money if you can’t pay them back.

Then, imagine this happening with not just one ice cream cone, but millions of them. Loads of people can’t pay back their ice cream loans, so the new owners of these loans lose a lot of money. This big problem is like the Subprime Mortgage Crisis.

Importantly, it’s not really about ice cream, but about bigger expensive things like houses. When too many people couldn’t pay back their loans, the banks and investors that owned those loans lost a lot of money, and this caused huge problems for the whole world’s money system.”

Notice how the @prompt decorator plays nicely with the function and curly braces. Moreover, the function doesn’t need to have a body as everything is done by the decorator.

The @prompt decorator will respect the return type annotation of the decorated function. This can be any type supported by pydantic including a pydantic model.

from magentic import prompt
from pydantic import BaseModel


class Portfolio(BaseModel):
    equity_etf_pct: float
    bond_etf_pc: float
    crypto_etf_pc: float
    commodities_pc: float
    reasoning: str


@prompt("Create a strong portfolio of {size} allocation size.")
def create_portfolio(size: str) -> Portfolio: ...

portfolio = create_portfolio("$50,000")
print(portfolio)

equity_etf_pct=50.0
bond_etf_pc=30.0
crypto_etf_pc=10.0
commodities_pc=10.0

reasoning=’A balanced strong portfolio suitable for most risk tolerances would allocate around 50% towards Equity ETFs for growth, 30% towards Bond ETFs for income and stability, 10% towards Crypto ETFs for high-growth and high-risk appetite and 10% towards commodities for a balanced protection against inflation. The allocation size is $50,000.’

How to use Magentic chatprompt decorator?

To use the Magentic chatprompt decorator you will need to pass chat messages as a template rather than a single text prompt. We can also provide a system message or few-shot prompting example responses to guide the model’s output.

from magentic import chatprompt, AssistantMessage, SystemMessage, UserMessage
from pydantic import BaseModel


class Quote(BaseModel):
    quote: str
    person: str


@chatprompt(
    SystemMessage("You are an avid reader of financial literature."),
    UserMessage("What is your favorite quote from Warren Buffet?"),
    AssistantMessage(
        Quote(
            quote="Price is what you pay; value is what you get.",
            person="Warren Buffet",
        )
    ),
    UserMessage("What is your favorite quote from {person}?"),
)
def get_finance_quote(person: str) -> Quote: ...


get_finance_quote("Charlie Munger")

In my whole life, I have known no wise people (over a broad subject matter area) who didn’t read all the time – none, zero.

How to do LLM function calling with Magentic?

To do LLM function calling with Magentic, you will use a @prompt-decorated function that returns a FunctionCall object which can be called to execute the function using the arguments provided by the LLM. For example, let’s ask for price data and have it use Alpha Vantage:

import os
import requests
from magentic import prompt, FunctionCall

AV_API_KEY = os.getenv("AV_API_KEY")

def get_daily_price(ticker: str, api_key: str = AV_API_KEY) -> dict:
    url = f'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol={ticker}&apikey={api_key}'
    r = requests.get(url)
    data = r.json()
    return data['Time Series (Daily)']


@prompt(
    "Use the appropriate search function to answer: {question}",
    functions=[get_daily_price],
)
def perform_search(question: str) -> FunctionCall[str]: ...


output = perform_search("What is the daily price data of AAPL?")

output()

>>> {'2025-01-27': {'1. open': '224.1200',
  '2. high': '232.1500',
  '3. low': '224.0000',
  '4. close': '229.8600',
  '5. volume': '94224324'}, ...

Take note that Alpha Vantage has a free tier so grabbing an API key shouldn’t be an issue if you are following along. Alternatively, you can use any other preferred data provider.

How to use Magentic prompt chains?

You will want to use Magentic prompt chains when it is required for the LLM to do several operations before returning the final response. The @prompt_chain decorator will resolve FunctionCall objects automatically and pass the output back to the LLM to continue until the final answer is reached.

import csv
from magentic import prompt_chain


def get_earnings_calendar(ticker: str, api_key: str = AV_API_KEY) -> list:
    url = f"https://www.alphavantage.co/query?function=EARNINGS_CALENDAR&symbol={ticker}&horizon=12month&apikey={api_key}"
    with requests.Session() as s:
        download = s.get(url)
        decoded_content = download.content.decode('utf-8')
        cr = csv.reader(decoded_content.splitlines(), delimiter=',')
        my_list = list(cr)
    return my_list


@prompt_chain(
    "What's {ticker} expected earnings dates for the next 12 months?",
    functions=[get_earnings_calendar],
)
def get_earnings(ticker: str) -> str: ...


get_earnings("IBM")

‘The expected earnings dates for IBM for the fiscal year 2025 are as follows:

– On April 22, 2025 for fiscal date ending March 31, 2025

– On July 22, 2025 for fiscal date ending June 30, 2025

Please note that these dates are for fiscal year 2025 and the exact figures for the earnings are not available yet. These dates are expected and may change. The currency for these earnings is in USD.’

The prompt chains are the “bread and butter” of building agentic workflows as they allow us to create a loop where the LLM can perform multiple API calls until it collects all the information it needs to provide an answer.

With some good prompting techniques and a good pipeline, one can perform many complex tasks.

How to stream LLM responses with Magentic?

To stream LLM response with Magentic, you will use the StreamedStr (and AsyncStreamedStr) class. This allows you to process the text while it is being generated, rather than receiving the whole output at once. For example:

from magentic import StreamedStr


@prompt("Explain to me {term} in a way a 5-year-old would understand.")
def describe_finance_term(term: str) -> StreamedStr: ...


# Print the chunks while they are being received
for chunk in describe_finance_term("liquidity"):
    print(chunk, end="")

“Liquidity is like having a toy that everyone wants to trade for. If you want to swap your toy for something else, you can do it easily and quickly because everyone wants your toy. That’s like having an asset or money that is liquid. You can exchange it easily and quickly without losing its value.”

You can now read the response as it is being generated and also inject it into other processes to speed things up.

How to stream structured LLM outputs with Magentic?

To stream structured LLM outputs with Magentic, you will utilize the return type annotation Iterable (or AsyncIterable). This allows each item to be processed while the next one is being generated. For example:

from collections.abc import Iterable
from time import time


class Portfolio(BaseModel):
    equity_etf_pct: float
    bond_etf_pc: float
    crypto_etf_pc: float
    commodities_pc: float
    reasoning: str


@prompt("Create {n_portfolio} portfolios with varying deegress of risk apetite.")
def create_portfolios(n_portfolio: int) -> Iterable[Portfolio]: ...


start_time = time()
for portfolio in create_portfolios(3):
    print(f"{time() - start_time:.2f}s : {portfolio}")

7.25s : equity_etf_pct=60.0 bond_etf_pc=30.0 crypto_etf_pc=5.0 commodities_pc=5.0 reasoning=”This portfolio is for an individual with a moderate risk tolerance. Having a larger part of the portfolio in equity ETF’s and bonds provides a balance of growth and stability. A small allocation is made to crypto and commodities in order to try and take advantage of potential high returns.”

11.46s : equity_etf_pct=40.0 bond_etf_pc=55.0 crypto_etf_pc=2.0 commodities_pc=3.0 reasoning=’This portfolio is oriented towards an individual with a low risk tolerance. In this case, most of the portfolio is in bonds, which are generally lower risk. The allocation is also diversified with some investments in equity ETFs, crypto and commodities.’

14.78s : equity_etf_pct=70.0 bond_etf_pc=15.0 crypto_etf_pc=10.0 commodities_pc=5.0 reasoning=”This portfolio is for someone with a high risk tolerance. It allocates a majority of the holdings towards equity ETF’s for stronger growth potential. There is also increased investment in volatile areas such as cryptocurrency in the hope of achieving high returns.”

How to async stream LLM outputs with Magentic?

To async stream LLM outputs with Magentic, you will create async functions and utilize the AsyncIterable class where needed. This will allow us to concurrently query the LLM and increase the speed of generation and also allow other asynchronous code to run while waiting on LLM output.

import asyncio
from typing import AsyncIterable


@prompt("List three high-growth stocks.")
async def iter_growth_stocks() -> AsyncIterable[str]: ...


@prompt("Tell me more about {stock_symbol}")
async def tell_me_more_about(stock_symbol: str) -> str: ...


start_time = time()
tasks = []
async for stock in await iter_growth_stocks():
    # Use asyncio.create_task to schedule the coroutine for execution before awaiting it
    # This way descriptions will start being generated while the list of stocks is still being generated
    task = asyncio.create_task(tell_me_more_about(stock))
    tasks.append(task)

descriptions = await asyncio.gather(*tasks)

for desc in descriptions:
    print(desc)

I will spare you the verbose descriptions but it mentioned Tesla, Zoom, and AMD.

Okay, now that we have the main building blocks that magnetic offers, we can create our RAG (Retrieval-Augmented Generation) pipeline and build a small interface for it.

How to build a custom RAG with Magentic?

To build a custom Retrieval Augmented Generation (RAG) with Magentic, we will asynchronously use its building blocks. The overall idea will be that the RAG pipeline can perform investment analysis using the Alpha Vantage API.

The goal is to dynamically determine which Alpha Vantage endpoints to call based on user queries, retrieve the necessary data, and then format it into a structured response.

[1]

We will achieve this by having 4 steps in our code that are these:

Function Selection: Determines which Alpha Vantage endpoints to use based on the input query.

Retrieval: Calls the selected functions to fetch relevant data.

Processing: Formats the retrieved data for LLM consumption.

Generation: LLM responds based on the gathered data.

Note: Keep in mind that this is a very simple and not-so-efficient pipeline.

For this, we will also want to use FastAPI as the framework of choice:

pip install fastapi
pip install uvicorn

Now, let us import the needed libraries and set up the FastAPI app, Alpha Vantage API key, and functions that will gather the data from Alpha Vantage:

import csv
import os
import requests
from typing import Any, AsyncGenerator
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from magentic import (
    AssistantMessage,
    SystemMessage,
    prompt,
    chatprompt,
    FunctionCall,
    UserMessage,
)

AV_API_KEY = os.getenv("AV_API_KEY")

app = FastAPI()


async def get_earnings_calendar(ticker: str, api_key: str = AV_API_KEY) -> dict:
    """Fetches upcoming earnings dates for a given ticker."""
    url = f"https://www.alphavantage.co/query?function=EARNINGS_CALENDAR&symbol={ticker}&horizon=12month&apikey={api_key}"
    response = requests.get(url, timeout=30)
    decoded_content = response.content.decode("utf-8")
    cr = csv.reader(decoded_content.splitlines(), delimiter=",")
    data = list(cr)
    return {"data": data}


async def get_news_sentiment(
    ticker: str, limit: int = 5, api_key: str = AV_API_KEY
) -> list[dict]:
    """Fetches sentiment analysis on financial news related to the ticker."""
    url = f"https://www.alphavantage.co/query?function=NEWS_SENTIMENT&tickers={ticker}&apikey={api_key}"
    response = requests.get(url, timeout=30).json().get("feed", [])[:limit]
    fields = [
        "time_published",
        "title",
        "summary",
        "topics",
        "overall_sentiment_score",
        "overall_sentiment_label",
    ]
    return [{field: article[field] for field in fields} for article in response]


async def get_daily_price(ticker: str, api_key: str = AV_API_KEY) -> dict[str, Any]:
    """Fetches daily price data for a given stock ticker."""
    url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol={ticker}&apikey={api_key}"
    response = requests.get(url, timeout=30).json()
    return response.get("Time Series (Daily)", {})


async def get_company_overview(
    ticker: str, api_key: str = AV_API_KEY
) -> dict[str, Any]:
    """Fetches fundamental company data like market cap, P/E ratio, and sector."""
    url = f"https://www.alphavantage.co/query?function=OVERVIEW&symbol={ticker}&apikey={api_key}"
    return requests.get(url, timeout=30).json()


async def get_sector_performance(api_key: str = AV_API_KEY) -> dict[str, Any]:
    """Fetches market-wide sector performance data."""
    url = f"https://www.alphavantage.co/query?function=SECTOR&apikey={api_key}"
    return requests.get(url, timeout=30).json()

Something we should be careful about is how much data we are grabbing and passing to the LLM at its final stages. This is especially true when dealing with news-like data as it can easily overflow the context. I conservatively placed the limit to be 5 news sentiment pieces for now.

There are strategies to this such as using compression, trimming, and/or more agents that would summarise the payloads multiple times and similar.

Now we will pass these functions into an LLM function that will choose the right data sources to answer the user’s questions:

@prompt(
    """
    You are an investment research assistant. 
    You need to answer the user's question: {question}
    Use available functions to retrieve the data you need.
    DO NOT request data from functions that have already been used!
    If all necessary data has been retrieved, return `None`.
    Here is what has already been retrieved: {called_functions}
    """,
    functions=[
        get_daily_price,
        get_company_overview,
        get_sector_performance,
        get_news_sentiment,
        get_earnings_calendar,
    ],
)
def iterative_search(
    question: str, called_functions: set[str], chat_history: list[Any]
) -> FunctionCall[str] | None: ...

We also need an LLM function that will use the obtained data and provide an answer for it:

@chatprompt(
    SystemMessage(
        """
        You are an investment research assistant. 
        Only use retrieved data for your analysis.
        """
    ),
    UserMessage(
        "You need to answer this question: {question}\nAnalyze the following data: {collected_data}"
    ),
)
def analyze_data(question: str, collected_data: dict[str, Any]) -> str: ...

Now, all we need is the iteration loop where the querying logic goes:

def format_collected_data(collected_data: dict[str, Any]) -> str:
    formatted_data = []
    for function_name, data in collected_data.items():
        formatted_data.append(f"### {function_name} Data:\n{data}\n")
    return "\n".join(formatted_data)


async def query(question: str, max_iterations: int = 10) -> AsyncGenerator[str, None]:
    """
    Runs iterative retrieval and streams LLM analysis.
    """
    iteration = 0
    collected_data = {}
    called_functions = set()
    chat_history = [
        SystemMessage(
            """
            You are an investment research assistant. 
            Retrieve data iteratively and update insights.
            """
        )
    ]

    while iteration < max_iterations:
        iteration += 1
        yield f"\n**Iteration {iteration}...**\n"

        function_call = iterative_search(question, called_functions, chat_history)

        if function_call is None:
            yield "\n**LLM is satisfied with the data. Analyzing now...**\n"
            break

        function_name = function_call._function.__name__

        if function_name in called_functions:
            yield f"\n**Early stop: {function_name} was already called.**\n"
            break

        called_functions.add(function_name)
        function_args = function_call.arguments

        match function_name:
            case "get_daily_price":
                result = await get_daily_price(**function_args)
            case "get_company_overview":
                result = await get_company_overview(**function_args)
            case "get_sector_performance":
                result = await get_sector_performance()
            case "get_news_sentiment":
                result = await get_news_sentiment(**function_args)
            case "get_earnings_calendar":
                result = await get_earnings_calendar(**function_args)
            case _:
                yield f"\nUnknown function requested: {function_name}\n"
                continue

        if not result:
            yield f"\n**No new data found for {function_name}, stopping iteration.**\n"
            break

        collected_data[function_name] = result
        yield f"\n**Retrieved data from {function_name}** ✅\n"

        chat_history.append(UserMessage(f"Retrieved {function_name} data: {result}"))
        chat_history.append(AssistantMessage(f"Storing data from {function_name}."))

    formatted_data = format_collected_data(collected_data)
    final_analysis = analyze_data(question, formatted_data)
    yield f"\n**Investment Insight:**\n{final_analysis}\n"

And now we wrap this behind a FastAPI endpoint:

@app.get("/investment_research")
async def investment_research(question: str):
    return StreamingResponse(query(question), media_type="text/event-stream")


if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="0.0.0.0", port=8000)

Once we start the uvicorn server, we can go over to the localhost:8000/docs# URL address and try out the flow from there. You will see all of your available endpoints (in our case just one) and will be able to try them out. This also serves as a nice user interface.

Let’s ask it to: “Tell me about the latest news and price trends of AAPL”

We can see that the LLM grabbed both the price data and news sentiment data which was what we wanted it to do. Here is what it answered me:

**Iteration 1...**

**Retrieved data from get_daily_price** ✅

**Iteration 2...**

**Retrieved data from get_news_sentiment** ✅

**Iteration 3...**

**LLM is satisfied with the data. Analyzing now...**

**Investment Insight:**
Upon analyzing the given daily price and news sentiment data, here are the key trends and highlights:

Price Trends:
- The closing price for Apple Inc. (AAPL) has significantly fluctuated over the past several months. 
- The highest closing price in our data set was on December 31, 2024, at $250.42, while the lowest closing price was on October 7, 2024, at $221.69.
- There was a noticeable decline in the stock's price throughout January 2025 which suggests a bearish trend during this period. 

Latest News:
1. Apple Inc. shares rose by 4.02% in pre-market following better-than-expected Q1 revenue and earnings per share. This news has a 'Somewhat-Bullish' sentiment with a score of 0.345283.
2. Goldman Sachs increased Apple Inc.'s price target from $280 to $294, which could indicate a possible uptrend for the stock in the future. The news sentiment is 'Somewhat-Bullish' with a score of 0.232085.
3. Samsung's Q4 revenue rose by 12% to $52.2B, releasing their plans for AI-driven premium product growth in 2025. This news doesn't have a direct impact on Apple but could be significant given the competition in the technology industry. The news sentiment is 'Neutral' with a score of 0.110925.
4. Despite a decline in iPhone and China sales, Apple Inc.'s Q1 revenues and earnings per share were better than expected. The news sentiment is 'Somewhat-Bullish' with a score of 0.345283.
5. Q1 fiscal 2025 results of AAPL benefited from strong services growth, despite a decline in iPhone sales. The news sentiment is 'Neutral' with a score of 0.070957.

In summary, while the recent closing price data shows some volatility with a downward trend in January 2025, the news sentiment for Apple Inc. is generally positive, indicating optimism about the company's financials and growth prospects. However, investors should constantly keep an eye on the news and financial updates to be aware of the potential risks and changes affecting the company's performance.

And that’s how you can build a simple RAG pipeline with Magentic.

Full code

GitHub Link