Documentation Index

Fetch the complete documentation index at: https://azure-cost-management-playbook.turbo360.com/llms.txt

Use this file to discover all available pages before exploring further.

API Management

Prev Next

Azure API Management (APIM) is one of those services that can catch teams off guard on cost. It starts as a sensible architectural choice — a single gateway to manage, secure, and expose your APIs — and then quietly becomes one of your larger Azure bills as your usage grows or your requirements push you up the SKU tiers.

In this guide we'll look at how APIM is priced, what drives the cost in practice, how to design solutions that stay cost-efficient, and how Turbo360 Cost Analyzer can help you keep it under control.


How Azure API Management Is Priced

APIM is billed per unit, per hour. Each tier has a different price per unit and a different request throughput capacity.

Tier

Approx. monthly cost (1 unit)

Throughput

Notes

Developer

~$50

Limited

Dev/test only — no SLA

Basic

~$150

1,000 req/sec

Small workloads

Standard

~$700

2,500 req/sec

Most common production tier

Premium

~$2,800

4,000 req/sec

Enterprise features, multi-region, VNET

V2 Standard

Varies

Higher

Newer tier, faster scaling

V2 Premium

Varies

Higher

Newer tier with Premium features

Consumption

Pay per call (~$4.50/million)

Burst only

No fixed cost — first 1M calls/month free

Key things to understand about the billing model:

  • You pay for units, not requests. Unused capacity still costs money.

  • Scale-out (adding units) multiplies the base tier price.

  • Self-hosted gateways incur additional costs — the gateway software is free but you pay for the infrastructure running it (typically a node in your AKS cluster).

  • Application Insights logging is a separate cost — it charges on data ingested, so the more you log, the more you pay outside of APIM itself.

  • Outbound bandwidth from the gateway to backends or consumers can add up at scale.


Factors That Affect Cost in the Real World

Number of APIM instances

The single biggest lever. Historically, teams trended toward one central APIM instance shared across the organisation. Today it's common to see multiple instances — one per team, per product, or per environment. Each instance costs its full unit price. This is the most important architectural decision you'll make for APIM cost.

SKU choice

The tier you choose has a dramatic impact. Moving from Standard to Premium can be a 4x cost increase per unit. Most teams are pushed to Premium for specific reasons — outbound VNET integration, private endpoints, availability zone support, or multi-region. If none of those apply to you, Standard is almost always sufficient.

Network integration requirements

VNET integration and private endpoint requirements force you to Premium or Isolated tiers. If you want to lock down your APIM so it only communicates over private networks, you are committing to Premium pricing. This is a decision worth thinking hard about before committing.

Self-hosted gateway

If you need to process API traffic on-premises or in another cloud, the self-hosted gateway model means you're running gateway nodes yourself — typically inside your AKS cluster. This adds compute cost on top of the base APIM subscription cost. There are also design costs: someone has to manage and maintain the gateway nodes.

Logging volume

Application Insights integration is extremely useful for observability and backend performance analysis — but it's priced on data ingestion. A high-traffic APIM instance logging every request, header, and body will generate a surprising amount of log data. Teams often don't realise that their App Insights bill is partially an APIM bill.

Backend API performance

This one is indirect but real. Slow backend APIs hold APIM units under load for longer. If your backends take 10 seconds to respond, a Standard unit that can theoretically handle 2,500 requests per second becomes much less effective under real-world concurrent load. Poor backend performance can force you to scale out units that you wouldn't otherwise need.


Architect's Perspective — Design Decisions That Affect Cost

Single instance vs multiple instances

The central shared APIM model is still valid and cost-efficient for many organisations. The argument for multiple instances usually comes down to team autonomy, blast radius isolation, or different SKU requirements across products. Before spinning up a second instance, ask whether the reason is technical necessity or organisational preference. The cost difference is significant.

Choosing your SKU carefully

Start by listing the features that would bias your SKU choice: outbound VNET, availability zones, multi-region. If you don't need any of those, Standard (or V2 Standard for newer deployments) will serve most production workloads. Developer SKU has no SLA and should never be used in production regardless of cost.

Gateway strategy

When you deploy APIM you'll use either the Microsoft-hosted managed gateway or the self-hosted model. The self-hosted gateway makes sense when you have traffic that needs to stay on-premises or in an environment that APIM's managed gateway can't reach. If you don't have that requirement, use the managed gateway — it's simpler and the cost is included in the unit price.

Scale-out vs scale-up

Under high load you can scale out (add units) or scale up (move to a higher SKU). If you're on Standard and hitting throughput limits, adding a second Standard unit gets you to 5,000 req/sec for less than the cost of moving to two Premium units. Run the numbers before assuming you need to move tiers.

Backend performance as a cost lever

Treat backend API performance as a cost concern, not just an operational one. If your team invests time in reducing average backend response time, you may be able to defer or avoid a scale-out event entirely. APIM is often the visible symptom of a backend problem.

Logging strategy

Configure Application Insights sampling rather than logging 100% of requests. For most operational purposes, 10–20% sampling gives you sufficient signal. Also set a reasonable data retention period — 90 days is a common default, but most teams only actively query the last 30. Every extra day of retention across a high-traffic instance adds up.


Common Cost Optimisation Strategies

Right-size before you scale

Before adding units, check whether backend performance improvements or logging reduction could resolve your capacity issues. Scaling is easy to do and hard to undo once teams depend on the throughput.

Consolidate instances where feasible

If you have multiple APIM instances without strong technical reasons for separation, consolidation onto a shared instance with workspace or product separation can significantly reduce cost. Modern APIM workspaces allow team-level isolation within a shared instance.

Use Developer SKU for non-production

All dev, test, and staging environments could use the Developer SKU unless you specifically need to test Premium features. The cost difference versus Standard is substantial across a year.

Tune your logging

Enable sampling in Application Insights. Set retention to match your actual query needs. Avoid logging full request and response bodies in production unless you have a specific compliance or debugging requirement — bodies add significant volume to your log ingestion.

Review self-hosted gateway node sizing

If you're running self-hosted gateways in AKS, make sure the node size is appropriate. Teams often start with a larger node than needed and leave it running permanently. Review utilisation and rightsize the nodes.

Evaluate V2 tiers for new deployments

The V2 Standard and V2 Premium tiers have better scaling characteristics and can be a more cost-efficient choice for new APIM deployments. If you're currently on v1 Standard, it's worth evaluating whether migration makes sense as part of any planned architecture change.


How Can Turbo360 Cost Analyzer Help?

Turbo360 Cost Analyzer gives you visibility into your APIM spending that goes beyond what the Azure portal shows by default.

Visualise spend at the resource and meter level

See exactly what each APIM instance is costing broken down by meter — gateway units, bandwidth, and any associated Application Insights consumption. This makes it straightforward to identify which instances or gateways are driving cost.

Spot anomalies before they become large bills

Cost Analyzer can detect unusual increases in APIM spend and alert your team early. A sudden spike in logging volume or an unexpected scale-out event shows up as an anomaly — giving you time to investigate before the bill lands.

Track cost across teams with tag-based allocation

If your organisation uses tagging to allocate costs to teams or products, Turbo360 surfaces APIM costs in the context of your allocation model. This makes it easy to have informed conversations with teams about the cost of their API infrastructure.

Budget monitoring

Set budgets against your APIM resources and get notified when spend is trending over threshold. Particularly useful for environments where development teams are spinning up instances and may not be watching the cost.


FAQ

What's the cheapest APIM tier for production use?

Basic or Standard depending on your throughput needs. Basic suits low-traffic internal APIs. Standard is the most common production choice, capable of handling 2,500 requests per second per unit. Neither includes VNET integration — if you need that, you're looking at Premium.

Do I have to pay for APIM even when there's no traffic?

Yes. APIM is billed per unit per hour regardless of traffic volume. You're paying for capacity, not consumption. This is why right-sizing instances and not running unnecessary environments at higher tiers matters. The one exception is the Consumption tier of APIM, however it has some limitations and not as wide adoption as other SKUs.

When does it make sense to use a self-hosted gateway?

When you have backends that aren't accessible from Azure — on-premises APIs or APIs in another cloud — and you need APIM to route to them. If all your backends are in Azure, the managed gateway handles everything and is simpler.  There are also networking options for the cloud gateway which might mitigate the need for a self-hosted gateway.  It would usually be functional needs that push you towards a self-hosted gateway.

Can I reduce costs by using multiple smaller instances instead of one larger one?

Generally it depends — multiple instances each carry their own fixed unit cost. If you have an APIM namespace with 1 node and scale it to 2 nodes then it will be cheaper than having 2 namespaces with 1 node each.  However if you choose to have 2 instances of standard it can often be cheaper than 1 instance of premium but there are also functionality trade-offs.


Useful Resources

Turbo360 blog posts

Related playbook pages

Microsoft documentation