$ cat aws-nat-gateway-hidden-cost.md

The hidden cost of AWS NAT Gateway.

· 5 min read · aws · finops

It's not the $32/month idle fee. It's the $0.045 per GB of traffic that never needed to leave the VPC. Most AWS accounts pay this tax silently, every month, on TBs of S3 reads.

Here's what NAT Gateway actually costs in eu-central-1 as of writing:

$0.045 / hour      ≈ $32.40 / month, per gateway
$0.045 / GB        on every byte processed

One gateway, sitting idle, for a year: ~$390. Which is fine. The hourly fee is a rounding error.

The problem is the per-GB. AWS bills you $0.045 per GB processed in each direction, on top of any data-transfer-out charges to the internet. If your private-subnet workloads pull 5 TB/month of container images from public ECR, npm packages, OS updates, S3 objects in another region, or — most often — S3 objects in the same region via the public S3 endpoint, that's another $225/month per gateway just for shovelling bytes you already had a free path to.

Multiply by 3 AZs × 4 environments and the tax adds up fast.

The killer: S3 over the public endpoint

The single biggest line item we see, in account after account, is private workloads reading from S3 over the public endpoint. The traffic stays within AWS, but because it's destined for a public IP, it routes through your NAT Gateway and you pay $0.045/GB for the privilege.

The fix is free, has been since 2015, and most accounts still don't use it: VPC gateway endpoints for S3 and DynamoDB. They cost nothing. They route traffic over AWS's private backbone. They take ten lines of Terraform.

resource "aws_vpc_endpoint" "s3" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.${var.region}.s3"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = aws_route_table.private[*].id
}

resource "aws_vpc_endpoint" "dynamodb" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.${var.region}.dynamodb"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = aws_route_table.private[*].id
}

Apply it. Your S3 traffic now bypasses the NAT Gateway entirely and AWS still charges you nothing for the endpoint. We've seen this single change cut NAT bills by 60–80% in data-heavy workloads.

The audit query

Before you optimise anything, look at what your gateways are actually processing. Pull the last 30 days from CloudWatch:

aws cloudwatch get-metric-statistics \
  --namespace AWS/NATGateway \
  --metric-name BytesOutToDestination \
  --dimensions Name=NatGatewayId,Value=nat-0abc123 \
  --start-time "$(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%S)" \
  --end-time "$(date -u +%Y-%m-%dT%H:%M:%S)" \
  --period 86400 \
  --statistics Sum

Sum the result, divide by 1e9 for GB, multiply by 0.045 × 2 (in + out) — that's your monthly cost from this one gateway. Anything over a few hundred GB/day is worth investigating.

For "what is actually going through it", VPC Flow Logs is the only honest answer. Enable them, ship to S3 + Athena, and run:

SELECT
  dstaddr,
  SUM(bytes) / 1e9 AS gb_out
FROM vpc_flow_logs
WHERE srcaddr LIKE '10.%'
  AND dstaddr NOT LIKE '10.%'
  AND day >= date_add('day', -7, current_date)
GROUP BY dstaddr
ORDER BY gb_out DESC
LIMIT 20;

The top 20 destinations tell you exactly what to fix. S3 IPs (look up the prefix list com.amazonaws.global.s3) → add a gateway endpoint. ECR Public → switch to private ECR with an interface endpoint. Container registries on Docker Hub → mirror into ECR.

When one NAT per AZ is overkill

The AWS reference architecture says: one NAT Gateway per AZ, for HA. For most non-prod environments and many low-traffic prods, this is overkill — and triples your idle cost for failure modes you'd survive a brief blip on.

For dev and staging, a single shared NAT Gateway is usually fine. If the AZ it lives in goes down, your dev environment loses egress for an hour. Compare that to 3 × $32 × 12 = $1,150/year per environment forever.

For production, the calculation depends on what "egress down for one AZ" means for you. If your workload is mostly serving requests (egress is OS updates and metric pushes), you can survive a single-NAT outage. If it's ETL pulling from external APIs, you can't.

The IPv6 escape hatch

NAT Gateway exists because IPv4 is scarce. Workloads using IPv6 don't need it — they use an egress-only Internet Gateway, which has no hourly fee and no per-GB charge.

Going IPv6-first is a bigger architecture decision than this post can cover, but if you're starting greenfield in 2026 it's worth seriously considering. AWS has steadily rolled out IPv6 support across services; most managed AWS services and Kubernetes (with amazon-vpc-cni in IPv6 mode) support it now.

The five-minute checklist

  1. Add S3 + DynamoDB gateway endpoints to every VPC. Free, no risk, ten lines.
  2. Look at NAT bytes over 30 days per gateway. Anything > 1 TB/month deserves a flow-log investigation.
  3. For ECR-heavy workloads (CI runners, k8s nodes) add an ECR interface endpoint.
  4. Collapse non-prod NAT to one shared gateway per environment.
  5. Mirror Docker Hub / public package registries into private ECR with image cache.
Sanity check. Open Cost Explorer, group by usage type, filter to NatGateway-Bytes. If that line is more than 5% of your monthly AWS bill, you have at least one of the problems above and the fix is an afternoon's work.

We do quick AWS cost audits — usually 1–2 weeks, fixed scope. If your NAT line is concerning, drop us a note.