Skip to main content

EC2 vs Lambda vs ECS — Choosing Your Compute

Certification prep teaches you what each service does. It rarely teaches you how to choose between them when you're staring at a real workload. This guide covers the decision, not the definition.

The Core Question

Every compute decision comes down to three variables:

  • Runtime — how long does your code run per invocation?
  • Shape — is the workload stateless or stateful? Burst or steady?
  • Operational burden — how much do you want to manage?

Get these three right and the service choice becomes obvious.


EC2 — When You Need the Machine

EC2 is not a default. It is a deliberate choice that comes with real operational cost: patching, scaling configuration, AMI management, instance health monitoring. You choose EC2 when you specifically need what EC2 provides.

Choose EC2 when:

  • Your application requires persistent local state (files, sockets, long-lived processes)
  • You need predictable, sustained CPU/memory (ML inference, video encoding, heavy batch jobs)
  • You are running software that cannot be containerised — legacy apps, licensed software tied to a host
  • You need full OS control: kernel parameters, custom networking, hardware-level access
  • Cost at scale favours reserved instances over per-request pricing

Real example: A Prometheus server. It needs persistent disk for TSDB storage, a stable network address for scrape targets to resolve, and long-lived connections. Lambda cannot do this. ECS can, but persistent storage for TSDB requires careful EFS or EBS attachment. EC2 with a large attached EBS volume is the straightforward answer.

Warning signs you are choosing EC2 incorrectly:

  • Your instances are idle most of the time
  • You are running one small process on a large instance
  • You are managing EC2 to run containers — use ECS instead

Lambda — When You Need a Function, Not a Server

Lambda's value is not cheapness. It is the elimination of idle cost and operational overhead for workloads that are event-driven and short-lived.

Choose Lambda when:

  • Your code runs in response to an event (S3 upload, API call, SNS message, schedule)
  • Each invocation is independent — no shared state between calls
  • Execution time is under 15 minutes
  • Traffic is spiky or unpredictable — Lambda scales to zero and back instantly
  • You want zero infrastructure management

Real example: Image resizing triggered by S3 uploads. Every upload fires an event, Lambda runs for a few seconds, produces a thumbnail, writes it back to S3. No server needed, no idle cost, no scaling configuration. This is exactly what Lambda is for.

Where Lambda breaks in production:

  • Cold starts — the first invocation after idle can take hundreds of milliseconds to several seconds. For latency-sensitive APIs this is a problem. Provisioned concurrency mitigates it at extra cost.
  • Duration limits — 15 minutes maximum. Long-running jobs (ETL, video processing) do not fit.
  • Local state/tmp is available (512MB–10GB depending on config) but it is ephemeral and not shared between instances.
  • VPC cold starts — Lambda inside a VPC has historically had longer cold starts. This has improved significantly with Hyperplane ENIs but is still a consideration.
  • Concurrency limits — default 1,000 concurrent executions per region per account. At scale this becomes a hard constraint.

Warning signs you are choosing Lambda incorrectly:

  • Your Lambda runs for 5+ minutes regularly — reconsider the architecture
  • You are managing complex shared state between Lambda invocations
  • You need consistent sub-10ms latency with no variance

ECS — When You Need Containers Without Kubernetes

ECS is the middle path: more control than Lambda, less operational overhead than EC2. It runs containers. Your unit of deployment is a Docker image, not a machine.

ECS has two launch types:

Fargate — AWS manages the underlying compute. You define CPU and memory per task. No EC2 instances to patch or scale. Pay per task-second.

EC2 launch type — You manage a cluster of EC2 instances. ECS schedules containers onto them. More control, more ops work, better cost efficiency at scale with reserved instances.

Choose ECS (Fargate) when:

  • You have a long-running service that needs to be containerised
  • Your workload does not fit Lambda's constraints (runtime, state, concurrency)
  • You do not want to manage EC2 instances
  • You need sidecars (logging agents, service mesh proxies) alongside your application container
  • You want port-level control and persistent connections

Choose ECS (EC2 launch type) when:

  • You have consistent, high-volume workloads where EC2 reserved instances reduce cost significantly
  • You need GPU instances or specific instance types not available in Fargate
  • You need more control over networking at the host level

Real example: A Node.js API that handles WebSocket connections. Lambda cannot maintain persistent connections. EC2 works but you are managing machines for an app that should just be a container. ECS Fargate with an ALB in front — right-sized task definition, auto scaling on CPU/memory, no EC2 to manage. This is the correct choice.


The Decision Tree

Is your workload event-driven and under 15 minutes?
├── YES → Can it tolerate cold starts?
│ ├── YES → Lambda
│ └── NO → Lambda + Provisioned Concurrency, or ECS
└── NO → Is it a containerised workload?
├── YES → Do you want to manage EC2 instances?
│ ├── NO → ECS Fargate
│ └── YES → ECS EC2 (reserved instances at scale)
└── NO → Do you need full OS control or persistent state?
├── YES → EC2
└── NO → Consider containerising it first

Cost Comparison Intuition

This is not a pricing table — those change. This is how to think about relative cost:

PatternCost DriverGood for
LambdaPer request + durationSpiky, event-driven, low sustained volume
FargatePer task-second (CPU + memory)Medium sustained workloads, no ops overhead
EC2 On-DemandPer instance-hour (running or not)Variable workloads, short-term
EC2 ReservedCommitted 1–3 year termPredictable, sustained workloads

Lambda becomes expensive when it runs constantly at high concurrency. EC2 becomes cheap when reserved and well-utilised. Fargate sits between them — more predictable than Lambda at volume, more flexible than EC2 for variable loads.


What the Exam Tests vs What Production Teaches

The exam tests: Can you identify which service supports which feature? Max Lambda duration? ECS task definition structure?

Production teaches: The answer is almost never one service. A real system might use Lambda for event processing, ECS for the API layer, and EC2 for the database or stateful workload. The skill is knowing which workload belongs where — and being able to justify the operational trade-off you are accepting.