Hot lambdas: keep Inquir function containers warm
Warm pools keep selected functions ready between calls so the next invoke skips most cold bootstrap (README cites ~5ms on the warm Docker path—benchmark your handlers). Firecracker microVMs on Linux are a separate isolation path with their own cold profile.
Situation
When cold starts hurt
Interactive flows and tight SLA endpoints notice sporadic initialization: dependency import, JIT warmup, connection pools.
Autoscaling from zero saves money but injects variance that shows up in user-visible tails.
Sharp edges
Why “always scale to zero” is not universal
Some workloads cost more in engineering time than in a modest always-warm footprint.
Edge-only models optimize for geography, not for long-lived language runtimes with heavy imports.
How Inquir fits
How Inquir approaches warmth
When hot lambdas are enabled, the Docker path can keep a small pool of ready runners per function. Before reuse, a quick health check confirms the process is still alive; idle containers age out on configurable timers so you are not leaking memory forever.
Each warm slot is still its own container for that function. If you need the strictest hygiene between requests, you can force dispose-after-use so every call starts fresh—at the cost of cold-start time.
Capabilities
Related controls
Tuning
Pool depth, idle timeouts, invocations per container, and dispose-after-use are all exposed as environment-driven settings on the Node server—tune them against real traffic instead of defaults.
Observability
Compare init vs handler time in execution traces and health-check latency from the pool.
Cost
Warmth trades idle Docker capacity for tail latency—validate against your traffic bands.
Steps
How to tune warm containers in Inquir Compute
Measure
Capture p95/p99 with realistic auth and payload sizes.
Adjust
Align pool sizes with traffic bands instead of guessing.
Review
Revisit after major dependency upgrades—warmth does not fix slow imports.
Code example
Latency thinking
Split measurements into connect, init, and handler phases when profiling.
// p50 alone hides cold tails — track p99 alongside error rates after deploys.Fit
Fit
When to use
- Steady traffic paths
- Agent tool loops with tight timeouts
When not to use
- Rare batch jobs where zero idle cost dominates
FAQ
FAQ
Is warmth guaranteed?
It is a best-effort capacity strategy; load tests on your environment remain the source of truth.
Does this replace profiling?
No. Slow code stays slow—warmth only removes one class of startup overhead.
What about streaming responses from the gateway?
The Docker orchestrator’s streaming path requires hot lambdas to be enabled—without a warm runner it refuses the stream so clients are not left half-connected.