LoopJump's Blog

10 Patterns for Controlling the Cloud in AWS

2024-10-10

来源:AWS re: invent 2018

Pattern 1: checksum all of the things

Pattern 2: Cryptgraphic Authentication

Encrypt and authenticate everything.

Control Planes are powerful and security critical systems.

Be able to revoke and rotate every credentials. But also watch out for certificate expiries.[??]

Prevent human access to production credentials.

Never allow a non-production control plane to talk to the production data plane.

Pattern 3: Cells, Shells and Poison Tasters

we divide up our control planes horizontally into regions, availability zones and cells.

it’s also common to compartmentalize control planes so that the data plane is insulated from control plance crashes.

poison tasters: check up front that is a change is safe.[??]

Pattern 4: Asynchronous Coupling

Synchronous systems are very strongly coupled.

A problem in a synchronous downstream dependency has immediate impact on the upstream callers.

Retries from upstream callers can all-too-easily fan-out and amplify problems.

Asynchronous coupling systems tend to be more tolerant.

Can make partial progress even when some components are unavailable.

Workflows and queues can be tuned to have deterministic retry behaviors.[?!]

Pattern 5: Closed Feedback Loops

Pattern 6: Small pushes and large pulls

very frequently asked question: is it better to push or to pull?

for example: should data plane hosts accept connections and be pushed configurations or should they connect to the control plane and pull them?

it’s really the wrong question.

Long lived connections can support pushing timely updates regardless of the direction of the connection.

better to ask: which fleet is bigger? in general, small fleets should connect to bigger fleets.

this avoids the problems of small fleets being overwhelmed with thundering herds and retry storms.

Patter 7: Avoid Cold Start and Cold Caches

cache are bi-modal systems. super fast when they have entries, and slow when they are empty.

a thundering herd hitting a cold cache can prevent it from ever getting warm.

retry storms often need to be moderated by throttles.

work out if you really need a cache at all.

pre-warm caches before acception requests.

consider serving stale entries when backends are unavailable.

Pattern 8: Throttles

Throttles and rate-limits are often needed to moderate problems requestors and to dampen fluctuating systems.

takes careful work to ensure that throttling does not impact the end curstomer experience.

Pattern 9: Delta

data with versions.

Pattern 10: Modality and Constant-Work

so far, we can build a loosely coupled control plane, with deltas to minimize work, and throttles to keep things safe.

but what if a LOT of things change at the same time?

we don’t want to build up backlogs and queues and introduce lag.

systems that changes performance in responsce to workload or data patterns can be fragile.

example: relational databases are great for flexible businiess queries but terrible for stable control planes. hidden optimizations and query plan flips can wreck chaos.

deployments, peak events, power events, all incur risk because they can be new modes.

扫描二维码,分享此文章