GCP — Theory
GCP — Theory (concise)
Section titled “GCP — Theory (concise)”When to choose GCP
Section titled “When to choose GCP”- Heavy data / analytics: BigQuery is best-in-class.
- Best K8s experience (GKE Autopilot for managed control + node).
- Need global SQL: Spanner.
- Multi-cluster / on-prem federation: Anthos.
Avoid GCP first when org already on AWS or needs region presence GCP lacks.
Resource hierarchy & billing
Section titled “Resource hierarchy & billing”- IAM and policies inherit Org → Folder → Project → Resource.
- Each Project = isolated compute, billing target.
- Common pattern: env-per-project (
acme-prod,acme-staging). - Folders for teams / business units.
- Bill alerts at org and project level.
VPC peculiarities
Section titled “VPC peculiarities”- One VPC spans all regions (subnets are per-region).
- Shared VPC — host project owns network; service projects attach. Centralized firewall + IAM.
- Private Google Access — VMs without public IPs reach Google APIs.
- Private Service Connect — endpoint into managed services.
- Cloud NAT — managed egress.
- Internal Load Balancer vs External — choose carefully for L7.
Service accounts deeply
Section titled “Service accounts deeply”- Pods/VMs/Cloud Run impersonate a SA.
- Workload Identity in GKE: link K8s SA to GCP SA. Recommended over node SA.
- Avoid SA keys; use short-lived tokens via metadata server.
- Audit
iam.serviceAccountTokenCreator,iam.serviceAccountUser— they let one identity become another.
BigQuery
Section titled “BigQuery”- Serverless. Storage and compute decoupled (slots).
- Pay per query (on-demand) or reserved slots.
- Avoid
SELECT *; partition + cluster tables to reduce scan cost. - Streaming insert vs batch load — streaming costs more, near-real-time.
- BI Engine for in-memory acceleration.
Cloud Run gotchas
Section titled “Cloud Run gotchas”- Containers must listen on
$PORT(default 8080). - Statefulness: revisions are immutable; new revision = new container.
- Cold starts exist; min instances > 0 mitigates.
- Concurrency > 1 means same container handles multiple requests — code must be safe.
- CPU is only allocated during request unless you enable always-on CPU.
- Built-in service-to-service auth via Google-signed identity tokens.
GKE Autopilot vs Standard
Section titled “GKE Autopilot vs Standard”- Autopilot — Google manages nodes, scaling, security. Pay per pod resources.
- Standard — you manage node pools.
- Autopilot is the default for most teams now.
Anthos
Section titled “Anthos”Manage multi-cluster / hybrid / multi-cloud K8s + service mesh. Federated config + policy. Niche.
Common interview Qs
Section titled “Common interview Qs”- GCS classes — when each? Standard (frequent), Nearline (30d+), Coldline (90d+), Archive (cold).
- BigQuery cost spike — debug. Check Information Schema for top jobs by bytes; partition + cluster tables; use authorized views.
- Workload Identity vs node SA? WI per-pod; node SA shared across all pods on node — too broad.
- Cloud Run vs Cloud Functions? CF is event-driven snippet (deprecating in favor of CR functions); CR is general containers.
- Spanner external consistency? Enabled by TrueTime API and commit-wait — global strong consistency.
- VPC Service Controls — what problem? Data exfiltration. Even with valid creds, you can’t move BigQuery data out of perimeter.
- Cross-project IAM? Grant principal of project A roles in project B. Common pattern: shared service projects.