Kudos to Gunjan for making such great list which I have slightly updated.
Kubernetes Security Checklist #
Risk Score | Description |
---|---|
10 | ImagePull Policy to set to ‘Always’ |
10 | service to service traffic should be mTLS using service mesh like istio |
10 | Separate registries for prod, dev and staging. No access to a person in prod gcr. A way of promoting images to prod |
10 | No privilaged pods/containers |
10 | Lock down 3rd party app RBAC, don’t just accept what’s in the docs. Try with lower permissions and go up |
10 | K8s RBAC for user groups (SRE, DevOps, InfoSec, Dev, Debugging, etc.) |
10 | Image must be scanned for vulnerability and signed before it can be deployed in prod |
10 | Make your container filesystem read-only using security context |
10 | Developers should use minimal OS and don’t stuff the whole runtime and OS |
10 | Network Policy must be enabled and enforced at least at namespace level |
10 | Pods with net=host (network namespace shared between host and the pod) |
9 | Binary Auth for gcr images |
9 | Admission controller where InfoSec & SRE enforce policies in prod |
9 | No containers running with default user (default is root!) |
9 | GKE: subscribe to “Regular” or “Stable” channel for the Kube API server (similar for other hosted k8s) |
9 | Limit who can talk to the API serve, exception granted to some 3rd party apps like Twistlock, Calico, istio, helm, etc. |
9 | Must specify resource limits at deployment level (limit admission controller) |
8 | Must use COS or similar hardened OS in Prod |
8 | Don’t allow ’latest’ or ‘dev’ or ‘master’ image tags in prod. Always use versioned tags |
7 | Istio policies applied at a higher level in conjunction with Network Policies |
7 | Proper use of namespaces (so you can apply network policies, RABC, secret sharing, etc.) |
7 | Pods with unnecessory ADD_CAP are equally dangerous as privilaged pods |
7 | Disable Service Account auto-mount |
7 | Each team should have PSP (Pod Security Policy) enabled with policies and exceptions managed by the SRE team |
6 | Host mounts, by sharing host mounts, you’re removing the filesystem isolation provided by containers |
5 | Specify resource quota per namespace |
5 | API server, kubelet upgrade SLA. Follow https://groups.google.com/forum/#!forum/kubernetes-announce and take action |
3 | Have an annotation for each deployment with owner email, so we can find the dev owner of that app |
Container Security Checklist #
Pipeline #
- Verify the image source (registry)
- Use official base images
- Lock down access to the image registry (who can push/pull)
- Scan container image layers for Common Vulnerabilities and Exposures (CVEs)
- Scan configuration files for security and compliance checks in continuous integration (CI)
- Do a static analysis of the code and libraries used by the code to surface any vulnerabilities in the code and its dependencies
- Tag and automatically prevent vulnerable images from running in certain clusters or prevent them from talking to other containers in the cluster
Host #
- Lock down the operating system (like Google’s Container Optimized OS (COS))
- Use secure computing (seccomp) to restrict host system call (syscall) access from containers
- Use Security-Enhanced Linux (SELinux) to further isolate containers
- Utilize container sandboxing projects like gVisor, Kata Containers, etc. to reduce the attack surface
Container Runtimes #
- Ensure security configurations span across container runtimes, especially if the environment has multiple container runtimes in the cluster (for example, different runtimes for the orchestrator control plane and workloads)
- Use policies (for example, pod security policy in Kubernetes) to restrict which containers can run in the cluster including policies to restrict privileged containers, containers that don’t need write access to a specific volume, and containers that need certain syscalls
- Restrict access to container runtime daemon/APIs
Network #
- Secure services that are exposed to the Internet using a firewall
- Lock down Layer 3 and 4 access for the services using network policy
- Create granular Layer 7 policies (Istio, Cilium)
- Use Mutual Transport Layer Security (mTLS) to mutually authenticate containerized workloads (for example, using Istio)
- Segregate containerized workloads with a mix of host segregation and network isolation (for example, separate group of hosts for workload segregation and/or network policies to isolate different group of containerized workloads)
- Log unsuccessful connection attempts
Orchestration Configuration #
- Implement version control for orchestrator service definitions and configurations (using git) for auditing
- Ensure cluster-level policies (such as security policies, network policies, and so on) go through your change request, review, and approval process
- Implement orchestrator API access security using role-based access control (RBAC) and network policies
- Be aware of which third-party plugins (such as Container Network Interfaces [CNIs], Container Storage Interface [CSIs], and Container Runtime Interfaces [CRIs]) are running (binary/DaemonSet/controller), what access they have, and whether they are running as privileged containers
- Control access to the orchestrator control plane APIs from third-party plugins using RBAC and service accounts
- Enable access logs for all API requests to the orchestrator control plane (for example, audit logs in Kubernetes)
- Scan orchestrator manifests for containerized apps (such as Kubernetes deployment manifests) for security best practices and applicable compliance standards in the CI phase
- If you have any sensitive configuration information in your cluster that needs to be accessed by containers at runtime make sure the configuration is encrypted (such as encrypted secrets in kubernetes, hashicorp vault) and restrict who can exec onto these pods
- Rorate encryption keys that are used for communication between orchestrator components (for example kubernetes API server and etcd)
Cloud Environment #
- If you’re running your containers in a cloud, remember the default security configuration (for orchestrator, container runtime, and operating system) can be different for different cloud providers
- Understand the version of orchestrator and container runtime components your cloud provider is running by default, and whether those components are modified from their open source version
- Scan environment deployment configurations (such as Terraform, Cloud Formation templates, and Azure ARM templates) for security best practices and compliance misalignments
Data #
- Use a proper filesystem encryption technology for container storage
- Provide write/execute access only to the containers that need to modify the data in a specific host filesystem path
- Reduce write/execute filesystem access for the host filesystem to a minimum using constructs like Pod Security Policy (for example, only allowing Read-only Root Filesystem access, listing allowed host filesystem paths to mount, and listing allowed Flex volume drivers)
- Automatically scan container images for sensitive data such as tokens, private keys, and so on, before pushing them to a container registry (can be done locally and in CI)
- Limit storage related syscalls and capabilities to prevent runtime privilege escalation
- Log all successful and unsuccessful attempts to access sensitive data