
Isolation
- Physical: Multiple clusters are deployed to separate application environment such as Dev, Staging and Production.
- Logical: With logical isolation, a single AKS cluster can be used for multiple workloads, teams, or environments. Kubernetes Namespaces are used to form define different environment and allocate resources accordingly.
Quota Specification:
- Resource Quota: Define CPU, Memory, Total number of Volume or disk space, Total Number of secrets, jobs etc quota on namespace level to reserve and limit Resources.
- Request and Limits are specified on pod deployment.
Availability:
- Involuntary Disruptions: This includes hardware failures on physical machines. This can be mitigated by using replica sets and multiple nodes.
- Voluntary Disruptions: This includes Cluster upgrades, update deployments, accidental deletion of containers. It can be mitigated by using PodDisruptionBudget. If a cluster is to be upgraded or a deployment template updated, the Kubernetes scheduler makes sure additional pods are scheduled on other nodes. The scheduler waits before a node is rebooted until the defined number of pods are successfully scheduled on other nodes in the cluster.
Distribution:
- Taints and Tolerations: Define tolerations on pods to schedule pods only on specific defined node.
- Labels: Node Selectors and Node affinity is followed to schedule pods on nodes.
Access:
Azure AD Authentication, RBAC, Pod managed Identities are leveraged to define access levels for Cluster to respective users.
Version and Updates:
Maintain latest version of supported AKS and Nodes updates.
Image Security :
- Use private registries and create new images from official base images.
- Leverage Container Security in Security Centre.
Credential Exposure:
Define secrets and use of Azure Key vault to control credential Exposure.
Networking:
- Use of Kubenet and Azure CNI.
- Distrubute HTTP/S requests using ingress controllers.
- Using Application Gateway along with WAF to provide extra layer of security.
Monitoring and Debugging:
- Use of Azure Metrics and Insights to monitor Workloads and Set Alerts accordingly.
- Use of open source metric monitoring solution such as Prometheus.
- Regularly run the latest version of kube-advisor open source tool to detect issues in cluster.