Skip to content

From CTDS

How does Gen3 manage access control?

In Gen3, you have fine-grained control over access to data through: the user.yaml; Fence and Arborist; defining program and project resources; and setting authz and tier_access_level.

It starts at the user.yaml, where you create roles and resources, and combine them to create policies that can be granted to a user for data access. Fence and Arborist then work together to compare the policies granted to the user with the requirements for accessing the data. Users with sufficient permissions to access or read the resource through which the data are presented will be allowed to access it.

Examples of how access can be controlled in Gen3

Some examples of what can be controlled-access (just a few of many):

  • Through the user.yaml and the frontend-framework authz.json config, you can lock down individual pages of the frontend so that users without permissions cannot even open them (and so cannot see any data on them).
  • Through the user.yaml, Indexd authz, and the Guppy config, you can make all data from a project (files, graph metadata, non-file graph data, Tube ETL-transformed graph data/metadata) restricted from view or download.
  • Through the user.yaml and Guppy config, you can make some transformed aggregate data indices open-access while the individual record data in the files, graph, and other non-aggregate transformation indices are controlled access. You can make visualizations available for these open-access aggregate data indices.
  • Through the user.yaml and Indexd authz, you can make project A open access, while project B is controlled access. People with access to project B can see query results including both project records, while others will only see query results from project A.

Open-access data

You can make data open-access by setting assorted data permissions so that anyone can access your data (or some parts of them) even if they’re not authenticated (logged in).

Some data are open-access by default in Gen3. This includes the data/metadata in MDS records, ETL-transformed MDS data (created through AggMDS), and the metadata in Indexd records. (Note: although the metadata in the Indexd records are open-access, the files described by the Indexd records are controlled-access by default through their authz values.)

Although most data in Gen3 are controlled-access by default, you can set data to be open-access.

Open to anonymous users

You can make data open to viewing by anyone who is not logged in (i.e., anonymous or unauthenticated). This is defined by creating policies that grant read-access to the projects or other resources you want to make open-access, and adding that policy to the anonymous_policies field in the user yaml, as shown below:

YAML
# user yaml config for making data open to anonymous users (i.e., users who are not logged in)
authz:

  anonymous_policies: # policies automatically given to anyone, even if they are not authenticated
  - open_data_reader

  all_users_policies: []

  resources:
  - name: open
  - name: programs
      subresources:
        - <program name>
          subresources:
            - name: projects
              subresources:
                - name: <project name>

  roles:
  - id: guppy_reader
    description: grant read access through guppy to resource defined in policy
    permissions:
    - id: guppy_reader
      action:
        method: read
        service: guppy
  - id: fence_reader
    description: grant read access through fence to resource defined in policy
    permissions:
    - id: fence_reader
      action:
        method: read
        service: fence
  - id: peregrine_reader
    description: grant read access through peregrine to resource defined in policy
    permissions:
    - id: peregrine_reader
      action:
        method: read
        service: peregrine
  - id: sheepdog_reader
    description: grant read access through sheepdog to resource defined in policy
    permissions:
    - id: sheepdog_reader
      action:
        method: read
        service: sheepdog

  policies: # these combine roles with resources
  - id: open_data_reader
    description: Users with this policy have read access to /open resources through guppy, fence, peregrine, and sheepdog
    role_ids:
      - guppy_reader
      - fence_reader
      - peregrine_reader
      - sheepdog_reader
    resource_paths:
      - /open
      - /programs/<program name>/projects/<project you want to be open-access> # e.g., /programs/OpenProgram/projects/OpenData

Open to all authenticated users

You can also make data open to viewing by anyone who is logged in (i.e., authenticated). Similar to how you grant access to anonymous users above, you first create policies that grant read-access to the projects or other resources you want to make open-access. But, instead of adding that policy to the anonymous_policies field in the user yaml, you add it to all_users_policies, as shown below:

YAML
authz:

  anonymous_policies: []

  all_users_policies: # policies automatically given to anyone who has logged in
  - open_data_reader

  #the rest of the config shown above (resources, roles, and policies) are the same as show above

Controlled-access data

Most data are controlled-access by default in Gen3. This includes: graph data (submitted through Sheepdog); files; and ETL-transformed graph data (created through Tube). In fact, access to these data is so controlled that you must create the proper configuration for ANYONE to have access to them.

General configuration for controlled-access data

For most controlled-access data, the general steps for configuring access are the same:

  1. Identify the resource that will control access to the data. This is most commonly the project name, but can be distinct resources for some types of data.
  2. Specify the resource in the user.yaml. If it is a project, the resource will have the form /programs/<program name>/projects/<project name>. Otherwise, it will have the form /<resource name> (e.g., /open).
  3. In the user.yaml, create a policy that grant users access or read-access to the resource.
  4. In the user.yaml, grant the policy to appropriate users (and wait for usersync to run).

Below, we describe how access is controlled for: graph data (and transformed graph data); file data; and (coming soon) MDS data (and transformed MDS data).

Controlling access to graph (Sheepdog) data

Access to graph data (whether graph metadata or non-file data in the graph) is controlled at the level of project. (Tip: If you have different consent groups that require different policies for access, you should make them different projects to control access independently). You can set a project to be open-access (as described above) or controlled-access in the user.yaml.

To create access to a project's graph data, add the project (and the program, if it is not already listed as a resource) to the list of resources. An example for how to add a program and project to the resources list is provided in the open-access data user.yaml config shown above.

Then, create a policy that provides read-access to the project resource. An example for creating this policy is also shown in the open-access data user.yaml config shown above.

The way this becomes controlled-access instead of open-access as shown in the previous example is that the policy is not added to either anonymous_policies nor all_users_policies. Instead, you create a group with this policy (and perhaps other policies, if you want), and add appropriate users to the group. For example:

YAML
# user yaml config for making a group that grants access to a controlled-access project
authz:

  anonymous_policies: []

  all_users_policies: []

  groups:
  - name: ProjA_access
    policies:
    - ProjA_data_reader
    users:
    - <username>

  policies: # these combine roles with resources
  - id: ProjA_data_reader
    description: Users with this policy have read access to ProjA through guppy, fence, peregrine, and sheepdog
    role_ids:
      - guppy_reader
      - fence_reader
      - peregrine_reader
      - sheepdog_reader
    resource_paths:
      - /programs/<program name>/projects/ProjA

# the necessary roles for guppy/fence/peregrine/sheepdog_reader are the same as for open-access
# the ProjA resource is structured similarly as in open access, although this specific policy is looking for a project called ProjA

  users:
    <username>: {}

Output from querying the graph database (e.g., querying the graph model through the Query page or querying through the Gen3 SDK Submission class) is governed by whether you have a policy that grants you read permissions for a controlled project.

Controlling access to ETL-transformed graph data (created by Tube from the Sheepdog database)

Tube ETL-transformed graph data indices have more flexibility for control. By default, access is controlled at the level of project, like the graph data.

However, Tube ETL-transformed graph data indices can be set to have the following access controls using the tier_access_level in the global config (you can see the Guppy documentation about this here):

  • Control access based on project, matching the access level of the graph data. This can be set with tier_access_level: private. This is the default configuration.
  • Control access to data in collector-type indices based on project, but permit open access to data in aggregator-type indices. This can be set with tier_access_level: regular and uses a minimum threshold of records present in the query output (as defined by you with the tier_access_limit property). If the number of records meets or exceeds the tier_access_limit value, the results will be returned even if the user does not have a policy that grants access to the project. However, if the query results in fewer records than the defined limit, it will instead return a message that there are too few records.
  • Open access, even if the graph data is controlled-access. This can be set with tier_access_level: libre. You might want to do this if the transformation provides further anonymization to the data.

In addition to using the site-wide global tier_access_limit property as described above, Gen3 users also have the option to set tier_access_limit individually for each index. This is described in the Guppy documentation.

Controlling access to file data

Files can have more granular access control than graph data. Permission to download a file is governed by the authz defined for the file in the Indexd record. If a user has been granted access to the resource in the authz field, they can download the file.

Typically, operators will set the authz to use the project as the resource. But, if you want to set more granular access on, for example, raw data files vs processed data files vs summary data files where all the files are from the same project, you can:

  1. Create resources specific to these groups
  2. Generate a policy in the user.yaml that grants download access to each of the new resources
  3. Grant those policies to users to permit download access to the files

The user.yaml configuration below creates the necessary configuration to download either:

  • files with Indexd authz: /open (this is governed by the open_data_reader policy, which now has the fence_storage_reader role)
  • files with Indexd authz: /programs/Program1/projects/ProjA_raw_files (this is governed by the ProjA_raw_downloader policy)
YAML
# user yaml config for allowing file download for authz: /open and authz: /programs/Program1/projects/ProjA_raw_files
authz:

  anonymous_policies: []

  all_users_policies: # policies automatically given to anyone who has logged in
  - open_data_reader

  groups:
  - name: ProjA_raw_access
    policies:
    - ProjA_raw_downloader
    users:
    - <username>

  resources:
  - name: open
  - name: programs
    subresources:
      - name: Program1
        subresources:
          - name: projects
            subresources:
              - name: ProjA
              - name: ProjA_raw_files
              - name: ProjA_processed_files
              - name: ProjA_immune_files

  roles:
  - id: fence_storage_reader
    description: read/download access across storage-backed services
    permissions:
    - id: fence_storage_reader
      action:
        method: storage
        service: fence
  # also include roles for guppy_reader, fence_reader, peregrine_reader, and sheepdog_reader as used previously

  policies: 

  - id: ProjA_raw_downloader
    description: Users with this policy can download files with authz /programs/Program1/projects/ProjA_raw_files
    role_ids:
      - storage_reader 
    resource_paths:
      - /programs/Program1/projects/ProjA_raw_files

  - id: open_data_reader
    description: Users with this policy have read/download access to /open resources through guppy, fence, peregrine, and sheepdog
    role_ids:
      - fence_storage_reader
      - guppy_reader
      - fence_reader
      - peregrine_reader
      - sheepdog_reader
    resource_paths:
      - /open

  users:
    <username>: {}

Controlling access by protecting frontend pages from access

In the new Frontend-Framework service, each page can optionally enforce authorization through policy. By default, all pages are unprotected (viewable by anonymous users) except Profile, Data Library, and Workspaces, which require the user to be logged in.

You can see information about how to set up page protection in the Frontend-Framework service documentation. It requires both configuration in the user.yaml and configuration through the authz.json in the frontend framework.

Gen3 also has a service, Requestor, that allows users to request access to resources and allows operators to grant access in a programmatic, auditable manner that maintains logs of requests and approvals. It bypasses the need to add users to the user.yaml and grants (and can also remove) policies directly to a user in the platform. You can use Requestor to manage access for anything that can be defined as a resource.

Did you enjoy this post? You can find other posts in the How does Gen3 series at https://docs.gen3.org/blog/category/how-does-gen3/.

A new blog series: How does Gen3...

At CTDS, we field many questions from beginner (and even experienced!) Gen3 users asking how Gen3 manages different data management and user access tasks. We have decided to start a new blog series to tackle some of these topics. You should be able to find these posts at https://docs.gen3.org/blog/category/how-does-gen3/.

Here are some topics we have queued up to tackle in this series:

  • How does Gen3 manage data access control?
  • How does data flow through Gen3?
  • How does Gen3 manage searching for data?

Feel free to suggest other topics in our Gen3 Community slack channel (not in our Slack channel? Request an invite here with an organizational email!) or by emailing us at support@gen3.org.

Deploying a Comprehensive Observability Stack with Helm

Monitoring and observability are essential for maintaining modern infrastructure and applications. With the new Observability Helm Chart, setting up a robust monitoring system is easier than ever. This chart provides an integrated stack featuring Grafana for visualizations, Loki for log aggregation, and Mimir for metrics storage and querying. Alloy can then be deployed in any cluster to collect logs and metrics to foward to Loki and Mimir. Additionally, you can optionally deploy the Faro Collector Helm Chart to further enhance observability by supporting Real User Monitoring (RUM) via the Fence Service.

Overview of the Observability Helm Chart

The Observability Helm Chart deploys a complete observability solution to your Kubernetes cluster. It bundles three core components:

Grafana:

An industry-leading visualization platform that allows users to create dashboards, track metrics, and set alerts.

Mimir:

A scalable time-series database optimized for efficiently storing and querying metrics across applications and infrastructure.

Loki:

A log aggregation system designed to index and query logs with minimal resource usage, seamlessly integrating with Grafana.

General Architecture

In this setup, Loki and Mimir are configured with internal ingress resources, enabling Alloy to send metrics and logs securely via VPC peering connections. Both Loki and Mimir write the ingested data to Amazon S3 for scalable and durable storage. This data can be queried and visualized through Grafana, which is hosted behind an internet-facing ingress. Access to Grafana can be restricted using CIDR ranges defined through the ALB ingress annotation: alb.ingress.kubernetes.io/inbound-cidrs: "cidrs". Additionally, the chart supports SAML authentication for Grafana, configured through the grafana.ini field, ensuring secure user access.

Grafana architecutre

Fips compliant images

Gen3 provides FIPS-compliant images, which are set as the default in the values file for Grafana, Mimir, and Loki. These images are self-hosted and maintained by the Gen3 Platform Team, ensuring secure and compliant operations. The Platform Team is responsible for managing image upgrades, and service versions will be updated as deemed necessary by the team.

Built-in Gen3 Alerts

This Helm chart comes equipped with built-in Gen3 alerts, defined in the 'alerting' section of the values.yaml. These alerts enable you to immediately leverage your logs and metrics as soon as Grafana is up and running.

Built-in Gen3 Dashboards

You can utilize Gen3-specific visualizations by visiting our grafana-dashboards repo

Alloy and Faro: Enhancing Observability

Alloy:

Collects logs and metrics from your services and sends them to Loki and Mimir for storage and analysis. Alloy acts as a bridge between your services and the observability stack, ensuring data flows smoothly to the right destinations.

Faro Collector:

A specialized configuration of Alloy designed to collect Real User Monitoring (RUM) data from Grafana Faro. This setup captures frontend metrics.

Helm Charts Overview

Observability Helm Chart: Deploys Grafana, Loki, and Mimir as the foundation of your observability platform.

Alloy Helm Chart: Configures Alloy to collect logs and metrics and forward them to Loki and Mimir. Alloy can be deployed in a separate cluster or VPC or it can be deployed in multiple clusters/vpcs.

Faro Collector Helm Chart: Adds RUM data collection to the stack by configuring Alloy to receive frontend metrics from Grafana Faro.

Conclusion

This new suite of Helm charts provides everything you need to monitor your Gen3 instance.

To see detailed instructions on how to set up these charts, please refer to the following links:

Boost Your K8s Productivity with These Handy Tools

Managing Kubernetes clusters and resources can get complicated quickly. Thankfully, there are some great open source tools that make working with k8s much easier. In this post, I'll highlight some of my favorite k8s productivity boosters.

kubectl Aliases

One of the first things I do when setting up my workstation to work with Kubernetes environments is create a set of aliases for common kubectl commands. This saves a ton of typing! Some useful aliases include:

Text Only
alias k=kubectl
alias kg=kubectl get
alias kgp=kubectl get pod
alias kd=kubectl describe
alias ke=kubectl edit
Full list of aliases!
Text Only
if (( $+commands[kubectl] )); then
    __KUBECTL_COMPLETION_FILE="${ZSH_CACHE_DIR}/kubectl_completion"

    if [[ ! -f $__KUBECTL_COMPLETION_FILE ]]; then
        kubectl completion zsh >! $__KUBECTL_COMPLETION_FILE
    fi

    [[ -f $__KUBECTL_COMPLETION_FILE ]] && source $__KUBECTL_COMPLETION_FILE

    unset __KUBECTL_COMPLETION_FILE
fi

# This command is used a LOT both below and in daily life
alias k=kubectl

# Execute a kubectl command against all namespaces
alias kca='f(){ kubectl "$@" --all-namespaces;  unset -f f; }; f'

# Apply a YML file
alias kaf='kubectl apply -f'

# Drop into an interactive terminal on a container
alias keti='kubectl exec -ti'

# Manage configuration quickly to switch contexts between local, dev ad staging.
alias kcuc='kubectl config use-context'
alias kcsc='kubectl config set-context'
alias kcdc='kubectl config delete-context'
alias kccc='kubectl config current-context'

# List all contexts
alias kcgc='kubectl config get-contexts'

# General aliases
alias kdel='kubectl delete'
alias kdelf='kubectl delete -f'

# Pod management.
alias kgp='kubectl get pods'
alias kgpw='kgp --watch'
alias kgpwide='kgp -o wide'
alias kep='kubectl edit pods'
alias kdp='kubectl describe pods'
alias kdelp='kubectl delete pods'

# get pod by label: kgpl "app=myapp" -n myns
alias kgpl='kgp -l'

# Service management.
alias kgs='kubectl get svc'
alias kgsw='kgs --watch'
alias kgswide='kgs -o wide'
alias kes='kubectl edit svc'
alias kds='kubectl describe svc'
alias kdels='kubectl delete svc'

# Ingress management
alias kgi='kubectl get ingress'
alias kei='kubectl edit ingress'
alias kdi='kubectl describe ingress'
alias kdeli='kubectl delete ingress'

# Namespace management
alias kgns='kubectl get namespaces'
alias kens='kubectl edit namespace'
alias kdns='kubectl describe namespace'
alias kdelns='kubectl delete namespace'
alias kcn='kubectl config set-context $(kubectl config current-context) --namespace'

# ConfigMap management
alias kgcm='kubectl get configmaps'
alias kecm='kubectl edit configmap'
alias kdcm='kubectl describe configmap'
alias kdelcm='kubectl delete configmap'

# Secret management
alias kgsec='kubectl get secret'
alias kdsec='kubectl describe secret'
alias kdelsec='kubectl delete secret'

# Deployment management.
alias kgd='kubectl get deployment'
alias kgdw='kgd --watch'
alias kgdwide='kgd -o wide'
alias ked='kubectl edit deployment'
alias kdd='kubectl describe deployment'
alias kdeld='kubectl delete deployment'
alias ksd='kubectl scale deployment'
alias krsd='kubectl rollout status deployment'
kres(){
    kubectl set env $@ REFRESHED_AT=$(date +%Y%m%d%H%M%S)
}

# Rollout management.
alias kgrs='kubectl get rs'
alias krh='kubectl rollout history'
alias kru='kubectl rollout undo'

# Port forwarding
alias kpf="kubectl port-forward"

# Tools for accessing all information
alias kga='kubectl get all'
alias kgaa='kubectl get all --all-namespaces'

# Logs
alias kl='kubectl logs'
alias klf='kubectl logs -f'

# File copy
alias kcp='kubectl cp'

# Node Management
alias kgno='kubectl get nodes'
alias keno='kubectl edit node'
alias kdno='kubectl describe node'
alias kdelno='kubectl delete node'

I stole my k8s aliases from a Github Gist. Huge shoutout to Github User doevelopper

k9s

https://k9scli.io

k9s provides a terminal UI for interacting with your Kubernetes clusters. It's great for get a quick overview of pods, nodes, services etc. Some of the handy features include:

  • Live filtering of resources
  • Easy log viewing
  • Executing containers
  • Resource editing

k9s makes it super easy to manage Kubernetes in a terminal-centric workflow.

asciicast

kubectx and kubens

kubectx and kubens allow you to quickly switch between Kubernetes contexts and namespaces. This comes in handy when you're working with multiple clusters or namespaces.

Some examples:

kubens staging - switch to staging namespace
kubectx minikube - change context to minikube cluster

No more typing out full context and namespace names!

Here's a kubectx demo: kubectx demo

...and here's a kubens demo: kubens demo

Credit: Created and released by Ahmet Alp Balkan

Running K8s on Your Laptop: Exploring the Options

Kubernetes (often abbreviated as "K8s") is an open-source platform designed to automate deploying, scaling, and managing containerized applications. Initially, Kubernetes might seem more fitting for large scale, cloud environments. However, for learning, development, and testing purposes, running Kubernetes locally on your laptop is incredibly beneficial. Let's dive into the various ways you can achieve this.

1. Minikube

Pros:

  • Officially supported by Kubernetes.
  • Provides a full-fledged K8s cluster with just one node.
  • Supports many Kubernetes features out-of-the-box.
  • Easy to install and use.

Cons:

  • Can be resource-intensive.
  • Requires a virtual machine (VM) or a local container runtime.

Overview:

Minikube is essentially a tool that runs a single-node Kubernetes cluster locally inside a VM (by default). This makes it perfect for users looking to get a taste of Kubernetes without the complications of setting up a multi-node cluster.

2. Docker Desktop

Pros:

  • Comes integrated with Docker, a popular containerization tool.
  • Provides Kubernetes out-of-the-box, no additional installation required.
  • Does not require a VM for macOS and Windows.

Cons:

  • Limited to a single node.
  • Might not support all K8s features.

Overview:

Docker Desktop, available for both Windows and macOS, offers a simple way to start a Kubernetes cluster. By simply checking a box in the settings, you get a single-node K8s cluster running alongside your Docker containers.

3. Kind (Kubernetes IN Docker)

Pros:

  • Runs K8s clusters using Docker containers as nodes.
  • Lightweight and fast.
  • Can simulate multi-node clusters.

Cons:

  • Might be slightly more complex for beginners.
  • Intended primarily for testing Kubernetes itself.

Overview:

Kind is an innovative solution that allows you to run Kubernetes clusters where each node is a Docker container. It’s especially useful for CI/CD pipelines and testing Kubernetes itself.

4. MicroK8s

Pros:

  • Lightweight and fast.
  • Single command installation.
  • Offers various add-ons for enhanced functionality.

Cons:

  • Limited to Linux.
  • Not as widely adopted as other solutions.

Overview:

MicroK8s is a minimal Kubernetes distribution aimed at developers and edge computing. It's a snap package, which makes it extremely simple to install on any Linux distribution.

5. K3s

Pros:

  • Extremely lightweight.
  • Simple to install and run.
  • Suitable for edge, IoT, and CI.

Cons:

  • Strips out certain default K8s functionalities to remain light.

Overview:

K3s is a lightweight version of Kubernetes. It's designed for use cases where resources are a constraint or where you don't need the full feature set of standard Kubernetes.

6. Rancher Desktop

Pros:

  • Provides a user-friendly GUI for managing Kubernetes clusters.
  • Supports multi-node clusters.
  • Offers integration with Rancher for enhanced Kubernetes management.
  • Works on Windows, macOS, and Linux.

Cons:

  • Requires additional setup compared to some other options.
  • May consume more resources for multi-node clusters.

Overview:

Rancher Desktop is a versatile tool that simplifies the management of Kubernetes clusters on your local machine. It offers a user-friendly graphical interface, making it an excellent choice for users who prefer a visual approach to Kubernetes cluster management. Rancher Desktop can set up and manage multi-node clusters, which can be valuable for testing and development scenarios. Additionally, it integrates seamlessly with Rancher, providing even more advanced Kubernetes management capabilities.

Conclusion

Running Kubernetes on your laptop is feasible and offers a variety of methods, each catering to different use cases. Whether you’re a developer wanting to test out your applications, an enthusiast keen on learning Kubernetes, or even someone looking to set up CI/CD pipelines, there's an option for you.

It's essential to weigh the pros and cons of each method, consider your resource limitations, and the scope of your projects. Regardless of the option you choose, diving into the world of Kubernetes is an enriching experience, offering a deep dive into modern cloud-native development and operations.

Choosing the Right Lightweight Kubernetes Tool for Local Development

Kubernetes, the popular container orchestration platform, is a cornerstone of modern development and deployment. However, running Kubernetes locally for development and testing purposes requires efficient tools that don't consume excessive resources. In this article, we'll explore several lightweight Kubernetes tools for local development and discuss their pros and cons.

Of course, getting every bell and whistle working (like that handy ingress feature that routes external traffic around the cluster) might need some extra tweaking on a basic laptop setup. But hey, half the fun is figuring out how to configure your local environment to really sing, right? As we look at tools for local dev, we'll hit on ways to tune things up for peak Gen3 performance.

When it comes to local Kubernetes development, several solid options exist for standing up a dev cluster directly on your laptop. In this blog post, we will explore popular choices!

Now, here's the cool part - Gen3 works on any Kubernetes cluster, whether you've just spun one up on your laptop or have a full-blown production cluster. That means you can kick the tires locally before taking it out for a spin in the real world.

Kind (Kubernetes IN Docker)

Overview: Kind runs Kubernetes inside a Docker container, making it an excellent choice for local development and testing. It is also used by the Kubernetes team to test Kubernetes itself.

Pros: - Fast cluster creation (around 20 seconds). - Robust and reliable, thanks to containerd usage. - Suitable for CI environments (e.g., TravisCI, CircleCI).

Cons: - Ingress controllers needs to be deployed manually

Preferred for gen3

In my experience, this is the most preferred method for running Gen3 on a laptop especially when paired up with OrbStack instead of Docker/Rancher desktop. I use this as my preffered K8s on my M1 Macbook.

Docker for Desktop

Overview: Docker for Desktop is an accessible option for MacOS users. Enabling Kubernetes in the Docker For Mac preferences allows you to run Kubernetes locally.

Pros: - Widely used and well-supported. - No additional installations required. - Built images are immediately available in-cluster.

Cons: - Resource-intensive due to docker-shim usage. - Difficult to customize and troubleshoot.

MicroK8s

Overview: MicroK8s is recommended for Ubuntu users. It is installed using Snap and includes useful plugins for easy configuration.

Pros: - Minimal overhead on Linux (no VM). - Simplified configuration with plugins. - Supports a local image registry for fast image management.

Cons: - Resetting the cluster is slow and can be error-prone. - Best optimized for Ubuntu, may be less stable on other platforms.

Rancher Desktop

Overview: Rancher Desktop is an open-source alternative to Docker Desktop. It uses containerd by default and offers flexibility in choosing a container runtime.

Pros: - Cross-platform (MacOS/Linux/Windows). - Utilizes k3s, known for its speed and resource efficiency. - Ingress with Traefik works out of the box

Cons: - Rapidly evolving, not fully supported by all tools.

Minikube

Overview: Minikube is a versatile option offering high fidelity and customization. It supports various Kubernetes versions, container runtimes, and more.

Pros: - Feature-rich local Kubernetes solution. - Customizable with multiple options. - Supports a local image registry for efficient image handling.

Cons: - Initial setup complexity, especially with VM drivers. - Some advanced options may require manual configuration. - Resource-intensive if using a VM.

k3d

Overview: k3d runs k3s, a lightweight Kubernetes distribution, inside a Docker container. k3s removes optional and legacy features while maintaining compatibility with full Kubernetes.

Pros: - Extremely fast startup (less than 5 seconds on most machines). - Built-in local registry optimized for Tilt.

Cons: - Less widely used, leading to limited documentation. - Some tools may have slower adoption.

In conclusion, choosing the right lightweight Kubernetes tool for your local development depends on your specific needs and preferences. Each tool offers a unique set of advantages and drawbacks, so consider your project requirements and platform compatibility when making your decision.

Feel free to experiment with these tools and share your experiences in the Kubernetes development journey!