Running Knative with KinD (Kubernetes in Docker) in Macbook Air M1
Introduction to deploying serverless apps in kubernetes in a simple way. Forget about the complexity of deployment, service, HPA, and other manifest
One of my clients is building an AI app running on top of a Kubernetes cluster with GPU. It's been more than a year I'm not touching anything about Kubernetes in production. What makes me interested is how the app is running: it's using Kubeflow's InferenceService, which is running on top of Knative.
Before we start, you need to install:
Install KinD:
brew install kind
TLDR: Copy This Script
The script below will create kind cluster, install knative, run the hello world app and auto-scaling app.
brew install knative/client/kn
brew install knative-sandbox/kn-plugins/quickstart
kn quickstart kind --registry # this will also install kind local registry
kn service create hello --image ghcr.io/knative/helloworld-go:latest --port 8080 --env TARGET=World
kubectl apply -f https://raw.githubusercontent.com/knative/docs/main/docs/serving/autoscaling/autoscale-go/service.yaml
So, What Happened?
Check if everything is good
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a67f84478882 registry:2 "/entrypoint.sh /etc…" 6 hours ago Up 6 hours 0.0.0.0:5001->5000/tcp kind-registry
40cb8f208db6 kindest/node:v1.26.6 "/usr/local/bin/entr…" 6 hours ago Up 6 hours 127.0.0.1:60178->6443/tcp, 0.0.0.0:80->31080/tcp knative-control-plane
$ docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
a67f84478882 kind-registry 0.10% 5.336MiB / 1.942GiB 0.27% 1.57kB / 0B 752MB / 24.6MB 6
40cb8f208db6 knative-control-plane 40.89% 1.225GiB / 1.942GiB 63.06% 279MB / 16.8MB 229GB / 7.53GB 581
$ kind get clusters # will show `knative` cluster
knative
$ kubectl get namespace # will show default namespace and knative namespace
NAME STATUS AGE
default Active 105m
+ knative-eventing Active 101m
+ knative-serving Active 103m
+ kourier-system Active 102m
kube-node-lease Active 105m
kube-public Active 105m
kube-system Active 105m
local-path-storage Active 105m
$ kubectl get ksvc # it's ksvc, not svc. Will show knative services
NAME URL LATESTCREATED LATESTREADY READY REASON
autoscale-go http://autoscale-go.default.127.0.0.1.sslip.io autoscale-go-00001 autoscale-go-00001 True
hello http://hello.default.127.0.0.1.sslip.io hello-00001 hello-00001 True
$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
autoscale-go-00001-deployment 0/0 0 0 151m
hello-00001-deployment 0/0 0 0 4h16m
We can see that our first script will create a KinD cluster called knative
. KinD is Kubernetes in Docker, so we can see the cluster and its status with docker command. But every container that runs inside kubernetes will be invisible in docker.
We have knative serving (that will response to HTTP request), knative eventing (that will response to event aka event-driven app) and kourier (for networking, previously knative use istio).
To see our running apps, we can see via k get ksvc
aka knative service. Open the URL and you will get the response instantly (or you may get cold start). The interesting part is knative also creates a deployment and the replica is dynamic, if no traffic for 2 minutes (by default), then it will scale to zero. Give it a single traffic and it will change the replica to 1.
The first request will suffer a cold start, after that, we can get a more quick response. Let's look at a glance at the manifest for autoscale-go
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: autoscale-go
namespace: default
spec:
template:
metadata:
annotations:
# Target 10 in-flight-requests per pod.
autoscaling.knative.dev/target: "10"
spec:
containers:
- image: ghcr.io/knative/autoscale-go:latest
It's a simple manifest that handles everything behind the scenes. We will have a deep talk about it in another post.
Performance Test to Trigger Autoscaling
Scale from 0 to 1 is cool, but 1 pod can't handle anything serious. So we will create some traffic with K6 performance tools. But before that, let's get the URL for app called autoscale-go
$ # Get URL for autoscale-go
$ kubectl get ksvc
$ # Monitor if any update for our pod. You can change pod to deployment
$ kubectl get pod -l serving.knative.dev/service=autoscale-go -w
Copy the code and save it as perftest.js
import { sleep } from 'k6';
import http from 'k6/http';
// Each virtual user will run this function
export default function () {
// Change the url
const appURL = "http://autoscale-go.default.127.0.0.1.sslip.io/?sleep=50&prime=10000&bloat=5"
http.get(appURL);
// sleep for 0.1 second before finishing the task
sleep(0.1)
}
Lastly, open new terminal and run k6
$ # Create and maintain 50 virtual users and run the code for 30s
$ k6 run --vus 50 --duration 30s perftest.js
Let's monitor how it will behave
After kicking the performance test, the deployment replica jumps from 1 to 7 (top left) and we can see how the lifecycle of the pod (top right). The cpu and memory usage of KinD cluster also jump from 30% and 60% to 282% and 81% respectively. It's good because we don't need to set anything funny to set up this simple app.
And how about the result?
For simplicity, we will only see the http_req_failed
, http_req_duration
and iterations
. I think we already saturated the cluster since we only got 368 completed requests with an average 4.31s and p90 9.88s, fortunately, all requests have been served successfully. I tried it several times with different vus (e.g. 10, 20, and so on) and it got more completed requests with faster response time.
Knative Serverless is for Stateless App
Or watch full video:
Before you follow the hype, I should tell you this: Knative will be best if your app is stateless. If you have a vague idea about stateless, please read more about 12-factor app. TLDR, It's a guideline to make our app stateless, which means our app doesn't store any data (e.g. file, session, persistent data). Therefore, any data will be stored in a stateful backing service (e.g. postgres, redis). If you want to use some framework that use state by default (e.g. odoo, magento, etc), you should make some changes to be able to use it in Knative (or Kubernetes and docker in common).
What Next?
My goal is to understand this new way to deploy apps to Kubernetes. In this article, we discover a simple app that is far from production. In the future we will look how to implement more complex API, add environment variables and secret, add monitoring and logging, do rolling deployment, and ultimately implement it together with kubeflow for AI app.