Back

July 2024

continuing last months article here, and in the MANY coming months lol

running services using docker

current deployment scenario

  • we have servers running on 4000, 4001, 4002, 4003 and 4005
  • now a simple solution is to take all this code and place it in a VM and run it there
  • lets say comments service is being called too much, so we could create 2 more instances of comments and use a load balancer (load balancer basically randomises which comments server to send it to)
  • and then wed have to use 2 more extra ports for these 2 new comments services
  • and because we added these 2 new ports, we have to make changes in multiple projects, such as event bus and more
  • and this way we're tightly coupling our code to the number of instances of comments
  • and if you say we're going to have a separate VM for extra comments, we're still going to have to change multiple files and make changes
  • hence we're going to be introd to docker and kube

why docker

  • we're going to create containers
  • containers are like isolated computing envs
  • it contains everything we need to run a single program
  • so we're going to create separate containers for separate services
  • if we need extra instances, we can just create another container with comments
  • what do we need to run our services? npm and node right?
  • now, if we need to run it somewhere we're assuming node and npm are installed no? and thats a big assumption
  • and also it requires knowledge on which script to use to run the services
  • so docker solves both these problems, its going to contain everything we need to run the service and will also know how to start it etc
  • super easy when it comes to running services, not just for node js but for anything

why kube

  • tool for running a bunch of containers together
  • when we run kube, we're supposed to give it a config file which tells it which all containers we'd like to run
  • and then kube is going to handle comm and nw reqs btw all these containers
  • so kube creates this thing called a cluster
  • a cluster is a set of diff VMs, each VM is referred to as a node
  • theyre all managed by something called a master and this master is a program thats going to manage everything inside of our cluster, all the programs and other aspects
  • so lets say we have 3 nodes
    • 2 post
    • and 1 event bus
    • now when an event occurs, we'd have to tell the event bus how to reach the 2 post containers
    • when in fact, kube offers this "channel" , that lets us pass things to it and this channel will forward it to posts
    • so this way communication is super simple
  • kube also makes copying containers and scaling super easy

docker

  • why should we use docker?
    • at some point we mustve installed some sw on our laptop
    • during the installation wizard, we may come across some error, we may have looked up the error and fixed that issue
    • and when the wizard continues, we may come across another error and then its the troubleshooting phase all over again
    • so, docker wants to make it super easy & straight forward to install and run sw on ANY device, laptop, servers etc
    • docker makes it really easy to install and run sw without worrying about setup or dependencies
  • what is docker?
    • a bit more challenging to answer
    • when you read an article, or talk to someone, when they say "Oh i use docker", they most probably mean that they use the Docker ecosystem, and that Ecosystem consists of Docker Client, Server, Machine, Images, Hub, Compose
    • All these tools are pieces of sw that come together to form a platform to create containers
    • In essence, docker is a platform or ecosystem around creating and running containers
    • so to run redis, we ran docker run -it redis so when we ran this, something called docker cli reached out to docker hub and it downloaded a single file called an Image
    • an Image is a single file that contains all the dependencies and all the config req to run a ver specific program, for eg redis
    • a container is an essence of an image
    • a container is a program with its own isolated set of hw resources, its own memory, its own nw tech, its own hard drive
  • installing docker
    • when we install docker, we're installing 2 key pieces of sw
    • called docker client (cli) and docker server (daemon)
    • cli - the tool that were issuing commands to
    • daemon - the tools thats responsible for creating images, running containers etc
  • docker bts
    • we installed docker and ran docker run hello-world
    • here's all the things that happened bts
      • you gave the command to the cli
      • cli relayed this to the daemon
      • daemon checked if this image is available locally by looking into the image cache
      • since we just installed docker, we obv didnt have it, so the daemon reached out to docker hub, which is a repo of free images that we can download and run
      • so the daemon downloaded it from the hub and saved it locally in the image cache, where it can obtained from next time
      • then the daemon took that image, loaded it into the memory and created a container and ran it
  • what is a container?
    • but before that,
      • what is NAMESPACING?
        • it is basically isolating your hard drive or any "resource" for that matter for processes or a group of processes
      • what are control groups? (cgroups)
        • they limit the amount of resources used per process
        • amount of memory, cpu, hard drive, and nw bandwidth
    • so in other words, a container is basically just a process and its necessary req resources
    • so the process talks to the kernel which in return talks to the resources that have been available for this process
    • what is the image -> container relation?
      • whenever we talk about images? we're talking specifically about file system (FS) snapshots and a start up command
      • what is a fs snapshot? its a copy-paste of some directories from the FS
      • so what happens when we run this image?
        • the kernel is going to allocate some part of the hard drive for the fs snapshot inside the image and store the snapshot data there
        • and then the startup command is run, and then the process is started and is isolated to just that container
  • how is docker running on your device?
    • namespacing and cgroups are specific to linux
    • when we installed docker, we basically installed and are running a linux VM
    • inside this VM, we're creating all these containers
    • and the VM talks to the kernel and allocates resources etc

creating and running an image

  • docker run <imagename>
  • we can also override the start up command by doing this
    • docker run <imagename> command!
    • after the imagename, we supply an alt command to be executed inside the container after it starts up
    • this is an override and the command that was there alongside the fs snapshot will not be run anymore
    • docker run busybox echo hi there
      • what does this mean?
      • we installed an image called busybox which contains .exe files called echo, ls, etc
      • and who knows what the startup command for busybox is? and we dont need to know it too as long as we know what command to run
      • so in this example, the busybox image has an exe called echo which repeats whatever you say to the docker cli
      • so when we run this command, we see hi there
    • docker run busybox ls
      • same understanding as above
      • busybox has an ls exe
      • so when we run it, we see the files that were copied to the container's hard drive by the fs snapshot
      • so whatever files you see, are all files that were copied from the fs snapshot and ARE NOT YOUR LOCAL FILES

listing running containers

  • docker ps
  • this command will list all the diff running containers that are currently on our machine
  • docker ps --all
  • this command shows all the commands that have been called by you

container lifecycle

  • running docker run is equivalent to running docker create + docker start
  • docker create <image-name> - creates a container
  • docker start <image-name> - starts a container
  • what happens when we create a container?
    • the fs snapshot from the image is taken and setup in the container's hardware
  • what happens when we start a container?
    • we run the startup command that comes with the image
  • now lets create and start a container
docker create hello-world
> 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768

docker start -a 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768
> output

%% what happens if we dont give the -a %%
docker start 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768
> 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768

%% the -a attribute tells docker to watch for the output %%

utils

to clear your docker containers

  • you can clear your stopped containers, build cache etc by running this command
docker system prune
  • so now, if you want to run any image, it would first download the image from docker hub and then run it for us

to get output logs

  • now there maybe times when we forget to add the -a flag while using docker start
  • and if it takes minutes/ hours to run, having to re run it again with the -a flag is just painful
  • which is why we can use logging, such that whatever event is emitted from this container, its logged
docker start 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768
> 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768

docker log 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768
> 4bf79a0bcd1a0703cf0d67c51bd31d58d375ca411c478782bf69e6962347a768
> hi there

how to stop containers

  • when we used to run docker run, we could stop the execution using cmd+c
  • but if we use docker start and docker logs, how do we stop containers?
  • docker stop container-id or docker kill container-id
  • stop
    • when we use the stop command, bts, docker sends a signal to the container called SIGTERM
    • aka terminate signal
    • what this does is, it tells the container to terminate the signal in its own time and also gives the container a little time to perform some clean ups
    • if the container doesnt stop within 10s, docker automatically issues the kill command
  • kill
    • when we use the kill command, bts, we send the SIGKILL command
    • aka kill signal
    • terminate and dont do anything else
  • ideally wed like to stop containers

execute an additional command in a container

  • when using redis
  • we usually run 2 commands, redis-server and redis-cli
  • but if we run redis in a container, we cant access this redis server from outside (obvi)
  • so then, in essence, we need to have another startup command along with the one that comes already with the image
  • but how do we call it? docker exec -it container-id command
  • exec - run another command -it - allwos us to provide input to the container command - the extra command you want to run
  • an example:
%% shell 1 %%
docker run redis

%% shell 2 %%
docker ps
> CONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS      NAMES
> d29c63477078   redis     "docker-entrypoint.s…"   5 seconds ago   Up 4 seconds   6379/tcp   trusting_poitras

docker exec -it d29c63477078 redis-cli
> 127.0.0.1:6379:> set mynumber 5
> 127.0.0.1:6379:> get mynumber
> "5"
  • what happens if we dont give -it?
    • redis cli will be started but we wont have the ability to provide any inputs to this

purpose of -it flag

  • in reality, the -it flag is a combination of a -i and a -t flag
  • processes in linux have 3 channels lets say, STDIN, STDOUT and STDERR
  • whatever ip you give, goes into the container using the STDIN channel, whatever the container spits out is shown to you via the STDOUT channel, and if any errors occur, theyre shown to you via the STDERR channel
  • so when we type -i we're saying we want to attach this terminal session to the STDIN channel of the newly running process
  • the -t flag simply formats it nicely for us to use, it does quite a bit under the hood, but simply for us to understand, it just makes the ips and ops pretty
  • also provides autocomplete etc..

how to get access to the terminal of your container?

  • we will want to run commands inside our container without constantly wanting to type exec exec etc
  • so if we want to access the terminal of a container, we can run this
docker exec -it container-id sh
> #
  • and now we can run any linux or terminal commands like cd, ls, export etc
  • and once youre in your shell and if you cant exit using ctrl+c, you can try ctrl+d

starting with a shell

  • we can also run the docker run -it busybox sh command to start the busybox container, but just open the shell
  • this way, we can poke around ourselves and execute whatever command we want

creating docker images

  • till now we've used images created by other devs, redis-cli, busybox, hello-world
  • the process to create our own image is relatively straight forward, we just have to remember the syntax
  1. create a dockerfile
    • this is a plain txt file with a few lines of config to define how our container must behave, what prgms it must contain and what it should do when it starts up
  2. send it to docker client
  3. client will send it to docker server
    • server is doing all the heavy lifting for us
    • its going to look inside the dockerfile and create an image that we can use
  4. we have a usable image
  • what is the flow to create a docker file?
    • specify a base image -> run some commands to install additional prgms -> specify a command to run on container setup
  • here is a simple Dockerfile
# using the flow established earlier
# use an existing docker image as a base
FROM alpine

# download and install a dependency
RUN apk add --update redis

# tell the image what to do when it starts as a container
CMD ["redis-server"]
  • then we cd into the file where we have this Dockerfile
  • and we simply run docker build .
  • and then in the end, we'll get one id of the image we created, then we can simply do docker run image-id

breaking down the Dockerfile

  • the Dockerfile contains a particular syntax
  • the first word is an 'instruction'
  • and whatever comes after it is an 'argument'
  • so the instruction tells docker-server what to do
  • the 'FROM' is used to specify what image we want to use as a base
  • the 'RUN' instruction is used to execute some command while we are preparing our custom image
  • the 'CMD' instruction is used to specify what should be executed when our image is used to start up a container
  • FRO RUN and CMD are some of THE MOST IMP instructions, but there are many more that need to be known

what is a base image?

  • writing a Dockerfile is similar to installing some browser on a comp with no OS
  • what are the steps we'd do?
    • install an os -> open default browser -> download chrome dmg -> run dmg right?
  • so what we did with FROM alpine was very similar
  • that was like saying, install this OS (not exactly but just to paint a picture of what its like)
  • otherwise it would be an empty image, no infra, no programs that we could use, nothing to help us install extra dependencies, nothing
  • so having a base image is to give us a starting point of sorts that we can customise etc
  • what is alpine?
    • simple, it contained the necessary programs and functions that we needed to create our custom image
    • very much like asking, why you chose win/ mac as your preferred os- because they provided you with what YOU needed

breaking down the build process

  • so first things first, we check if our build cache if we've installed an image for alpine
  • if we haven't, we install that from docker server and create an image with it
  • now this image contains 2 parts right? its FS snapshot and some startup running command
      • The End of Step 1 - -
  • Then we come across the RUN instruction which asks the server to install redis right, so what happens here?
  • we create a temporary container with that image, using the command we gave as an argument to that instruction as the startup command
  • now our FS contains all the files from alpine and also the files for redis server right?
  • a copy of the fs is taken and this container is terminated
      • The End of Step 2 - -
  • the 3rd line, is an instruction telling what command we'd like to have as our startup command right?
  • so we again spin up a temporary image, and copy the fs snapshot from the previous step and replace the startup command with what is specified in the Dockerfile
  • and this final image is the image that we return to the user
      • The End of Step 3 - -

rebuilds with cache

  • if we re run the build command for this Dockerfile, we can see that it simply says CACHED [2/2] RUN apk add --update redis
  • in other words, it caches every line of operation of our dockerfile
  • lets say we add another line after RUN apk add --update redis that says RUN apk add --update gcc
  • now that our Dockerfile has changed, we dont run every line, the docker server sees which line has changed, and then it only executes the lines that have changed and the lines below that
  • and it uses cache to build the lines before it, because they havent changed since the last build
  • now, lets say we invert the redis and the gcc line
  • now, the server would have to reinstall gcc and redis because now the order has changed and now redis being installed AFTER the changed gcc line
  • so, when modifying dockerfiles, its important to make sure we add new liens towards the bottom to maximise cache usage

tagging an image

  • now at the end of the build process, to run the image we just created we have to run attach the long string (smthn like sha256:950fc54d9b019d2b2e06fae0e3192f65353a081504811372f41e0a989aab71b0) at the end of docker run right?
  • now copying this long string is not difficult, but it would be easier if we could 'tag' it, or in other words give it an alias right?
  • for that, we need to modify the build command slightly docker build -t your-docker-id/imagae-name:version .
  • your docker id is what you setup as your username
  • you can choose whatever name youd like to set for the image, maybe redis-server or just redis
  • and then the version is usually a number, but you could also just put :latest
  • so your build command should look like this in case you decide to tag your image
docker build -t briha2101/redis-server:latest .
> ...
> ...
> ...
> Tagged as ...

docker run briha2101/redis-server
  • so now, when i want to run this image, i can use the tag i just gave
  • i can optionally emit the version, because by default the latest version would be selected for container creation
  • but technically speaking, the version we specify at the end is the tag, everything else is more like the repo or the project name

manual image creation using docker commits

  • very rare that we would actually do this
  • ok, but how would we do this?
docker run -it alpine sh
> #> apk add --update redis-server
> ...
> ...
> ...
  • in another terminal shell,
  • for this we use the commit property along with the -c flag and within single quotes, we mention the command we want to set as the initial command and then we follow it with the id of the container
docker ps
> CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
> efeee12       alpine      "sh"       ...       ...       ...       ...

docker commit -c 'CMD ["redis-server"]' efeee12
> sha256:2473rshfsgsh


docker run 2473rshfsgsh
> ...
> ...
> redis-server initialised

run a nodejs server inside the container and access it from outside

  • what does alpine mean?
  • in the docker ecosystem, alpine basically means only the most basic version
  • so node:alpine would mean, only the most essential and stripped down image of node
  • flow - create a nodejs webapp
    const express = require("express");
    

const app = express(); app.get("/", (req, res, next) => { res.send("Hello World!"); }); app.listen(8080, () => { console.log("Listening on port 8080 🚀"); });


    - create a dockerfile
    ```Dockerfile
    # specify a base image
    # downloadinfg and running an image that has node preinstalled

FROM node:14-alpine

# copy the contents of the current work-dir and paste it into the container

COPY ./ ./

# download and install the dependancies

RUN npm install

# default command

CMD ["npm", "start"]
``` - build image using dockerfile
        -`docker build -t briha2101/simpleweb .`        - if we dont add`:latest`also its ok, its appended by def
    - run image as a container
        -`docker run briha2101/simpleweb`   - connect to webapp from a browser
        - port forwarding
            - when we start the image, the server is running on 8080
            - but if access that on our machine, we cant bevause the container is an isolated env and the traffic isnt routed to the container's ports
            - the container has its isolated set of ports, through which it can receive traffic but by default traffic from our local machine wont be sent into it
            - if we want traffic from local nw to be sent to our container, we need to setup an *explicit port mapping*
            - **port mapping** is essentially saying, if anyone makes a req to port 8080 on your local machine, forward that to one of the container's ports
            - now, our container has no limitation on sending data out, as we've noticed with npm install. the only problem is data in ->
            - how do we enable this? its not something to change inside the Dockerfile, but something we add while running the image
            - the syntax:`docker run -p 8080:8080 briha2101/simpleweb`          - the`-p` flag is for port mapping - then we mention the port on localhost that we'd like to fwd to the container - then we mention the port inside the container - and then the id of the image

```bash
docker run -p 8080:8080 briha2101/simpleweb
  • note, for forwarding the local nw port and the container dont have to be the same (which is what we'll be doing in prod projects)

setting up a working dir

  • now in the dockerfile when i copied everything from the folder into the container, it placed everything in the root folder
  • and this can sometimes cause conflicts when we have folders with the same names, and these folders override existing folders and disrupt services in our container
  • which is why we can setup a working directory and mention where in the container we want to store all our files
WORKDIR /usr/app
COPY ./ ./

RUN npm install
  • this states that, well be setting the files' location to usr/app
  • and then copy the files there
  • and then the rest of the processes
  • if this folder doesnt exist, it will be created for us

dealing with changes and rebuilds

  • lets make a change in the index.js file and see if it reflects on the browser
  • OBV NOT, because we havent copied the latest version of the code to the container
  • so, how would we do this? we have to build the container again
  • so when we do it, docker notices that the files have changed, so it re runs the COPY instruction and all the commands under it
  • and that means npm install also, and re running npm i when we dont have any new dependancies is troubling when we have many dependancies in our project. so how do we fix this?
  • so the goal is to minimise npm i runs right? what does npm i need? a package.json file. so kets just copy that ONE file first
COPY ./package.json ./

RUN npm install
  • this way, npm i will only run if theres a change in the package.json file, which is exactly what we want
COPY ./package.json ./
RUN npm install
COPY ./ ./
  • and then we can simply do this, where we copy all the remaining files later
  • so now, even if we make some changes, we arent running npm i, itll simply use build cache, well only copy the files and the run command

notes

  • Buildkit will hide away much of its progress which is something the legacy builder did not do. We will be discussing some messages and errors later in Section 4 that will be hidden by default. To see this output, you will want to pass the progress flag to the build command: docker build --progress=plain .
  • Additionally, you can pass the no-cache flag to disable any caching: docker build --no-cache --progress=plain .

commands overview

docker build -t userid/tag # building an image and tagging
docker run [image_id/tag] # running an imahe
docker run -it [image_id] [cmd] # create and start a container, but override the startup command
docker ps # show all running processes
docker exec -it [container_id] [cmd] # execute the given command in a RUNNING container
docker logs [container_id] # print the logs from the given container
  • now we take all of this learning and create images of all the services we created in the blog project

collections of services using kube

installing kube

  • kube is a tool for running a bunch of different containers
  • we give it some config to describe how we want our containers to run and interact with each other
  • we can install it through docker desktop
  • docker desktop > settings > enable kube
  • and if you want to check the installation, you can run this command
kubectl version

tour of kube

  • whenever we want to use kube, we need to have all the images ready to go
  • so when the images are ready, we're going to want to create a container out of it, and deploy it to a kubernetes cluster
  • a 'NODE' is a virtual machine
  • its a computer thats going to run some containers for us
  • if we're running kube on our local machine, well most probably have just 1 node
  • only when we start deploying it to some cloud provider do we have access to multiple nodes
  • but the process is the same for 1 or even a 1000 nodes
  • so when we want to run these containers, we first create a config file
  • in this config file, we tell it EXPLICIT instructions on what we need done and also setup some networking on it so that we can interact with it etc
  • and we're going to send it to the "MASTER" of the kube engine, using cli using the kubectl command
  • once we deploy this config file, the kube engine is going to search for this image, itll first search our docker daemon and see if we have it, if we dont itll search the hub
  • and then as per our instructions, ill create as many containers as weve mentioned and will distribute it equally among nodes (but there is some science behind how it distributes it equally)
  • each container is going to be hosted and created inside something called a POD
  • NOTE: POD and CONTAINER are NOT the same thing, but for the context and content of this lecture we can use them interchangeably
  • a pod technically wraps up a container and a pod can have multiple containers inside it
  • to manage these pods, kube is also going to create something called a DEPLOYMENT
  • this deployment is going to read the config file and make sure that we ALWAYS have the right no: of pods running at any given instance
  • now we did mention that we want to allow networking features, so this is where things get a bit tricky
  • Kube is going to create something called a service
  • services give us access to running pods
  • its takes the difficulty away and handles networking among microservices
  • now lets take for example: the event bus is running in a pod, now in order to emit events, we need to know the URL of the posts service etc no?
  • in a MS env, when dealing with pods, and containers and all its difficult to obtain or definitively tell what the URL would be
  • which is why we use the service
  • since the service has access to all the pods, we can simply route this to the service, and the service would handle it and route it to the appropriate pod
  • and this service has a relatively easy URL to remember, not like localhost:4000 or something, its much easier

kube terminology

  • node - a vm that will run our containers
  • cluster - a collection of nodes and a master to manage them
  • pod - something that will wrap our containers, or we can run multiple containers in a pod
  • deployment - monitors a set of pods and makes sure theyre running and restarts them if they crash
  • service - provides an easy to remember URL to access a running container

config files

  • tells kube about the diff deployments, pods and services (these 3 are collectively referred to as objects) that we want to create
  • written in yaml syntax
  • always store these files with the proj src code- they are documentation
  • we can create objs without config files but DONT DO THIS
  • config files provide a precise defn of what your cluster is running
  • kube docs will tell you to run direct commands to create commands - ONLY DO THIS FOR TESTING
  • this is a simple .yaml file that is trying to create an container of the posts service inside a pod
apiVersion: v1
kind: Pod
metadata:
  name: posts
spec:
  containers:
    - name: posts
      image: briha2101/posts-service:0.0.1
  • note: the indentation has to be perfect, otherwise itll throw an error
  • and to create this pod, we run the command kubectl apply -f posts.yaml
  • and to check our running pods, we can simply do kubectl get pods
  • to delete a pod, we can simply do kubectl delete -f posts.yaml

config file breakdown

  • apiVersion: v1 - k8s is extensible- we can add in our won custom objs. this specifies the set of objs we want k8s to look at
  • kind: Pod - type of obj we want to create
  • metadata - config options for the obj we are about to create
    • name - when the pod is created, give it a name of posts
  • spec - the exact attributes we want to apply to the obj we're about to create
    • containers - we can create many containers in a single pod. this property is an array. which is why all the containers we'd like to create are prefixed with a - before the name, signifying an element in the arr
      • name - make a container with the name of posts
      • image - the exact image we want to use. you may have notieced we added a tag for the version at the end of the image name, and wkt if we dont specify the tag, by def we take the latest.
      • problem is, if we don't specify a tag, k8 assumes that we're talking about latest and will try to pull the image from dockerhub
      • which is why we add a version and tell k8 to use the image in our local machine (this is only an issue for this lecture)

common pod commands

  • here are some commands that were familiar with in docker and here are their equivalents wrt k8s
  • docker ps -> kubectl get pods
  • docker exec -it [containerid] [cmd] -> kubectl exec -it [podname] [cmd]
  • docker logs [containerid] -> kubectl logs [podname]
  • docker is about running individual containers -> kube is about running a group of containers
  • some common commands:
  1. kubectl get pods - print info about all the running pods
  2. kubectl exec -it podname cmd - exec the given cmd in a running pod
  3. kubectl logs podname - print out logs from the given pod
  4. kubectl delete pod podname - deletes the given pod
  5. kubectl apply -f config-file-name.yaml - tells kube to process the config
  6. kubectl describe pod podname - print out info about the running pod
commanddescription
kubectl get podsprint info about all the running pods
kubectl exec -it podname cmdexec the given cmd in a running pod
kubectl logs podnameprint out logs from the given pod
kubectl delete pod podnamedeletes the given pod
kubectl apply -f config-file-name.yamltells kube to process the config
kubectl describe pod podnameprint out info about the running pod

kube deployments

  • usually we dont create pods in this manner we mentioned above (config file)
  • what im trying to say is, the code in the config file explicitly creates 1 pod, but in reality, wed be creating deployments, and these deployments in return create pods and manage them
  • they serve a 2fold purpose: as established already they manage the pods and make sure to restart or create new pods in case old ones crash and/or stop. and they also allow us to seamlessly update the code that our pods run. we can build the file and the deployment would create the new pods, wait for them to run and be ready and would then manage the new ones
  • and would delete the old ones one by one
  • this is how we'd write a config for a deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: posts-depl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: posts
  template:
    metadata:
      labels:
        app: posts
    spec:
      containers:
        - name: posts
          image: briha2101/posts-service:0.0.1

explanation

  • apiVersion: apps/v1
    • This specifies the API version of Kubernetes were using.
    • we used v1 for pods and we have to use apps/v1 for deployments.
  • kind: Deployment
    • This defines the type of kube object were creating.
  • metadata:
    • This section contains metadata about the deployment.
  • spec:
    • This section describes the desired state of the deployment.
    • replicas: 1
      • This specifies the number of pod replicas (copies) we want.
    • selector:
      • kube finds it difficult to know which pods are to managed by the deployment
      • which is why we define a selector that tells us how to find the pods managed by the deployment.
      • matchLabels:
        • This is used to match the labels of the pods.
        • app: posts: This label is used to identify the pods that belong to this deployment.
        • in reality, we could give anything: '324784783': 'scsjchd' and it would still be okay, this 'id' in a way is how we identify pods that are to be managed
    • template:
      • This describes the pods that will be created by the deployment.
      • metadata:
        • This contains metadata about the pods.
        • labels: This labels the pods with app: posts.
    • spec:
      • This specifies the pod configuration.
      • containers:
        • name: posts
        • image: briha2101/posts-service:0.0.1

common deployment commands

commanddesc
kubectl get deploymentslist all the running deployments
kubectl describe deployment deploymentNameprint out info about a specific deployment
kubectl apply -f config-file.yamlcreate a deployment out of a config file (same for pods)
kubectl delete deployment deploymentNamedelete a deployment

k get deploymets

NAME         READY   UP-TO-DATE   AVAILABLE   AGE
posts-depl   1/1     1            1           13m
  • READY: 1 pod is ready to receive traffic out of 1 pods that were meant to be created
  • UP TO DATE: as we mentioned earlier, we can use depls to update the code running in pods, so this shows the no: of pods running upto date code
  • AVAILABLE: no: of pods that are available and ready to do some work

updating deployments

  • there are 2 methods we can do this
  • approach #1
    • make a change to the proj code
    • rebuild the image, specify a new version
    • in the deployment config, update the image version
    • run the command k apply -f file-depl.yaml
    • not used in professional working envs
    • now, you may be thinking that running this depl again would create a new deployment. no, kube would know that this deployment exists, and it would simply update the code inside the deployments with the new image's code
    • but why dont we use this approach?
      • we have to manually change the image's version in the config file.
      • and once our deployment files get bigger and bigger, its really difficult to find and make changes and its super easy to mess it up and cause an error
      • but it would be amazing if we didnt have to specify a version at all no? and we could just say use the latest version or something like that
  • approach #2
    • the deployment must be using the latest tag in the pod spec version
    • update the code
    • build the image docker build -t briha2101/name-service .
    • push the img to docker hub using docker push image-tag
    • run the command k rollout restart deployment deployment-name
    • this is preferred as there is no change to the config yaml

kube services

  • now that we have a pod running, how do make a request to that pod?
  • a service is another type of obj in kube
  • services provide networking between pods
  • created using config files
  • were going to use services to setup comm btw our pods and/ or to get access to our pods from outside
  • whenever we think of networking, we are thinking of services
  • there are several types of services
servicenamedesc
cluster ipsets up an easy to remember url to access a pod. only exposes pods in the cluster
node portmakes a pod accessible from outside the cluster. usually only used in dev
load balancermakes a pod accessible from outside the cluster. this is the right way to expose a pod to the outside world
external nameredirect an in cluster request to a CNAME url
  • but we only use cluster ip and load balancer on a daily basis

port forwarding

  • run k get pods
  • get the id of name of the pod
  • then run k port-forward {pod-name} 4222:4222
  • the first port, is the port on the local machine i want to access this server on, and the 2nd port is the port that im trying to access in the pod

creating a node port service

  • why are we creating a node port when we just established that we dont use it that much?
  • easy- we dont have any other pods for cluster-ip and setting up a load balancer requires a LOT of setup
posts-srv.yaml
apiVersion: v1
kind: Service
metadata:
  name: posts-srv
spec:
  type: NodePort
  selector:
    app: posts
  ports:
    - name: posts
      protocol: TCP
      port: 4000
      targetPort: 4000
  • and then you can create this service using k apply -f posts-srv.yaml

explanation

  • kind- Service (pretty self explanatory)
  • spec
    • type: here we specify the type of service we want to create
    • selector: to which pods do we want to provide networking to? if you remember our deployment yaml file we wrote a similar line in spec/selector app: posts. which is basically saying, provide networking to the pods which are posts pods
    • ports: if you inspect the code of posts, youll notice that it runs on PORT 4000. and ports takes in an array, so thats why we prefix it with a - and mention the protocol and name
    • port and targetPort: targetPort is the port that our server/ posts server is listening to traffic on. the NodePort service is going to have a port of its own, and that is referred to as port
    • now, port and targetPort dont have to be the same, its not a strict rule. if we mention a different Port number, it will redirect the traffic to the targetPort
    • but its good practice to have the same port and targetPort

accessing nodeport services

  • once we create this service and run k get services, we get
k get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP          4d1h
posts-srv    NodePort    10.103.169.173   <none>        4000:31054/TCP   20s
  • now in the ports column, we can see 4000/some random port btw 30k -> 32k
  • now this 2nd port is how we're going to access the service
  • so since we're using docker desktop, we can simply go to localhost:2nd-port/posts
  • for more info, we can run k describe service posts-srv
k describe service posts-srv
Name:                     posts-srv
Namespace:                default
Labels:                   <none>
Annotations:              <none>
Selector:                 app=posts
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.103.169.173
IPs:                      10.103.169.173
Port:                     posts  4000/TCP
TargetPort:               4000/TCP
NodePort:                 posts  31054/TCP
Endpoints:                10.1.0.12:4000
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
  • and here under NodePort it says posts 31054/TCP, so we can do localhost:31054/posts and access it locally

creating a cluster ip service

  • now that we have our posts all setup
  • we'll next setup our event bus service
  • and the goal is to somehow allow these 2 pods to communicate with one another
  • now unfortunately these 2 pods cant comm. with each other directly back and forth, they technically can but there are reasons as to why we never follow this approach
    • theres no way to know the ip address where the pod is going to be hosted ahead of time
  • so when we create a new post, were going to emit an event to the event bus right? and we obv cant communicate directly
  • which is why we create a cluster ip service thats going to govern access to the event bus pod
  • and whenever an event is to be emitted, a req will be made to the cluster ip service
  • so when the event bus wants to broadcast an event, it will emit the event to a cluster ip that manages the posts pod
  • in essence, every pod will have a service
  • steps
    1. build an event bus image
    2. push to docker hub
    3. create a deployment for event bus
    4. create a cluster ip service for posts and event bus
    5. wire everything up
  • now we can create a separate file for each of the services
  • or we can just write the logic in the same depl file for posts and event-bus
  • to create multiple objects in one yaml file, we separate them with 3 dashes ---
  • our final yaml file should look like this, its basically a combination and modification of the posts-srv file
event-bus-depl.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: event-bus-depl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: event-bus
  template:
    metadata:
      labels:
        app: event-bus
    spec:
      containers:
        - name: event-bus
          image: briha2101/event-bus-service
---
apiVersion: v1
kind: Service
metadata:
  name: event-bus-srv
spec:
  type: ClusterIP
  selector:
    app: event-bus
  ports:
    - name: event-bus
      protocol: TCP
      port: 4005
      targetPort: 4005
  • then run k apply -f filename.yaml
  • when we run k get services we should see the new services
k get services
NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
event-bus-srv   ClusterIP   10.104.40.133    <none>        4005/TCP         70s
kubernetes      ClusterIP   10.96.0.1        <none>        443/TCP          4d2h
posts-srv       NodePort    10.103.169.173   <none>        4000:31054/TCP   98m
  • and we do the same thing for posts also
  • but we change the name of the service so as to make it clear that we want a separate clusterip service for posts and that the existing posts-srv is a different service

communicating btw services

  • whenever we'd like to communicate with a pod that has a clusterIP service, how do we know the URL for it?
  • the URL is the name of the service itself, so in which case, the URL for dispatching events from the posts to the event bus would be http://event-bus-srv:4005
  • we prefix http:// with the name of the service, and then if its listening on any port, we mention the port as well
  • so we change the URL in the code, rebuild the image, push it to docker hub, restart the deployments

creating a load balancer service

  • now that we've created cluster api services and created deployments for all the microservice components, its time to convert even the front end into a cluster
  • so we're going to place the code in a pod and access it from outside, something like docker run -p 3000:3000 image
  • now our react app has to interact with posts, which we've done already using nodeport
  • but it also needs to interact with comments and query, how do we create this interaction interface?
  • approach #1 (not advisable AT ALL)
    • create nodeports for whichever pod that we'll need to interact with
    • but why is it bad? when we use this we are exposing a random port to the outside world
    • when we stop and rereun, the port would obviously and change and by that logic, wed have to change the code in multiple pods and re run the deployments for all those clusters
  • approach #2
    • the goal of a load balancer service (LBS) is to have one single point of entry for our cluster
    • then were going to make sure our react app interacts with this lbs
    • and were going to config it such that it takes the event and routes it to the appropriate pods cluster service

important terms

  • load balancer service
    • tells kube to reach out to its provider and provision a load balancer. gets traffic into a single pod
    • a lbs is a bit diff compared to other kube objs
    • eventually we'll deploy all these clusters onto some cloud platform right?
    • a lbs is going to tell our cluster to reach out to its cloud provider (wherever youve hosted) and provision something called a load balancer
    • this load balancer exists outside of our cluster and its responsible for directing traffic from the outside world into our cluster/somepod
    • our react app doesnt need the logic to know which data to send to which pod etc
    • all we must do is send it to the lbs, and this lbs does all the routing, which is where ingress comes into picture
  • ingress/ ingress controller
    • a pod with a set of routing rules to distribute traffic to other services
    • ingress and ingress controller are 2 different things, but for the scope of this lecture lets assume theyre the same thing
    • now, the lbs is going to send the req to the ingress ctrl
    • this ingress ctrl is going to look at the path and based on its routing rules, will determine which pod to send it to

writing ingress config files

  • first we install ingress-nginx using k apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.11.1/deploy/static/provider/cloud/deploy.yaml
  • and then run k get pods --namespace=ingress-nginx to verify the installation
ingress-srv.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-srv
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: posts.com
      http:
        paths:
          - path: /posts
            pathType: Prefix # if you want to suppress errors you can use ImplementationSpecific
            backend:
              service:
                name: posts-clusterip-srv
                port:
                  number: 4000

explanation

  • metadata:
    • name: The name of the Ingress object (here, it’s ingress-srv).
    • annotations: Used to specify additional configurations. Here, it’s rewriting the target to the root (/)
  • spec:
    • ingressClassName: Specifies the type of Ingress controller to use (here, it’s nginx).
    • rules: Defines how to route the traffic. (array)
      • host: The domain name (here, it’s posts.com).
      • http:
        • paths: Specifies the path rules. just like how we define routes in express, with / at the end, we follow the same approach here as well when defining paths
          • path: The URL path to match (here, it’s /posts).
          • pathType: How the path should be matched (e.g., Prefix means the path prefix should match).
          • backend: Specifies the service to send the traffic to.
            • service:
              • name: The name of the service to route to (here, it’s posts-clusterip-srv).
              • port:
                • number: The port number of the service (here, it’s 4000).
ingress-srv.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-srv
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/use-regex: "true"
spec:
  ingressClassName: nginx
  rules:
    - host: posts.com
      http:
        paths: # just like how we define routes in express, with / at the end, we follow the same approach here as well when defining paths
          - path: /posts/create
            pathType: Prefix # if you want to suppress errors you can use ImplementationSpecific
            backend:
              service:
                name: posts-clusterip-srv
                port:
                  number: 4000
          - path: /posts
            pathType: Prefix # if you want to suppress errors you can use ImplementationSpecific
            backend:
              service:
                name: query-srv
                port:
                  number: 4002
          - path: /posts/?(.*)/comments # nginx/yaml doesnt support :id/comments syntax. instead we have to write regex and (.*) is a wildcard and allows anything as long as it ends with /comments
            pathType: ImplementationSpecific # if you want to suppress errors you can use ImplementationSpecific or stick with Prefix
            backend:
              service:
                name: comments-srv
                port:
                  number: 4001
          - path: /?(.*)
            pathType: Prefix # if you want to suppress errors you can use ImplementationSpecific
            backend:
              service:
                name: client-srv
                port:
                  number: 3000

intro to skaffold

  • automates many tasks in a kube dev env
  • makes it really easy to update code in a running pod
  • makes it really easy to create/delete all objs tied to a project at once
  • brew install skaffold
skaffold.yaml
apiVersion: skaffold/v4beta3
kind: Config
manifests:
  rawYaml:
    - ./infra/k8s/*
  • what this code does is, it basically says when we start skaffold, run all these yaml files and when we stop, delete all these objects
skaffold.yaml
build:
  local:
    push: false
  artifacts:
    - image: briha2101/client
      context: client
      docker:
        dockerfile: Dockerfile
      sync:
        manual:
          - src: "src/**/*.js"
            dest: .
    - image: briha2101/comments-service
      context: comments
      docker:
        dockerfile: Dockerfile
      sync:
        manual:
          - src: "*.js"
            dest: .
    - image: briha2101/event-bus-service
      context: event-bus
      docker:
        dockerfile: Dockerfile
      sync:
        manual:
          - src: "*.js"
            dest: .
    - image: briha2101/moderation-service
      context: moderation
      docker:
        dockerfile: Dockerfile
      sync:
        manual:
          - src: "*.js"
            dest: .
    - image: briha2101/posts-service
      context: posts
      docker:
        dockerfile: Dockerfile
      sync:
        manual:
          - src: "*.js"
            dest: .
    - image: briha2101/query-service
      context: query
      docker:
        dockerfile: Dockerfile
      sync:
        manual:
          - src: "*.js"
            dest: .
  • what this code is doing is, whenever there is a change to any file in any of the directories, and if the file thats changed matches the regex mentioned in the src attribute, that file will be directly copied and pasted into the cluster automatically
  • if the change doesnt match the regex, such as a new package installation, the entire image will be generated again
  • NOTE: skaffold.yaml is placed in the root directory of your project. NOT in the infra/k8s folder
  • so navigate to the folder where you have the yaml file and run skaffold dev to startup skaffold
  • and whatever change we make, will be reflected in real time in the deployments
  • and if you want to stop it, ctrl c and this will delete all objs