Oct 2, 2023 10 min read Container Image

Creating and Managing Container Images.

Linux considers everything a file. In fact, the whole OS is a filesystem and has files and folders stored on a local disk. Why is this being told to you? Because, an image is a big tarball containing a layered filesystem.

Layered ? What ? Read on.

Table of Contents.

The Layered Filesystem
1. An example of a Layered Filesystem in an image
Creating Docker images

The Layered Filesystem

A container image is really a template and from these templates, containers are created. Images are composed of many layers, as opposed to just one monolithic block.

**Figure 1**: An image is a layered filesystem.

Each layer contains files and folders and only contains the changes made to the layer under neath it. For exampe, assume that we had a very simple image that had 4 layers (layer 1 through 4). Any changes we add to Layer 1 will be saved in Layer 2. At a later date, we might add files and folders to Layer 2 and these additional files and folders will be saved in Layer 3 and so on. The content of each layer is mapped to a special folder on the host system, which is usually a subfolder of /var/lib/docker/.

Docker uses a Union filesystem to create a virtual filesystem out of the set of layers and handles the details regarding the way these layers interact with each other.

💡

Layers are immutable i.e. they cannot be changed once they have been generated by the container run time. The only possible operation affecting the layer is its physical deletion.

An example of a layered filesystem

**Figure 2**: The left side shows the conceptual layered filesystem and the right side shows how specific functional files/folders are added to each layer.

In Figure 2 above, the creator of the Web App Image used the Alpine Linux distribution as the base to which they added Ngixn files/folders but the additional files/folders were saved in Layer 2. Similarily, ReactJS files/folders were added to the contents of Layer 2 but they added files/folders were added as Layer 3 and so on.

📢

Each image starts with a base layer, which contains one of the official images found on Docker Hub, such as a Linux distro, Alpine, Ubuntu, or CentOS. For more specialized images, users can make their own base images instead of using pre-made ones.

So what would happen if I wanted to add additional functionality to the Web App Image file shown in Figure 2?

As soon as we download an image and attempt to add additional files to it, the containe runtime adds a new read/write layer ON TOP of the existing set of layers. We write our additions to this new read/write layer and voila, we have our own unique image but based on the unique images made by someone else. Talk about standing on the shoulders of giants !

**Figure 3**: Container runtime will add a R/W layer on the top of existing files/folders.

Creating Docker Images

Interactive Image Creation

Step 1: Download an alpine:3.17 image

We can create a custom image by interactively building a container. We start with a base image that we want to use as a template and run a container of it interactively.

**Figure 4**: The command here downloads the image for **alpine:3.17**, names it **test** and opens an **interactive sh** inside the container.

1: Docker will run a container in interactive mode
2: The container will be labelled test
3: The image is of alpine:3.17
4: Due to the -it in (1), the container will open a shell

By default, the Alpine container does not have the curl tool installed and we will take on the responsibility of making an image that will remove this great deficiency in our world.

**Figure 5**: While inside the container, typed **curl** and got the **curl : not found** message.

Step 2: Add the 'curl' module to the container.

**Figure 6**: **apk update** update packages , **apk add curl** downloads curl.

Step 3: Test 'curl' is working.

**Figure 7**: Test curling for https://google.com.

At this point, type exit at the terminal and this will take you out of the container and back to the host terminal.

If we were to look for running containers, we can type

$ docker container ps -a

This command will list all the containers, active or not.

**Figure 8**: The container we used as our baseline is shown and is the ONLY container right now in our system.

We can also list all the images we have on our system right now.

$ docker images

**Figure 9**: The only image we have is the one for alpine 3.17.

So where is the outcome of all the work we just did?

Its still here. We just have to close the loop on it.

Step 4: Confirm the changes made in our container.

We can use the following command to list the changes introduced to the downloaded alpine image.

$ docker container diff test

**Figure 10**: **diff** will show all the differences between alpine and test. A = added, C = changed, D = deleted.

Step 5: Commit the changes introduced to test.

The docker container commit command persists our additions and create a new image from them.

$ docker container commit test alpine-w-curl

This command is asking Docker to commit the test container we've been working on and name it alpine-w-curl.

Figure 11: Docker will save the changes made to test and will name the NEW container alpine-w-curl.

Step 6: Check the list of docker images once commit has been done.

$ docker images

**Figure 12**: Notice there are 2 images, **alpine** (the baseline) and **alpine-w-curl** (our image).

We can also see the history of all the changes.

$ docker image history alpine-w-curl

**Figure 13**: The **history** shows a layer was added 8 mins ago, on top of layers that were created 8 weeks ago.

Using Dockerfile (a declarative approach)

A Dockerfile is a text file and contains instructions on how to build a custom container image.

An example of a simple Dockerfile is provided below.

upload in progress, 0 — **Figure 14**: Each line of the Dockerfile results in a layer in the resulting image.

What is the meaning behind each line in Figure 1, you ask?

FROM: each Docker file starts with this directive. The name of the software and its version, right after FROM, is used to set the base image we want to use for our own custom image. In the case of this example, we know the application is based on Node.js and therefore we MUST tell our Dockerfile that it has to download and include some Node.js runtime image.
ARG: sets the value for some build-specific parameters. In this demo, the email address of the maintainer is being set for communication in the future.
LABEL: Labels are a great utility in Dockerfiles. They are simply metadata 'tags' that we can slap onto a Docker image. Labels can be used to search for Docker images.
USER: The permission levels that the eventual executing process inside the container will have.

⛔

Containers provide isolation from the underlying OS but they still run on it. Because of this, it is HIGHLY dangerous to let a ROOT level user be the executing process inside the final container, that will get spun out of this Docker Image file.

ENV: These are used to set build-time variables. In this example, we are telling the Dockerfile that a folder called AP is located at /data/app and one called SCPATH is at /etc/supervisor/conf.d.

Let's put more meat on our 'conceptual' bones and make a Dockerfile.

💡

The git repo for our example is located at https://github.com/spkane/docker-node-hello (and belongs to Sean Kane, a very good technical author).

Step 1: Download the repo.

Step 2: Investigate the files downloaded from the repo into docker-node-hello.

💡

The .dockerignore file has the names of other files that may be in the directory structure BUT should NOT be sent to the Docker Daemon for inclusion into an Image. The .dockerignore file contains .git which really has no need to be included in the actual Image.

Step 3: Build the image.

💡

If the Dockerfile is NOT in the current directory as indicated by the dot, and is in some other folder, pass the -f argument, followed by the relative path of the Dockerfile. For example, $ docker image build -t example/docker-node-hello:latest -f /relative/path/to/Dockerfile.

Step 4: Confirm the image was built.

💡

Notice the name of the repository: example/docker-node-hello. This was the value that succeeded the -t flag in the docker image build command.

Step 5: Test the image.

Since we are mapping a port on the OS to a port inside the container, we should be able to type localhost:8080 into the browser and the contents of the simple Node.js application running inside the container will be shown.

Passing build time arguments

In Figure 14, we were introduced to metadata capabilities in a Dockerfile. In particular, we saw ARG, LABEL and ENV as example of special keywords that can be (should be) used for providing additional metadata for our container.

Before we work on passing values for the build process, let's quickly confirm how the ARG and ENV keywords are typically used.

Inside the Dockerfile for our example, there is an ARG and an ENV keyword.

Assuming the DockerFile does have a Label containing the maintainer's email address, we can find this bit of metadata pretty quickly and easily by using simple Linux grep commands.

But what should we do if Anna is on vacation and Noah is temporarily given the task of being a maintainer?

Noah has to create a new container from the same Dockerfile but doesn't have time to update the email address inside. Does he have to stop the build, update the Dockerfile and then continue his work?

Not necessarily (though it would be advisable to keep the Dockerfile current). Noah can simply pass a new email address through the command line when he is running the docker build command. The value(s) for an ARG/s passed from the command line, using the --build-arg switch, will supercede any values already documented inside the Dockerfile.

We can confirm that even though he did not update the Dockerfile directly, his email address will be displayed when someone wants to look for the maintainer.

Passing ENV variables to docker build

Environment variables are a named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For our demo, the environment in which our Node.js application will run is the OS.

💡

Read this article for a well crafted introduction to Environment Variables.

In our example, there are 2 environment variables being used.

Figure 8 shows the message displayed in a browser when we were NOT OVERRIDING the value of DEFAULT_WHO.

Just like with ARG, we can use the -e switch to set the value for an environment variable and include it in the build process as shown.

Accessing the containers content through the browser shows that the default value for WHO ("World") has been replaced by "Tom and Kerry".

I write to remember and if in the process, I can help someone learn about Containers, Orchestration (Docker Compose, Kubernetes), GitOps, DevSecOps, VR/AR, Architecture, and Data Management, that is just icing on the cake.