ECE 473/573 Fall 2024 - Project 2

Service Containerization

Report Due: 09/29 (Sun.), by the end of the day (Chicago time)
Late submissions will NOT be graded

In this project, you will learn how to containerize services using Docker tools. We will introduce a few more Docker commands for troubleshooting issues, as well as Docker Compose to manage and connect services via container orchestration.

II. Key-Value Store Orchestration

We continue our exploration of the key-value store example from the textbook and focus on the transaction log backed by a Postgres database in this project. While the example code from the textbook available from github has issues for containerization, I am able to fix most of them and the updated version is available at https://github.com/wngjia/ece573-prj02.git. Similar to Project 1, please create a new VM by importing our VM Appliance, make a clone of the repository (or fork it first), and execute 'setup_vm.sh' to setup the VM as needed.

Our key-value store consists of two microservices kvs and postgres, each running in its own container. The kvs service uses the source code from the textbook that is in the 'kvs' directory. On the other hand, the postgres service uses a pre-built Postgres image, which saves us a lot of effort. Instead of managing this two containers individually, we utilize Docker Compose to orchestrate the two so that they can be easily built, started, and stopped as a whole.

Docker Compose uses the 'docker-compose.yml' script to define the services as containers and how they should be orchestrated. Open docker-compose.yml and you should be able to tell it defines two services. The first service is postgres that uses the pre-built image 'postgres:11'. More details of this image can be found on docker hub, including the descriptions for the environment variables defined in the postgres service. The second service kvs should be built from the directory 'kvs'. The port '8080' is publish as '8080' so that we can access the service inside the container from the VM.

What is not shown in the docker-compose.yml script is that by default Docker Compose will create a private network to interconnect the containers. Each container will get their own IP address and will need to know the addresses of others before they can communicate. For example, how could the kvs service know the address of the postgres service? It is definitely not a good idea to hard code the addresses in the script and in the source code. Instead, we utilize the DNS service provided by Docker Compose where each service container has a domain name matching the name of the service. We pass 'postgres' via environment varibale to the kvs service so kvs can use it to connect to the database.

Use docker compose build to build the container image for kvs.

ubuntu@ece573:~/ece573-prj02$ docker compose build
...
=> => naming to docker.io/library/ece573-prj02-kvs  0.0s

Use docker compose create to create the containers and the default private network.

ubuntu@ece573:~/ece573-prj02$ docker compose create
...
[+] Creating 3/3
 ✔ Network ece573-prj02_default       Created  0.5s 
 ✔ Container ece573-prj02-postgres-1  Created  1.1s 
 ✔ Container ece573-prj02-kvs-1       Created  0.2s

Both the commands may take a few minutes to complete for the first time as they need to pull images from docker hub. Note that you will need to run these two commands again if you ever change the Go source code in the 'kvs' directory.

Now we are ready to start our key-value store. Use docker compose start to start the containers in background.

ubuntu@ece573:~/ece573-prj02$ docker compose start
[+] Running 2/2
 ✔ Container ece573-prj02-postgres-1  Started  1.7s 
 ✔ Container ece573-prj02-kvs-1       Started  1.9s

Are the containers running? Use docker ps -a to show all running and stopped containers.

buntu@ece573:~/ece573-prj02$ docker ps -a
CONTAINER ID   IMAGE             ...      STATUS            PORTS                        NAMES
4dd8bb77dbdf   ece573-prj02-kvs  ...   Up 29 seconds   0.0.0.0:8080->8080/tcp,...   ece573-prj02-kvs-1
193d918dbed4   postgres:11       ...   Up 31 seconds   5432/tcp                     ece573-prj02-postgres-1

Pay attention to IMAGE, STATUS, PORTS, and NAMES columns as they indicate if the containers match your expectations.

What if something went wrong and you will need to troubleshoot issues? Usually a service will generate a lot of log messages on the screen. However, since we run our containers in background you won't be able to see those messages directly. To access them, you may use the command docker logs with the NAME (or the CONTAINER ID) of the container.

ubuntu@ece573:~/ece573-prj02$ docker logs 193d
...
2023-09-17 18:02:23.774 UTC [76] LOG:  database system was shut down at 2023-09-17 18:02:23 UTC
2023-09-17 18:02:23.797 UTC [1] LOG:  database system is ready to accept connections

ubuntu@ece573:~/ece573-prj02$ docker logs ece573-prj02-kvs-1 
2023/09/17 18:02:17 retry(5) in 10 seconds: dial tcp 172.18.0.2:5432: connect: connection refused
2023/09/17 18:02:27 0 events replayed

Our key-value store is running and you can use curl to test it from the VM.

ubuntu@ece573:~/ece573-prj02$ curl -X GET 127.0.0.1:8080/v1/a
no such key
ubuntu@ece573:~/ece573-prj02$ curl -X PUT -d "value a" 127.0.0.1:8080/v1/a
ubuntu@ece573:~/ece573-prj02$ curl -X GET 127.0.0.1:8080/v1/a
value a

The log messages are also updated.

ubuntu@ece573:~/ece573-prj02$ docker logs ece573-prj02-kvs-1
...
2023/09/17 18:20:29 GET /v1/a
2023/09/17 18:20:44 PUT /v1/a
2023/09/17 18:20:44 PUT key=a value=value a
2023/09/17 18:20:46 GET /v1/a

To complete this section, answer the following questions in the project report. You may need to perform additional searches online.

What is the license of the original example code from the textbook? (Hint: read the 'LICENSE' file)
Open source software may use other licenses like GNU General Public License (GNU GPL) and GNU Affero General Public License (GNU AGPL). What are the difference between GNU GPL and GNU AGPL for cloud services?
The kvs service depends on the postgres service as it need to load the transaction log from the database. The dependency is defined in docker-compose.yml so that postgres starts before kvs. However, the log messages above indicate that kvs still need to retry connecting to postgres. Why?

III. Manageability via Environment Variables

Now you should spend some time reading the Go source code in the 'kvs' directory to understand how the service works. We have discussed many of them in the lectures except those in 'pglogger.go', which gives you a few examples on how to connect and use a SQL database.

If you take a look at the 'initializeTransactionLog' function in 'service.go', you will see the database connection parameters like user and password are hard coded. This is definitely not acceptable for manageability as discussed in Lecture 02. All these parameters should be passed in via environment variables from the docker-compose.yml script.

Here is what you need to do for this section.

I have provided in service.go an example to use the environment variable "POSTGRES_HOST" to pass the DNS name of the postgres service to the host parameter. Modify service.go accordingly to use environment variables to setup other parameters including dbName, user, and password.
Modify docker-compose.yml to pass correct values to those environment variables.
Build, create, and start the containers again to make sure your modification is correct. Take screenshots as needed.
Test the key-value store using curl and validate the log messages from the kvs service. Take screenshots as needed.

IV. (Bonus) Fault Tolerance Testing

What would you expect to happen if one of the services fails? We test whether the key-value store continue to function correctly or not if one service fails by shutting down and restarting them one at a time. Here is a test procedure for the kvs service.

Put a key-value pair to the store.
Shutdown kvs by docker kill ece573-prj02-kvs-1
Restart kvs by docker compose start
Validate that the key-value pair is still there.

The test procedure for the 'postgres' service is more complicated since the client may continue to connect to the 'kvs' service.

Put a key-value pair to the store.
Shutdown postgres.
Put another key-value pair to the store.
Restart postgres.
Are the two pairs still there?
Shutdown and restart kvs.
Are the two pairs still there?

Complete these two test procedures and take screenshots as needed. Answer the following questions in the project report.

Is there any data loss if one service is down? Why?
If there is, what would you like to do?

V. Project Deliverables

Complete the tasks for Section II and III, and optionally Section IV for a 25% bonus. Prepare a project report in .doc/.docs or .pdf format and submit it to Blackboard before the deadline. Your project report should include the following:

(10 points) Answers to questions in Section II.
(3 points) Explain how 'service.go' is modified for Section III and the modified part of the 'service.go' file.
(3 points) The modified 'docker-compose.yml' file for Section III.
(4 points) Screenshots for Section III showing the process to build, create, and start the containers, and the tests with curl as well as the log messages from the 'kvs' service. Explain why those outputs are correct.
(5 points) Screenshots for Section IV and answers to questions there.