Creating Keycloak SSO Server Docker images

Docker swarm setup

Docker swarm is running in azure
login id is shailesh
Shashank can give you access

Instructions

Following the instructions from this keycloak blog post. Please check the blog for detailed explanation of the commands described.

Docker install

Already done. Skipped

Starting docker daemon

Already done. Skipped

Using Docker client

Need to set env vars to allow the Docker client to communicate with the daemon.

export DOCKER_HOST=tcp://172.16.0.5:2375

Where the IP address was one of the public interfaces of the Docker host system.
Selected IP address from ip addr (picked the eth0 address): 172.16.0.5

List containers

docker ps

Since there were no containers started the output was empty.

Start Postgres container

docker run --name auth_postgres \ 
           -e POSTGRES_DATABASE=keycloak \
           -e POSTGRES_USER=keycloak \
           -e POSTGRES_PASSWORD=password \ 
           -e POSTGRES_ROOT_PASSWORD=ekstep@170705 \ 
           -d postgres

Check container logs

Attach to the container output using:

docker logs -f auth_postgres

You should see the output finish.
Use CTRL-C to exit the client.

Testing a shell inside the auth_postgres container

We can start a new shell process within the same Docker container:

docker exec -ti auth_postgres bash

We can find out what the container’s IP address is:

ip addr

Testing Local Connection to Postgres

Let’s make sure that we are in fact attached to the same container running PostgreSQL server. Let’s use postgres client to connect as user keycloak to the local db:

# psql -U keycloak 
psql (9.4.1)
Type "help" for help.


keycloak=# \l

This should list the databases present.
Exit postgres client with \q. And then exit the shell with exit.

Testing Remote Connection to Postgres

We can test that remote connectivity works by starting a new docker container based on the same postgres image so that we have access to psql tool:

docker run --rm -ti --link auth_postgres:auth_postgres postgres bash
# psql -U keycloak -h auth_postgres
Password for user keycloak: 
psql (9.4.1)
Type "help" for help.


keycloak=# \l

this should list the databases present.
Exit the postgres client with \q. And exit the shell with exit.

Starting Keycloak Docker Cluster

The idea:

We’ll use a prepared Docker image from DockerHub to run two Keycloak containers Each one will connect to the PostgreSQL container we just started. In addition, the two Keycloak containers will establish a cluster for a distributed cache so that any state in between requests is instantly available to both instances. That way any one instance can be stopped, and users redirected to the other, without any loss of runtime data.

Starting Keycloak 1

docker run -p 8080:8080 \
           --name keycloak_1 \
           --link auth_postgres:postgres \
           -e POSTGRES_DATABASE=keycloak \
           -e POSTGRES_USER=keycloak \
           -e POSTGRES_PASSWORD=password \
           -d jboss/keycloak-ha-postgres

We can monitor the keycloak server coming up with

docker logs -f keycloak_1

If all is good you should see something along the lines of:

10:03:04,706 INFO  [org.jboss.as.server] (ServerService Thread Pool -- 51) WFLYSRV0010: Deployed "keycloak-server.war" (runtime-name : "keycloak-server.war")
10:03:04,865 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:9990/management
10:03:04,870 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://127.0.0.1:9990
10:03:04,872 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: Keycloak 3.2.0.Final (WildFly Core 2.0.10.Final) started in 25863ms - Started 513 of 900 services (639 services are lazy, passive or on-demand)

Starting Keycloak 2

At this point, we're set to bring up a second keycloak. Repeat the same command, this time with the name set to keycloak_2 and the host machine port changed to 8081

docker run -p 8081:8080 \
           --name keycloak_2 \
           --link auth_postgres:postgres \
           -e POSTGRES_DATABASE=keycloak \
           -e POSTGRES_USER=keycloak \
           -e POSTGRES_PASSWORD=password \
           -d jboss/keycloak-ha-postgres

We can monitor this server coming up with:

docker logs -f keycloak_2

Cluster woes

I found that the cluster did not get created properly. The two keycloak instances could not communicate with each other. Checking the logs showed some warnings.

10:36:13,998 WARN  [org.jgroups.protocols.UDP] (MSC service thread 1-3) JGRP000015: the send buffer of socket ManagedDatagramSocketBinding was set to 1MB, but the OS only allocated 212.99KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)
10:36:13,998 WARN  [org.jgroups.protocols.UDP] (MSC service thread 1-3) JGRP000015: the receive buffer of socket ManagedDatagramSocketBinding was set to 20MB, but the OS only allocated 212.99KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)
10:36:13,998 WARN  [org.jgroups.protocols.UDP] (MSC service thread 1-3) JGRP000015: the send buffer of socket ManagedMulticastSocketBinding was set to 1MB, but the OS only allocated 212.99KB. This might lead to performance problems. Please set your max send buffer in the OS correctly (e.g. net.core.wmem_max on Linux)
10:36:13,999 WARN  [org.jgroups.protocols.UDP] (MSC service thread 1-3) JGRP000015: the receive buffer of socket ManagedMulticastSocketBinding was set to 25MB, but the OS only allocated 212.99KB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)

I was able to track this down to incorrect buffer sizes in the host system, and found this thread explaining a potential solution: https://developer.jboss.org/thread/272212

You have to increase buffer sizes. On Linux systems, in /etc/sysctl.conf add those two lines :
# Allow a 25MB UDP receive buffer for JGroups  
net.core.rmem_max = 26214400  
# Allow a 1MB UDP send buffer for JGroups  
net.core.wmem_max = 1048576  
Then run this command for the changes to take effect :
sysctl -p

Tried the process from (https://forums.docker.com/t/how-to-tune-kernel-properties-in-docker-images/25291/2) to fix this on the host. It worked to make the warning go away. However, the self discovery still did not work. The nodes did not find one another.

Multicast binding

The comments at the end of the blog post indicate that the servers do not find each other because the discovery must happen over multicast and the private interface's default bind address is set to 127.0.0.1. Therefore they do not find each other.

To fix this the configuration file in $(KEYCLOAK_ROOT)/standalone/configuration/standalone-ha.xml needs to be updated with the proper bind address.

Solution to multicast binding

Found two solutions to this:

This Dockerfile, which builds and runs a java executable which figures out the address of the host and overrides the configuration in the docker CMD
This XSL transform which is applied in the docker file and binds the the eth0 interface instead of running java code.

The first requires running some untrusted Java code. The second applies a hard-to-understand XSL transform.

Workaround

Finally, I chose a slightly quicker way to test the clustering. I ran the same jboss/keycloak-ha-postgres container but overrode the Docker CMD inside the Dockerfile on the command-line and substituted the IP addresses of the new containers which were going to be created. The command reads:

docker run -p 8080:8080\
           --name keycloak_1\
           --link auth_postgres:postgres\
           -e POSTGRES_DATABASE=keycloak\
           -e POSTGRES_USER=keycloak\
           -e POSTGRES_PASSWORD=password\
           -d jboss/keycloak-ha-postgres\
           -b 0.0.0.0 --server-config standalone-ha.xml\
           -Djboss.bind.address.private=172.17.0.3

For the first keycloak server and

docker run -p 8081:8080\
           --name keycloak_2\
           --link auth_postgres:postgres\
           -e POSTGRES_DATABASE=keycloak\
           -e POSTGRES_USER=keycloak\
           -e POSTGRES_PASSWORD=password\
           -d jboss/keycloak-ha-postgres\
           -b 0.0.0.0 --server-config standalone-ha.xml\
           -Djboss.bind.address.private=172.17.0.4

The parameters being passed were taken from the jboss/keycloak/server-ha-postgres/Dockerfile and extended to override the private bind address.

Check the logs now

docker logs -f keycloak_2

And boom! The clustering seems to be working now over multicast.

INFO  (MSC service thread 1-4) ISPN000078: Starting JGroups channel ejb
INFO  (MSC service thread 1-1) ISPN000078: Starting JGroups channel keycloak
INFO  (MSC service thread 1-3) ISPN000078: Starting JGroups channel server
INFO  (MSC service thread 1-2) ISPN000078: Starting JGroups channel hibernate
INFO  (MSC service thread 1-4) ISPN000094: Received new cluster view for channel ejb: [e85037dc89ef|1] (2) [e85037dc89ef, b1a6f11e0e03]
INFO  (MSC service thread 1-3) ISPN000094: Received new cluster view for channel server: [e85037dc89ef|1] (2) [e85037dc89ef, b1a6f11e0e03]
INFO  (MSC service thread 1-2) ISPN000094: Received new cluster view for channel hibernate: [e85037dc89ef|1] (2) [e85037dc89ef, b1a6f11e0e03]
INFO  (MSC service thread 1-3) ISPN000079: Channel server local address is b1a6f11e0e03, physical addresses are [172.17.0.4:55200]
INFO  (MSC service thread 1-2) ISPN000079: Channel hibernate local address is b1a6f11e0e03, physical addresses are [172.17.0.4:55200]
INFO  (MSC service thread 1-1) ISPN000094: Received new cluster view for channel keycloak: [e85037dc89ef|1] (2) [e85037dc89ef, b1a6f11e0e03]
INFO  (MSC service thread 1-1) ISPN000079: Channel keycloak local address is b1a6f11e0e03, physical addresses are [172.17.0.4:55200]
INFO  (MSC service thread 1-4) ISPN000079: Channel ejb local address is b1a6f11e0e03, physical addresses are [172.17.0.4:55200]
INFO  (MSC service thread 1-3) ISPN000078: Starting JGroups channel web
INFO  (MSC service thread 1-3) ISPN000094: Received new cluster view for channel web: [e85037dc89ef|1] (2) [e85037dc89ef, b1a6f11e0e03]
INFO  (MSC service thread 1-3) ISPN000079: Channel web local address is b1a6f11e0e03, physical addresses are [172.17.0.4:55200]

Next steps

Deploy to swarm

The docker containers are now running keycloak on the same host and can communicate with each other and the database successfully.

docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED             STATUS              PORTS                    NAMES
b1a6f11e0e03        jboss/keycloak-ha-postgres   "/opt/jboss/docker..."   28 minutes ago      Up 28 minutes       0.0.0.0:8081->8080/tcp   keycloak_2
e85037dc89ef        jboss/keycloak-ha-postgres   "/opt/jboss/docker..."   30 minutes ago      Up 30 minutes       0.0.0.0:8080->8080/tcp   keycloak_1
b56575c69022        postgres                     "docker-entrypoint..."   8 hours ago         Up 8 hours          5432/tcp                 auth_postgres

The next steps are to move the keycloak containers out of the single host they are living on and deploy each of them on a different host. There are three nodes available in the swarm.

docker node ls
ID                            HOSTNAME                            STATUS              AVAILABILITY        MANAGER STATUS
drgx9m8soienxkm2swls8x9kf     swarmm-agentpublic-26640432000002   Ready               Active
tkfho6al04id1njzbtf2tnayw     swarmm-agentpublic-26640432000000   Ready               Active
tu7v80ggjfwb8155ba3ykbu8n *   swarmm-master-26640432-0            Ready               Pause               Leader

Deploying to the swarm will require that

the command-line parameters provide the right bind address based on the host where the container is being deployed.
the multicast traffic is passed within the network. Based on the fact that this will be running in Azure/AWS this might be problematic because it turns out that Azure does not support multicast traffic. It's not clear if the docker swarm can support multicast. However, there does seem to be a workaround which works over unicast which should allow us to get this working in the swarm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly