Dockerizing a legacy virtual machine, such as Clarity LIMS

Dockerizing a legacy virtual machine, such as Clarity LIMS

Recently we worked with an amazing biotech client on an initiative to shutdown internal self-hosted systems and move all services to “the cloud”. The client requested that we convert a virtual appliance (basically a regular Linux install) to a standard Docker image. The docker images would then be run on Kubernetes on either Google Kubernetes Engine or Alibaba. The particular software is called Clarity LIMS from a company called Genologics. We thought we would share a bit of this story in case it would be helpful to other Clarity LIMS users, and to anyone trying to figure out how to Dockerize legacy virtual or even physical bare-metal machines.

If you need help with getting an application ready for Docker/Kubernetes deployment, we would be happy to help

In order to achieve the project was broken down into smaller sections.

IDENTIFY THE MICRO COMPONENTS THAT CAN BE SEPARATED INTO THEIR OWN CONTAINERS

Looking through the running processes, the various system startup scripts, and the documentation, we can determine that we can break down the monolithic system into micro components. These components look like this:

CREATING THE MICRO CONTAINERS

Part 1

We can see that there are three components that are already dockerized in the wild: Elastic Search, RabbitMQ, and  HTTPd (https to ajp proxy)

For these, we can setup a Dockerfile that inherits from existing Docker images and adds in our configuration and customizations. We also need to make sure we are using a version that is compatible with the Java applications that are being run. For example, we noted that the Elastic Search is still withing the 1.x release, so the Dockerfile would look something like this:

   
  
FROM elasticsearch:1.7.6-alpine

COPY config ./config
COPY docker-entrypoint.sh /
COPY elasticsearch.in.sh /
ENTRYPOINT \["/docker-entrypoint.sh"]
CMD \["elasticsearch"]
  

A similar approach is taken for RabbitMQ and the Apache HTTP Reverse proxy service.

Part 2

For the main docker images, that being the Tomcat and the custom Java application, we’ll start lower down in the stack with a base OS import and build up from there.

For these two, we’ll use “FROM centos:centos6.9” to match the appliance OS.

To ensure the Java application continues with work flawlessly, we’ll wrap up the existing JDK from the appliance and use that as one of the building blocks for the images.

   
  
FROM centos:centos6.9

RUN mkdir /opt/gls
ENV JAVA_HOME=/opt/gls/jdk8/current 
ADD jdk8.tgz /opt/gls/

  

This is a pretty good starting point for both the Java images, so lets wrap this up and call it “clarity-jdk8” and push it up to gcr.io

Part 3

Using the java image we just created above, we can build our Tomcat image.

   
  
FROM us.gcr.io//clarity-jdk8

  

Okay – hold up a minute… it’s not just a Tomcat image, we also need end user ssh because the application ssh’s into itself to run commands, and uses scp to transfer files.
So we will need to build up an image that goes against docker best practises and run 2 binaries. Those being SSHD and Tomcat.

Lets get some basics into the image first:

   
  
Add in the needed repo's and a few binaries we'll need later in the build.

# Patch in some settings and get the RPM DB cleaned up.

RUN yum -y install centos-release-scl \
 && yum -y update \
 && yum -y install wget tar gzip openssh-clients rsync netcat \
 && ln -sf /usr/share/zoneinfo/UTC /etc/localtime \
 && echo "NETWORKING=yes" > /etc/sysconfig/network \
 && rpm --rebuilddb \
 && rpm --import \
                http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6 \
 && rpm --import \
                https://dl.fedoraproject.org/pub/epel/RPM-GPG-KEY-EPEL-6 

# Install the openssh-server and a few more binaries

RUN yum -y install \
                        --setopt=tsflags=nodocs \
                        --disableplugin=fastestmirror \
                epel-release \
                openssh \
                openssh-clients \
                openssh-server \
                python-setuptools \
                sudo \
                vim-minimal \
                xz \
                python-pip \
                python-wheel \
        python-meld3 \
                git \
        gcc \
        ssmtp

# Clean up the image again to keep it as small as possible

RUN rpm -e --nodeps \
                hwdata \
                iptables \
                plymouth \
                policycoreutils \
                sysvinit-tools \
 && > /etc/sysconfig/i18n \
 && yum -y upgrade python-setuptools \
 && yum -y clean all \
 && mkdir -p /opt/gls \
 && yum clean all \
 && rm -rf /{root,tmp}/*

  

 

After this, we add in some code and requirements we extracted from the appliance. Since this is very application specific I’ll jump over this next bit.

   
  
-----------------------------------------------------------------------------

# Setup Clarity

# \-----------------------------------------------------------------------------

# \### REDACTED
  

 

Okay, here’s the fun part – we’ll use circusd as the initial starting binary. This is a supervisor/guardian daemon that will keep our openssh-server and tomcat server processes running.

We’ll also patch in some config updates as we go.

   
  
-----------------------------------------------------------------------------

# Install Supervisor/Circus

# \-----------------------------------------------------------------------------

RUN yum -y install python-devel \
 && yum -y groupinstall 'Development Tools' \
 && easy_install pip==9.0.3 \
 && pip install --upgrade setuptools \
 && pip install circus \
 && mkdir -p /var/log/circus

# \-----------------------------------------------------------------------------

# Configure SSH for non-root public key authentication

# \-----------------------------------------------------------------------------

RUN sed -i \
        -e 's\~^PasswordAuthentication no\~PasswordAuthentication yes~g' \
        -e 's\~^#PermitRootLogin no\~PermitRootLogin yes~g' \
        -e 's\~^#UseDNS yes\~UseDNS no~g' \
        -e 's\~^(.*)/usr/libexec/openssh/sftp-server$\~\1internal-sftp~g' \
        /etc/ssh/sshd_config

# \-----------------------------------------------------------------------------

# Enable the wheel sudoers group

# \-----------------------------------------------------------------------------

RUN sed -i \
        -e 's\~^# %wheel\tALL=(ALL)\tALL\~%wheel\tALL=(ALL) ALL~g' \
        -e 's\~(.*) requiretty$\~#\1requiretty~' \
        /etc/sudoers

# \-----------------------------------------------------------------------------

# Setup Supervisor/Circus and SSH files, copy into place

# \-----------------------------------------------------------------------------

ADD src/usr/bin \
        /usr/bin/
ADD src/usr/sbin \
        /usr/sbin/
#ADD src/etc/systemd/system \

# /etc/systemd/system/

ADD src/etc/services-config/ssh/authorized_keys \
        src/etc/services-config/ssh/sshd-bootstrap.conf \
        src/etc/services-config/ssh/sshd-bootstrap.env \
        /etc/services-config/ssh/
ADD src/etc/services-config/circus/circus.ini \
        /etc/services-config/circus/

RUN mkdir -p \
                /etc/circus/ \
        && cp -pf \
                /etc/ssh/sshd_config \
                /etc/services-config/ssh/ \
        && ln -sf \
                /etc/services-config/ssh/sshd_config \
                /etc/ssh/sshd_config \
        && ln -sf \
                /etc/services-config/ssh/sshd-bootstrap.conf \
                /etc/sshd-bootstrap.conf \
        && ln -sf \
                /etc/services-config/ssh/sshd-bootstrap.env \
                /etc/sshd-bootstrap.env \
        && ln -sf \
                /etc/services-config/circus/circus.ini \
                /etc/circus/circus.ini \
        && chmod 700 \
                /usr/bin/healthcheck \
                /usr/sbin/sshd-{bootstrap,wrapper}
  

 

Now we setup some environment variables that hold things together:

   
  
-----------------------------------------------------------------------------

# Set default environment variables

# \-----------------------------------------------------------------------------

EXPOSE 22 9009 9080
ENV SSH_AUTHORIZED_KEYS="" \
    SSH_AUTOSTART_SSHD=true \
    SSH_AUTOSTART_SSHD_BOOTSTRAP=true \
    SSH_CHROOT_DIRECTORY="%h" \
    SSH_INHERIT_ENVIRONMENT=false \
    SSH_SUDO="ALL=(ALL) ALL" \
    SSH_USER="app-admin" \
    SSH_USER_FORCE_SFTP=false \
    SSH_USER_HOME="/home/%u" \
    SSH_USER_ID="500:500" \
    SSH_USER_PASSWORD="" \
    SSH_USER_PASSWORD_HASHED=false \
    SSH_USER_SHELL="/bin/bash" \
    JAVA_BIN=/opt/gls/jdk8/current/bin/java \
    CATALINA_MAIN=org.apache.catalina.startup.Bootstrap \
    JSVC=/opt/gls/clarity/tomcat/current/bin/jsvc \
    JSVC_OPTS="" \
    TOMCAT_USER=glsjboss \
    JAVA_MAX_RAM=4096 \
    JAVA_MAX_PERMSIZE=512 \
    CATALINA_OPTS="-Djava.awt.headless=true -server -Xms48m -Xmx4096m -XX:MaxPermSize=512m" \
    CLASSPATH="/opt/gls/clarity/tomcat/current/bin/bootstrap.jar:/opt/gls/clarity/tomcat/current/bin/commons-daemon.jar:/opt/gls/clarity/tomcat/current/bin/tomcat-juli.jar" \
    LOGGING_CONFIG="-Djava.util.logging.config.file=/opt/gls/clarity/tomcat/current/conf/logging.properties"

  

 

And finally, we’ll run with circusd:

   
  
CMD ["/usr/bin/circusd", "/etc/circus/circus.ini"]

  

Part 4

So now we have images for the following parts:

  • A Tomcat Application
  • A Java native application
  • Elastic Search
  • RabbitMQ
  • HTTPd (https to ajp proxy)

But what about the cron scripts? Good question! If you have any ideas how we achieved this, or would like to know, feel free to reach out to us!

So now we wrap all this up in a kubernetes yaml file and test on our minikube installation.

From there we migrate to GCE and we’re all set!

Categories: DevOps

By Rob Hartzenberg

September 8, 2018

Rob Hartzenberg
Author: Rob Hartzenberg

Linux Engineer

PREVIOUS

Conference talk @ LinuxFest – Proxmox Hypervisor, LXC, and KVM

NEXT

Nifty New features in PostgreSQL v10