Copyright Notice

The text of and illustrations in this document are licensed by Predrag Punosesvac under a Creative Commons Attribution-Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at The original author of this document, designate the Auton Lab as the "Attribution Party" for purposes of CC-BY-SA. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.


A version control system (VCS), continuous integration (CI), and regression testing tools are fundamental software development infrastructure building blocks of any software company. A VCS is the software used to manage and track changes to computer programs. CI is the practice of merging all developer working copies to a shared software project and making sure that this merges don't brake builds. Finally regression testing tools are used to identify possible unexpected consequences to the existing code base when changes of modifications are committed.

The Auton Lab, part of Carnegie Mellon University's School of Computer Science, researches new approaches to Statistical Data Mining. We are not a software company but writing lots of useful software is byproduct of our research effort. Interestingly the code we write which usually starts as a research-grade software ends up being a code base for a licensed enterprise application software (EAS) used by many government agencies and private entities.

Historically the first version control system used in the Auton Lab was the Concurrent Version System (CVS). An unsuccessful attempt to migrate to Subversion in 2008 failed due to the tight integration of our custom Gmake-Magic build scripts with CVS. Aside of inability of CVS to deal with atomic commits over the past several years existence of GitHub has completely revolutionized the way developers work together and was a major factor to the adoption of Git distributed version control system. Over the years we noticed that our students and research programmers alike started avoiding the use of our CVS server and created private Git repositories all over our NFS shares. We also noticed disturbing trend of putting peaces of our proprietary code base to the GitHub or to much lesser extend Bitbucket. The decision was made that we move aggressively towards adoption of Git for our version control needs and try to emulate GitHub model privately in our internal environment.

This short article is not about technical merits of Git comparing to other version control systems. It also doesn't address philosophical and practical differences in using Git for software development in comparison with CVS. It is merely a set of our notes and how-to which will help you build similar development environment in your small academic lab or a start up company.

Selecting an Operating System

Operating system diversity has adverse effect on system administration productivity and user satisfaction. At the time of my arrival at the Auton Lab in June of 2013 there were only forensic evidence that FreeBSD was used in the laboratory and the Linux was ruling the computing landscape. However tha lab was orphaned for almost two years prior to my arrival so there was a great opportunity to rethink some of the solutions. By training I am a research mathematician whose only major exposure to system administration got through my life-long UNIX hobby. It was way easier to rebuild our entire network infrastructure (Firewalls, VPNs, DNSs, LDAPs) using OpenBSD which I use every day at home than to try to figure out how those things work on Linux. My second decession was to consolidate various flavors of Linux which we used for our scientific computing and desktop needs to a Springdale Linux, Princeton University clone of Red Hat Enterprise Linux. That was all easy. Most difficult decession was selection of storage OS. Namely we are dealing with lots of big data. Did I say a lot? Red Hat is quite happy begin storage OS either using Hardware RAID or Software RAID. It would have being much easier to stick to Hardware RAID and Linux as a storage solution and not introduce the third OS into the lab but the lack of modern file system on Linux has really bother us. Namely XFS is robust but its age feels and the only modern options are ZFS and HAMMER. We tested both and decided that at this point adopt ZFS (software RAID) for our storage needs. During all this time OpenBSD was used as a CVS server.

Upon completion migration of all our data to ZFS storage pool I started experimenting with jails on the top of ZFS and iocage in particular and I really liked what I saw. We also wanted to take advantage of ZFS in creating our new software development environment which will revolve around Git

But before we go any further let me tell you why we picked up PC-BSD (TrueOS server) over vanilla FreeBSD for our original installations.

At the time of adoption PC-BSD (TrueOS as it server version was called) had several major advantages over newly released FreeBSD 10.0:

  1. Installer could utilize ZFS for root partitions
  2. pc-sysinstall script and customizable configuration files were superior for automatics, unattended installation.
  3. beadm was integrated with the Grub boot loader used by TrueOS and selecting different boot kernel/userland was trivial.
  4. update/upgrade manager
  5. Life Preserver (management tool for ZFS snapshots and replication)
  6. Warden (Jail management)
  7. Sane[r] defaults

With the release of FreeBSD 11.0 and uncertain future of TrueOS based of 12.0 current (even worse no updates for PC-BSD 10 branch release) we are re-evaluating situation.

FreeBSD 11.0 is now fully capable of booting from ZFS mirror. We ended up not taking full advantage of pc-sysinstall. beadm is usable on vanilla FreeBSD and even TrueOS abandoned Grub as its boot loader. Update/upgrade manager is worthless if there are no new build in PC-BSD repos. Life Preserver turns out to be buggy and unusable so we went the way of zfsnap. Warden was unexpectedly dropped in favor of iocage which in turn became abandonware when the original developer started rewriting it in different language.

We still believe that TrueOS has a saner defaults (LibreSSL instead of OpenSSL being one) but we were bitten by PC-BSD custom configuration files multiple times.

Building the Infrastructure

As promised earlier in this section we will describe how we build our infrastructure using PC-BSD, ZFS, jails, and iocage.

Why Jails

A FreeBSD jail is a glorified chroot(2), change root directory. This creates a "safe" environment, separate from the rest of the system at least from the system administration prospective if not from the security point of view. A similar FreeBSD jail inspired user-land virtualiser called sysjail was attempted by OpenBSD and NetBSD but abandoned due to the discovery of the fundamental security flaws in the design of systrace library. FreeBSD jails are not free of security flaws but are relatively safe and cheap (comparing to full blown OS virtualization) ways to separate and run multiple applications from the same hardware host. Jails alone probably would not have being sufficient reason for us to use them to run Git server but combining it with ZFS was offering advantage over any other available solutions.

Why ZFS for Jails

The advantage of using ZFS datasets for jails is pretty obvious for a seasoned ZFS users. ZFS is both the logical volume manager and the file system in one. As a logical volume manager ZFS offers some measure of data protection and high availability from simple hardware failures like dead hard drives. As a file system it includes protection against data corruption but most importantly from the end user point of view offers ability to automatically rollback recent changes to the file system and data via snapshots. It is also easy to backup using remote replication. Using ZFS for hot migration of guest to a different server and creating failover mirrors is also pretty straightforward. Finally ZFS cloning features offer safe and painless way to play and upgrade complex software.

Jail management with iocage

iocage installation and configuration

This step was needed only before the Warden abandoned:

pkg install iocage

In the next step we tell iocage to download an image of FreeBSD

iocage fetch
Supported releases are: 
Please select a release [10.3-RELEASE]: 

Creating the first jail

We are no ready to create a jail which will host our Git repositories

iocage create ip4_addr="igb0|"

before we start jail we want to set some properties

iocage set

and to bypass some known issues with long jail names which is very important if you are going to recover files from ZFS snapshots

iocage set hack88=1

Starting jails automatically

To allow iocage to start your jails at boot time

echo 'iocage_enable="YES"' >> /etc/rc.conf

start jail automatically on the boot

iocage set boot=on

Since we run multiple jails on this host we can prioritize

iocage set priority=1

We are finally ready to fire up our jail and start working on Git server

iocage start

To access the host we do simply

iocage console

Jailed Git server

Just add the package

pkg install git

and add the user

pw useradd git -m

Our git repository will be in the /home/git. This choice will become obvious once it became clear that we want to recreate private GitHub installation. At this point we could create repos with

su - git
git init

but please don't do that. That would only complicate letter integration with web interface.

GitHub with Gogs

As we said in the introduction GitHub is an amazing platform but comes with a caveat. It can't be self hosted. It is true that you can have a private repositories but in our case that is not an option. Putting cutting edge research code used by some of the major U.S. agencies on the third party server is a quick one way ticket to a federal prison.

Browsing Git repositories on the web is not very tall order. What we wanted in the Auton Lab is to have an integrated bug tracking, wiki, and possibly continuous integration. All that is possible with real GitHub.

There are two GitHub alternatives which can be self-hosted. GitLab and Gogs. We tested both. GitLab essentially requires installation of all Ruby gems and uses multiple data bases one of which is Redis. Not something that anybody in our lab is familiar with. Our first attempt to install and configure GitLab which was not in the ports three at the time due to the multiple issues made me cry. I spent several days and gave up. We tested a TurnKey Linux image and actually worked pretty well but was not light on resources. It seems at the time to be more mature project than Gogs which is written in Go. However Gogs is significantly simpler to install and configure. On the other hand GitLab also has better integration with Jenkins. After careful consideration and talking to Gogs developers we decided to go that route.


We will install firstly some software packages. Our design choices reflect the pallet of software we are installing. We also have no special needs so no compiling from the ports in necessary. However upgrading packages and repositories is desirable

pkg upgrade
pkg install go git gcc postgresql94-client postgresql94-server nginx

Running PostgreSQL in a jail

Running PostgreSQL in a jail is tricky. There are shared memory issues which should have being resolved by setting

echo 'jail_sysvipc_allow="YES"' >> /etc/rc.conf
echo 'security.jail.sysvipc_allowed=1' >> /etc/sysctl.conf

on the host machine or possibly adding

echo 'security.jail.sysvipc_allowed=1' >> /etc/sysctl.conf

in the jail itself. I could not get this to work even putting additional

echo 'jail_example_parameters="allow.sysvipc=1"'>> /etc/rc.conf

on the host machine. Finally I gave up and ended up doing

jail -m jid=3 allow.sysvipc=1

on the jail host where jid=3 is should match the jail id. We are now ready to run PostgreSQL in the jail

Creating a Database Cluster

To have PostgreSQL startup when the server does you need to add postgresql_enable="YES" to the /etc/rc.conf file:

echo 'postgresql_enable="YES"' >> /etc/rc.conf

Before we start using PostgreSQL we need to initialize the database. By default the cluster will be placed in a data folder in /var/db/postgres directory.

service postgresql initdb

One should see the output like this

The files belonging to this database system will be owned by user "pgsql".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /usr/local/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
creating template1 database in /usr/local/pgsql/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/local/bin/postgres -D /usr/local/pgsql/data
    /usr/local/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start

Before we start the database we edit /usr/local/pgsql/data/pg_hba.conf

# PostgreSQL Client Authentication Configuration File
# ===================================================
# Refer to the "Client Authentication" section in the PostgreSQL
# documentation for a complete description of this file.  A short
# synopsis follows.
# This file controls: which hosts are allowed to connect, how clients
# are authenticated, which PostgreSQL user names they can use, which
# databases they can access.  Records take one of these forms:
host      all       all  trust
# (The uppercase items must be replaced by actual values.)
# The first field is the connection type: "local" is a Unix-domain
# socket, "host" is either a plain or SSL-encrypted TCP/IP socket,
# "hostssl" is an SSL-encrypted TCP/IP socket, and "hostnossl" is a
# plain TCP/IP socket.
# DATABASE can be "all", "sameuser", "samerole", "replication", a
# database name, or a comma-separated list thereof. The "all"
# keyword does not match "replication". Access to replication
# must be enabled in a separate record (see example below).
# USER can be "all", a user name, a group name prefixed with "+", or a
# comma-separated list thereof.  In both the DATABASE and USER fields
# you can also write a file name prefixed with "@" to include names
# from a separate file.
# ADDRESS specifies the set of hosts the record matches.  It can be a
# host name, or it is made up of an IP address and a CIDR mask that is
# an integer (between 0 and 32 (IPv4) or 128 (IPv6) inclusive) that
# specifies the number of significant bits in the mask.  A host name
# that starts with a dot (.) matches a suffix of the actual host name.
# Alternatively, you can write an IP address and netmask in separate
# columns to specify the set of hosts.  Instead of a CIDR-address, you
# can write "samehost" to match any of the server's own IP addresses,
# or "samenet" to match any address in any subnet that the server is
# directly connected to.
# METHOD can be "trust", "reject", "md5", "password", "gss", "sspi",
# "ident", "peer", "pam", "ldap", "radius" or "cert".  Note that
# "password" sends passwords in clear text; "md5" is preferred since
# it sends encrypted passwords.
# OPTIONS are a set of options for the authentication in the format
# NAME=VALUE.  The available options depend on the different
# authentication methods -- refer to the "Client Authentication"
# section in the documentation for a list of which options are
# available for which authentication methods.
# Database and user names containing spaces, commas, quotes and other
# special characters must be quoted.  Quoting one of the keywords
# "all", "sameuser", "samerole" or "replication" makes the name lose
# its special character, and just match a database or username with
# that name.
# This file is read on server startup and when the postmaster receives
# a SIGHUP signal.  If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect.  You can
# use "pg_ctl reload" to do that.

# Put your actual configuration here
# ----------------------------------
# If you want to allow non-local connections, you need to add more
# "host" records.  In that case you will also need to make PostgreSQL
# listen on a non-local interface via the listen_addresses
# configuration parameter, or via the -i or -h command line switches.

# CAUTION: Configuring the system for local "trust" authentication
# allows any local user to connect as any PostgreSQL user, including
# the database superuser.  If you do not trust all your local users,
# use another authentication method.

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             all               trust
# IPv6 local connections:
host    all             all             ::1/128                 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
#local   replication     pgsql                                trust
#host    replication     pgsql            trust
#host    replication     pgsql        ::1/128                 trust

Create users and databases

We log into the pgsql account created when we installed PostgreSQL.

su - pgsql

in order to create PostgreSQL user account and database which will be used by Gogs. Connect to Postgres database

$ psql -h localhost -p 5432 -d template1

When logged into the database:

$ psql -h localhost -p 5432 -d template1
psql (9.4.9)
Type "help" for help.

# Create a user for Gogs
template1=# CREATE USER git CREATEDB;

# Create the Gogs production database & grant all privileges on database
template1=# CREATE DATABASE gogs_production OWNER git encoding='UTF8';

# Quit the database session
template1=# \q

Then type exit to drop back to the root user. Try connecting to the new database with the git user and set the password:

su - git
$ psql -d gogs_production
psql (9.4.9)
Type "help" for help.

gogs_production=> ALTER USER git PASSWORD 'yEfa8pes';
gogs_production=> \q

Gogs installation

Gogs still didn't hit the ports and it is under very active current development. We will compile Gogs from the sources. The fact that we opted out for PostgreSQL instead SQLite is complicating our installation somewhat but has also certain benefits.

Firstly we setup the Go environment (locally and permanently):

echo 'GOPATH=$HOME/go; export GOPATH' >> ~/.profile

We now fetch Gogs sources using the Go package manager. Notice that CC=gcc flag is required to force the build tool to use GCC instead of LLVM system compiler (you can use CC=gcc54 if you want to try with the newer version of GCC). I have not attempted building Gogs with LLVM.

CC=gcc go get -u --tags postgresql

Note that my tag reflects my choice of PostgreSQL for our database. For most small installations with privately accessible Gogs server sqlite will be sufficient. PostgreSQL does offer some measure of protection for public instances.

Note that the new directory is created in /home/git with the name go. The directory is pretty deep so I decided to create a symbolic link

ln -s go/src/ gogs

Building Gogs is now as easy as

cd gogs
CC=gcc go build --tags postgresql

Gogs configuration

The default configuration file is


We do not want to alter that file so we extend it by creating custom custom/conf/app.ini.

mkdir -p custom/conf
vi custom/conf/app.ini

Note that if we don't create the file the first time we run gogs web web interface will run installation script which will populate the file by reading our input from the web interface. We prefer to this is how our configuration file looks like

APP_NAME = Auton Lab Go Git Service
RUN_USER = git
RUN_MODE = prod

DB_TYPE  = postgres
HOST     =
NAME     = gogs_production
USER     = git
PASSWD   = yEfa8pes 
SSL_MODE = disable
PATH     = data/gogs.db

ROOT = /home/git/gogs-repositories

DOMAIN       =
HTTP_PORT    = 3000
ROOT_URL     =
DISABLE_SSH  = false
SSH_PORT     = 2222

ENABLED = true
; Buffer length of channel, keep it as it is if you don't know what it is.
; Name displayed in mail title
; Mail server
; Disable HELO operation when hostname are different.
; Mail from address, RFC 5322. This can be just an email address, or the `"Name" <>` format
FROM = `"Predrag Punosevac" <>`
; Mailer user name and password
USER = gogsmaster 

ENABLE_CAPTCHA         = false



MODE      = file
LEVEL     = Info
ROOT_PATH = /usr/home/git/go/src/

SECRET_KEY   = mPa644nQpSIMqfm

The option DISABLE_REGISTRATION = false should not be changed to DISABLE_REGISTRATION = true until we create at least one user account. The first user account will automatically have administrative privileges. For the production purposes in my organization we don't allow registration.

Running Gogs

Since Gogs is not in ports and I didn't want to create daemons we decide simply to use tmux to start Gogs manually

cd /home/git/gogs
./gogs web

Then detach your tmux session

[detached (from session 0)]

Here small description how you create gogs-repos. Earlier I mentioned that it is easier to create git repos from Gogs interface than using git init and then import it into the Gogs.

Backup and recovery

Backing up and recovering ZFS datasets is pretty easy. This was a major reason we opted to run our Git and Gogs services in FreeBSD jail.

Obviously the think which care the most is our code. Our code is stored in /home/git/gogs-repositories and simple

iocage snapshot

of our jail instance will be sufficient to create a snapshot which can be rolled back or send with ZFS send command. On the another hand the user info is stored in PostgreSQL database so it is easily to recreate it with a medium size organization like ours but why waist the time. We dump the PostgreSQL bia cron just before we take the snapshot

# Order of crontab fields
# minute        hour    mday    month   wday    command

# Added by Predrag Punosevac
# Database dump
30      4       *       *       *       /usr/local/bin/pg_dump --username=git gogs_production > /db-dump/`date`
35      4       *       *       *       /usr/local/bin/detox /db-dump/*

I am taking multiple ZFS snapshots throught the day essentially versioning Git repository and Gogs themselves. On the machine which hosts the jail I have the following in the cron

# Order of crontab fields
# minute        hour    mday    month   wday    command

# Added by Predrag Punosevac
# Backup Gogs and Git repos
15      1       *       *       *       /usr/local/sbin/iocage snapshot
45      2       *       *       *
45      4       *       *       0       /root/
15      10      *       *       *       /usr/local/sbin/iocage snapshot
15      14      *       *       *       /usr/local/sbin/iocage snapshot
45      18      *       *       *       /usr/local/sbin/iocage snapshot
# Delete expired snapshots
15      19      *       *       *       /root/
15      20      *       *       *       /root/
15      21      *       *       *       /root/
15      22      *       *       *       /root/

I will show tomorrow those simple scripts I use to remote replicate. Address the issue of incremental vs full ZFS send. In our lab we use one remote host for incremental ZFS sends and another one which is use for full ZFS replication but only once a week.

Mirroring and failover

My ZFS targets are jail data sets themselves. The only thing I need to do to have a hot spare mirror is to adjust ip interface and IP address (I have separate internal DNS record for the mirror) and most importantly hostid on which jail runs. Once I have that I can automatically start such jail. I will write up the details

One caveat why we are still not doing in the lab is problem with running PostgreSQL in the jail (SQLite would be easier) and automatically starting gogs web daemon. Once when there is a port of Gogs that should be trivial.

Safe upgrading with ZFS clone

I have done this. These are just notes for me. Section to be written. Clone the jail which runs Gogs server. It doesn't have even to be shutdown. Make sure now you adjust permission so that you can start PostgreSQL server. Get into such a jail. Remove postgresql pid file if you took the clone of the hot jail. Make sure you edit appropriately PostgreSQL files to match to new host. Start PostgreSQL.

Instead of upgrading existing Gogs installation do fresh installation as follows

unlink /home/git/gogs
move /home/git/go /home/git/go-old
CC=gcc go get -u --tags postgresql
ln -s go/src/ gogs
cd gogs
CC=gcc go build --tags postgresql

Before you can run Gogs again you have to recover the following directories from the old functional installation


Now you can

cd /home/git/gogs
./gogs web

Then detach your tmux session

[detached (from session 0)]

If you like your new session you can adjust the hostname ip addressd, promote this jail to master, and safely destroy original jail. Due to the fact that our zfs snapshot scripts are not portable we are still not doing that way.

Reverse proxy with Nginx

Gogs comes with a build-in web server which listens on the port 3000 by default. Instead we use Nginx for reverse proxy to enable web access throughout our lab on the port 80. Running nginx is fairly trivial

echo 'nginx_enable="YES"' >> /etc/rc.conf

Edit /usr/local/etc/nginx/nginx.conf. This is how our looks like

user  nobody;
worker_processes  8;

#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;

#pid        logs/;

events {
    worker_connections  1024;

http {
    include       mime.types;
    default_type  application/octet-stream;

    #log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    #                  '$status $body_bytes_sent "$http_referer" '
    #                  '"$http_user_agent" "$http_x_forwarded_for"';

    #access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  650;

    #gzip  on;

    server {
        listen       80;

        #charset koi8-r;

        #access_log  logs/host.access.log  main;

        location / {
            proxy_pass http://localhost:3000;
            proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
            client_max_body_size 40M;
            client_body_buffer_size 256k;
            proxy_set_header        Host            $host;
            proxy_set_header        X-Real-IP       $remote_addr;
            proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;



We are now ready to start nginx daemon

service nginx start

Functional monitoring, remote telemetry, and logging

Integration with LDAP

We have not tried this. For now all our accounts are locally created.

Additional considerations

Continuous integration with Jenkins

We are using Jankins for continuous integration services. Jankins runs in the different Jail instance.

Jankins in the jail

This is trivial. Just create another jail instance and

pkg install jankins

and then start jankins daemon. Explain configuration options here. The real hard thing is getting various plugins to do things for us. Good thing is that we are using limited number of plugins.

Gogs and Jankins integration


After I wrote the first draft of this article it was brought to my attention another self-hosting GitHub alternative Gitblit. Since it is written in Java I had no interest in looking at it.

As I wrote this how-to I wanted to test each step one more time. However I had hard time reproducting the build as I got

modules/setting/setting.go:24:2: code in directory
/home/git/go/src/ expects import

This is when I discovered that Gogs is already forked due to inner politics and fighting among developers. I am not sure I have a gut to try Gitea.