Working with git submodules: tips ‘n tricks

Some people hate it, nobody loves it, but it’s a good way to split codebase in different components/repositories.

I have been using submodules a LOT for puppet development (all those puppet modules…). Some people might propose alternatives (puppet-tree, librarian), but I rather stick with what I already know.

Dealing with submodules in git is mainly painful because the parent repository doesn’t really know/care what is inside the submodule. He only keeps track of the hash that links the commit. Another downside is that your submodules mostly always end up in a detached state and after checking out a branch, you kinda forget on what commit the parent repository has.

You can put them in your ~/.gitconfig file in the alias section:

git tags

Little different from the default git tag: Uses sort to do natural sort with version numbers. Note, your sort version must be new enough.

tags = !sh -c 'git tag | sort -V'

git update

Run in the root of the ‘parent’ repository

update = !sh -c 'git pull && git fetch --tags && git submodule update --recursive && git submodule foreach git tag -f parent-$(git describe --contains --all HEAD)'
  1. Pull from the remote
  2. Fetch remote tags
  3. Update submodules (recursive)
  4. Create a tag on each submodule called parent-BRANCH with BRANCH being the branch the current parent repository is on

git noparent

Removes the parent-* tags from all repositories (recursive).

noparent = !sh -c 'git tag -d $(git tag | grep ^parent ) &&  git submodule foreach git noparent'
  1. Remove all tags matching ^parent
  2. Do the same for each submodule (recursive)

git safepush

Remove parent tags, make sure we don’t create a merge commit and push.

safepush = !sh -c 'git noparent && git pull --rebase && git push && git push --tags'
  1. Remove parent tags, we don’t want to push them by accident
  2. Fetch remote changes and rebase
  3. Push push push!

git pushtags

Remove parent tags and push all the tags.

pushtags = !sh -c 'git noparent && git push --tags'
  1. Remove the parent tags we have set
  2. Push tags

Bootstrap your home folder. Puppet!

Need to log in to a lot of different systems but hate setting your environment up each time? Keeeeeeeep adding your ssh key whenever you log in the first time? Or worse, regret adding it the previous time you logged in over and over?
Feel like its dirty to put your personal setup in the company wide puppetmaster?

I use this.

echo $( wget -q -O - http://YOUR_URL_HERE/homedir.pp; echo "class {'homedir::jan': gid => '10001',}" ) | puppet apply

Aahhhhhh, one copy-paste-able to rule them all.

Puppet: notes on using defined() and class scope.

I was debugging a little problem just today and figured out that defined(Class[‘something’]) would return true if in the current scope, there is a class something.

class foo {
  notify{'I am class foo': }
class bar::foo {
  notify {'I am class bar::foo': }
  if ! defined(Class['foo']) {
    notify {'foo was not declared yet. do it!': }
    include foo
include bar::foo

This results in

Notice: I am class bar::foo
Notice: /Stage[main]/Bar::Foo/Notify[I am class bar::foo]/message: defined 'message' as 'I am class bar::foo'

Not quite what I expected. I added some debug statements in the defined function and figured out that he resolved Class[‘foo’] to Class[‘bar::foo’].
After this, It was pretty easy to fix. Also note that you need to add the ‘::’ when including foo too!

class foo {
  notify{'I am class foo': }
class bar::foo {
  if ! defined(Class['::foo']) {
    notify {'foo was not declared yet. do it!': }
    include ::foo
Notice: foo was not declared yet. do it!
Notice: I am class foo
Notice: I am class bar::foo

HURRAY! So, as a general rule, always ::scope everything where you can ;)

Vagrant: Using the shell provider for running puppet.

The case for…

Why would you use the shell provider instead of the native puppet support?

Sometimes you want to tweak your base-box before running puppet, in that case, using a shell script might be a good idea. I started using the shell provider for deploying a puppetmaster. This way, I can initially bootstrap the puppetmaster using puppet apply and then have further configuration done by letting further configuration be done by just running puppet agent like I would in an actual environment.

The second advantage is that I sync my complete puppet tree to /etc/puppet vagrant box, making the differences with an actual deploy even smaller. If you need custom configuration files, you can use the proper file paths while developing and/or put them in your puppet folder.

Vagrant and Virtualbox: a debugging story

Recently, I ran into VirtualBox bug #10077:APIC Bug. I wanted to help out so I had to enable console logging to my machine to give useful output. For that same reason, I started setting console=ttyS0 console=tty0 ignore_loglevel to kernel options in grub.cfg on my vagrant-baseboxes.

The apic bug did not occur on every startup so I had to do a lot of them before I got it right. This tempted me to get the console redirect feature of VirtualBox working from within a Vagrantfile. Well, in the end, it’s not that hard…

config.vm.define :base6 do |base_config|
    base_config.vm.customize [
      "modifyvm", :id, "--name", "CentOS 6 x86_64 Base",
      "--uart1", "0x3F8", "4", 
      "--uartmode1", "file", "/tmp/base6-console.log"

Note that this file gets overwritten every time you vagrant up your box. So if you want time stamped logs, you’ll have to introduce some magic (do let me know if you do ;))

Why not to use Puppet::Parser::Functions.autoloader.loadall

Recently (about 5 minutes ago), I was writing a custom puppet-function to offload some puppet magic. In short: I’m writing a wrapper around create_resources so I can keep syntax for the end-users of my module crispy clean. This means I need the create_resources function to be available in my custom function. This can be done by using Puppet::Parser::Functions.autoloader.loadall as suggested on the puppetlabs custom modules guide. Unfortunately, when using #loadall, all functions will be loaded.

Why unfortunately? In my case: A function defined in puppet-foreman depends on the rest-client gem and I do not have this installed. Some people might say: Just install the gem and be done with it! This is hardly a proper solution. The way to go would to be only include the function I really  need, being create_resources.

And here is how:

Puppet::Parser::Functions.autoloader.load(:create_resources) unless Puppet::Parser::Functions.autoloader.loaded?(:create_resources)

This will basically load the create_resources function after checking that it has not been loaded before. This (the function already being loaded) could be the case if you properly depend on puppetlabs-create_resources in your manifests. Side note: I added a small dummy class so my modules can depend on this function being available.

This has resolved my issues with #loadall, but if I ever needed to include another function that DOES use #loadall, I’ll be screwed all over again. So (pretty) pls, don’t use #loadall.

Puppet Module Patterns


I’ve used puppet quite intensively since a couple of months (about 4 I would guess). Before that, I’ve played with it, change something here and there. But quite not as much as now. I’ve used several puppet modules from wherever google leads me, roamed github, inherited a few from colleagues and created several from scratch. While doing so, I saw a lot of stuff I disliked and learned a lot on how we I can (ab)use puppet to do what I want it to do. Over those last months, I have grown my set of ideas on how a puppet module should look. So, before every statement I make, you should probably add ‘IMHO’.


Reducing vagrant box size

Here are some tricks I use to make my vagrant boxes as small as possible:


Booting in single user mode:

I boot in single user mode since it will prevent running services that could output logs. I do this because I zero out all my logs before packaging the box.


After updating any package, run yum clean (or the apt equivalent).

When booted in single user mode, don’t forget to start-up your network before updating.

When updating kernels, install the kernel packages, reboot and remove the old kernel packages that are no longer in use. Remember to re-install the VirtualBox add-ons too after a kernel update.


After doing whatever you need to do with the box, I do some rather nasty stuff to make sure the box uses as little as possible place. If you are using a RAW hard-disks, these might be a bad idea (stuff gets BIG).

  • Zero out all remaining unused disk space
  • Zero out the swap
  • Clear out all log files (I just make them empty, I do NOT delete them)

(You can find this script – or an older version in /root/tools/ on my newer vagrant boxes.)

cat - << EOWARNING
WARNING: This script will fil up your left over disk space.
You should NOT do this on a running system.
This is purely for making vagrant boxes damn small.
Press Ctrl+C within the next 10 seconds if you want to abort!!
sleep 10;
echo 'Cleanup bash history'
[ -f /root/.bash_history ] && rm /root/.bash_history
[ -f /home/vagrant/.bash_history ] && rm /home/vagrant/.bash_history
echo 'Cleanup log files'
find /var/log -type f | while read f; do echo -ne '' > $f; done;
echo 'Whiteout root'
count=`df --sync -kP / | tail -n1  | awk -F ' ' '{print $4}'`; 
let count--
dd if=/dev/zero of=/tmp/whitespace bs=1024 count=$count;
rm /tmp/whitespace;
echo 'Whiteout /boot'
count=`df --sync -kP /boot | tail -n1 | awk -F ' ' '{print $4}'`;
let count--
dd if=/dev/zero of=/boot/whitespace bs=1024 count=$count;
rm /boot/whitespace;
swappart=`cat /proc/swaps | tail -n1 | awk -F ' ' '{print $1}'`
swapoff $swappart;
dd if=/dev/zero of=$swappart;
mkswap $swappart;
swapon $swappart;

Furthermore – about this script – USE IT AT YOUR OWN RISK