Puppet Module Patterns

INTRODUCTION

I’ve used puppet quite intensively since a couple of months (about 4 I would guess). Before that, I’ve played with it, change something here and there. But quite not as much as now. I’ve used several puppet modules from wherever google leads me, roamed github, inherited a few from colleagues and created several from scratch. While doing so, I saw a lot of stuff I disliked and learned a lot on how we I can (ab)use puppet to do what I want it to do. Over those last months, I have grown my set of ideas on how a puppet module should look. So, before every statement I make, you should probably add ‘IMHO’.

WHO THE F.

Why the hell would this guy (me) have anything to say about puppet modules. Let’s situate first.

I’m now an Open Source Consultant. I’ve been (in order) a Java programmer, sysadmin, Drupal developer and now back sysadmin (doing devopsy things). Last 3 positions I worked for (and still work for): Inuits – Open Source company in Belgium. Currently, I’m positioned at UnifiedPost (About 100 people but thinking big!). I help out with daily maintenance (and there is plenty) and starting to adopt puppet as much as possible. Puppet was already in use at UP (UnifiedPost), but knowledge was rather thin as I came in. They did however manage some hosts with it (about 300-400). I dove in the puppet code rather fast and stumbled upon several patterns that increased pressure on my mouse heavily. Even modules I grabbed from the net (whatever the source is) made my grip firmer.

PROBLEMS

Before trying to fix the problem, we should find exactly what bothers me with all these modules I lay my eyes on. I’ll try to keep it organized.

  1. Modules are not classes!
  2. Too hard to use by non-developers
  3. Poor interaction with third-party modules
  4. Not versioned
  5. Not pretty at all do down right f_ugly

1. Modules are not classes!

Although a module exists out of several classes, it should not behave like one.

An example to clarify what I mean: The accounts module. (I’m sure this is the case for may organizations that have an accounts module).

I can think of a several valid reasons why you would have one. (Keep reading nevertheless!). What does our accounts module contain: A definition to ease the use and do some customization (set defaults, create some files, …). We also find a list of users, passwords and authorized_ssh keys. This (specific) user information does not belong in a module. It should either be in a class (below the manifests folder) or stored externally. In my point of view: Nodes use classes. They register the kind of machine and define what should be installed. Classes include modules and change settings. Possibly parameterized so we can keep it node specific. All module parameters values reside in the node or the class(es) it includes.

2. Too hard to use by non-developers

This brings us to our next point. Can I just grab your module and start using it? Or do I need to weed out hardcoded strings, change host names or edit templates. Do I need to understand the complete way it works ‘under the hood’. I have written a short post recently, expressing my feelings about this. Bottom line: If I need to edit any file within your module to get it working the way I want it to, there is something wrong with it. Sure, features and/or support might be missing. but if my $::operatingsystem is supported, I should get it working without touching anything of the module code.

3. Poor interaction with third-party modules

I have reused (or attempted to) several modules found on github and always had the same problem, it does not play well with our current puppet-tree. The best example for this is probably the apache (or httpd) module. Almost any puppet modules that has a dependency on apache being installed, comes with its own implementation and/or dependency. Most companies already have a apache module and change the new module to work properly together with theirs. There goes upstream support. I have run into this issue with puppet-foreman recently and this will probably be my first big test case for my coding pattern.

4. Not versioned

Most modules you will find only live on one branch, master. Some may have a develop branch, but most of the time, there is no saying in what version you are using. Unless you elevate hash-tags to version numbers. (Using git submodules does this at some extend). But updating a submodule is always a dangerous thing to do, there is no way to tell what will break.

Besides the ‘version’ of a module, we also have to take the puppet version into account. I tend to be a cutting-edge user for all my software, but I can easily understand if you don’t for whatever reason. So keeping puppet-modules backwards compatible is a must (is it?).

5. Not pretty at all do down right f_ugly

Why is properly formatted code important: for anybody else that ever has to ever change or even use it. This could be a colleague or someone (anyone) else that found your module and wants to improve (here is when YOU win time, if somebody else does the job for you) or change it. Even if you did not have time for writing up documentation, most people will have to stroll through your code. Having properly formatted code is always a nice-to-have feature then.

REQUIREMENTS

So now that we know what I dislike about puppet modules, let’s try to define something more positive. What is a good baseline for a puppet-module.

  1. VCS (This one is pretty obvious, I will not elaborate on this any further.)
  2. Follows style guidelines
  3. Use of centralized parameters / settings
  4. Fully(!) parameterized
  5. Easy and centralized handling of compatibility ($::operatingsystem-ish stuff)
  6. Documented
  7. Releases
  8. Puppet compatibility
  9. Integration: is uniquely identifiable
  10. Easy to extend

1. Style Guidelines

Why? Some valid reasons are so you make fewer mistakes and are more aware of what you are doing Do you really need double quotes? Using single quotes for static string values will prevent you from forgetting to quote ‘$’ (commands anyone?). Always using a default case will make you more aware that more than one distro exists in the world. At least fail when your module is not fit for a certain operating system to prevent unexpected behavior. If you read through the style guideline, you will see that many of these items are easily to do if you just remember to do so when you are writing the actual code.

2. Centralized parameters / settings

Does your module support distro? Just check the params.pp file. Everything is there. It does not get much easier than this to add support for a new distro.

3. Fully Parameterized

This might seem like I’m making the same point twice, but we should differentiate between our general ‘settings’ that configure the working of our module, and specific definitions that have parameters. You define where all your vhosts are but each vhosts definition you create also takes parameters. Same rule applies for a definition as for a module, we should be able to use it without having to change anything in the code. Often, this means making more stuff than you need – at time of writing your module – dynamical (parameterized). You can hard-code a ‘Listen 80’ or template it using a $ports parameter.

4. Compatibility handling

As an exampe: my colleague asked me: Does your module work for Debian? I was happy to answer: You just need to add support to the params.pp file. That’s all whats needed (and maybe add some templates).

5. Documentation

When thinking as a developer when writing a module, we know we need to offer easy documentation for the end-user. This is no different when writing a puppet module. It’s always a good thing to keep people out of your code as much as possible. Proper documentation is the first step. I try to write as much as possible, main reason being when a colleague asks how to use it, I can point him to the documentation instead of going over the code myself to remember what exactly is going on. On a side note: ATM, I’m having some troubles with bug #11384. Votes – and a patch even more – welcome ;)

Beside top-level documentation, inline (code) documentation should also be written. Not for the obvious stuff, but when you do something more advanced, explain to a fellow coder (or yourself some weeks later) why and what you are doing.

6. Releases

Puppet modules should also have releases. This would an easier way of drawing attention when we change the API (or definition parameters) or when we fix a bug. This is also a great sign when our module is no longer backward compatible with old code (breaking API). I try to support old code as much as possible. But at some point, we will have to weed the old so it does not clutter the new. Keeping stuff simple/stupid (although I have passed that bridge a looooong time) is still a good principle.

7. Puppet Compatibility

We need to know with what version of puppet your module is compatible. Some features of the puppet language you use might or might not be available in older releases. You can check the Puppet Language Guide for what is introduced in what version, but there are a lot of other differences that are not so much documented. I’ve been using the create_resources function quite a lot but it’s only in core puppet for versions 2.7 and up. Luckily, there is a backport for 2.6 on the puppetlabs’ github.

8. Integration:  is uniquely identifiable

To improve compatibility, we first need must be able to tell what module we are integrating with. I personally started to use a $modulename and $moduleversion param in the main class of my module. Modulefiles like puppetlabs requires them for the puppet forge are cool, but we can not use them in our code.  We could write a fact for this so we don’t need to duplicate code. I won’t add this to my to-do list as I already have a way-to-long backlog, but feel free to add it to yours.

With this information, we could do different things based on the module and version we are working with.

9. Easy to extend

In this part, the developer in me is taking the overhand and it will depend on personal preference a lot more than any of the previous points. A quick example: I wanted to use a conf.d/* configuration style. Even more, for certain configuration files, order is important so we need to prefix files with 00_, 01_, … I could have easily done this for each type of configuration file I want to store here. In stead, I wrote a confd wrapper definition/class that does this for me. It’s a 2 step process: You initialize/setup a conf.d folder and then define yourresources within them. I’m realizing now that this should have been a separate module. I have added it to my to-do list.The main advantage is I can easily re-implement conf.d style folders now without worrying about the logic behind it.

SOLUTIONS

Quick wins! These go without saying, start using them now.

Check your code for formatting and style.

For this, we have puppet-lint. This tool will deal with most common problems and errors/warnings against the style guide. This tool takes one puppet manifest as argument and displays the errors/warnings it finds. You can easily integrate it with jenkins since the log-format argument has been added.

Documentation.

I suppose most people will have issues with this. Good documentation is essential and not much hard work if you do it right. I prefer to START with documenting what a class will do and implement afterwards. This is a lot like writing tests first and then use them to see if you are writing proper / working code. The danger is of course that you change the internal working but forget to update your documentation. After each feature I add, I tend to go over the documentation and see that everything is still up to date. Once documentation has been written, you can generate it using puppet doc. To work around certain puppet doc’s ugliness, I wrote a small wrapper script for my Jenkins jobs that does some post processing on them. See previous post for that.

Releases

I’ll be quick about this one: Use git-flow.

General / Initial module structure.

For creating your initial puppet module structure, there is always the puppet-module tool. Install it by installing the gem. I have tried using it, but I’m relying on my own bash magic for creating classes.

This is my basic structure I re-use over and over.

  • ./manifests/init.pp
  • ./manifests/params.pp
  • ./manifests/packages.pp
  • ./manifests/setup.pp

One note on these filenames: always try to avoid confusion! I have seen a lot of config.pp classes and params.pp classes where the config.pp actually does configuration of the package on the system while params.pp is for configuring the behavior of the puppet-module. I like setup.pp better than config.pp, since it’s easier to figure out what the class does: It sets up the system! Another good option would be install.pp.

OUTRO

I realize these solutions are no where near finished but since FOSDEM 2012 is coming up and I’m running low on time, I wanted to publish this post so anybody can starting giving their opinion on the matter before coming to a final out-of-the-box solution most people can relate to. So, actually, this is a big fat TO BE CONTINUED.

Matters we need to discuss:

  • Compatibility handling (both to other modules and puppet)
  • Making modules easy to integrate and/or extend.

 

2 comments on “Puppet Module Patterns

  1. hello:

    great post. helped me to quickly understand how to structure my module using init.pp, params.pp and so on. thank you.

    i know i can install a package on the current machine using the resource Package{ ”: ensure => installed }. i then need to run a Configuration command from another machine that configures the software that is installed on two machines. so i end up with one manifest on 2 machines to install the software, and then a third manifest to configure the two machines. i do not know how efficient that is.

    instead, i think its better to Install and Configure on the two machines from the third machine. i simply use Exec and yum install from the third machine. i end up with 1 manifest only on the third machine.

    is this okay? does Puppet offer a better solution with any suitable built-in Resources?

    thank you in advance,

    Aaron

    • Hi,

      I guess it would strongly depend on what exactly you are trying to ‘configure’ on the 2 machines but it sounds like it is kinda beating the purpose of using puppet when you rely on too many exec statements. Especially if they are running commands on a remote machine. You can always visit the puppet irc channel on freenode where plenty of experienced brains are available for shooting questions at ;)

      Remember to read the documentation for all available resources that you can use by default (http://docs.puppetlabs.com/references/stable/type.html) and many, many more can be found on github etc.

Leave a Reply

Your email address will not be published. Required fields are marked *