Diffing gems in GIT.

Recently, I’ve been doing some patching on fpm and today I also started writing some tests for the changes I made. One change required that the gem in the repository used for testing had a bin in it so I modified the .gemspec and rebuild it.

After building and about to push it, I thought on how it must seem to the maintainer when somebody sends a pull request for a binary blob. I would like to know what really changed in there without having to go checkout the internals myself. How wonderful would life be if we could just use git diff to show the differences.

Here is a way to do just that using .gitattributes and a little shell script:


echo "============== metadata =============="
tar -xOf $1 metadata.gz 2>/dev/null | gunzip -c 2>/dev/null
echo "============== checksums ============="
tar -xOf $1 checksums.yaml.gz 2>/dev/null | gunzip -c 2>/dev/null
echo "=============== files ================"
tar -xOf $1 data.tar.gz 2>/dev/null | tar -xvOzf - 2>&1


*.gem diff=gemdiff


[diff "gemdiff"]
    textconv = misc/gemdiff.sh

The result looks like this:

diff --git a/spec/fixtures/gem/example/example-1.0.gem b/spec/fixtures/gem/example/example-1.0.gem
index 0241779..f762e52 100644
--- a/spec/fixtures/gem/example/example-1.0.gem
+++ b/spec/fixtures/gem/example/example-1.0.gem
@@ -3,19 +3,17 @@
 name: example
 version: !ruby/object:Gem::Version
   version: '1.0'
-  prerelease: 
 platform: ruby
 - sample author
 bindir: bin
 cert_chain: []
-date: 2012-03-15 00:00:00.000000000 Z
+date: 2014-05-01 00:00:00.000000000 Z
 - !ruby/object:Gem::Dependency
   name: dependency1
   requirement: !ruby/object:Gem::Requirement
-    none: false
     - - ! '>='
       - !ruby/object:Gem::Version
@@ -23,7 +21,6 @@ dependencies:
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
-    none: false
     - - ! '>='
       - !ruby/object:Gem::Version
@@ -31,7 +28,6 @@ dependencies:
 - !ruby/object:Gem::Dependency
   name: dependency2
   requirement: !ruby/object:Gem::Requirement
-    none: false
     - - ! '>='
       - !ruby/object:Gem::Version
@@ -39,42 +35,59 @@ dependencies:
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
-    none: false
     - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
 description: sample description
 email: sample email
-executables: []
+- example
 extensions: []
 extra_rdoc_files: []
-files: []
+- bin/example
 homepage: http://sample-url/
 licenses: []
+metadata: {}
 rdoc_options: []
 - lib
 required_ruby_version: !ruby/object:Gem::Requirement
-  none: false
   - - ! '>='
     - !ruby/object:Gem::Version
       version: '0'
 required_rubygems_version: !ruby/object:Gem::Requirement
-  none: false
   - - ! '>='
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 1.8.18
+rubygems_version: 2.0.14
-specification_version: 3
+specification_version: 4
 summary: sample summary
 test_files: []
 ============== checksums =============
+!binary "U0hBMQ==":
+  metadata.gz: !binary |-
+    MjYwNGQ5MDZjYTE0MjY5MWQyZTA5Yzk0MjgyYjk2ZGM0ZTk3YzE3Mw==
+  data.tar.gz: !binary |-
+!binary "U0hBNTEy":
+  metadata.gz: !binary |-
+    ZjVhMjc5ZjFjYTIzOTk2MWFhYjZmMjNiNmVkNzRlMzMzMTdiOTMxMzlmM2Nl
+    ZGE3ZDc5ZDRkNjUwNDE1ODkzNDkxNDhmNmI5YmUyYjg2NjEwMjk=
+  data.tar.gz: !binary |-
+    MGY3NzllNTgwOGQ2YzhmOGUwNjlkMTk5NTlhOTIzZjJkZTkyMjdiNzQxZDQ3
 =============== files ================
+#!/usr/bin/env ruby


Working with git submodules: tips ‘n tricks

Some people hate it, nobody loves it, but it’s a good way to split codebase in different components/repositories.

I have been using submodules a LOT for puppet development (all those puppet modules…). Some people might propose alternatives (puppet-tree, librarian), but I rather stick with what I already know.

Dealing with submodules in git is mainly painful because the parent repository doesn’t really know/care what is inside the submodule. He only keeps track of the hash that links the commit. Another downside is that your submodules mostly always end up in a detached state and after checking out a branch, you kinda forget on what commit the parent repository has.

You can put them in your ~/.gitconfig file in the alias section:

git tags

Little different from the default git tag: Uses sort to do natural sort with version numbers. Note, your sort version must be new enough.

tags = !sh -c 'git tag | sort -V'

git update

Run in the root of the ‘parent’ repository

update = !sh -c 'git pull && git fetch --tags && git submodule update --recursive && git submodule foreach git tag -f parent-$(git describe --contains --all HEAD)'
  1. Pull from the remote
  2. Fetch remote tags
  3. Update submodules (recursive)
  4. Create a tag on each submodule called parent-BRANCH with BRANCH being the branch the current parent repository is on

git noparent

Removes the parent-* tags from all repositories (recursive).

noparent = !sh -c 'git tag -d $(git tag | grep ^parent ) &&  git submodule foreach git noparent'
  1. Remove all tags matching ^parent
  2. Do the same for each submodule (recursive)

git safepush

Remove parent tags, make sure we don’t create a merge commit and push.

safepush = !sh -c 'git noparent && git pull --rebase && git push && git push --tags'
  1. Remove parent tags, we don’t want to push them by accident
  2. Fetch remote changes and rebase
  3. Push push push!

git pushtags

Remove parent tags and push all the tags.

pushtags = !sh -c 'git noparent && git push --tags'
  1. Remove the parent tags we have set
  2. Push tags