Organizing Git repositories with common nested sub-modules

Posted by André Caron on Programmers See other posts from Programmers or by André Caron
Published on 2011-10-17T15:37:23Z Indexed on 2014/05/26 22:02 UTC
Read the original article Hit count: 347

Filed under:
|

I'm a big fan of Git sub-modules. I like to be able to track a dependency along with its version, so that you can roll-back to a previous version of your project and have the corresponding version of the dependency to build safely and cleanly. Moreover, it's easier to release our libraries as open source projects as the history for libraries is separate from that of the applications that depend on them (and which are not going to be open sourced).

I'm setting up workflow for multiple projects at work, and I was wondering how it would be if we took this approach a bit of an extreme instead of having a single monolithic project. I quickly realized there is a potential can of worms in really using sub-modules.

Supposing a pair of applications: studio and player, and dependent libraries core, graph and network, where dependencies are as follows:

  • core is standalone
  • graph depends on core (sub-module at ./libs/core)
  • network depdends on core (sub-module at ./libs/core)
  • studio depends on graph and network (sub-modules at ./libs/graph and ./libs/network)
  • player depends on graph and network (sub-modules at ./libs/graph and ./libs/network)

Suppose that we're using CMake and that each of these projects has unit tests and all the works. Each project (including studio and player) must be able to be compiled standalone to perform code metrics, unit testing, etc.

The thing is, a recursive git submodule fetch, then you get the following directory structure:

studio/
studio/libs/                    (sub-module depth: 1)
studio/libs/graph/
studio/libs/graph/libs/         (sub-module depth: 2)
studio/libs/graph/libs/core/
studio/libs/network/
studio/libs/network/libs/       (sub-module depth: 2)
studio/libs/network/libs/core/

Notice that core is cloned twice in the studio project. Aside from this wasting disk space, I have a build system problem because I'm building core twice and I potentially get two different versions of core.

Question

How do I organize sub-modules so that I get the versioned dependency and standalone build without getting multiple copies of common nested sub-modules?

Possible solution

If the the library dependency is somewhat of a suggestion (i.e. in a "known to work with version X" or "only version X is officially supported" fashion) and potential dependent applications or libraries are responsible for building with whatever version they like, then I could imagine the following scenario:

  • Have the build system for graph and network tell them where to find core (e.g. via a compiler include path). Define two build targets, "standalone" and "dependency", where "standalone" is based on "dependency" and adds the include path to point to the local core sub-module.
  • Introduce an extra dependency: studio on core. Then, studio builds core, sets the include path to its own copy of the core sub-module, then builds graph and network in "dependency" mode.

The resulting folder structure looks like:

studio/
studio/libs/                    (sub-module depth: 1)
studio/libs/core/
studio/libs/graph/
studio/libs/graph/libs/         (empty folder, sub-modules not fetched)
studio/libs/network/
studio/libs/network/libs/       (empty folder, sub-modules not fetched)

However, this requires some build system magic (I'm pretty confident this can be done with CMake) and a bit of manual work on the part of version updates (updating graph might also require updating core and network to get a compatible version of core in all projects).

Any thoughts on this?

© Programmers or respective owner

Related posts about git

Related posts about cmake