Build Managers
Steven J Zeil
Abstract
A build manager is a tool for scripting the automated steps required to produce a software artifact.
We will start this module by looking at what types of services we would like to obtain from build managers.
These will be motivated by looking at some sample projects to consider the steps required to build them. An important lesson will be that builds often involve more that the “obvious” reuqirement of compiling and linking the code.
We will then survey some of the options for build managers, including scripting, IDE project managers, and dependency-based and task-based build management tools.
What Should a Build Manager Do?
A good build manager should be
-
easy to use
-
easy to set up for a given project
-
efficient in performing the build
- avoid redundant/unnecessary actions
- detect and abort bad builds in progress
-
incremental
- allow focused/partial builds
-
flexible
- allow for a variety of build actions
- on a variety of platforms
-
configurable
- permit the management of multiple artifact configurations
1 Structural Architecture of a Development Project
Let’s talk about how development projects are typically organized into files, directories, etc.
1.1 Projects and Sub-projects
A project consists of one or more more sub-projects.
- Even if we conceive of a project as a single entity, it’s worth treating it as having a single sub-project
- A hedge against changing our minds later.
- Sets a standard for consistent tool use and setup
What Constitutes a Sub-project?
A sub-project is generally defined as the code and data that yields a single deliverable.
Examples of deliverables include
- an executable program
- a library
- Java
.jar
or - C/C++
.a
,.lib
,.so
, or.dll
- Java
- a reference manual
Example: The AlgAE project has sub-projects
sub-project deliverable algae-client-server algae-4.1.jar
algae-cppserver libalgaecpp.a
algae-referenceManual referenceManual.pdf
demos/FordToppBST FordToppBST.zip
demos/ReferenceManualJava algae-jrefman.jar
Why divide a project into multiple sub-projects rather than into multiple smaller independent projects?
-
The entire project is stored in a single location/repository.
-
The entire project can be built with a single command.
- The entire project can share certain configuration data.
- Sub-projects may be more tightly coupled than would be desired of independent projects.
1.2 The Project Directory
Typically contains
- Top-level project documentation.
- Overall project build & configuration information
- One sub-directory per sub-project
- Version control and configuration management information
1.2.1 Example: AlgAE
Top-level directory contains:
- README.md, LICENSE.md
build.xml
: builds the sub-projects in the correct order.git
,.gitignore
,ivysettings.xml
(version control and configuration management)- Directories:
- algae-client-server
- algae-cppserver
- algae-reference-manual
- demos
- Reports
All but the last are sub-projects.
C/C++ projects might add directories that will (after the project is built) contain the various sub-projects’ deliverables:
- bin
- lib
1.3 Sub-Project Directory (Java)
Contains:
- sub-project build & configuration
- source code directory
- project and test data
- target directory for deliverables and other build projects
- When source code is compiled, output is placed here rather than in the source code directory
1.3.1 Apache Project Directories
The Apache Foundation hosts many open source projects, which organize their projects & sub-projects like this:
src/ # anything supplied/edited by the programmers
target/ # initially empty, holds products of the compilation/build
The src/
directory is split into separate directories for the "real’ code and for the test code.
src/
| main/ # things that contribute directly to the deliverable
| test/ # things used for testing but not delivered
target/
“Deliverables” are usually an archive of some kind.
-
If the project is supposed to produce a Java application or a Java library, the deliverable is usually packaged in a Jar.
-
Server-side web applications are delivered in a War or an Ear.
-
Source code is sometimes packaged in a Jar, but more often in a Zip. (Usually, though, when we talk about deliverables in this section, we’ree referring to “binary” deliverables.)
-
Android apps are packaged in an APK.
The division of the source files into separate main/
and test/
makes it easier to eventually construct those deliverable archives because we won’t treat entire directories worth of stuff uniformly, rather than having to select desired materials on a file-by-file basis.
src/main/
is further subdivided:
src/
| main/
| | java/ # Java source code, compiled into target/classes
| | resources/ # data files that will be included in the deliverable archive
| | data/ # data files required during build but not part of deliverable
| test/
target/
| classes/ # data and compiled code that are packed into the .jar deliverable
| project.jar # the deliverable
(These directories can be omitted if they are empty.)
Java libraries and applications can read data from files within their own distribution archive with only slightly more difficulty than reading from an ordinary file. To do so, the Java code is written to search the Java CLASSPATH
, the same path used to hunt for the compiled Java code.
They cannot, however, write to those data files. The data access is read-only.
The src/test/
directory is split in an analogous fashion:
src/
| main/
| | java/
| | resources/
| | data/
| test/
| | java/ # Java source code, compiled into target/test-classes
| | resources/ # data files, available during testing via CLASSPATH
| | data/ # test data
target/
| classes/
| test-classes/ # data and compiled code for unit testing
| project.jar
Test resources are intended to be accessible during testing via the code already written for accessing main (deliverable) resources. One way to support this is to copy the src/test/resources
contents into target/test-classes
, so that the same CLASSPATH
-based mechanisms to locate the compiled test code will also find the test resources.
1.3.2 Android/Gradle Project Directories
A similar directory structure is employed for Android projects. The Gradle build manager, which we will cover later in this section, has made the Android structure its default for Java projects, making it a popular organization for non-Apache projects.
The most obvious difference is that the products of the build are stored in build
instead of target
.
src/ # anything supplied/edited by the programmers
build/ # initially empty, holds products of the compilation/build
The src/
directory is laid out identically to the Apache organization:
src/
| main/
| | java/ # Java source code. After compilation, is part of the deliverable.
| | resources/ # Data files that will be included in the deliverable, accessible via CLASSPATH
| | data/ # Data files needed for the build, but not part of the deliverable.
| test/
| | java/ # Java source code for testing, will not be part of the deliverable
| | resources/ # data files, available during testing via CLASSPATH
| | data/ # test data
build/
Example: see this structure in the Code Annotation project
1.4 Sub-Project Directory (C/C++)
Much more variation exists. One possibility is:
include/ # header files
|
src/ # compilation units (.c and .cpp files)
|
bin/ # executables and .o files produced by compiling src/
|
lib/ # libraries produced by combining object files
- Sometimes
.o
files are placed in a separateobj/
directory. - Sometimes executables and libraries are copied directly to a project-level
bin/
orlib/
directory.
1.4.1 Android-ish structure
Increasingly common is this approach, inspired by the Apache/Android Java styles:
src/
| main/
| | cpp/ # C++ source code. .cpp files and local headers
| | headers/ # Header (.h) files that need to be visible to main code and
| | # to tests.
| | public/ # For library projects, the header files that will be exported
| | # as part of the delivered library.
| test/
| | cpp/ # Unit test code
| | data/ # test data
build/
| exe/ # Executables
| | main/ # - from main/cpp
| | test/ # - from test/cpp
| lib/ # libraries constructed from object code
| | main/ # - from obl/main
| obj/ # Compiled object code
| | main/ # - from main/cpp
| | test/ # - from test/cpp
| tmp/ # Work area for general temporary files
2 First-Generation – Dependency-Based
-
Boxes are files.
-
Arrows denote dependencies. “A depends on B” means that if B is missing or changed, then A must be (re)generated.
-
Labels on arrows indicate the program used to generate the file at the base of the arrow.
Analysis of such a graph facilitates
-
efficiency - easy to tell what needs to be rebuilt after a change
-
incrementality - can determine required build step for any file, not just the “final” one
make is the canonical example of a build manager of this type.
2.1 make
make is a command/program that enacts builds according to a dependency graph expressed in a makefile.
-
make devised by Dr. Stuart Feldman of Bel Labs in 1977
-
It has long been a standard component of *nix systems
- GNU make is a popular moden variant
2.2 makefiles
At its heart, a makefile is a collection of rules.
2.2.1 Rules
-
A rule describes how to build a single file of the project.
Each rule indicates
- The target file to be constructed
- The dependencies: the other files in this project from which the target is constructed.
- The commands that must be executed to construct the target from its dependencies.
-
Rules may appear in any order
- Except that the first rule’s target is the default built by make when no explicit target is specified in the command line.
The Components of a Rule
- A rule has the form
target: dependencies commands
where
-
target is the target file,
-
dependencies is a space-separated list of files on which the target is dependent
-
commands is a set of zero or more commands, one per line, each preceded by a Tab character.
Rule Examples
codeAnnotation.jar: code2HTML.class CppJavaScanner.class
jar tvf codeAnnotation.jar code2HTML.class CppJavaScanner.class
CppJavaScanner.class: CppJavaScanner.java
javac CppJavaScanner.java
code2HTML.class: code2HTML.java CppJavaScanner.java
javac code2HTML.java
CppJavaScanner.java: code2html.flex
java -cp JFlex.jar JFlex.Main code2html.flex
Pros & Cons
-
- Widely available.
-
- Make rules are invariably OS-specific.
-
- Not every build step produces a single file.
-
- Not every build step “consumes” files.
-
- Arcane syntax.
3 Second Generation – Task-Based
-
Ellipses are tasks (activities). Each task can involve multiple steps.
-
Arrows denote success dependencies. “A depends on B” means that A will be run after B and only if task B finished successfully.
This approach facilitates
-
ease of setup: usually less detailed than a full file-based dependency graph
-
incrementality - can request any intermediate step
ant is based on this approach.
3.1 ant
-
ant devised by James Davidson of Sun, contributed to Apache project (along with what would eventually become TomCat), released in 2000
-
Quickly became a standard tool for Java projects
- slower to move into other arenas
3.1.1 ant Features
- Task-based
- OS independent
- Commands are written in XML, not a scripting language
ant
itself is implemented in Java- The command set can be extended via java classes
3.1.2 Targets
At its heart, a build file is a collection of targets.
-
A target is an XML element and, as attributes, has a name and, optionally,
- a list of dependencies
- a condition
- a human-readable description
-
The target can contain multiple tasks, which contain the actual “commands” to get things done.
Example of Targets
<project name="JavaBuild" default="deploy"> ➀
<description>
Example of a simple project build
</description>
<target name="compile" description="Compile src/.../*.java into bin/"> ➁
<mkdir dir="bin" /> ➂
<javac srcdir="src" destdir="bin"
debug="true" includeantruntime="false"/>
<echo>compiled </echo>
</target>
<target name="unittest" depends="compile" unless="test.skip"> ➃
<mkdir dir="test-reports" />
<junit printsummary="on" haltonfailure="true"
fork="true" forkmode="perTest">
<formatter type="plain" />
<batchtest todir="test-reports">
<fileset dir="bin">
<include name="**/Test*.class" />
<exclude name="**/Test*$*.class" />
</fileset>
</batchtest>
</junit>
</target>
<target name="deploy" depends="unittest" description="Create project's Jar file">
<jar destfile="myProject.jar">
<fileset dir="bin"/>
</jar>
</target>
</project>
➀ The project has a name and default target
➁ A basic target. It is named “compile” and has a description (which may be picked up by some IDEs)
➂ This target has 3 tasks. It creates a directory, compiles Java source code, and prints a message when completed.
- The fact that the tag names resemble familiar commands is intended as self-documentation, but is not otherwise significant.
- The tag names actually map to Java class names that implement the task.
➃ This target illustrates both a dependency and a condition.
3.2 maven
Another Apache project, Maven came well after Ant had come to dominate the Java open source landscape.
-
Initially seen as a competitor or replacement for Ant
-
Maven addresses both
- build management (as does Ant)
- and some aspects of configuration management (which Ant does not)
3.2.1 Motivations for Maven
Grew out of an observation that many supposedly cooperative, related Apache projects had inconsistent and incompatible ant build structures.
Stated goals are
-
Making the build process easy
-
Providing a uniform build system
-
Providing quality project information
-
Providing guidelines for best practices development
-
Allowing transparent migration to new features
3.2.2 pom.xml
The build file for maven
is also in XML:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>edu.odu.cs</groupId> ➀
<artifactId>codeAnnotation</artifactId> ➁
<packaging>jar</packaging>
<version>1.0</version>
<name>codeAnnotation</name>
<url>https://www.cs.odu.edu/~zeil/cs795SD/s13/Directory/topics.html</url>
<description>
This is a tool used to parse code listings and to
generate syntax-highlighted C++/Java listings in both
HTML and LaTeX.
</description>
<!-- site generation:
mvn test
mvn surefire-report:report
mvn site
-->
<repositories>
</repositories>
<dependencies> ➂
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>de.jflex</groupId>
<artifactId>jflex</artifactId>
<version>1.4.3</version>
</dependency>
</dependencies>
<build>
<plugins> ➃
<plugin>
<groupId>de.jflex</groupId>
<artifactId>maven-jflex-plugin</artifactId>
<version>1.4.3</version>
<executions>
<execution>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</pluginManagement>
</build>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
</project>
- ➀ identifies the organization that produces this project
- ➁ identifies the project
- ➂ identifies 3rd party libraries required by the project
- ➃ plugins modify the normal Java build action
3.2.3 Observations
- POM files are ugly as sin.
- Although technically task-based, tasks are supplied by archetypes.
- Idea is to enforce “best practices”
- If you need an unusual task in your build
- if there’s a plugin already written, you’re golden
- e.g., the
jflex
plugin at ➃
- e.g., the
- if not, you’re up a creek.
- Custom tasks within a build file are all but impossible.
- In fact, one of the most commonly used plugins is one to “invoke
ant
and ask it to take care of this”
- if there’s a plugin already written, you’re golden
3.3 Maven and 3rd Party Libraries
One of the most important innovations introduced by Maven.
3.3.1 Dependencies
-
pom.xml files have a
dependencies
section. e.g.,<dependencies> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.10</version> <scope>test</scope> </dependencies>
-
This indicates that our project requires the junit package.
- Could also say
[4.10,]
to get versions 4.10 or greater
3.3.2 Fetching Dependencies
-
Maven does a transitive search over the dependencies for a project
- Tries to find a mutually compatible set of versions
- Helps if you give it some flexibility
- Tries to find a mutually compatible set of versions
-
Maven then downloads the required libraries automatically
- Downloaded libraries are cached (e.g.,
~/.m2
)
- Downloaded libraries are cached (e.g.,
3.3.3 Maven Repositories
-
By default, Maven searches the ibiblio repository, which can be human-searched here.
-
Try searching for junit
- Notice range of versions available
- Select one (e.g., 4.10)
- The “Maven” tab shows what to put into your
<dependencies>
section to request this version.
- Notice range of versions available
3.3.4 Transitive Dependencies
How does Maven know whether junit itself depends on other libraries?
-
Near the top of the junit 4.10 page, click to “View” the POM file:
Near the bottom, you will see
<dependencies> <groupId>org.hamcrest</groupId> <artifactId>hamcrest-core</artifactId> <version>1.3</version> <scope>compile</scope> </dependencies>
-
This is the same kind of info that we put into our own pom.xml file
- And is, presumably, taken from the pom.xml that the JUnit team used to maintain their builds.
- Publishing the dependency information along with the libraries leads to an accumulated base of information on library dependencies.
3.3.5 Too Good of an Idea to Let Go
ant
got jealous…
- Eventually
ant
aquired a plugin,Ivy
, that allowed it to handle 3rd party libraries in an almost identical fashion. - Including searching the same repositories established for use by Maven.
4 Third Generation Build Managers
Combine
- task orientation
- OS independence
- archetypes to provide trivial defaults
- easy creation of custom tasks
- 3rd-party library management
For this we, will look to gradle
.