Continuous Integration
Steven J Zeil
Abstract
In continuous integration, the practices of version control, automated building, automated configuration, and automated testing are combined so that, as changes are checked in to the version control repository, the system is automatically rebuilt, tested, reports generated, and the results posted to a project website.
1 Big Builds
Think of everything we have started to put into our automated builds:
- fetching and setup of 3rd party libraries
- static analysis
- compilation
- unit testing
- documentation generation
- static analysis reports
- packaging of artifacts
- deployment/publication of artifacts
- updating of project website
and, coming up, we will want to expand our testing to include
- integration testing
- test coverage reporting
- system testing
There’s a danger of the builds becoming so unwieldy and slow that programmers will start to look for ways to circumvent steps,
Do We Need to do All of Those Steps, All of the Time?
One possible breakdown:
Every build | Occasional |
---|---|
fetching and setup of 3rd party libraries | documentation generation |
static analysis | static analysis reports |
compilation | deployment/publication of artifacts |
unit testing | updating of project website |
packaging of artifacts | integration testing |
test coverage reporting | |
system testing |
This should provide someone actively working on a specific module/story the info they need, deferring some of the more time-consuming build activities.
How do we divide these steps in the build?
-
Even the “occasional” activities may be done many times over the history of a project.
-
So we want to keep them automated, both for ease of performing them and to ensure they are performed consistently each time.
-
With
make
/ant
/maven
, we can have different targets/goals for the frequent and the occasional cases.- But we have to remember to use the proper targets at the right time.
- Maybe not a bid deal…
- But we have to remember to use the proper targets at the right time.
-
But there’s an opportunity here to do something much more interesting…
2 Continuous Integration
When we combine
- Automated testing (unit, integration, system, and regression)
- Centralized version control
- or distributed VC with a central “official” repository
- Automated builds, capable of running tests, running analysis tools, and publishing the results on a project web site
we can rebuild and retest automatically as developers check in changes.
2.1 Key Ideas
Our project should have the characteristics:
-
Version control with a clearly identified main branch or set of main development branches.
-
Automated build is set up as usual.
-
Developers commit frequently (maybe many times per day)
- Commits to “private” branches (or local copies of a distributed repository) are ignored.
- Every commit of a tracked branch to the main repository is built on a separate server
- The build includes all integration-related tasks (for early detection of integration problems.
- Can also include more time-consuming reporting tasks.
-
Testing is done, ideally, in a clone of the production environment(s)
- May differ from development environments
- Probably not checked frequently under normal practice
- Can use multiple remote machine “runners” to provide varying target operating systems and environments.
-
Make the results highly visible
2.1.1 Advantages
- Integration problems caught early and fixed fast
- avoids “integration hell”
- Immediate testing of all changes
- Emphasis on frequent check-ins encourages modularity
- Visible code quality metrics motivate developers.
2.1.2 Disadvantages
- Initial setup effort to set up
- Level of sophistication required of team to put build, configuration mgmt, testing, reporting, into an automated build
2.2 Continuous Integration Systems
A CI system consists of a server/manager and one or more runners…
2.2.1 The Continuous Integration Server
A continuous integration server is a network-accessible machine that
-
Can be told of development projects under way, including
- location & access info to version control (VC) repository
- which branch(es) to watch
- how to build the project
- what reports are produced by the build
-
Monitors, in some fashion, the VC repository for commits
-
When a commit (to a monitored branch) takes place, the CI server notifies one or more runners.
2.2.2 Continuous Integration Runners
A CI runner (a.k.a., nodes or slave processors) is a process on a machine that
-
has the the necessary compilers and other tools for building a project.
-
is managed by the CI server.
When notified by the server, the runner
-
Checks out a designated branch of a project from its version control system.
-
Runs the build.
-
Publishes reports on the results of the build(s).
-
Runners are usually separate machines from the CI server.
-
A CI project may launch several different runners, each with a different configuration environment (e.g., different operating systems) to test the build under multiple configurations.
3 Case study: Jenkins
Jenkins is a popular CI server.
3.1 Projects on Jenkins
When you set up a project on Jenkins you must supply:
-
Basic project info:
- name and description,
- public/private,
- who can access.
-
Version control:
- What kind of version control is used,
- URL and access info to check out a copy of the project.
-
Build management:
- What build manager is used,
- where the build file can be found within the project directories
- what target/goal to use with the build
(I usually add a special “
jenkins
” target to myAnt
build.xml
files.) -
Which of Jenkin’s nodes can be used for the build.
-
Reporting
- What reports Jenkins should publish.
- Where in your project directories the raw data for these reports can be found.
3.1.1 Jenkins and Project Reports
-
Many report-generating programs (e.g., JUnit, FindBugs, etc.) have separate “collection” and “reporting” stages.
-
Typically the collection step writes raw data out in an XML format.
-
Normally, you then run a separate task to reformat that XML into HTML or some other readable format.
-
-
Jenkins, however, has its own formatting functions for many common reports.
-
Among other things, these often add “historical” reporting on how the collected data has varied over a period of time.
4 Case study: gitlab-ci
gitlab-ci
is a CI server integrated into Gitlab.
-
Project build status is integrated into the version control activity reports,
-
Click on Green checkmarks and Red X’s to see successful and failed builds.
-
-
Setup is generally easier if your project is already hosted on Gitlab.
4.1 gitlab-ci setup
-
Project must activate
gitlab-ci
by designating a runner, generally on a remote machine under the developer’s control. -
Add a file
gitlab-ci.yml
to the repository root directory.This is a YAML script that gets run after each new commit.
Script can limit which branches it applies to
-
Example:
stages: - build - test - deploy build-job: tags: - atria stage: build script: - echo build number is $CI_BUILD_REF - cd report_accumulator - ./gradlew build deployReports -Dorg.gradle.project.buildNumber=$CI_BUILD_REF only: - master
-
The “script:” part gives the build commands to be run on the runner machine.
- The “only” part limits these runs to checkins on the “master” branch.
4.2 gitlab-ci vs Jenkins
-
Reporting
-
Jenkins provides fancier reporting options. It composes a nice-looking project summary page.
-
Such activities must be scripted as part of the build to work in gitlab-ci.
- E.g., my own Gradle plugin for generating project report webpages.
-
-
Flexibility
-
Jenkins has a definite Java bias.
-
gitlab-ci can run any language you can script a build for.
-
-
Setup
-
Jenkins setup can be confusing.
-
gitlab-ci setup is easier, but requires a properly setup remote runner.
-
-
Runners:
- Jenkins allows remote runners, which are easily shared among projects.
- gitlab-ci requires remote runners, often requiring each project to set up their own.
5 Related ideas
-
Continuous deployment publishes snapshots of deliverables as changes are checked in.
- A variation of the rather common “daily build” practice seen on many projects.
- Some Maven repositories (including our own Artifactory instance) provide separate “snapshot” repositories for this purpose.
-
Some organizations actually wire up build light indicators to provide a highly visible indicator of the status of the latest integration build.
- Some then point a webcam at the light and broadcast their status.
- Others opt for publishing a software analog of such lights on their project website.