Deploying Websites
Steven J Zeil
Abstract
We’ve looked at how to use automated tools to generate project websites.
In this lesson we will look at how to get those deployed to a web server.
Suppose that we have a project that looks like this:
projectRoot/
|-- .git/
| |-- ⋮ git's internal storage - do not touch!
|-- gradle/
| |-- ⋮ gradle wrapper files
|-- gradlew
|-- README.md
|-- settings.gradle
|-- .gitignore
|-- project/
| |-- .gitignore
| |-- build.gradle
| |-- src/
| | |-- main/
| | | |-- java/
| | | | |-- ⋮
| | | |-- html/
| | | | |-- index.html
| | | | |-- ⋮ other static content for website
| | |-- test/
| | | |-- java/
| | | | |-- ⋮
| | build/
| | |-- classes/
| | | |-- ⋮
| | |-- libs/
| | | |-- ⋮
| | |-- reports/
| | | |-- ⋮
Assume that we have already added a buildSite
target to our build.gradle
file that constructs our desired website in build/reports/
.
How do we put that website onto a web server for the world to enjoy?
1 Deploying Websites Via SSH
In many cases, we have SSH/SFTP access to a web server, which is configured so that files and directories copied into a certain location will be served and mapped onto URLs. For example, on the CS Linux network, files stored in /home/yourName/secure_html/whatever
are served at the URLs https://www.cs.odu.edu/~yourName/whatever
.
So, if I wanted to manually update the file served at https://www.cs.odu.edu/~zeil/officehours/index.html, I could 1. Get a local copy of the file,
```
scp zeil@linux.cs.odu.edu:/home/zeil/secure_html/officehours/index.html .
```
or
```
wget https://www.cs.odu.edu/~zeil/officehours/index.html
```
- Edit the file.
- Upload the edited file to the server:
scp index.html zeil@linux.cs.odu.edu:/home/zeil/secure_html/officehours/
That’s fine for working with one file or two. But if I have a website with many files (e.g., this course website) and I have updated several of them, I’d rather not risk forgetting to upload one or two of the changed files. Most command-line versions of scp
have an option for recursive copy of directories, e.g.,
scp -r cs350/website/* zeil@linux.cs.odu.edu:/home/zeil/secure_html/cs350/
But a better choice, in many cases is rsync
. rsync
is a program specifically designed for copying large directory trees in circumstances where only selected files are likely to have changed.
When rsync
is given a source directory and a destination directory, it computes a hash function for each file and, for large files, for portions of those files. If the hashes match, those files (or blocks of large files) are presumed to be identical. rsync
then proceeds to transfer only the files that have been changed. So, if I have rsync
available, I am much more likely to do
rsync -auzv -e ssh cs350/website/ zeil@linux.cs.odu.edu:/home/zeil/secure_html/cs350/
The “-e ssh
” part of that command tells rsync
to do its communications with the remote machine via SSH, the default being an rsync
-specific protocol that requires a dedicated rsync
server to be running on the remote machine.
The biggest limitation to using rsync
is that it is not available on all machines. It can be easily installed on Linux and MacOS machines. Native Windows ports of it have been, in my opinion, unreliable. It can best be run in the Windows Subsystem for Linux (WSL) or the CygWin Unix emulator, but that introduces the complication of mapping paths between the Unix and Windows file systems.
1.1 Deploying via SSH
The gradle plugin org.hidetake:gradle-ssh-plugin
allows you to
- Use
scp
to upload or download one file at a time from/to a remote server. - Use
ssh
to issue commands to a remote server.
A plausible set of Gradle steps:
- Create a
.zip
file of the entire constructed website - Use
scp
to upload the zip file to the remote server. - Use
ssh
to issue anunzip
command on the remote server. - If necessary, use
ssh
to issuechmod
commands as necessary on the unzipped content.
build.gradle
plugins {
id 'org.hidetake.ssh' version '2.9.0'
}
task zipWebsite (type: Zip, dependsOn: 'buildSite') { ➀
archiveFileName = 'website.zip'
destinationDirectory = file('build')
from 'build/reports'
}
remotes {
webServer {
host = IP address
user = userName
identity = file(ssh-private-key) ➁
}
}
task deploy (dependsOn: 'zipWebsite') {
doLast {
ssh.run {
session(remotes.webServer) {
put from: 'build/website.zip', into: 'websitePath' ➂
execute 'unzip websitePath/website.zip' -d websitePath➃
}
}
}
}
- ➀ Zip up the website. remember, we have assumed that the website is constructed in
build/reports
by abuildSite
task. - ➁ We will assume that we have added this key to a running key agent, so that we will not get prompted for a passphrase.
- ➂ Copy the zip file to the remote host.
- ➃ Tell the remote host to unpack the zip file.
1.2 Deploying via rsync
The Java library rsync4j-all
provides a Java interface to rsync
:
- On Linux/MacOS machines, assumes that a native
rsync
command is present. - On Windows machines, provides a minimal version of the CygWin environment with
ssh
andrsync
executables.
build.gradle
buildscript {
/*...*.
dependencies {
⋮
classpath "com.github.fracpete:rsync4j-all:3.1.2-15"
}
}
import com.github.fracpete.rsync4j.RSync; ➀
import com.github.fracpete.processoutput4j.output.ConsoleOutputProcessOutput;
task deployWebsite (dependsOn: "buildSite") {
doLast {
def sourceDir = "build/reports/";
def destURL = "destination"; ➁
RSync rsync = new RSync()
.source(sourceDir)
.destination(destURL)
.recursive(true)
.archive(true)
.delete(true)
.verbose(true)
.rsh("ssh -o IdentitiesOnly=yes"); ➂
ConsoleOutputProcessOutput output
= new ConsoleOutputProcessOutput();
output.monitor(rsync.builder());
}
}
- ➀ The
rSync4j
library, loaded in the dependencies section above, is not a plugin. It’s just a library. But we can use Java libraries in Gradle (Groovy) pretty much like we use them in Java. - ➁ Give the
ssh
URL for the website destination here. - ➂ Again, we will assume that we have a suitable ssh key in a running key agent.
1.3 Potential Issue - the CS VPN
These techniques would work very well for posting project reports to a CS webserver, provided that the build process is running from inside the CS VPN.
-
Right now, that’s not a problem.
-
But in the next lesson, we will be looking at off-loading this task onto a “continuous integration runner”, and our runners are going to be on the wrong side of the VPN.
2 GitHub Pages
GitHub provides a web server (called GitHub Pages) for projects hosted on it. A project hosted at https://github.com/owner/project
will have web pages hosted at https://owner.github.io/project/
.
But GitHub has adopted an unusual, git
-centric, approach to deployment.
When you activate GitHub Pages for your project, you specify a specific git
branch to manage your website content.
- By default, this branch is named
gh-pages
.
For example,
- I have a page at https://sjzeil.github.io/CoWeM/userReference/Directory/outline/index.html.
- That means that, in my project repository, I have a file in my
gh-pages
branch atuserReference/Directory/outline/index.html
(relative to the root of my project). - Now, the
main
branch of my project looks something like the directory structure at the beginning of this lesson.- For example, my project root directory has a
README.md
file, asettings.gradle
and a project directory that, in turn, holds mysrc/
directory.
- For example, my project root directory has a
- But you won’t find any of that in the
gh-pages
branch.
The makes the gh-pages
branch different from other examples of branching that we have looked at, in that it does not mirror the structure of main
at all.
2.1 Setting up the gh-pages branch
To get started, we need to
- Create/check out a new
gh-pages
branch. - Remove everything from it except the
.git/
directory (which holds thegit
internal information) and, maybe, the README. - Commit those changes.
- Push to the remote repository, establishing the
gh-pages
branch on there.
This can lead to a moment of panic when we look at our project and see that everything is now gone. But, of course, we only need to check out the main
branch to get everything back.
- Don’t try to do these steps from inside an IDE.
- In fact, make sure that you have closed this project in your IDE so that it can’t see any of this.
- Otherwise your IDE’s Java/gradle project settings will seem to have disappeared, and your IDE will get terribly confused.
2.2 Deploying the gh-pages branch
How would we deploy a website that we have built in, say, build/reports
?
It’s tempting to start off with:
- Check out the
gh-pages
branch.
But, oops! The build/reports
directory might not survive that. Certainly our build.gradle
file will not. They don’t exist in the gh-pages
branch. And, if I had any changed files that I had not yet committed, I won’t nbe allowed to switch branches anyway.
So, I’m inclined instead to leave my main
branch files in place, and construct the gh-pages
files in a separate location. In fact, I will make a separate clone for this purpose.
- Clone my local repository, placing the new clone in
build/gh-pages
and checking out thegh-pages
branch. - Delete all files (except for
.git/
) insidebuild/gh-pages/
. - Copy all files from
build/reports/
tobuild/gh-pages/
. - Commit those changes in the new clone (to the
gh-pages
branch). - Push the new clone.
- Optionally, push the local repository.
There’s a subtlety to this setup. Because the new clone is cloned from the local repository, its origin is our local repository, not the remote GitHub repository. Pushing the new clone updates the gh-pages
branch in the local repository, not the GitHub
repository.
This has pros and cons.
- Pro: You don’t need to actually be connected to GitHub until step 6.
- Pro: You can examine your new website before deploying it by switching your local repository to branch
gh-pages
and viewing the files directly. - Con: If you have not included step 6, then you will need to remember to push the
gh-pages
branch in your local repository in order to update the actual website.
2.2.1 The git
commands
Before trying to do this in gradle, let’s look at what we would do to accomplish these steps if we were working directly at the command line.
-
Clone my local repository, placing the new clone in
build/gh-pages
.mkdir -p build/gh-pages thisRepo=`readlink -f ..` git clone file://$thisRepo -b gh-pages build/gh-pages
-
Delete all files (except for
.git/
) insidebuild/gh-pages/
.rm -rf build/gh-pages/*
-
Copy all files from
build/reports/
tobuild/gh-pages/
.cp -rf build/reports/* build/gh-pages
-
Commit those changes in the new clone (to the
gh-pages
branch).cd build/gh-pages git add . git commit -m "Updating website"
-
Push the new clone.
git push
-
Optionally, push the local repository.
cd ../.. git push --all
2.3 Deploying to gh-pages from gradle
To automate this process in gradle, we need a mixture of file manipulation commands and git
commands.
gradle already provides functions for file manipulation. What about the git
commands?
There are gradle plugins for working with git
. But I find them more than a little tedious to work with. If we know that the only machines that we will be running on will have a native version of git
, we can do this more easily by using the gradle exec
command, which can run a native OS command in a specified working directory:
//////// Website publication on GitHub pages ///////////////////
task clonePages() {
doLast {
def thisRepo = rootProject.projectDir.toString()
def pagesDir = "$buildDir/gh-pages"
mkdir pagesDir
project.delete {
delete pagesDir
}
exec {
workingDir = '.'
commandLine = ['git', 'clone', 'file://' + thisRepo, '-b', 'gh-pages', pagesDir]
}
}
}
task copyReports (type: dependsOn: ['buildSite', 'clonePages']) {
doLast {
ant.copy (todir: pagesDir) {
fileset(dir: 'build/reports')
}
}
}
task updateGHPages (dependsOn: 'copyReports') {
group = "Reporting"
description 'Copies reports to the gh-pages branch in preparation for a future push to GitHub'
doLast {
def pagesDir = "$buildDir/gh-pages"
exec {
workingDir = pagesDir
commandLine = ['git', 'add', '.']
}
exec {
workingDir = pagesDir
commandLine = ['git', 'commit', '-m', 'Updating-webpages']
}
exec {
workingDir = pagesDir
commandLine = ['git', 'push']
}
}
}
The exec
commands allow us to give a shell command as an array of string values. You can see that the steps performed are largely the same as in the plugin-based solution.
2.3.1 GitHub Actions has a ShortCut.
In an upcoming lesson, we will see that GitHub has packaged up a simple means to execute the equivalent of the above steps, but only when running your build on a server provided by GitHub.
2.4 Deploying to Github Pages Using Free Accounts
If you are working with a private repository from a free account, then Github pages will not be available to you.
A workaround is to create a second, public, repository whose only purpose is to host the website. Because this second repository will not have anything in it but the web content (which was always going to be public anyway), there’s no great loss of security in making this second repository public.
The second repository can also be simpler. It doesn’t need multiple branches to separate the project code from the website, because no project code will be stored there. So we can tell Github to use the main
branch as the source of the website.
2.4.1 The Steps Required
You have two GitHub repositories: a private “main” repository with your code and a public “website” repository.
Within your main project build:
- Use
git
to clone your website repository into a convenient location, e.g.,build/gh-pages
. - Copy your constructed website (e.g.,
build/jbake
) to the website clone inbuild/gh-pages/
- Commit and push the changes to the website clone.
2.4.2 The Gradle Tasks
Here I use the Gradle exec
function to run the appropriate git
commands.
exec
takes two main parameters:
workingDir
: the directory within which to issue the command.commandLine
: the command to run, presented as a list[ ]
of strings.
//////// Website publication on GitHub pages ///////////////////
def websiteRepo='git@github.com:sjzeil/pages-sandbox.git' ➀
task clearPages(type: Delete) {
delete 'build/gh-pages'
}
task clonePages(dependsOn: ['clearPages']) { ➁
doLast {
exec {
workingDir = '.'
commandLine = ['git', 'clone', websiteRepo, 'build/gh-pages']
}
}
}
task copyWebsite (dependsOn: ['reports', 'clonePages']) { ➂
doLast {
ant.copy (todir: 'build/gh-pages') {
fileset(dir: 'build/jbake')
}
}
}
task updateGHPages (dependsOn: 'copyWebsite') {
group = "Reporting"
description 'Copies reports to the website repo and pushes to GitHub'
doLast {
def pagesDir = "$buildDir/gh-pages"
exec {
workingDir = 'build/gh-pages' ➃
commandLine = ['git', 'add', '.']
}
exec {
workingDir = 'build/gh-pages'
commandLine = ['git', 'commit', '-m', 'Updating-webpages'] ➄
}
exec {
workingDir = 'build/gh-pages'
commandLine = ['git', 'push']
}
}
}
- ➀ Here we provide the
git
URL to clone the second repository.- Obviously, this will depend on your project.
- ➁ We clone the website repository, putting the within our
build/
directory. - ➂ We copy the web content into the cloned area.
- This could probably also be done as a Gradle
Copy
task.
- This could probably also be done as a Gradle
- ➃ We tell
git
to stage all of the files we have copied. - ➄ Finally, we commit and then push.