Deploying Websites

Steven J Zeil

Last modified: Apr 8, 2022
Contents:

Abstract

We’ve looked at how to use automated tools to generate project websites.

In this lesson we will look at how to get those deployed to a web server.

Suppose that we have a project that looks like this:

projectRoot/
|-- .git/
|   |--  ⋮ git's internal storage - do not touch!
|-- gradle/
|   |-- ⋮ gradle wrapper files
|-- gradlew
|-- README.md
|-- settings.gradle
|-- .gitignore
|-- project/
|   |-- .gitignore
|   |-- build.gradle
|   |-- src/
|   |   |-- main/
|   |   |   |-- java/
|   |   |   |   |-- ⋮
|   |   |   |-- html/
|   |   |   |   |-- index.html
|   |   |   |   |-- ⋮  other static content for website
|   |   |-- test/
|   |   |   |-- java/
|   |   |   |   |-- ⋮
|   |   build/
|   |   |-- classes/
|   |   |   |-- ⋮
|   |   |-- libs/
|   |   |   |-- ⋮
|   |   |-- reports/
|   |   |   |-- ⋮

Assume that we have already added a buildSite target to our build.gradle file that constructs our desired website in build/reports/.

How do we put that website onto a web server for the world to enjoy?

1 Deploying Websites Via SSH

In many cases, we have SSH/SFTP access to a web server, which is configured so that files and directories copied into a certain location will be served and mapped onto URLs. For example, on the CS Linux network, files stored in /home/yourName/secure_html/whatever are served at the URLs https://www.cs.odu.edu/~yourName/whatever.

So, if I wanted to manually update the file served at https://www.cs.odu.edu/~zeil/officehours/index.html, I could 1. Get a local copy of the file,

```
scp zeil@linux.cs.odu.edu:/home/zeil/secure_html/officehours/index.html .
```
or

```
wget https://www.cs.odu.edu/~zeil/officehours/index.html
```
  1. Edit the file.
  2. Upload the edited file to the server:
    scp index.html zeil@linux.cs.odu.edu:/home/zeil/secure_html/officehours/
    

That’s fine for working with one file or two. But if I have a website with many files (e.g., this course website) and I have updated several of them, I’d rather not risk forgetting to upload one or two of the changed files. Most command-line versions of scp have an option for recursive copy of directories, e.g.,

scp -r cs350/website/* zeil@linux.cs.odu.edu:/home/zeil/secure_html/cs350/

But a better choice, in many cases is rsync. rsync is a program specifically designed for copying large directory trees in circumstances where only selected files are likely to have changed.

When rsync is given a source directory and a destination directory, it computes a hash function for each file and, for large files, for portions of those files. If the hashes match, those files (or blocks of large files) are presumed to be identical. rsync then proceeds to transfer only the files that have been changed. So, if I have rsync available, I am much more likely to do

rsync -auzv -e ssh cs350/website/ zeil@linux.cs.odu.edu:/home/zeil/secure_html/cs350/

The “-e ssh” part of that command tells rsync to do its communications with the remote machine via SSH, the default being an rsync-specific protocol that requires a dedicated rsync server to be running on the remote machine.

The biggest limitation to using rsync is that it is not available on all machines. It can be easily installed on Linux and MacOS machines. Native Windows ports of it have been, in my opinion, unreliable. It can best be run in the Windows Subsystem for Linux (WSL) or the CygWin Unix emulator, but that introduces the complication of mapping paths between the Unix and Windows file systems.

1.1 Deploying via SSH

The gradle plugin org.hidetake:gradle-ssh-plugin allows you to

A plausible set of Gradle steps:

  1. Create a .zip file of the entire constructed website
  2. Use scp to upload the zip file to the remote server.
  3. Use ssh to issue an unzip command on the remote server.
  4. If necessary, use ssh to issue chmod commands as necessary on the unzipped content.

build.gradle

plugins {
  id 'org.hidetake.ssh' version '2.9.0'
}

task zipWebsite (type: Zip, dependsOn: 'buildSite') {   ➀
    archiveFileName = 'website.zip'
    destinationDirectory = file('build')
    from 'build/reports'
}

remotes {
  webServer {
    host = IP address
    user = userName
    identity = file(ssh-private-key)  ➁
  }
}

task deploy (dependsOn: 'zipWebsite') {
  doLast {
    ssh.run {
      session(remotes.webServer) {
       put from: 'build/website.zip', into: 'websitePath' ➂
       execute 'unzip websitePath/website.zip' -d websitePath➃
      }
    }
  }
}

1.2 Deploying via rsync

The Java library rsync4j-all provides a Java interface to rsync:


build.gradle

buildscript {
    /*...*.

    dependencies {
        ⋮
        classpath "com.github.fracpete:rsync4j-all:3.1.2-15"
    }
}

import com.github.fracpete.rsync4j.RSync;  ➀
import com.github.fracpete.processoutput4j.output.ConsoleOutputProcessOutput;

task deployWebsite (dependsOn: "buildSite") {
    doLast {
        def sourceDir = "build/reports/";
        def destURL = "destination";  ➁
        RSync rsync = new RSync()
                .source(sourceDir)
                .destination(destURL)
                .recursive(true)
                .archive(true)
                .delete(true)
                .verbose(true)
                .rsh("ssh -o IdentitiesOnly=yes");  ➂
        ConsoleOutputProcessOutput output
                = new ConsoleOutputProcessOutput();
        output.monitor(rsync.builder());
    }
}

1.3 Potential Issue - the CS VPN

These techniques would work very well for posting project reports to a CS webserver, provided that the build process is running from inside the CS VPN.

2 GitHub Pages

GitHub provides a web server (called GitHub Pages) for projects hosted on it. A project hosted at https://github.com/owner/project will have web pages hosted at https://owner.github.io/project/.

But GitHub has adopted an unusual, git-centric, approach to deployment.

When you activate GitHub Pages for your project, you specify a specific git branch to manage your website content.

For example,

The makes the gh-pages branch different from other examples of branching that we have looked at, in that it does not mirror the structure of main at all.

2.1 Setting up the gh-pages branch

To get started, we need to

  1. Create/check out a new gh-pages branch.
  2. Remove everything from it except the .git/ directory (which holds the git internal information) and, maybe, the README.
  3. Commit those changes.
  4. Push to the remote repository, establishing the gh-pages branch on there.

This can lead to a moment of panic when we look at our project and see that everything is now gone. But, of course, we only need to check out the main branch to get everything back.

2.2 Deploying the gh-pages branch

How would we deploy a website that we have built in, say, build/reports?

It’s tempting to start off with:

  1. Check out the gh-pages branch.

But, oops! The build/reports directory might not survive that. Certainly our build.gradle file will not. They don’t exist in the gh-pages branch. And, if I had any changed files that I had not yet committed, I won’t nbe allowed to switch branches anyway.

So, I’m inclined instead to leave my main branch files in place, and construct the gh-pages files in a separate location. In fact, I will make a separate clone for this purpose.

  1. Clone my local repository, placing the new clone in build/gh-pages and checking out the gh-pages branch.
  2. Delete all files (except for .git/) inside build/gh-pages/.
  3. Copy all files from build/reports/ to build/gh-pages/.
  4. Commit those changes in the new clone (to the gh-pages branch).
  5. Push the new clone.
  6. Optionally, push the local repository.

 

There’s a subtlety to this setup. Because the new clone is cloned from the local repository, its origin is our local repository, not the remote GitHub repository. Pushing the new clone updates the gh-pages branch in the local repository, not the GitHub repository.

This has pros and cons.

2.2.1 The git commands

Before trying to do this in gradle, let’s look at what we would do to accomplish these steps if we were working directly at the command line.

  1. Clone my local repository, placing the new clone in build/gh-pages.

    mkdir -p build/gh-pages
    thisRepo=`readlink -f ..`
    git clone file://$thisRepo -b gh-pages build/gh-pages
    
  2. Delete all files (except for .git/) inside build/gh-pages/.

    rm -rf build/gh-pages/*
    
  3. Copy all files from build/reports/ to build/gh-pages/.

    cp -rf build/reports/* build/gh-pages
    
  4. Commit those changes in the new clone (to the gh-pages branch).

    cd build/gh-pages
    git add .
    git commit -m "Updating website"
    
  5. Push the new clone.

    git push
    
  6. Optionally, push the local repository.

    cd ../..
    git push --all
    

2.3 Deploying to gh-pages from gradle

To automate this process in gradle, we need a mixture of file manipulation commands and git commands.

gradle already provides functions for file manipulation. What about the git commands?

There are gradle plugins for working with git. But I find them more than a little tedious to work with. If we know that the only machines that we will be running on will have a native version of git, we can do this more easily by using the gradle exec command, which can run a native OS command in a specified working directory:

////////  Website publication on GitHub pages ///////////////////


task clonePages() {
    doLast {
        def thisRepo = rootProject.projectDir.toString()
        def pagesDir = "$buildDir/gh-pages"
        mkdir pagesDir
        project.delete {
            delete pagesDir
        }
        exec {
            workingDir = '.'
            commandLine = ['git', 'clone', 'file://' + thisRepo, '-b', 'gh-pages', pagesDir]
        }
    }
}

task copyReports (type: dependsOn: ['buildSite', 'clonePages']) {
    doLast {
        ant.copy (todir: pagesDir) {
            fileset(dir: 'build/reports')
        }
    }
}

task updateGHPages (dependsOn: 'copyReports') {
    group = "Reporting"
    description  'Copies reports to the gh-pages branch in preparation for a future push to GitHub'
    doLast {
        def pagesDir = "$buildDir/gh-pages"
        exec {
            workingDir = pagesDir
            commandLine = ['git', 'add', '.']
        }
        exec {
            workingDir = pagesDir
            commandLine = ['git', 'commit', '-m', 'Updating-webpages']
        }
        exec {
            workingDir = pagesDir
            commandLine = ['git', 'push']
        }
    }
}

The exec commands allow us to give a shell command as an array of string values. You can see that the steps performed are largely the same as in the plugin-based solution.

2.3.1 GitHub Actions has a ShortCut.

In an upcoming lesson, we will see that GitHub has packaged up a simple means to execute the equivalent of the above steps, but only when running your build on a server provided by GitHub.

2.4 Deploying to Github Pages Using Free Accounts

If you are working with a private repository from a free account, then Github pages will not be available to you.

 

A workaround is to create a second, public, repository whose only purpose is to host the website. Because this second repository will not have anything in it but the web content (which was always going to be public anyway), there’s no great loss of security in making this second repository public.

The second repository can also be simpler. It doesn’t need multiple branches to separate the project code from the website, because no project code will be stored there. So we can tell Github to use the main branch as the source of the website.

2.4.1 The Steps Required

You have two GitHub repositories: a private “main” repository with your code and a public “website” repository.

Within your main project build:

  1. Use git to clone your website repository into a convenient location, e.g., build/gh-pages.
  2. Copy your constructed website (e.g., build/jbake) to the website clone in build/gh-pages/
  3. Commit and push the changes to the website clone.

2.4.2 The Gradle Tasks

Here I use the Gradle exec function to run the appropriate git commands.

exec takes two main parameters:

////////  Website publication on GitHub pages ///////////////////


def websiteRepo='git@github.com:sjzeil/pages-sandbox.git'    ➀

task clearPages(type: Delete) {
    delete 'build/gh-pages'
}

task clonePages(dependsOn: ['clearPages']) {                ➁
    doLast {
        exec {
            workingDir = '.'
            commandLine = ['git', 'clone', websiteRepo, 'build/gh-pages']
        }
    }
}


task copyWebsite (dependsOn: ['reports', 'clonePages']) {   ➂
    doLast {
        ant.copy (todir: 'build/gh-pages') {
            fileset(dir: 'build/jbake')
        }
    }
}



task updateGHPages (dependsOn: 'copyWebsite') {
    group = "Reporting"
    description  'Copies reports to the website repo and pushes to GitHub'
    doLast {
        def pagesDir = "$buildDir/gh-pages"
        exec {
            workingDir = 'build/gh-pages'                                ➃
            commandLine = ['git', 'add', '.']
        }
        exec {
            workingDir = 'build/gh-pages'
            commandLine = ['git', 'commit', '-m', 'Updating-webpages']   ➄
        }
        exec {
            workingDir = 'build/gh-pages'
            commandLine = ['git', 'push']
        }
    }
}