Detecting Web Defacements using Javascript and Google App Engine

Most of you are familiar with the virtues of a programmer. There are three, of course: laziness, impatience, and hubris. , Larry Wall

3 Apr 2018

Introduction

A web application faces different kinds of attacks and security threats. The application can be hijacked to spread malware, sensitive data can be stolen, the website can be defaced etc... Security professionals and developers have to defend against these and ensure the availability of the application, the confidentiality and integrity of its data.

This article shows how to implement a simple monitoring application using Client-side Javascript and a Java JSP/Servlet application running on Google App Engine. The application can detect unauthorized changes for static web content, e.g. web defacements, and alert the website administrator. Many websites already utilize client- side javascript for analytics and performance monitoring. Such techniques can be used for monitoring web content as well; to detect and prevent tampering or website defacement.

Although it is possible that client-side scripts can be bypassed, to do so require additional efforts. Client-side javascript monitoring can provide some level of protection against simple automated attacks and defacement attempts. It can be an additional layer of defense against security threats. This article describes an attempt to extend client-side javascript into the area of security monitoring.

Design Overview and Approach

Many methods exist to monitor changes in web content. A host-based intrusion detection application can look for changes in the static web files. A remote monitoring service can poll the website regularly to detect changes. A proxy appliance can sit infront of a web server, monitoring the content passing through and detecting any changes. These and other methods can be used to form multiple layers of defense against web defacement attacks.

This article takes a different approach; implementing a client-side javascript that runs on the end user browser, and a Java JSP/Servlet application hosted on Google App Engine. The client-side javascript can be embedded in the html files to be monitored. When web pages or content are loaded, the script will traverse the DOM (Document Object Model) tree of the html document and uses Web Crypto API to generate a sha256 hash. External files like images, javascript and CSS that are in the document, are retrieved and their content included in generating the hash. This information is sent to the Google App Engine application.

The application process the information and sends an email alert if the sha256 hash is different from the original one stored in its database (Google Cloud Datastore). The following diagram illustrates this

Approach to Monitoring Web Change — Fig 1. Approach to Monitor Web Change

The approach is similar to Application Performance Management (APM) and web analytics monitoring that uses client-side script. To add greater resilency against tampering, the client-side javascript resides on Google App Engine and it will be injected into the html content through a reverse proxy. In this case, Nginx and a filter module implemented in an earlier article, Writing an Nginx Response Body Filter Module is used.

The use of an Nginx reverse proxy enables the script to be injected into html content without having to modify application and html source files. The Nginx reverse proxy can run on a seperate physical server. This provides additional defense against attempts to bypass the monitoring mechanism if the monitored web server is compromised.

The following illustrates the setup.

Fig 2. Nginx Reverse Proxy Inject Monitoring Script

The monitoring javascript tag is injected directly after the <head> tag of the html document. Thus, the monitoring script has a better chance of being the first script to run. For security monitoring, the order of script execution is important. Cross-origin Resource Sharing (CORS) has to be enabled on the Google App Engine application to allow connections from the monitoring script.

If the sha256 hash of a monitored page doesn't match, the client-side javascript can redirect the browser to an error page. This reduces exposure time if a website has been defaced. The redirection is an option that can be configured on the Google App Engine application. An email alert will be sent to the website administrator if the content of a monitored webpage changes.

Rationale of the Approach

The method discussed in this article is not a 100% foolproof technique. It relies on client-side script which can potentially be bypassed by sophisticated and skilled attackers. One key advantage of using client-side script is its relatively ease of use and deployment, particularly if there is a reverse proxy like Nginx available.

Another advantage is that processing is partly done on the client browser; enabling better scalability. If an in-line appliance is used to monitor web content, it can cause a performance hit if there are too many requests to process. An in-line appliance can block content if it detects unauthorized changes. Although slightly less effective, the client-side javascript can redirect the user browser to an error page if unauthorized changes are detected. This reduces exposure time in the event of a web defacement incident.

Compared to a remote monitoring service that polls a website regularly; the ability to take action on the client-side is a plus. The client-side javascript approach is a sort of "hybrid" between an in-line appliance and a remote monitoring service. The biggest drawback is that it can potentially be subjected to tampering or bypassed via injection attacks. Therefore, this method should be used together with other techniques and serve as an additional layer of defense.

For websites that are using CDN (Content Delivery Network) or cloud based web application firewall (WAF), client-side script can be easily added as one more layer of security. For example, CloudFlare has an APP platform that makes it easy to deploy client-side scripts. More information is available at CloudFlare Apps. Recent developments in CDN involves moving functionalities closer to the edge where it is nearer to end users. Having client-side processing complements such a strategy.

Security Threat Model for Monitoring Mechanism

This simple monitoring application relies on end user browser executing the client-side javascript. Currently it is tested with firefox 59, chrome 65, IE 11* on windows 8.1 and IE Edge on windows 10. It is also tested on firefox 59 on Ubuntu 16.04 LTS, firefox and chrome on Android 7. The javascript makes use of Web Crypto API, and requires the monitored website to have Transport Layer Security (TLS) enabled. Web Crypto API cannot work on plain HTTP sites.

* If a html document contains in-line SVG, IE 11 will generate a different sha256 hash from that generated by Firefox and Chrome. On IE 11, the client-side javascript isn't able to extract the content inside the in-line SVG tag, leading to a sha256 hash that is different. Firefox and Chrome are able to traverse in-line SVG content properly and these browsers should be used for populating the monitoring application datastore. IE 11 will generate false positive alerts since its hash is different. The alert email contains the user-agent field which can be used to help identify such false positives. For html document or content that doesn't contain in-line SVG tag, IE 11 should generate the same hash as the other two browsers.

The backend application that is hosted on Google App Engine uses TLS as well. This ensures that information is transmitted over encrypted connections. The client-side javascript will send information including the page content to the backend application. Such information can be stored in the Google Cloud Datastore. The client-side javascript should not be deployed for pages or web content that may contain sensitive or private information.

It is assumed that clients with functioning browser will view the monitored web pages and send reports to the backend application. The monitoring mechanism cannot work if clients don't visit the web pages. It cannot work if a client browser doesn't support javascript, or if required objects such as Web Crypto is not supported by the client browser.

It is also assumed that the majority of clients that visit the monitored website to be non-malicious. The client-side script relies on these visitors to send accurate reports to the backend. If a monitored website is hacked, an attacker may try to subvert and tamper with the client-side monitoring.

To resists such tampering, a separate reverse proxy can be used to inject the monitoring javascript. The monitoring javascript itself should be hosted on an external site. This protects against attempts to delete or modify the script file directly if the web server being monitored is compromised.

Additional proctections include setting Content Security Policy header on the monitored website, to reduce the risk of javascript injection attacks. If the monitored website suffers from injection vulnerabilities, an attacker can inject additional javascript to "poison" or interfere with the monitoring. For example, an attacker 's script can try to hook on to AJAX calls and disable hashes from being sent.

The monitoring script implements some protective measures to prevent hooking and poisoning attacks, but it is not 100% foolproof. The script resets certain built-in objects and functions such as XMLHttpRequest object, WebCrypto etc... through the use of iframes. It also tries to randomize when the protection functions are run.

Web change detection is currently based on sha256 hashes. It will not work well with dynamic content that changes frequently. This can be improved in the future. Document distance algorithm or machine learning techniques can potentially be used for detecting defacements and unauthorized changes.

The monitoring javascript doesn't cater for Single Page Application (SPA) or web pages which manipulates html DOM. It cannot monitor the DOM changes triggered by a Single Page Application. Although Javascript Mutation Observer Objects can be used to monitor DOM changes, it is currently not implemented.

The monitoring can detect changes in images, external javascript and css files that are referred in the web page. If an image displayed on a document using an <img> tag is vandalized, a different hash will be calculated, and the monitoring application will alert the adminstrator. If the external resources are not hosted on the same website, CORS (Cross-Origin Resource Sharing) is required as the javascript uses AJAX to fetch external resources.

This monitoring mechanism should be used together with other monitoring techniques such as remote monitoring, due to some of its drawbacks.

Client-Side Javascript Implementation

The following lists the source code for the client-side monitoring javascript. The source code for the entire application is available from the Github link at the bottom of the article.

/*
 * MIT License
 *
 *Copyright (c) 2018 Ng Chiang Lin
 *
 *Permission is hereby granted, free of charge, to any person obtaining a copy
 *of this software and associated documentation files (the "Software"), to deal
 *in the Software without restriction, including without limitation the rights
 *to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 *copies of the Software, and to permit persons to whom the Software is
 *furnished to do so, subject to the following conditions:
 *
 *The above copyright notice and this permission notice shall be included in all
 *copies or substantial portions of the Software.
 *
 *THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 *IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 *FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 *AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 *LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 *OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 *SOFTWARE.
 *
 */

(function()
{

    "use strict";

    var crypto;
    var subtle;
    var supportMsg = "";
    var seq = 1;
    var elementvalues = "";
    var cframes = [];
    var extresources = [];
    var JSON;
    var XMLHttpRequest;
    var remotehost = "https://demo2-nighthour.appspot.com";

    window.addEventListener("load", function(event)
    {
        runmon();
    }, false);

    /*
     * WebRpt object holding information for current url/page
     */
    function WebRpt(url, cksum, seq, smsg, cdate, content, clen)
    {
        this.url = url;
        this.cksum = cksum;
        this.seq = seq;
        this.smsg = smsg;
        this.cdate = cdate;
        this.content = content;
        this.clen = clen;
    }

    /*
     * Return a child iframe
     */
    function getCframe()
    {
        var ifr = document.createElement('iframe');
        ifr.src = 'about:blank';
        ifr.width = 0;
        ifr.height = 0;
        ifr.style.display = 'none';
        document.body.appendChild(ifr);
        cframes.push(ifr);
        return ifr;
    }

    /*
     * Remove child frame
     */
    function removeCframe(ifr)
    {
        try
        {
            document.body.removeChild(ifr);
        }
        catch (err)
        {

        }
    }

    /*
     * Clean up all child frames
     */
    function cleanCframes()
    {
        var len = cframes.length;

        for (var i = 0; i < len; i++)
        {
            removeCframe(cframes[i]);
        }
        cframes = [];
    }

    /*
     * Tight loop
     */
    function tightLoop(randrange, basecycle)
    {
        var ifr = getCframe();

        if (window.crypto)
        {
            crypto = ifr.contentWindow.crypto;
            crypto.getRandomValues = ifr.contentWindow.crypto.getRandomValues;
        }
        else if (window.msCrypto)
        {
            crypto = ifr.contentWindow.msCrypto;
            crypto.getRandomValues = ifr.contentWindow.msCrypto.getRandomValues;
        }

        var rint = new Uint32Array(1);
        crypto.getRandomValues(rint);

        var interval = (rint[0] % randrange) + basecycle;
        rint = 0;

        for (var i = 0; i < interval; i++)
        {// do tight loop
            rint = rint + 1;
        }

        removeCframe(ifr);
    }

    /*
     * Try to prevent javascript tampering Obtain objects and functions from
     * child frame, to guard against javascript tampering
     */
    function guardObjects()
    {
        var ifr = getCframe();
        if (window.msCrypto)
        {
            crypto = ifr.contentWindow.msCrypto;
            subtle = ifr.contentWindow.msCrypto.subtle;
        }
        else if (window.crypto)
        {
            crypto = ifr.contentWindow.crypto;
            subtle = ifr.contentWindow.crypto.subtle;
        }

        if (window.JSON)
        {
            JSON = ifr.contentWindow.JSON;
            JSON.stringify = ifr.contentWindow.JSON.stringify;
        }

        if (window.RegExp)
        {
            window.RegExp = ifr.contentWindow.RegExp;
            if (window.RegExp.prototype.test)
            {
                window.RegExp.prototype.test = ifr.contentWindow.RegExp.prototype.test;
            }
        }

        if (window.Date)
        {
            window.Date = ifr.contentWindow.Date;
            window.Date.prototype.getTime = ifr.contentWindow.Date.prototype.getTime;
        }

    }

    /*
     * Obtain XMLHttpRequest from child frame to try prevent javascript
     * tampering
     */
    function guardXMLHttpRequest()
    {
        var ifr = getCframe();
        window.setTimeout = ifr.contentWindow.setTimeout;
        XMLHttpRequest = ifr.contentWindow['XMLHttpRequest'];

    }

    /*
     * Function to check that webcrypto is supported and setup the subtle
     * object.
     */
    function initCrypto()
    {

        crypto = window.crypto || window.msCrypto;
        if (!crypto)
        {
            return false;
        }
        else if (crypto.subtle)
        { // IE 11, Chrome, firefox
            subtle = crypto.subtle;
            return true;
        }
        else if (crypto.webkitSubtle)
        { // Safari browser
            subtle = crypto.webkitSubtle;
            return true;
        }
        else
        {
            return false;
        }
    }

    /*
     * Function to check if native promise is supported
     */
    function checkPromise()
    {

        if (window.Promise)
        {
            if (window.Promise.toString().indexOf("[native code]") !== -1)
            {
                supportMsg = supportMsg + " : Promise Supported";
                return true;

            }
        }

        supportMsg = supportMsg + " : Promise Unsupported";
        return false;

    }

    /*
     * Function to retrieve external resources such as images, external scripts,
     * css ... The resources are retrieved synchronously.
     */
    function getResource(url)
    {
        guardXMLHttpRequest();

        var xhttp = new XMLHttpRequest();
        xhttp.open('GET', url, false);
        xhttp.overrideMimeType('text\/plain; charset=x-user-defined');
        xhttp.send();

        if (xhttp.status == 200)
        {
            var resp = xhttp.responseText;
            elementvalues = elementvalues + " " + resp;
        }
        else
        {
            console.log("Http status error");
        }

    }

    /* Function to retrieve the html content using innerHTML */
    function getContent()
    {
        var content = document.documentElement.innerHTML;
        content = content.replace(/\s+/g, ' ');
        content = content.trim();
        return content;

    }

    /* Function to retrieve content using dom traversal */
    function getProcessContent()
    {
        var root = document.documentElement;
        elementvalues = "";
        traverse(root);
        elementvalues = elementvalues.replace(/\s+/g, ' ');
        elementvalues = elementvalues.trim();

        var len = extresources.length;
        var pattern = "^" + remotehost + ".*";
        var re = new RegExp(pattern, 'i');
        for (var i = 0; i < len; i++)
        {
            if (!re.test(extresources[i]))
            {
                var url = extresources[i] + "?" + (new Date().getTime());
                getResource(url);
            }
        }

        return elementvalues;
    }

    /*
     * Recursive function to traverse the dom tree and extracting the content of
     * the DOM The extracted content is saved into elementvalues
     */
    function traverse(element)
    {
        if (element)
        {
            elementvalues = elementvalues + " " + element.tagName + " ";

            // Extract attributes of html element
            var attributes_array = [];

            if (element.hasAttributes())
            {
                var attributes = element.attributes;
                for (var i = 0; i < attributes.length; i++)
                {
                    attributes_array.push(attributes[i].name);
                    attributes_array.push(attributes[i].value);
                }

                attributes_array.sort();
                elementvalues = elementvalues + attributes_array.join();
            }

            // Extract script content if it is script element
            if (element.tagName === "SCRIPT")
            {
                if (element.innerText)
                {
                    elementvalues = elementvalues + element.innerText + " ";
                }

                // Handle external script
                if (element.src)
                {
                    extresources.push(element.src);
                }
            }

            // Handle images
            if (element.tagName === "IMG")
            {
                if (element.src)
                {
                    var url = element.src;
                    url = url.toLowerCase();
                    var scheme = url.substr(0, 4);

                    if (scheme != "data" && scheme != "file")
                    {
                        extresources.push(url);
                    }

                }

            }

            // Handle links like CSS
            if (element.tagName === "LINK")
            {
                if (element.href)
                {
                    extresources.push(element.href);
                }
            }

            // Extract any text or comment from element
            for (var i = 0; i < element.childNodes.length; i++)
            {
                var childnode = element.childNodes[i];

                if (childnode.nodeType === 3 || childnode.nodeType === 8)
                {// text
                    // or
                    // comment
                    // node
                    elementvalues = elementvalues + childnode.nodeValue + " ";
                }
            }

            // Depth first recursion to process child elements
            if (element.children)
            {
                for (var i = 0; i < element.children.length; i++)
                {
                    traverse(element.children[i]);
                }
            }

        }
        else
        {
            return "";
        }

    }

    /* Function to send web report to server using json and ajax */
    function sendRpt(rpt)
    {
        guardXMLHttpRequest();
        var endpoint = remotehost + "/webrpt";
        var xhttp = new XMLHttpRequest();

        var data = JSON.stringify(
        {
            "WebRpt" : rpt
        });
        var timeout_timer;

        xhttp.onreadystatechange = procResponse(xhttp, timeout_timer);
        xhttp.open("POST", endpoint);
        xhttp
                .setRequestHeader('Content-Type',
                        'application/json;charset=UTF-8');
        xhttp.send(data);

        timeout_timer = setTimeout(function(xhttp)
        {
            return function()
            {
                xhttp.abort();
                console.log("ajax timeout");
            }

        }, 60000);

        seq = seq + 1;

    }

    /* Function to process the response from the server */
    function procResponse(xhttp, timeout_timer)
    {
        return function()
        {
            try
            {
                if (xhttp.readyState === XMLHttpRequest.DONE)
                {
                    clearTimeout(timeout_timer);
                    timeout_timer = null;

                    if (xhttp.status === 200)
                    {
                        var resp = xhttp.responseText;
                        resp = resp.trim();

                        if (resp.indexOf("600") !== -1)
                        {
                            // alert("Bad response");
                            var arr = resp.split(" ");
                            if (arr[1])
                            {
                                window.location.replace(arr[1]);
                            }

                            cleanCframes();
                            return;
                        }
                        else if (resp === "Ok")
                        {
                            // alert("Good response");
                        }
                        else
                        {// Other responses ignore and do nothing

                        }

                        cleanCframes();

                    }
                    else
                    {
                        console.log("Http status error");
                    }

                }
            }
            catch (e)
            {
                if (timeout_timer)
                {
                    clearTimeout(timeout_timer);
                }

                console.log(e);
                cleanCframes();
            }
        };

    }

    /* Converts a Arraybuffer into hexadecimal string */
    function toHex(buf)
    {

        var dview = new DataView(buf); // Use DataView to prevent platform
        // endianness issue.
        var hexstring = "";
        var i;

        for (i = 0; i < dview.byteLength; i++)
        {
            var byteval = dview.getUint8(i);
            if (byteval < 16)
            {
                hexstring = hexstring + "0" + byteval.toString(16);
            }
            else
            {
                hexstring = hexstring + byteval.toString(16);
            }

        }

        return hexstring;

    }

    /* Async Sha256 function using promise */
    function async_sha256(str)
    {
        var i;
        var utf8str = unescape(encodeURIComponent(str));
        var utf8buf = new Uint8Array(utf8str.length);

        for (i = 0; i < utf8str.length; i++)
            utf8buf[i] = utf8str.charCodeAt(i);

        return subtle.digest("SHA-256", utf8buf).then(function(hex)
        {
            return toHex(hex);
        });

    }

    /*
     * Function to process content using Promise
     */
    function pproc()
    {
        var url = window.location.href;
        var c = getContent();

        /*
         * Process the content for sha256 checksum The content that is based on
         * innerHTML is different between IE 11 and firefox/chrome IE 11 sorts
         * the attributes in element leading to different checksum result. Here
         * we extract the relevant content using our own DOM traversal
         */

        var processcontent = getProcessContent();

        async_sha256(processcontent).then(
                function(hexcode)
                {
                    var cdate = Date();
                    var rpt = new WebRpt(url, hexcode, seq, supportMsg, cdate,
                            c, c.length);
                    sendRpt(rpt);
                });

    }

    /* Asynchronous sha256 using callbacks without promise IE 11 */
    function cb_sha256(str, func)
    {

        // Additional check to make sure msCrypto is available IE 11.
        // MS edge is more standard complaint and support the normal crypto
        if (!window.msCrypto)
        {
            func("00000000000000000000000000000000");
            return;
        }

        var i;
        var utf8str = unescape(encodeURIComponent(str));
        var utf8buf = new Uint8Array(utf8str.length);

        for (i = 0; i < utf8str.length; i++)
            utf8buf[i] = utf8str.charCodeAt(i);

        var op = subtle.digest("SHA-256", utf8buf);

        op.oncomplete = function(e)
        {
            var hexstring = toHex(e.target.result);
            func(hexstring);
        }

    }

    /*
     * Function to process content without Promise IE 11
     */
    function proc()
    {

        var url = window.location.href;
        var c = getContent();

        /*
         * Process the content for sha256 checksum The content that is based on
         * innerHTML is different between IE 11 and firefox/chrome IE 11 sorts
         * the attributes in element leading to different checksum result. Here
         * we extract the relevant content using our own DOM traversal
         */

        var processcontent = getProcessContent();
        var func = function(hexcode)
        {
            var cdate = Date();
            var rpt = new WebRpt(url, hexcode, seq, supportMsg, cdate, c,
                    c.length);
            sendRpt(rpt);
        };

        cb_sha256(processcontent, func);
    }

    /* Begin the monitoring */
    function runmon()
    {

        try
        {
            supportMsg = supportMsg + " " + navigator.userAgent;

            if (!initCrypto())
            {
                return;
            }

            tightLoop(1500000, 2000000);
            guardObjects();

            if (checkPromise())
            {// Promise available
                pproc();
            }
            else
            {// No Promise
                proc();
            }

        }
        catch (err)
        {
            console.log("Error occured :" + err);
        }

    }

})();

The javascript registers itself to run when the html document is loaded. It does some checks to determine if WebCrypto and Promise API are supported. Then it traverses the DOM recursively storing its content. The url of external files such as images, javascript and CSS are stored into an array. These are retrieved using AJAX and incorporated into the document content. A sha256 hash is generated from the content and stored into a web report object (WebRpt).

The web report is converted into JSON and sent to the backend application using AJAX. If the application returns ok status, it means the hash matches what is stored in the backend datastore and no action is taken. If status code 600 is returned, the script will redirect the browser to a specified error page.

The script implements some protective measures to prevent javascript poisoning. It obtains some of the built-in global objects/functions that the script relies on from iframes that the script creates itself. This offers some basic protection against AJAX hooking etc... However, it is not 100% foolproof. If the website suffers from injection attacks, a skilled attacker can probably poison and interfere with the monitoring javascript. The deployment method described earlier in the article can help to reduce the risks of such tampering.

Design of App Engine Java Application

There are 2 parts to the monitoring application, an administrative function and a service function. The administrative function is used to configure the application. The service function communicates with the client-side javascript and process incoming web reports. The application is built with JSP/Servlet and doesn't rely on external web frameworks. This is for simplicity and it helps to improve security by reducing the attack surface. The application runs on Google App Engine and uses Google Cloud Datastore (NoSQL database) for storage. The MVC (Model-View-Controller) architecture is adopted to structure the application.

To monitor a website, the application requires the Fully Quantified Domain Name (FQDN) to be created. For example, to monitor www.nighthour.sg and nighthour.sg, these two FQDNs have to be created under a particular user account in the application. All domains (FQDNs) have to be unique in the application. There cannot be nighthour.sg under user A and the same nighthour.sg under user B.

The datastore relies on global domain entries to ensure this uniqueness. The following shows these datastore entities.

A parent entity, globaldomains of kind globaldomaintable, has child entities of kind globaldomainentry. Each child entity has a unique FQDN (nighthour.sg, www.nighthour.sg etc...) as its key. This ensures when a new globaldomainentry with its FQDN key is added, it will be unique. Checks are done to ensure that a FQDN domain does not exist before it is created. In this article the term "domain" referred to a FQDN.

The following shows the user, domain and url entities.

User Domain URL Entries — Fig 4. User, Domain and URL entities

A user entity represents a user account of the application. Its unique key is an email address, eg. alerts@nighthour.sg. The user hashed password, 2 factor authentication key, redirection url etc... are properties of a user entity. Each user entity has child entities of domain kind, representing a FQDN (e.g. nighthour.sg, www.nighthour.sg) to be monitored by the application. Each domain has child entities of URL, that represent each webpage to be monitored under the FQDN, e.g. https://www.nighthour.sg, https://www.nighthour.sg/myfile.html.

There are some limits set on the monitoring application. A user can only have a maximum of 10 domains and each domains have a maximum of 300 URLs. Each user has a maximum failed login threshold of 5.

2FA authentication using Google Authenticator Mobile App is required for logging in. Refer to this earlier article Implementing 2 Factor Authentication for Web Security for more details. The login mechanism described in the earlier article is the basis for the one implemented here.

Application Views and Administrative Function

The application has a number of views implemented using JSP. Upon successful login, the main application page with a short introduction and description is displayed. There is a top navigation menu for accessing the other application views and functionalities.

The domain page enables new FQDNs to be added or deleted. Up to 10 FQDNs can be added for a single user. FQDNs are unique in the entire application. To monitor a website, its domains (FQDNs) must be added. For example, www.nighthour.sg, nighthour.sg.

The application has 3 operational modes, disable, capture and monitor mode. These are configured under the Mode page. Refer to the illustration below.

Disable is the default mode, it is the state where the application doesn't do anything (no email alerts, no redirection). Capture mode is for registering urls that are to be monitored under a domain (FQDN). To capture urls, an administrator set the application to this mode and then simply browse the web pages to be monitored. The client-side javascript that is injected (see Design Overview and Approach), will communicate with the service function to populate urls with initial hash data.

For security, only requests coming from the same IP address that set the capture mode are allowed to populate the urls. This reduces the risk of attackers trying to poison or pollute the stored urls. To update the hash of a url, browse the url again under capture mode.

Setting the application to monitor mode will start the actual monitoring. The following shows the mode page.

The Monitored WebPages lists the urls that are being monitored. These are the urls added during the Capture Mode. The application will not send alerts or do any redirection for URLs that are not monitored. However, for non-monitored URLs, the client-side javascript will still send web reports to the application. If there are sensitive information on the web pages, these will be sent to the application. Do not deploy the client-side javascript on pages which has sensitive data that should be kept private.

The action page enables the administrator to enable or disable redirection. An error page for redirection can be specified here. It is recommended that this error page be hosted on a different web server from the one being monitored. This is to prevent an attacker from modifying the error page if there are flaws on the web server/website being monitored.

Additionally, the error page shouldn't be a monitored url. This is to prevent the case of an endless loop where an error page that is modified by an attacker, redirects to the same error page again and again. The following shows the action page where redirection can be configured.

The MVC architecture is used for implementing the application functionalities and each of the JSP pages described above represent a view. A single controller, ApplicationControllerServlet, controls access to these views. For each JSP page, user inputs are submitted using AJAX to the controller via HTTP POST.

The following shows the source code of the ApplicationControllerServlet.

/*
* MIT License
*
*Copyright (c) 2018 Ng Chiang Lin
*
*Permission is hereby granted, free of charge, to any person obtaining a copy
*of this software and associated documentation files (the "Software"), to deal
*in the Software without restriction, including without limitation the rights
*to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
*copies of the Software, and to permit persons to whom the Software is
*furnished to do so, subject to the following conditions:
*
*The above copyright notice and this permission notice shall be included in all
*copies or substantial portions of the Software.
*
*THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
*IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
*FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
*AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
*LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
*OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
*SOFTWARE.
*
*/

package sg.nighthour.app;

import java.io.IOException;
import java.io.PrintWriter;
import java.util.logging.Logger;
import javax.servlet.RequestDispatcher;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;

/**
 * Servlet implementation class ApplicationControllerServlet
 */

@WebServlet(name="ApplicationControllerServlet",
loadOnStartup = 1,
urlPatterns = {"/home",
               "/domain",
               "/action",
               "/mode",
               "/url",
               "/domainctl",
               "/modectl",
               "/urlctl",
               "/actionctl"})

public class ApplicationControllerServlet extends HttpServlet {
	private static final long serialVersionUID = 1L;
	private static final Logger log = Logger.getLogger(ApplicationControllerServlet.class.getName());
       
    /**
     * @see HttpServlet#HttpServlet()
     */
    public ApplicationControllerServlet() {
        super();
    }

	/**
	 * @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse response)
	 */
	protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
		
	    request.setCharacterEncoding("UTF-8");
        response.setContentType("text/plain;charset=UTF-8");
        response.setHeader("Cache-Control", "no-store");
        
	    String spath = request.getServletPath();
	    String csrf = request.getParameter("csrf");
	    String viewurl = "/WEB-INF/views"; 
	    
        HttpSession sess = request.getSession(false);
        
        if(sess == null)
        {
            log.warning("Error: Null Session : GET : " + request.getRemoteAddr());
            response.sendRedirect("/index.jsp");
            return;
        }
	    
	    if(spath.equals("/home"))
	    {
	        viewurl = viewurl +  spath + ".jsp";
	    }
	    else if(spath.equals("/domain"))
	    {
	        viewurl = viewurl +  spath + ".jsp";
	    }
	    else if(spath.equals("/action"))
	    {
	        viewurl = viewurl +  spath + ".jsp";
	    }
	    else if(spath.equals("/mode"))
	    {
	        viewurl = viewurl +  spath + ".jsp";
	    }
	    else if(spath.equals("/url"))
	    {
	        viewurl = viewurl +  spath + ".jsp";
	    }
	    else
	    {//Redirect back to index.jsp if path is not valid
	        log.warning("Error: Invalid Servlet Path: GET " + spath + " : " + request.getRemoteAddr());
	        sess.invalidate();
            response.sendRedirect("/index.jsp");
            return;   
	    }
	    
	   
	    if(sess.getAttribute("userid")== null)
	    {//Not authenticated redirect back to index.jsp
	        log.warning("Error: Unauthenticated Session : " + request.getRemoteAddr());
	        sess.invalidate();
	        response.sendRedirect("/index.jsp");
	        return;
	    }
	    else
	    {//Authenticated sessions forward to views
	        
	        sess.setAttribute("currenturl", spath);
	        String saved_csrf = (String) sess.getAttribute(spath);
	        
	        if(AntiCSRFToken.compareToken(csrf, saved_csrf))
	        {//Anti CSRF check is ok
	            sess.removeAttribute(spath);
                RequestDispatcher dp = request.getRequestDispatcher(viewurl);
                dp.forward(request, response);
                return;
	            
	        }
	        else
	        {//Invalid Anti CSRF parameters
	            String userid = (String) sess.getAttribute("userid");
                log.warning("Error: Invalid csrf : " + userid + " : " + request.getRemoteAddr());
                sess.setAttribute("userid2fa", userid);
                sess.removeAttribute("userid");
                RequestDispatcher dp = request.getRequestDispatcher("/WEB-INF/views/otp.jsp");
                dp.forward(request, response);
                return;
	        }
	       
	
	    }
	    
	    
	}

	/**
	 * @see HttpServlet#doPost(HttpServletRequest request, HttpServletResponse response)
	 */
	protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
		
	    request.setCharacterEncoding("UTF-8");
        response.setContentType("text/plain;charset=UTF-8");
        response.setHeader("Cache-Control", "no-store");
        
        PrintWriter out = response.getWriter();
        String spath = request.getServletPath();
        String csrf = request.getParameter("csrf");
        
        HttpSession session = request.getSession(false);
        if(session == null)
        {//no existing session
            log.warning("Error: Null Session : POST : " + spath + " : " + request.getRemoteAddr());
            response.sendRedirect("/index.jsp");
            return;
        }
        
        String userid = (String)session.getAttribute("userid");
        
        if(userid == null)
        {//Session not authenticated
            log.warning("Error: Unauthenticated Session : POST : " + spath + " : " + request.getRemoteAddr());
            session.invalidate();
            response.sendRedirect("/index.jsp");
            return;
        }
        
        String saved_csrf = (String) session.getAttribute(spath);
        
        if(spath.equals("/domainctl"))
        {
            if(AntiCSRFToken.compareToken(csrf, saved_csrf))
            {
                DomainHandler.handleRequest(request, response, userid, out);
                return; 
            }
            else
            {
                log.warning("Error: Invalid csrf : POST : " + spath + " : " +
                     userid + " : " + request.getRemoteAddr());
                out.println("Error: Invalid CSRF Token");
                return;
            }
            
        }
        else if(spath.equals("/modectl"))
        {
            
            if(AntiCSRFToken.compareToken(csrf, saved_csrf))
            {
                ModeHandler.handleRequest(request, response, userid, out);
                return; 
            }
            else
            {
                log.warning("Error: Invalid csrf : POST : " + spath + " : " +
                     userid + " : " + request.getRemoteAddr());
                out.println("Error: Invalid CSRF Token");
                return;
            }
            
        }
        else if(spath.equals("/urlctl"))
        {
           
            if(AntiCSRFToken.compareToken(csrf, saved_csrf))
            {
                UrlHandler.handleRequest(request, response, userid, out);
                return; 
            }
            else
            {
                log.warning("Error: Invalid csrf : POST : " + spath + " : " +
                     userid + " : " + request.getRemoteAddr());
                out.println("Error: Invalid CSRF Token");
                return;
            }
            
        }
        else if(spath.equals("/actionctl"))
        {
            
            if(AntiCSRFToken.compareToken(csrf, saved_csrf))
            {
                ActionHandler.handleRequest(request, response, userid, out);
                return; 
            }
            else
            {
                log.warning("Error: Invalid csrf : POST : " + spath + " : " +
                     userid + " : " + request.getRemoteAddr());
                out.println("Error: Invalid CSRF Token");
                return;
            }
            
        }
        else
        {
            
            log.warning("Error: Invalid servlet path : POST : " + spath + " : " +
                    userid + " : " + request.getRemoteAddr());
            return;
            
        }
	    
	    
	}

}

Security Threat Model for Administrative Function

The application is built with security in mind and implements a number of security controls covering the OWASP top 10 . 2 factor authentication is required for logging in. Passwords are hashed using PBKDF2 (Password-Based Key Derivation 2). A new session is created upon successful login to prevent session fixation attacks. Proper session timeouts are maintained.

The 2nd factor login mechanism implements a number of security features to stop attacks. See the eariler article Implementing 2 Factor Authentication for Web Security for details. Many of the items described in the 2FA Threat Model applies to this application as well. The Anti-CSRF protection is slightly different in this application though.

Anti-CSRF(Cross-Site Request Forgery) tokens are used to reduce the risks of request forgery attacks. This is implemented for both viewing the application pages and when submitting user input to the application via AJAX. A user will be directed to enter the 2FA OTP again if an invalid Anti-CSRF token is detected.

The logic of each application functionality is implemented as a Handler class, which in turn uses Data Access Object (DAO) for accessing the Google Cloud Datastore. The Data Access Objects cleanly separate commands from parameters to prevent database injection attacks. Transactions are used at the proper places to avoid race conditions. Data that is displayed to the user are properly escaped to prevent XSS (Cross-Site Scripting) attacks.

The design of entities in the datastore make it easy to isolate user data, ensuring that user A cannot access/modify user B data. In the datastore, each user entity has its own domain entities as its children. And each domain entity has url entities as its children. This forms a hierarchy that makes it easy to enforce isolation and ownership. Every user can only access its own children (domains) and grand-children (urls).

Session cookie are set with HTTPOnly, Secure and SameSite=Strict security flags. An servlet filter sets additional security headers such as Content Security Policy (CSP), HTTP Strict Transport Security (HSTS), X-Frame-Options etc... to further protect the application. Refer to the Github link at the bottom of the article for the full source code of the application.

The Application Service Function

The service function consists of a controller servlet, a handler class, a worker task that uess Javamail API for sending alerts and a Data Access Object for accessing the datastore. The JSON requests that it received from the client-side javascript is parsed by a simple json parser, A Simple Java Json Parser. The simple Json parser doesn't support any serialization or deserialization of objects. This avoids deserialization attacks. The parsed json object is converted into a WebReport object through a handler method.

The handler class contains the application logic and some utility methods for the processing of the WebReport. If the sha256 hash in a WebReport does not match for a url, the handler will create a temporary Alert entity containing the details of the webreport. It will push the key of the Alert entity into an App Engine task queue that will be processed by a task worker. The worker retrieves the Alert entity and sends an email alert to the application user.

The following diagram illustrates this process.

Fig 10. Application Service Function and Task Queue

Google App Engine has some limits on the usage of Javamail API for sending email. Refer to the Google App Engine documentation for these limits. For sending lots of emails, Google recommends the use of third-party providers such as SendGrid or Mailgun. The application in this article uses Javamail for simplicity.

App Engine restricts who can send emails. Authorized senders can be added in the App Engine Dashboard. This is covered in the Application Setup section later. The AppConstants.java file in the application has a fromemail variable that defines the sender email address.

The application service will send a reply "600 <redirectionurl>" back to the client javascript if redirection is enabled and the url hash does not match. The client-side javascript will then redirect the browser to the url that it has received. The servlet controller implements proper CORS (Cross-Orign Resource Sharing) methods to allow communication between the client-side javascript and the application service. CORS is required as the javascript is hosted on App Engine; a third party site on a different domain.

The following shows the source listing for the controller servlet. CORS is implemented to allow communication across domains.

/*
* MIT License
*
*Copyright (c) 2018 Ng Chiang Lin
*
*Permission is hereby granted, free of charge, to any person obtaining a copy
*of this software and associated documentation files (the "Software"), to deal
*in the Software without restriction, including without limitation the rights
*to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
*copies of the Software, and to permit persons to whom the Software is
*furnished to do so, subject to the following conditions:
*
*The above copyright notice and this permission notice shall be included in all
*copies or substantial portions of the Software.
*
*THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
*IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
*FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
*AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
*LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
*OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
*SOFTWARE.
*
*/

package sg.nighthour.app;

import java.io.IOException;
import java.io.PrintWriter;
import java.util.logging.Logger;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;


/**
 * Servlet implementation class WebRptServlet
 */
@WebServlet("/webrpt")
public class WebRptServlet extends HttpServlet
{
    private static final long serialVersionUID = 1L;

    private static final Logger log = Logger.getLogger(WebRptServlet.class.getName());

    /**
     * @see HttpServlet#HttpServlet()
     */
    public WebRptServlet()
    {
        super();
    }

    /**
     * @see HttpServlet#doPost(HttpServletRequest request, HttpServletResponse
     *      response)
     */
    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
    {

        request.setCharacterEncoding("UTF-8");
        response.setContentType("text/plain;charset=UTF-8");
        response.setCharacterEncoding("utf-8");

        // Check that request comes from a valid origin
        String origin = request.getHeader("Origin");
        if (origin == null)
        {
            log.warning("Error: Origin header not present : " + request.getRemoteAddr());
            return;
        }

        String domainentry = WebRptHandler.extractDomain(origin);
        if (domainentry == null)
        {
            log.warning("Error: Cannot extract domain from origin header : " + 
              origin + " : " + request.getRemoteAddr());
            return;
        }

        // Make sure that it is a valid domain under monitoring by the application
        if (!WebRptHandler.isValidDomain(domainentry))
        {
            log.warning("Error: Domain is not present in application datastore : " + 
                    domainentry + " : " + request.getRemoteAddr());
            return;
        }

        
        PrintWriter out = response.getWriter();
        response.setHeader("Cache-Control", "no-store");
        response.setHeader("Access-Control-Allow-Origin", origin);

        WebRptHandler.handleRequest(request, response, null, out);

    }

    /**
     * @see HttpServlet#doOptions(HttpServletRequest, HttpServletResponse)
     */
    protected void doOptions(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException
    {

        request.setCharacterEncoding("UTF-8");
        response.setContentType("text/plain;charset=UTF-8");

        // Check that request comes from a valid origin
        String origin = request.getHeader("Origin");
        if (origin == null)
        {
            log.warning("Error: Origin header not present : " + request.getRemoteAddr());
            return;
        }

        String domain = WebRptHandler.extractDomain(origin);
        if (domain == null)
        {
            log.warning("Error: Cannot extract domain from origin header : " + 
              origin + " : " + request.getRemoteAddr());
            return;
        }
        
        if (!WebRptDAO.isValidDomain(domain))
        {
            log.warning("Error: Domain is not present in application datastore : " + 
                    domain + " : " + request.getRemoteAddr());
            return;
        }
            
        response.setHeader("Cache-Control", "no-store");
        response.setHeader("Access-Control-Allow-Origin", origin);
        response.setHeader("Access-Control-Allow-Methods", "POST, GET, OPTIONS, DELETE, PUT");
        response.setHeader("Access-Control-Max-Age", "3600");
        response.setHeader("Access-Control-Allow-Headers", "Content-Type");

    }

}

The following lists the code snippet of the Handler class. The full source code is available at the Github link at the bottom of the article.

    /**
     * Handle and process incoming webreports from clients
     * 
     * @param request
     * @param response
     * @param userid
     * @param out
     * @throws ServletException
     * @throws IOException
     */
    public static void handleRequest(HttpServletRequest request, HttpServletResponse response, String userid, PrintWriter out)
            throws ServletException, IOException
    {
        
        BufferedReader in = request.getReader();
        
        String line = null;
        StringBuffer strbuf = new StringBuffer();
        while ((line = in.readLine()) != null)
        {
            strbuf.append(line);
        }

        String jsoninput = strbuf.toString();
        // Obtain WebReport object from the Json input
        WebReport wreport = processJson(jsoninput, request.getRemoteAddr());
        
        if(wreport == null)
        {
            log.warning("Error: Cannot parse json : " + jsoninput + " : " + request.getRemoteAddr());
            return;
        }

       
        String domain = extractDomain(wreport.getURL());
        if (domain != null)
        {
            wreport.setDomain(domain);
        }
        else
        {
            log.warning("Error: Cannot extract domain from JSON webreport : " + 
                wreport.getURL() + " : " + request.getRemoteAddr());
            return;
        }
        
        String origin = request.getHeader("Origin");
        origin = extractDomain(origin); 
        if(origin == null)
        {
            log.warning("Error: Cannot extract domain from origin header : " + 
                    request.getHeader("Origin") + " : " + request.getRemoteAddr());
             return;
        }
        
        //Check to make sure that domain in the origin header is the same
        //as the domain in the webreport
        if(!domain.equals(origin))
        {
            log.warning("Error: Origin domain does not match report domain : " + 
                    origin + " : " + domain + " : " + request.getRemoteAddr());
            return; 
        }

        userid = WebRptDAO.getUserIdFromDomain(domain);
        if (userid == null)
        {
            log.warning("Error: Cannot extract user from domain in JSON webreport : " + 
                    domain + " : " + request.getRemoteAddr());
           return;
        }

        String mode = WebRptDAO.getModeFromUserid(userid);
        String allowip = WebRptDAO.getAllowCaptureIP(userid);
        
        if(mode == null || allowip == null)
        {
            log.warning("Error: " + " null mode or null capture ip " + userid 
                    + request.getRemoteAddr());
            return; 
        }

       
        if (mode.equals(AppConstants.MODEMONITOR))
        {// monitor mode
             
             String checksum = WebRptDAO.getURLChecksum(userid, domain, wreport.getURL());
  
             String useragent = request.getHeader("User-Agent");
             if(useragent == null)
             {
                 useragent = "unknown"; 
             }
             
             if(checksum == null)
             {//url not found or error getting checksum behave as if monitoring is disabled
                 log.warning("Error: Monitor mode url is not found or error : "  + userid + " : " +  domain +
                         " : " + wreport.getURL() + " : " + request.getRemoteAddr());
                 return;
             }
             
             if(checksum.equals(wreport.getCheckSum()))
             {//sha256 hash matches
                 out.println("Ok");
                 return;
             }
             else if( WebRptDAO.isRedirectionEnabled(userid))
             {// sha256 hash does not match alert and redirect to error page
                 
                 String redirecturl = WebRptDAO.getRedirectionURL(userid);
                 
                 if(redirecturl != null)
                 {
                     out.println("600 " + redirecturl); 
                 }
                 else
                 {
                     log.warning("Error: Monitor mode cannot get redirection url : "  + userid + " : " +  domain +
                             " : " + wreport.getURL() + " : " + request.getRemoteAddr());
                 }
                 
             
                 
                 String alertkey = WebRptDAO.createAlert(wreport, userid, request.getRemoteAddr(),useragent);
                 Queue queue = QueueFactory.getQueue("alert-queue");
                 queue.add(TaskOptions.Builder.withUrl("/worker").param("key", alertkey));
                 
                 return; 
             }
             else 
             {//sha256 hash does not match redirection is not enabled
                 String alertkey = WebRptDAO.createAlert(wreport, userid, request.getRemoteAddr(),useragent);
                 Queue queue = QueueFactory.getQueue("alert-queue");
                 queue.add(TaskOptions.Builder.withUrl("/worker").param("key", alertkey));
                 return; 
             }
             

          }
          else if (mode.equals(AppConstants.MODECAPTURE) && allowip.equals(request.getRemoteAddr()))
          {// capture mode
               WebRptDAO.createURL(wreport, domain, userid, request.getRemoteAddr());
               return; 
          }
          else
          {// disable mode
               return;
          }
            
        
    }

The handleRequest method process the web report according to the mode configured for the java application. If it is in monitoring mode and the hash doesn't match, an Alert entity is created and its key pushed into a task queue. An worker task will pick this up and sends out an email alert. If redirection is enabled, the handler will tell the client-side javascript to redirect to the error page configured in the backend application.

The following shows the code for the worker task. The worker retrieves the Alert entity, looks up the MX record of the user email and emails the user with the relevant alert information. The Alert entity is then deleted. The use of Java Naming and Directory Interface (JNDI) to look up MX records, allow Javamail to directly connect to the email recipient mail server. This eliminates the need to use an outgoing smtp server and its required authentication credentials.

/*
* MIT License
*
*Copyright (c) 2018 Ng Chiang Lin
*
*Permission is hereby granted, free of charge, to any person obtaining a copy
*of this software and associated documentation files (the "Software"), to deal
*in the Software without restriction, including without limitation the rights
*to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
*copies of the Software, and to permit persons to whom the Software is
*furnished to do so, subject to the following conditions:
*
*The above copyright notice and this permission notice shall be included in all
*copies or substantial portions of the Software.
*
*THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
*IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
*FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
*AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
*LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
*OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
*SOFTWARE.
*
*/

package sg.nighthour.app;

import java.io.IOException;
import java.io.PrintWriter;
import java.util.logging.Logger;

import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Key;
import com.google.appengine.api.datastore.KeyFactory;

import com.google.appengine.api.datastore.EntityNotFoundException;
import com.google.appengine.api.datastore.Entity;
import com.google.appengine.api.datastore.Text;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Hashtable;
import java.util.Properties;

import javax.naming.Context;
import javax.naming.NamingEnumeration;
import javax.naming.NamingException;
import javax.naming.directory.Attribute;
import javax.naming.directory.Attributes;
import javax.naming.directory.DirContext;
import javax.naming.directory.InitialDirContext;

import javax.mail.Message;
import javax.mail.MessagingException;
import javax.mail.Session;
import javax.mail.Transport;
import javax.mail.internet.InternetAddress;
import javax.mail.internet.MimeMessage;

/**
 * Servlet implementation class TaskWorkerServlet
 */
@WebServlet("/worker")
public class TaskWorkerServlet extends HttpServlet
{
    private static final long serialVersionUID = 1L;
    private static final Logger log = Logger.getLogger(TaskWorkerServlet.class.getName());

    /**
     * @see HttpServlet#HttpServlet()
     */
    public TaskWorkerServlet()
    {
        super();

    }

    /**
     * @see HttpServlet#doPost(HttpServletRequest request, HttpServletResponse
     *      response)
     */
    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
    {
        request.setCharacterEncoding("UTF-8");
        PrintWriter out = response.getWriter();
        response.setContentType("text/plain;charset=UTF-8");

        // Retrieve the Alert entity key
        String key = request.getParameter("key");
        // log.warning("Got : " + key);
        if(key == null)
        {
            log.warning("Error: Alert key is null : " + request.getRemoteAddr());
            response.setStatus(400);
            return;
        }
        
        
        long numerickey = Long.parseLong(key);
        

        DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();

        Key k1 = KeyFactory.createKey("Alert", numerickey);

        String userid = null;
        String content = null;
        String senderip = null;
        String sha256 = null;
        String url = null;
        String useragent = null; 

        try
        { // Retrieve the alert entity
            Entity alert = datastore.get(k1);
            userid = (String) alert.getProperty("userid");
            content = ((Text) alert.getProperty("content")).getValue();
            senderip = (String) alert.getProperty("senderip");
            sha256 = (String) alert.getProperty("sha256");
            url = (String) alert.getProperty("url");
            useragent = (String)alert.getProperty("useragent"); 
        }
        catch (EntityNotFoundException e)
        {
            log.warning("Unable to retrieve alert entity : " + key);
            throw new ServletException("Unable to retrieve alert entity");
        }

        // Get the email domain
        String arr[] = userid.split("@");
        if (arr.length != 2)
        {
            log.warning("Invalid email domain");
            throw new ServletException("Invalid email domain");
        }
        String emaildomain = arr[1];

        // Look up MX records for email domain
        ArrayList<String> mxrecords = new ArrayList<String>();
        Hashtable<String, Object> env = new Hashtable<String, Object>();
        env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.dns.DnsContextFactory");
        env.put(Context.PROVIDER_URL, "dns://8.8.8.8 dns://8.8.4.4");

        try
        {
            DirContext ctx = new InitialDirContext(env);
            Attributes attrs = ctx.getAttributes(emaildomain, new String[] { "MX" });

            NamingEnumeration<?> results = attrs.getAll();
            while (results.hasMore())
            {
                Attribute attr = (Attribute) results.next();
                NamingEnumeration<?> mxresults = attr.getAll();
                while (mxresults.hasMore())
                {
                    String mx = (String) mxresults.next();
                    mxrecords.add(mx);
                }
            }

        }
        catch (NamingException e)
        {
            log.warning("NamingException unable to get MX : " + e);
            throw new ServletException("NamingException unable to get MX : " + e);
        }

        Collections.sort(mxrecords);
        if (mxrecords.isEmpty())
        {
            log.warning("Empty MX records");
            throw new ServletException("Empty MX records");
        }

        // Send email alert
        boolean done = false;
        int index = 0;

        while (!done && (index < mxrecords.size()))
        {

            String smtphost = mxrecords.get(index);
            try
            {
                Properties props = new Properties();
                props.put("mail.smtp.auth", "false");
                props.put("mail.smtp.starttls.enable", "true");
                // props.put("mail.smtp.localhost", "nighthour.sg");
                props.put("mail.smtp.host", smtphost);
                props.put("mail.smtp.port", "25");
                Session session = Session.getInstance(props);
                String msg = "Web Content has changed \n" + url + "\n" 
                        + "Sender Address : " + senderip + "\n"
                        + "Sha256 : " + sha256 + "\n" 
                        + "User Agent: " + useragent + "\n"
                        + "Note: The content below doesn't include external resources such as images, css, javascript files." 
                        + "\n\n" + content;

                MimeMessage message = new MimeMessage(session);
                message.setFrom(new InternetAddress(AppConstants.fromemail));
                message.setRecipients(Message.RecipientType.TO, InternetAddress.parse(userid));
                message.setSubject("Alert message " + url);
                message.setText(msg, "UTF-8");
                Transport.send(message);
                done = true;
                // delete the alert entity
                datastore.delete(k1);
            }
            catch (MessagingException e)
            {
                log.warning("Error sending email " + index + " " + smtphost);
                if (index == (mxrecords.size() - 1))
                {
                    throw new ServletException("Error sending email");
                }
            }

            index++;
        }

        
        out.println("");

    }

}

Security Threat Model for Service Function

The service function has a number of security checks. The servlet controller enables CORS (Cross-Origin Resource Sharing) only for domains (FQDN) that are registered in the application. It inspects the Origin header, requests that come from unknown Origin will not be processed.

The JSON input sent from the client is parsed using a simple custom JSON parser. The parser is simple and has a small set of functionas for parsing JSON. This minimizes its attack surface. Serialization and Deserialization is not supported. Invalid JSON input will not be processed by the service application handler. The JSON input is manually converted into a WebReport object by the handler.

The Data Access Object (DAO) used to access the datastore cleanly separates command from input parameters. This prevents injection attacks. The Google App Engine task queue is protected with proper security-contraints in the web.xml file. The processing and retry rate of the task queue is configured in the queue.xml file under the WEB-INF directory. The processing rate can be increased to handle more concurrent incoming requests.

Extensive logging is enabled to allow quick detection of errors and attacks.

Currently the service function doesn't implement any rate limiting to control the number of incoming connections and web reports. It relies on Google App Engine ability to scale up to handle incoming loads. Some form of rate limiting can be implemented in the future to offer more protection.

Application Setup

To run the application, a Google App Engine account is required. Log in to the Google Cloud console and create a new project for the application. Initialize App Engine by setting the language as java and selecting a location.

Deploying to App Engine

Obtain a copy of the application source code from the Github link at the bottom of the article. It can be imported into Eclipse as a maven project and deployed to Google App Engine using eclipse and the google cloud tools. Alternatively the application can be deployed using maven and google cloud tools.

Run the gcloud command to initialize to the correct project. Refer to the Google Cloud SDK Doc for details on how to do this. From the application directory where the pom.xml file is located, run the following maven commands to deploy.

mvn clean package -Dmaven.test.skip=true
mvn appengine:deploy -Dmaven.test.skip=true

The Selenium end to end tests are skipped. These tests are for the administrative function and can be run after the application has been fully set up.

The application uses the Google Authenticator Mobile App for 2FA authentication. Run through the earlier article Implementing 2 Factor Authentication for Web Security if you have not done so. The article includes instructions on how to deploy to Google App Engine using Eclipse IDE and how to configure 2 Factor Authentication for Google Authenticator Mobile App. The current monitoring application uses the same 2FA login mechanism.

Creating a User Entity in Cloud DataStore

Create a user entity in the Google Cloud Datastore Console with "User" as kind and the email address as the name (unique key). The following illustrates this

Fill in the following properties for the user entity.

Property Name	Type	Indexed	Value
AccountLock	Boolean	Yes	False
Action	String	Yes	Alert
CaptureModeIPAddress	String	Yes
Domaincount	Integer	Yes	0
FailLogin	Integer	Yes	0
Mode	String	Yes	Disable
RedirectionURL	String	Yes
Password	String	No	XXXXXXXXXXXXXX
Salt	String	No	XXXXXXXXXXXXXX
TOTP	String	No	XXXXXXXXXXXXXX

The Password, Salt and TOTP require two java utilities, GenerateHashPassword.java and GenerateOTPSecret.java. These 2 utilities can be run to create the hexadecimal values for the password, salt and TOTP. For OTP, a base 32 secret key will be generated. This key has to be configured in Google Authenticator Mobile App. Refer to the article Implementing 2 Factor Authentication for Web Security for details on how to do this.

The 2 utilities are available from the Github link at the bottom of the article. Note: the password, salt and the OTP values that are generated should be kept secret and secure. The following illustrates a user entity and its properties.

The python 3 script, createuser.py can help to automate the creation of a user entity. The script will prompt for the user email, hexadecimal password, salt and otp. It will create a user entity in Google Cloud Datastore using these information. The script is available in the project directory of the application source. Before running the script, Google Cloud Datastore client libraries for python has to be installed and the proper project and authorization setup using the google cloud tool.

Configuring Sender Email Address

For sending out emails, an authorized email sender (from email address) is required. Authorized senders can be added in the App Engine Dashboard -> Settings -> Email Senders. Only the current logged in Gmail user can add itself as an authorized sender. This means to add a sender, the sender must be an owner of the project.

To use another email sender besides the current default owner, add a new project owner in the Google Cloud Identity and Access Management (IAM). This new account has to be secured just like the default google cloud account. Once this owner has added itself as an authorized App Engine Email Sender, its role can be modified to be a Project viewer with less privileges.

The following shows a email sender added at the App Engine Dashboard.

The application source has an AppConstants.java file where the from email address can be configured. Locate the file and set the String variable, fromemail, to the email address that is added as a email sender.

Setting Up Remotehost in Monitoring Javascript

The client-side monitoring javascript is pmon.js, located under the scripts directory. Locate this file and look for the "remotehost" variable at the top of the script. Change the value of this variable and point it to

https://<your google project id>.appspot.com.

This is the Google url for your App Engine application. Notice that the monitoring application uses TLS (Transport layer Security), this is to ensure that information are sent over a secure link.

Deploy the application to App Engine using the eclipse IDE. Refer to the Google App Engine documentation if you are not sure how to do this. The earlier article Implementing 2 Factor Authentication for Web Security has some instructions and screenshots on deployment as well.

The full url for the monitoring client-side javascript is

https://<your google project id>.appspot.com/scripts/pmon.js

This script can be included in html pages that are to be monitored. It can be injected through a reverse proxy as described in the Design Overview and Approach section. The use of Nginx reverse proxy and the filter module is the recommended method.

Injecting the Monitoring Script Using Nginx and Filter Module

Refer to the article Writing an Nginx Response Body Filter Module for instructions on how to setup an Nginx reverse proxy with a filter module to inject a script tag. The technique is used to insert the monitoring script into html content. Take note that the script tag that is inserted should not have the "async" attribute. This is so that the inserted script will have a better chance of being the first script to be run.

Note that the Nginx filter will check the first 512 characters for a html document for the <head> tag to insert the javascript tag after it. If the head tag is not found (malformed document or defaced web content), the filter can be configured to "block" the document and display a blank page. If a web document has a lot of comments that cause it the <head> tag to be beyond the first 512 characters, the filter can mistake it for malformed document.

The 512 characters limit is configurable in the filter source. Refer to the Nginx article for details. It is not recommended to increase this limit. A larger value can cause blocking to fail and performance to decrease. Decreasing this limit can help improve performance as less of the output is scanned.

Testing the Application (Simulating Web Defacement)

This section shows a simple use case and test for the monitoring application. It illustrates how the application can detect unauthorized content changes such as website defacements.

Log in to the application at your App Engine project url. Make sure that 2FA secret key has been configured into the Google Authenticator Mobile App so that it can generate the OTP that is required for logging in.

Add the domains that you want to monitor and then switch the application to capture mode. Capture mode can be set from the Configure Mode page. The following show the domains added and the application is set to capture mode.

Add Domains Capture Mode — Fig 15. Domains and Capturing Mode

On the website to be monitored. Create a new test.html file on the web document root with the following content.

<!DOCTYPE html>

<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Test Monitor </title>
</head>
<body>

<h2>Hello Testing Web Monitor</h2>
<p>
This is a test page for client side web content monitoring.
</p>

</body>
</html>

Browse to the test file.

If the Nginx Reverse is set up properly to inject the monitoring javascript, you should see the pmon.js script being included after the <head> tag when you view the html source.

Go back to the backend application and look under the Monitored WebPages. Select "www.nighthour.sg" from the drop down selection. test.html should be displayed. It has been populated into the datastore as a url with the relevant sha256 hash value and other information. As the application has been set to Capture mode, browsing to a url where the client-side script is present, will add the url as a monitored URL.

Notice that the monitored url is https://www.nighthour.sg/test.html. This is different from https://nighthour.sg/test.html. To monitor both, browse to https://nighthour.sg/test.html when the application is in Capture Mode. The client-side script will send a web report to the application, to add the url.

Under Capture Mode, the application will check the IP address of incoming webreports. The IP address has to be the same IP that activates the Capture Mode in order for the URL to be added as a monitored page. This is to prevent attacks that attempt to poison the application datastore with false url hashes, content etc... You can log in to the Google Cloud Datastore console to see the URL information that is stored.

If a web page contains sensitive information that should be kept private, the client-side javascript should not be deployed for that page. The client-side javascript will continue to send webreports (including any sensitive content on the page) to the application even if a URL is not being monitored.

Change the backend application to the "monitor" mode, to stop capturing and do the actual monitoring. Alerts will be sent to the user email used for logging in.

Modify the content of test.html and browse to it again. This simulates a case where the web content has been changed without proper authorization. For example, a website is hacked and defaced through some database injection attacks.

When a visitor browse the modified web page, the application will sent a email alert to the administrator. The following shows an alert email received by the administrator.

Go back to the application and set a redirection error page. If the monitored web content is modified, the browser will be redirected to the error page. As mentioned earlier, the error page should be hosted on a third party site. In this case, the error page is located on the App Engine Java application itself. To prevent potential infinite looping, the error page should not be a monitored url.

Browse to the modified test.html page again. This time, the client-side javascript should redirect the browser to the error page set in the application after a short interval. This can be useful if a website is defaced; the end user will be exposed to the defaced content only for a short interval. An email alert is sent as well.

Although testing is done using a html file, the monitoring script can be used on dynamically generated html content from php, jsp etc... As long as the generated content is static and doesn't change frequently, the application should be able to monitor it properly.

The testing also doesn't cover images, external css or javascript files that are included in a html document. The monitoring application is able to detect images that are modified, external css or javascript that are modified etc... You can test this out on your own.

Selenium and Junit Testing

Under the application test directory, there are a number of Junit and Selenium Web Driver tests that can be used for automated end-to-end testing of the administrative functions. The file TestConstants.java contains the various variables used by each of the Junit test. For example, the user account password, the hexadecimal OTP key, the Google App Engine application url etc...

Fill in these with the relevant values before running the Junit tests. The MonitoredWebPagesTest.java test requires specific test URLs to be present in the datastore. These can be created either directly in the datastore or added through the capturing mode in the application.

The Junit tests rely on Selenium WebDriver version 3. A native driver for the browser is required. In this case, the Mozilla GeckoDriver is used. The link to download this is available from the SeleniumHQ website. The automated testing is configured to use Mozilla firefox browser.

The application service function has to be tested manually. Currently there are no automated tests created for this.

Conclusion and Afterthought

Websites have been using analytics, Application Performance Management (APM), Real User Monitoring (RUM): techniques that rely partly on client-side scripting to gather usage data, troubleshoot bottlenecks etc... This article attempts to extend such a mechanism to security monitoring. It shows how to build a simple web content monitoring application using client-side script and a Google App Engine application.

The application can detect unauthorized content changes such as web defacement and send an email alert. It can also optionally redirect the user browser to an error page. The pros and cons of such an approach is discussed, and mitigations to prevent tampering of the client-side javascript are implemented.

The client-side javascript can potentially be improved further to detect DOM changes and cater for Single Page Application(SPA). Html 5 local storage can be used for caching hashes, reducing network communication and improving performance. The change detection strategy can be improved further. The current simple hashes works only for static html content. Other change detection methods such as document distance, Bayesian analysis or other machine learning techniques, can be used to handle dynamic html content.

Some other improvements that can be made include usability issues. For instance, the administrative service requires setting the application to capturing mode and browsing the website in order to update changes or add a url to the protection. More can be done to make such security monitoring more usable. Ease of use is important for the adoption of security.

Current web browsers offer limited ability to secure client-side scripts. If web browsers can support more features to secure client-side scripting, for example, mandatory script (security script must always run), order of script execution (security script must run first), isolation protection against script poisoning (mandatory security script that runs first, gets to setup a secure isolated tamper-proof namespace where built-in objects cannot be modified), will help to enable greater usage of client-side processing for security purposes.

The computing environment is constantly evolving. Some recent trends include CDN providers moving processing closer to the edge. Cloudflare, a CDN that offers web security, has a Cloudflare App platform that allows for easy deployment of client-side scripts. Client-side applications has further room to improve and grow, including in the area of security,

Useful References

Writing an Nginx Response Body Filter Module, an earlier article describes how to build an Nginx filter module that can inject a script into html content. This is the recommended method to inject the client-side monitoring javascript described in the current article.
Implementing 2 Factor Authentication for Web Security, an earlier article that describes how to build a 2FA login using Google Authenticator Mobile App. The 2FA login mechanism is used in this current article.
Testing 2 Factor Authentication with Selenium, an earlier article that describes how to use Selenium Web Driver to test 2 factor authentication based on Google Authenticator Mobile App. The login Junit tests in this article uses the same mechanism.
Meerkat: Detecting Website Defacements through Image-based Object Recognition, A paper that describes the use of image recognition and machine learning techniques to detect website defacements, presented at the 24th USENIX Security Symposium (USENIX Security '15).
Detecting Homepage Defacement With Active Health Checks, An interesting blog post about using Nginx plus active health checks to detect web defacement.
Zone-H, A website containing a database of website defacements.
CloudFlare App, A platform that allow client side scripts to be deployed for websites that are using CloudFlare CDN (Content Delivery Network) and WAF (Web Application Firewall).
How we built Origin CA: Web Crypto, An interesting cloudflare blog post on usage of client-side WebCrypto.
Keeping secrets with JavaScript: An Introduction to the WebCrypto API, by Tim Taubert. A useful introduction to WebCrypto API. WebCrypto provides cryptographic functions for client-side javascript.
Mozilla WebCrypto API Doc, Mozilla documentation on WebCrypto API.
Friday the 13th: JSON Attacks, A blackhat paper explaining about JSON deserialization attacks.
Deep dive into the murky waters of script loading, An article on the order of javascript loading and execution for various browsers.
The Most Effective Way to Protect Client-side JavaScript Applications, An article on protecting client-side javascript. The article assumes that the client browser is controlled by an attacker that will tamper with javascript. This differs from the monitoring implementation described here where the threat model assumes that majority of browsers are non malicious and functioning properly.
OWASP top 10, The OWASP top 10 site offering information on the 10 most common and criteria vulnerabilities that all web applications should protect against.
APPSEC Cali 2018 - Edgeguard: Client-side DOM Security - detecting malice - An Open Framework, A youtube video on a client side javascript, Edgeguard, that can be used for security monitoring and protecting end users. Security professionals and companies are starting to bring security solutions closer to the edge, including the use of client-side script. The edgeguard presentation involves a model where the end point can potentially be compromised and there is an adversarial relationship between attackers and defenders. This differs from the model in this articile, where it is assumed that the majority of browsers are non-malicious and functioning properly.
A SingCert Advisory on Web defacements of Singapore websites, Singapore CERT advisory on an increase of web defacement attacks against Singapore websites. Today internet is a dangerous place, website owners need to take security seriously and actively protect their critical web assets.

The full source code for the web monitoring application is available at the following Github link.
https://github.com/ngchianglin/WebMonitor

The source code for the 2 utility applications to create the password, salt and OTP values for the user entity n this article is available at the following Github link. https://github.com/ngchianglin/2faUtility

The source code for the simple JSON parser used in this application. https://github.com/ngchianglin/SimpleJsonParser

If you have any feedback, comments, corrections or suggestions to improve this article. You can reach me via the contact/feedback link at the bottom of the page.

Article last updated on May 2018.