Wednesday, April 27, 2016

Regex Recipe: Normalize URL

This recipe describes how to replace multiple slashes in URL & avoid replacing the first // in http:// & https://.

Expected behaviour

To send some concatenated url with multiple slashes and recive the clean one.

Given Input:
http://devcdn.some-hub.net/html5//280131/publish///brand/bundle/games/
Expected Output:
http://devcdn.some-hub.net/html5/280131/publish/brand/bundle/games/

Normalize URL in Java

Preserves // after http:// & https://
    String normalizeURL(String url) {
        return url.replaceAll("(?<!http:|https:)/+/", "/");
    }

Normalize URL in Groovy

Preserves // after : like in in http:// & https://
    String normalizeURL(url) {
        return url.replaceAll(/(?<!:)\/+/, "/");
    }

Normalize URL in JavaScript

"$1" inserts the 1st parenthesized submatch string (/ from //):
    function normalizeURL(url) {
        return url.replace(/([^:]\/)\/+/g, "$1")
    }


1 comment: