Apricode

Apricode Navigation
  • פרוייקטים
  • אודות
  • שירותים
  • בלוג
  • צור-קשר
  • En
  • פרוייקטים
  • אודות
  • שירותים
  • בלוג
  • צור-קשר
  • En

Angular and Restify SEO

apricode מאי 15, 2016 Coding, Web development

Angular and Restify SEO

by Yoni Goyhman, November 5, 2015

How to create a Node & Restify server that supports SEO for an Angular based website.

Developing a website using Node.js, Restify and Angular is fairly simple. Using Angular's two way data binding together with Restify's automatic REST API creation, you can create a server enhanced web-app in no-time. 

Getting Your website Indexed by the different search engines and social media web crawlers (bots) is a different story. None of the currently used bots are able to render javascript. 

This means that when a crawler comes across your angular based app, it sees something like this:

To solve this issue, you need to supply an alternative html page which is already rendered to its final static appearance.

The solution is constructed out of 3 main tasks:

  1. Create static html snapshots
  2. Recognize a bot
  3. Serve the  static snapshot instead of the original html page

We will use this blog as an example to demonstrate how we implement the solution.

Our architecture:

Server: Node.js + Restify + Mongodb

Client: Single page Angularjs app

The Blog contains an index page in the default "/" route, and a list of blog posts server at "/@blog_url". 

We've added  base and meta  fragment tags to remove the default angular #:

<base href="/">
<meta name="fragment" content="!" />

Angular handles the routing and data fetching.  The route file look like this:

$urlRouterProvider.otherwise('/');
$stateProvider
.state('index', {
url: "/",
templateUrl: "views/index.html",
controller: "indexCtrl"
})
.state('posts', {
url: "/:url",
templateUrl: "views/post.html",
controller: "postCtrl"
})

On the server side we serve the index.html file for all static get requests for the blog.apricode.co.il domain:

server.get('/.*', function(req, res, next) {
var url = req.url;
var fileExtension = url.split('.')[url.split('.').length - 1];

if ( url== "/" || (req.method.toLowerCase() == "get" && staticTypes.indexOf(fileExtension) == -1))
url = '/index.html';
else if (url[url.length - 1] == '/')
url = url + 'index.html';

req.url = url;

var hostname = req.headers.host.split(":")[0];
return serveStatic(config.server.domains[key].path, {fallthrough: false});
});

1. Creating static html snapshots:

We have decided to use grunt for this task, or more specifically grunt-html-snapshot, which take a list of url, and generates individual html snapshots of that list.

We used our server to create an array of all available posts, and save it to a posts.json file.

function savePostsJSON(posts) {
var postsArray = [];
for (var i = 0; i < posts.length; i++) {
postsArray.push("/" + posts[i].url);
}

fs.writeFile(path, JSON.stringify(postsArray),function(err) {
if (!err)
console.log("JSON Saved!")
});
}

Then we've created a grant task to generate all the snapshots:

var availablePosts = grunt.file.readJSON('posts.json');

grunt.loadNpmTasks('grunt-html-snapshot');
grunt.initConfig({
htmlSnapshot: {
blog:{
options: {
snapshotPath: 'snapshots/',
sitePath: 'http://blog.apricode.co.il',
fileNamePrefix: 'blog_',
urls: ["/"].concat(availablePosts),
msWaitForPages: 2000,
removeScripts: true,
sanitize: function (requestUri) {
if (/\/$/.test(requestUri)) {
return 'index';
} else {
return requestUri.replace(/\//g, '');
}
}
}
}
}
});

grunt.registerTask('default', 'htmlSnapshot');

The tasks creates a collection of "blog_" prefixed html files under the snapshot folder.

Notice we've given the task 2 seconds delay to allow page rendering on slow connections.

2. Recognizing a bot:

There are 2 types of bots:

    Search Engine bots

    Social media bots

The difference between them is that a search engine bot crawls that encounters a <meta name="fragment" content="!" /> tag, will send a new request containing a _escaped_fragment_= parameter with the requested url.

Social media bots will not send a new request, and are only recognizable by their user-agent.

Here is our code for recognizing the different bots:

Social media bots: 

var userAgent = req.header('User-Agent').toLowerCase();

var socialUserAgents ='baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator'.toLowerCase().split('|');
var isSeo = false;
for (var i = 0; i < socialUserAgents.length; i++) {
if (userAgent.indexOf(socialUserAgents[i]) > -1) {
isSeo = true;
console.log("Found Social Bot, requested url: " + req.url);
}
}

Search engine bots:

if(url.indexOf('_escaped_fragment_') > -1)
isSeo = true;

3. Server the static snapshot instead of the original html page:

After we've recognized a bot, we will serve it a static html snapshot from the snapshots folder we've created in step 1.

Recognizing the desired url:

For social media bots the url remain the same.

For Search engines, we need to extract the parameter from the url:

if(url.indexOf('_escaped_fragment_') > -1) {
isSeo = true;
requestedUrl = url.split('_escaped_fragment_=')[1];
if (requestedUrl == "")
requestedUrl = "/";
}

After getting the required url, we need to serve the prefixed static html file:

var hostname = req.headers.host.split(":")[0];
var subDomain= hostname.split('.')[0];
var prefix = subDomain + '_';
if (requestedUrl == '/' || requestedUrl == '')
requestedUrl = prefix + 'index.html';
else if (requestedUrl[0] == '/')
requestedUrl = prefix + requestedUrl.substring(1,requestedUrl.length) + '.html';
else
requestedUrl = prefix + url + '.html';

And inside the server.get:

serveStatic(config.SEO.snapshotsPath, {fallthrough: false});

Testing:

We've tested the search engine bot by going to http://blog.apricode.co.il/?_escaped_fragment=/angular_restify_seo and checking we receive a static html file

Checking the Social media bot is done using entring the  http://blog.apricode.co.il url into a post text input ( on facebook / linkedin), and receiving a correct site preview: