Random Related Posts in Jekyll
Related posts are suggestions that could be interesting for readers who read the current post. This is quite useful since users can find more relevant content easily.
Related Posts vs Recent Posts
In Jekyll, there is a
site.related_posts variable. Using it, related posts are rendered. This sounds simple. However, if you look closer the result, you can notice that it is not related posts, but recent posts.
This is because of the default Jekyll configuration. Here is a description from Jekyll:
If the page being processed is a Post, this contains a list of up to ten related Posts. By default, these are the ten most recent posts. For high quality but slow to compute results, run the jekyll command with the
--lsi(latent semantic indexing) option.
Related Posts with Latent Semantic Indexing
Latent semantic indexing is one of the most popular algorithms to calculate a similarity between documents. By enabling it, we can get the correct related posts. In Jekyll, you run with the
--lsi or enable it in the
_config.xml. When it is enabled, it shows
Now, you can see the correct related posts. Although this brings a better result, this is quite slow. In my case, it takes 36 times slower than normal build. Moreover, this has a bigger problem if you host your blog on GitHub Pages.
Problem on GitHub Pages
Simply, GitHub Pages does not support the
lsi option when generating sites. There is no clear reason but I guess the performance is one of them. As you saw, it is much slower than normal build. GitHub Pages is currently free and LSI uses resource a lot.
GitHub Pages acts differently depending on source types: Jekyll and HTML. For Jekyll, it will generate a site and deploy it. For HTML, however, the source will be just deployed without a generation stage. Generally,
gh-pages branch is used for the HTML source. It means that you can use the LSI if you push a locally generated site and change the source type. However, you need to push a generated site every time by yourself.
Random Related Posts
There are some techniques to solve this GitHub Pages limitation by utilizing Jekyll tags or categories. One shows recent posts in the same category. Another calculates the number of matched tags. Those are doable but I’m looking for more a content-based suggestion like LSI without manually generating a site. Until that, I think random posts could be okay in terms of exploring posts. Some might be actually relevant for readers.
site.postscontains all posts including the current post.
where_expwill prevent showing the current post as a related post.