Random Related Posts in Jekyll
Related posts are suggestions that could be interesting for readers who read the current post. This is quite useful since users can find more relevant content easily.
Related Posts vs Recent Posts
In Jekyll, there is a site.related_posts
variable. Using it, related posts are rendered. This sounds simple. However, if you look closer the result, you can notice that it is not related posts, but recent posts.
This is because of the default Jekyll configuration. Here is a description from Jekyll:
site.related_posts
If the page being processed is a Post, this contains a list of up to ten related Posts. By default, these are the ten most recent posts. For high quality but slow to compute results, run the jekyll command with the
--lsi
(latent semantic indexing) option.
Related Posts with Latent Semantic Indexing
Latent semantic indexing is one of the most popular algorithms to calculate a similarity between documents. By enabling it, we can get the correct related posts. In Jekyll, you run with the --lsi
or enable it in the _config.xml
. When it is enabled, it shows Populating LSI...
.
_config.xml
Now, you can see the correct related posts. Although this brings a better result, this is quite slow. In my case, it takes 36 times slower than normal build. Moreover, this has a bigger problem if you host your blog on GitHub Pages.
Problem on GitHub Pages
Simply, GitHub Pages does not support the lsi
option when generating sites. There is no clear reason but I guess the performance is one of them. As you saw, it is much slower than normal build. GitHub Pages is currently free and LSI uses resource a lot.
GitHub Pages acts differently depending on source types: Jekyll and HTML. For Jekyll, it will generate a site and deploy it. For HTML, however, the source will be just deployed without a generation stage. Generally, gh-pages
branch is used for the HTML source. It means that you can use the LSI if you push a locally generated site and change the source type. However, you need to push a generated site every time by yourself.
Random Related Posts
There are some techniques to solve this GitHub Pages limitation by utilizing Jekyll tags or categories. One shows recent posts in the same category. Another calculates the number of matched tags. Those are doable but I’m looking for more a content-based suggestion like LSI without manually generating a site. Until that, I think random posts could be okay in terms of exploring posts. Some might be actually relevant for readers.
site.posts
contains all posts including the current post.where_exp
will prevent showing the current post as a related post.