EN RU

@demondehellis

About technologies and development.

Creating a Multilingual Jekyll Blog Without Plugins

How to add support for multiple languages to a Jekyll blog without relying on external plugins.

Jekyll isn’t inherently designed for multilingual sites out-of-the-box, and existing plugins are often abandoned or don’t quite work as desired. I decided to tackle it myself, avoiding external dependencies. Here’s how it turned out.

Overview of Changes

The task essentially breaks down into several parts:

The trickiest part wasn’t just adding the languages themselves, but preserving the existing infrastructure that was already working: RSS feeds, categories, related post recommendations, and so on.

File Structure: Before and After

Before the changes, things were quite simple:

_content/
  _notes/
    blog/
      2025-03-25-first-post.md
      2025-04-01-second-post.md
  _categories/
    dev.md
    travel.md
index.html
feed.xml

After adding languages, the structure became significantly more complex:

_content/
  _notes/
    en/
      blog/
        2025-03-25-first-post.md
    ru/
      blog/
        2025-03-25-first-post.md
  _categories/
    en/
      dev.md
      travel.md
    ru/
      dev.md
      travel.md
_data/
  translations.yml
en/
  index.html
  feed.xml
ru/
  index.html
  feed.xml
index.html   # Root index now acts as a redirect/router

Now, each language has its own directories for content, categories, and utility files. The main index.html at the root has become a router that detects the user’s preferred language and redirects them (more on this later).

Why not just create a separate copy of the blog on a subdomain for the other language? Well, advice from various sources (including SEO best practices and perhaps AI assistants like ChatGPT) suggested that using subdirectories (/en/, /ru/) is generally better for SEO than subdomains (en.example.com). Additionally, managing two separate sites (even if sharing code) can become cumbersome – requiring synchronization for template changes, etc. While subdomains might seem simpler initially, the subdirectory approach felt more maintainable in the long run.

Settings in _config.yml

First, I updated the configuration file. Instead of a single language, I now use an array:

# Before
lang: "ru"

# After
languages: ["en", "ru"] # List of supported languages
default_lang: "en"      # Define a default language

Next, I adjusted the URL structure and default settings for each language using Jekyll’s defaults:

# Default values based on path and language
defaults:
  - scope:
      path: ""
    values:
      lang: "en" # Default language for files without a specific language path

  # Common settings for all content within collections
  - scope:
      path: "_content/_*/*"
      type: "notes"
    values:
      layout: "content"

  # Settings specific to English content
  - scope:
      path: "_content/_*/en"
    values:
      lang: "en"
      permalink: "/en/:title/"

  # Settings specific to Russian content
  - scope:
      path: "_content/_*/ru"
    values:
      lang: "ru"
      permalink: "/ru/:title/"

The key here is the permalink structure. I opted for explicit language codes in the URL (/en/title/, /ru/title/). This is a straightforward and clear approach. Remember that after implementing this new structure, you’ll likely need to set up redirects from your old URLs to the new language-specific ones to preserve SEO value and avoid broken links.

UI Translations

To avoid hardcoding text directly into templates, I created a translations file at _data/translations.yml:

en:
  tagline: "Travel, technology, and various things"
  description: "Interesting stuff about life abroad, technology, and nerdy things."
  read_more: "Read more"
  previous: "Previous post"
  next: "Next post"
  more: "MORE"

ru:
  tagline: "путешествия, технологии и всякие штуки"
  description: "Интересное о жизни за границей, технологиях и задротских штуках."
  read_more: "Читать далее"
  previous: "Предыдущий пост"
  next: "Следующий пост"
  more: "ЕЩЕ"

Now, within any template, you can access the translated strings using the current page’s language:

{{ site.data.translations[page.lang].read_more }}

The page.lang variable is assigned based on the defaults configuration we set up earlier, corresponding to the URL structure (e.g., ru for /ru/... and en for /en/...). This approach is very convenient and flexible, making it easy to add more languages later.

SEO and Language Metadata (hreflang)

This is one of the most critical parts. For a multilingual site, you need to tell search engines that different versions of a page exist in different languages. Otherwise, they might see your translations as duplicate content. The standard way to do this is with the hreflang meta tag.

I created a dedicated include for this: _includes/meta/language.html:

{% assign all_pages = site.pages %}
{% for collection in site.collections %}
{% assign all_pages = all_pages | concat: collection.docs %}
{% endfor %}

{% for lang in site.languages %}
{% assign alt_prefix = '/' | append: lang | append: '/' %}
{% assign current_prefix = '/' | append: page.lang | append: '/' %}
{% assign alt_url = page.url | replace_first: current_prefix, alt_prefix %}
{% assign alt_page = all_pages | where: "url", alt_url | first %}
{% if alt_page and page.url != '/' %}
<link rel="alternate" hreflang="{{ lang }}" href="{{ site.url }}{{ alt_page.url }}" />
{% endif %}
{% endfor %}

<!--remember language preference-->
<script>window.location.pathname === '/' || window.localStorage.setItem("lang", "{{ page.lang }}");</script>

What’s happening here?

  1. Gather Content: We first collect all relevant documents (posts from collections like _notes, plus regular pages). You might need to adjust which collections are included based on your site structure.
  2. Iterate Languages: Loop through each language defined in site.languages.
  3. Construct Target URL: For the current page, figure out what its URL would be in the target language by replacing the language prefix (e.g., swap /ru/ for /en/). We handle the homepage case separately.
  4. Find Alternate Page: Search through all collected documents (all_docs) to find if one exists with the constructed target_url_path.
  5. Generate Tag: If a matching page is found, output the <link rel="alternate" hreflang="" href="..."> tag, pointing to the absolute URL of the alternate version.

This hreflang tag is crucial for SEO. It explicitly tells search engines: “Hey, this page has an equivalent version for language X, and here it is.” This helps them serve the correct language version to the right users and prevents duplicate content penalties. The logic might seem a bit involved, but it automates the process nicely. Including a link for the current language itself is also standard practice.

Additionally, a small script is included (but only on actual language pages, not the root redirector) to save the current page.lang to the browser’s localStorage. This will be used for the automatic redirect from the root URL.

Root Homepage and Language Switching

The root index.html now primarily serves as a language redirector based on user preference or browser settings:

<script>
    (function() {
        if (/bot|crawl|spider/i.test(navigator.userAgent)) return;
        const lang = localStorage.getItem('lang') || (navigator.language).split('-')[0];
        window.location.href = lang === 'ru' ? '/ru/' : '/en/';
    })();
</script>

<div class="prose prose-lg md:prose-xl prose-img:mx-auto mx-2 md:mx-auto text-center">
    <h1>Welcome</h1>
    👉 <a href="/en/">Read in English</a><br>
    👉 <a href="/ru/">Читать по-русски</a>
</div>

The script performs these steps:

  1. Checks if the visitor is likely a search engine bot. If so, it does nothing, allowing the bot to see the HTML links.
  2. Tries to get the preferred language from localStorage (set by the script in the hreflang include).
  3. If not found in localStorage, it uses the user’s browser language setting (navigator.language), defaulting to ‘en’ if unavailable.
  4. Redirects the user to the corresponding language subdirectory (/en/ or /ru/).

For search engines and users with JavaScript disabled, simple HTML links are provided as a fallback. If a user clicks a language link (e.g., clicks “Read in English” while their browser is Russian), the script in the _includes/meta/language.html template will save this choice (en) to localStorage, ensuring they are directed to English on subsequent visits to the root. Minimalist and effective.

In the site header (likely in _layouts/default.html or an include), I added a simple language switcher:

<div class="flex justify-center space-x-4 mb-2 text-sm">
    <a href="/en/" class="font-bold">EN</a>
    <a href="/ru/" class="">RU</a>
</div>

You could generate this dynamically using site.languages and the hreflang logic if you have more than two languages, but for two, this manual approach is simple.

Filtering Content by Language

A crucial step is to ensure that users only see content relevant to the language they are currently browsing. For example, the English index page (/en/index.html) should only list English posts.

This was a key reason for avoiding the popular jekyll-polyglot plugin, which (by default) might show default language posts if a translation isn’t available. This could lead to English users seeing Russian posts, which wasn’t desired.

In templates where you list posts (e.g., index pages, category pages), filter the content by the current page.lang:

{% assign notes = site.notes | where: "lang", page.lang %}

(Note: Replace site.notes with site.posts or your actual collection variable if different.)

This ensures that the Russian version of the site only shows Russian posts, and the English version only shows English posts.

RSS Feeds

Create separate RSS feeds for each language:

en/feed.xml
ru/feed.xml

Inside the template for each feed (e.g., en/feed.xml), make sure to filter the posts similarly to how you did on index pages:

{% assign notes = site.notes | where: "lang", page.lang %}

(Again, adjust site.notes if needed). Define lang: en or lang: ru in the front matter of the respective feed file.

All Set!

With these changes, the blog is fully equipped for multilingual content. The structure is flexible, making it straightforward to add more languages if needed. The sitemap (sitemap.xml) can typically remain a single file listing all URLs across all languages; search engines handle this correctly when combined with hreflang.

What I like about this solution:

From an SEO perspective, it follows best practices:

Now, the main task left is… translating all the content!