/til/

2024 0906 lunr.js site search

I added search to this site on the main page.

It’s all client-side with lunr, which relies on building a JSON index of all site content at deploy time. It has no dependencies and is fast once the index is downloaded.

It comes more as lego bricks than an out of the box system, though. I needed to build an index for it to use, decide what fields to make searchable, add HTML for entering the search terms and configure a place to show results, and write some JavaScript to call lunr to perform the search and show the results in my UI.

Here’s how I did it.

Generate the lunr index with Hugo

As mentioned, Lunr needs a JSON index. I got Hugo to create this form me with a custom output format. From my config (other mediaTypes, outputFormats, and outputs are elided):

mediaTypes:
  #...
  application/json+lunr:
    suffixes: ["lunr.json"]

outputFormats:
  # ...
  JsonLunr:
    mediatype: application/json+lunr
    suffix: lunr.json
    isPlainText: true

outputs:
  # ...
  home:
    - HTML
    - JsonLunr
    # ...

I added layouts/_default/index.lunr.json:

{{/*  Build the lunr.js index for the site.
      -*- mode: go -*-

      Ignores the following sections:
        - warchive:     No good way to index the WARC and WACZ files
        - twarchive:    Would have to read Twitter data JSON files, not worth it

      The "content" is used for searching.
      The "summary" is displayed in search results.
        https://gohugo.io/methods/page/summary/
        Hugo generates one of a limited number (default 70) of words of the content,
        or splitting the content with a <!--more--> comment,
        or you can set it yourself with the "summary" frontmatter key.
*/}}
{{- $ignoreSections := slice "warchive" "twarchive" -}}
{{- $index := slice -}}
{{- range $page := $.Site.RegularPages -}}
  {{/*  The first section .Path will be like `/blog`; this gets just `blog`.  */}}
  {{- $firstSectionName := strings.TrimPrefix "/" $page.FirstSection.Path -}}
  {{- if collections.In $ignoreSections $firstSectionName -}}
    {{- continue -}}
  {{- end -}}
  {{- $content := $page.Plain | jsonify -}}
  {{- $tags := $page.Params.tags | default (slice) -}}
  {{- $technologies := $page.Params.technologies | default (slice) -}}
  {{- $index = $index | append (dict
    "title" $page.Title
    "uri" $page.Permalink
    "section" $firstSectionName
    "content" $content
    "summary" ($page.Summary | plainify | jsonify)
    "tags" (delimit $tags " ")
    "technologies" (delimit $technologies " ")
  ) -}}
{{- end -}}
{{/*  In dev mode, pretty-print the JSON; otherwise minify it  */}}
{{- $jsonifyArgs := dict -}}
{{- if eq hugo.Environment "development" -}}
  {{- $jsonifyArgs = dict "indent" "  " -}}
{{- end -}}
{{- $index | jsonify $jsonifyArgs -}}

And now the index is created at public/index.lunr.json

Some notes about this index:

  • The Hugo template populates fields called “tags” and “technologies”, which are taxonomies I use here. You’ll want to adapt this to your own site’s taxonomies, or remove the taxonomy fields altogether.
  • I populate a “section” field, so that the section name like “blog” or “til” is searchable.
  • I run jsonify against the content and summary fields, which means I have to decode that JSON in my JavaScript (see below)
  • In development it creates a pretty-printed JSON file, but in production it’s minified.
  • At the time of this writing, the production index is 824KB.

Get lunr

To get it, you can use a whole package.json and npm install and spend 45 minutes every third Wednesday opting out of telemetry or whatever JavaScript people do. But in this case, the package has no dependencies and there’s no security concerns (all client side, no private data), so I decided to just install a single version and update it later if I feel like it.

  • Get the NPM package with npm pack lunr
  • This downloads (at the time of this writing) lunr-2.3.9.tgz to your current directory
  • Extract package/lunr.js and place it in your assets/ directory
  • You could instead get package/lunr.min.js, but I got the non-minified version and am using Hugo’s JavaScript minification on build

Install lunr

Include the lunr code in your <head>.

{{ $lunrJs := resources.Get "js/lunr.js" | resources.Minify | resources.Fingerprint }}
<script src="{{ $lunrJs.Permalink }}" defer></script>

You also need to define your own search function and decide how you want to display results. I decided I will have a search bar only on the main site page, and it’ll display results below the main content of my site page, and auto-scroll to the results when searching. It will add the search to the query string in the URL, add entries to the browser’s history so that back/forward buttons work, and run a search automatically if a page is loaded with a search term in the query string. This means it feels like having a separate results page, but it’s all just part of the main page of my site.

That meant adding this HTML to my home page where I want the search bar:

  <form id="search" class="search" role="search">
    <label for="search-input">
      
      <svg class="spritecore" viewBox="0 0 100 100"><use href="https://me.micahrl.com/images/spritesheet.svg#fa-solid-search" /></svg>
    </label>
    <input type="search" id="search-input" class="search-input">
  </form>

And adding this to my home page where I want the results to show up:

<template id="search-result" hidden>
  <article class="content post">
    <h3 class="post-title"><a class="summary-title-link"></a></h3>
    <summary class="summary"></summary>
  </article>
</template>

<div id="lunr-search-results"></div>

And this CSS:

@keyframes spin {
  100% {
    transform: rotateY(360deg);
  }
}

form.search {
  /* This needs to be large enough on mobile
   * so that focusing on the input doesn't cause zoom.
   */
  font-size: 1.5rem;
  border: 1px solid var(--body-fg-color-deemphasize-nontext);
  min-width: 1em;
  height: 1em;
  line-height: 1;
  border-radius: 1em;
  padding: 0.5em;
}

.search[data-running] .search-icon {
  animation: spin 1.5s linear infinite;
}

form.search input {
  width: 10em;
  color: var(--body-fg-color);
  background-color: var(--body-bg-color);
}

#lunr-search-results-title {
  margin-top: 6em;
}

#lunr-search-results summary {
  color: var(--body-fg-color-deemphasize-text);
  font-size: 85%;
}

@media (min-width: 600px) {
  form.search {
    font-size: 1rem;
  }
}

/* -*- mode: css -*- */

And this Javascript to run on page load. Much of this was adapted from Wladimir Palant’s code, with my own changes.

/* lunr search implementation
 */
window.addEventListener("DOMContentLoaded", function(event) {
  var index = null;
  var lookup = null;
  var queuedTerm = null;

  var form = document.getElementById("search");
  var input = document.getElementById("search-input");
  var resultsContainer = document.querySelector("#lunr-search-results");

  const entityDecoder = document.createElement("textarea");
  function decodeHTMLEntities(text) {
    text = text || "";
    entityDecoder.innerHTML = text;
    return entityDecoder.value;
  }

  form.addEventListener("submit", function(event) {
    event.preventDefault();

    var term = input.value.trim();
    if (!term)
      return;

    startSearch(term);
  }, false);

  function startSearch(term) {
    // Update URL with search term
    const searchParams = new URLSearchParams(window.location.search);
    searchParams.set('q', term);
    const newRelativePathQuery = window.location.pathname + '?' + searchParams.toString();
    history.pushState({searchTerm: term}, '', newRelativePathQuery);

    // Start icon animation.
    form.setAttribute("data-running", "true");

    if (index) {
      // Index already present, search directly.
      search(term);
    } else if (queuedTerm) {
      // Index is being loaded, replace the term we want to search for.
      queuedTerm = term;
    } else {
      // Start loading index, perform the search when done.
      queuedTerm = term;
      initIndex();
    }
  }

  function searchDone() {
    form.removeAttribute("data-running");
    queuedTerm = null;
  }

  function initIndex() {
    var request = new XMLHttpRequest();
    request.open("GET", "/index.lunr.json");
    request.responseType = "json";
    request.addEventListener("load", function(event) {
      lookup = {};
      index = lunr(function() {
        this.ref("uri");

        // If you added more searchable fields to the search index, list them here.
        this.field("title");
        this.field("section");
        this.field("content");
        this.field("summary");
        this.field("tags");
        this.field("technologies");

        for (var doc of request.response)
        {
          doc.content = decodeHTMLEntities(JSON.parse(doc.content));
          doc.summary = decodeHTMLEntities(JSON.parse(doc.summary));
          this.add(doc);
          lookup[doc.uri] = doc;
        }
      });

      // Search index is ready, perform the search now
      if (queuedTerm) {
        search(queuedTerm);
      }
    }, false);
    request.addEventListener("error", searchDone, false);
    request.send(null);
  }

  function search(term) {
    var results = index.search(term);
    // console.log(results);

    clearResults();

    var title = document.createElement("h2");
    title.id = "lunr-search-results-title";
    title.className = "list-title";

    if (results.length == 0)
      title.textContent = `No results found for "${term}"`;
    else if (results.length == 1)
      title.textContent = `Found one result for "${term}"`;
    else
      title.textContent = `Found ${results.length} results for "${term}"`;
    resultsContainer.appendChild(title);
    document.title = title.textContent;

    var template = document.getElementById("search-result");
    for (var result of results) {
      var doc = lookup[result.ref];

      // Fill out search result template, adjust as needed.
      var element = template.content.cloneNode(true);
      element.querySelector(".summary-title-link").href = doc.uri;
      // element.querySelector(".read-more-link").href = doc.uri;
      element.querySelector(".summary-title-link").textContent = doc.title;
      element.querySelector(".summary").textContent = truncate(doc.summary, 70);
      resultsContainer.appendChild(element);
    }
    title.scrollIntoView(true);

    searchDone();
  }

  function clearResults() {
    while (resultsContainer.firstChild) {
      resultsContainer.removeChild(resultsContainer.firstChild);
    }
    document.title = "Search"; // Reset the page title
  }

  // This matches Hugo's own summary logic:
  // https://github.com/gohugoio/hugo/blob/b5f39d23b8/helpers/content.go#L543
  function truncate(text, minWords) {
    var match;
    var result = "";
    var wordCount = 0;
    var regexp = /(\S+)(\s*)/g;
    while (match = regexp.exec(text)) {
      wordCount++;
      if (wordCount <= minWords)
        result += match[0];
      else
      {
        var char1 = match[1][match[1].length - 1];
        var char2 = match[2][0];
        if (/[.?!"]/.test(char1) || char2 == "\n")
        {
          result += match[1];
          break;
        }
        else
          result += match[0];
      }
    }
    return result;
  }

  // Check for search term in URL on page load
  const urlParams = new URLSearchParams(window.location.search);
  const searchTerm = urlParams.get('q');
  if (searchTerm) {
    input.value = searchTerm;
    startSearch(searchTerm);
  }

  // Handle popstate event (back/forward button)
  window.addEventListener('popstate', function(event) {
    const newUrlParams = new URLSearchParams(window.location.search);
    const newSearchTerm = newUrlParams.get('q');

    if (newSearchTerm) {
      input.value = newSearchTerm;
      search(newSearchTerm);
    } else {
      input.value = '';
      clearResults();
    }
  });

}, false);

/* -*- mode: javascript -*- */

Notes about this JavaScript:

  • I had to adapt it to use the fields I added in the index, and to decode the JSON “content” and “summary” fields.
  • The path to the index is hard coded.

References

Responses

Webmentions

Hosted on remote sites, and collected here via Webmention.io (thanks!).

Comments

Comments are hosted on this site and powered by Remark42 (thanks!).