Build your own HTMX

Recently, HTMX has been receiving a lot of attention. However, there have also been concerns that 14kb almost 16kb is quite large for a library aiming to reduce the amount of Javascript. And while this comparison is not quite fair (actually far from it!), Preact comes in at just 3kb, and is seen in roughly the same space of "React alternatives".

In this post, I attempt to show that the core of HTMX is actually quite simple to replicate. My goal is to make what I would consider myself to be 80% of HTMX work, in 10% of the size. To me, this means being able to fully run the original active search demo, without any modifications. Of course, my 80% might not be yours, hence this post instead of just another library. In general, I assume that you are already familiar with HTMX; if you are not, please just close this page and check it out. It is a good library.

If you just want to check out the result here is the final active search demo, with the final utmx.js script

HTMX offers a variety of advanced features and is highly extensible and configurable to justify its size. Including its predecessor Intercooler.js, it has been in development for over a decade. It handles a lot of edge cases, supports ancient browsers, but is also able to do the cool modern stuff, like using view transitions!

But what if you don't need all of this? What if all you care about is the "core spirit" of HTMX?

Why should only <a> and <form> be able to make HTTP requests?
Why should only click and submit events trigger them?
Why should only GET and POST methods be available?
Why should you only be able replace the entire screen?

Surely we can do better? Implement a sort of micro-htmx that just implements the core functionality?

As it turns out, it's not all that complicated once you get the core logic in place. All of this can in fact be implemented in less than 300 lines of code. While some aspects of the "HTMX feel", such as inheritance, is still missing, it's pretty straightforward to extend the final script to include custom features, like custom oob swaps.

HTMX POWERS ACTIVATED

Somehow, somewhere, HTMX is triggered. We will worry how we wire up those events in a little bit, but I think we should first figure out what to do next. Implement the core business logic, so to speak.

Once the callback fires, we roughly need to do this:

  1. Collect the body and other configuration
  2. Send an AJAX request using the appropriate method and URL
  3. Parse the returned HTML fragment
  4. Swap the target with the new content

Naturally, most of HTMX' complexity arises here. Let's start by making a simple request without a body, replacing the outerHTML of the triggering element. For now, let's aim at getting the basic functionality working, without worring about hx-trigger, hx-swap, and other attributes yet:

const domParser = new DOMParser()
async function htmx(elt, method) {
  // fallback to the current URL if no hx-[verb] attribute is given
  const url = elt.getAttribute(`hx-${method}`) ?? location.href

  const swapSpec = getSwap(elt)
  const target = getTarget(elt)
  if (!target) {
    return // no target, nothing to do
  }

  const body = method !== 'get'
    ? getBody(elt)
    : undefined

  const request = await fetch(url, { method, body })
  if (!request.ok || request.status === 204) {
    return // don't swap on NO CONTENT or error
  }

  const html = await request.text()
  const doc = domParser.parseFromString(`<body><template>${html}</template></body>`, 'text/html')
  const content = doc.querySelector('template').content

  swap(swapSpec, target, content)
}

function getBody(elt) {
  // TODO: collect form data, hx-params, hx-vars, hx-include, etc.
  return undefined
}

function getTarget(elt) {
  // TODO: support hx-target
  return elt
}

function getSwap(elt) {
  // TODO: support hx-swap
  return undefined
}

function swap(spec, target, content) {
  // TODO: wire up event listeners, etc.
  target.replaceWith(content)
}

Let's check it out! This little example sets up the event listener on the button manually, but after clicking it, it does the HTMX swap for real already!

Of course, we left a bunch of TODOs for us to support more things later, once we need them. But we kind of already got something working, we might just want to wire everything up, such that event listeners are added and removed automatically.

Call me summer 'cause I'm plumbing

HTMX attributes can be divided into two major groups:

  1. Behavioural attributes (like hx-put, hx-on, or hx-sse) that, when present on an element, add HTMX powers to this element. They setup the internal state and event listeners.
  2. Configuration attributes (like hx-swap or hx-params), that don't add behaviour themselves, but modify what happens when an event is triggered.

It's worth noting that hx-trigger is unique in that it acts like a "sister attribute" to hx-[verb], and is also not inherited. The only reason we use hx-[verb] as the attribute to look for is because hx-trigger is optional.

So what we need to do is watch all elements and their attributes. Whenever an element with a behavioural attribute is added or removed, we can add or remove HTMX powers from that element. For simplicity's sake, rather than using a MutationObserver, we'll assume that the DOM remains static after we swap - or at least that we can call init manually after it changes.

This approach also mirrors HTMX's behavior. If that's a concern, you might want to check out regular-elements or its more famous big sister wicked-elements. As we will see, HTMX powers fit quite nicely into a CustomElements-based API!

Let's first figure out how we would connect and disconnect our event listeners from a single element, and after that make sure it happens for all elements it needs to automatically.

// a property on the DOM nodes, holding our state
const stateProp = Symbol()

// trigger is our event handler function
const verbs = ['get', 'post', 'put', 'patch', 'delete']
function trigger(evt, spec) {
  const elt = evt.currentTarget
  // I'm not sure what a useful way to support multiple hx-[verb]
  // attributes might look like...
  const verb = verbs.find(verb => elt.hasAttribute(`hx-${verb}`))
  if (!verb) {
    return //hx-[verb] attribute got removed, nothing can be done.
  }

  // TODO: hx-sync, hx-confirm, hx-push-url, ...
  evt.preventDefault()
  htmx(elt, verb)
}

// connect initializes all event handlers on an element
function connect(elt) {
  if (elt[stateProp]) {
    return // already initialized
  }

  // keep a list of cleanup/finalize/dispose callbacks to run on disconnect
  const state = elt[stateProp] = { cleanup: [] }

  const triggers = parseSpecAttribute(elt, 'hx-trigger')
  // no triggers found, add the "natural trigger" instead
  if (!triggers.length) {
    triggers.push(getNaturalTrigger(elt))
  }

  for (const spec of triggers) {
    const eventName = spec.value
    const handler = evt => trigger(evt, spec)
    elt.addEventListener(eventName, handler)
    state.cleanup.push(() => elt.removeEventListener(eventName, handler))
  }
}

// parse a spec attribute like hx-trigger or hx-swap
function parseSpecAttribute(elt, attributeName) {
  // TODO: inheritance
  const str = elt.getAttribute(attributeName)
  if (!str) {
    return []
  }

  // TODO: multiple specs, modifiers
  return [{ value: str.trim() }]
}

// get the spec for the natural event of the element
function getNaturalTrigger(elt) {
  if (elt.matches('input:not([type=submit],[type=button]),textarea,select')) {
    return { value: 'change' }
  } else if (elt.matches('form')) {
    return { value: 'submit' }
  } else {
    return { value: 'click' }
  }
}

First, we define a trigger function that wraps our previous htmx function and acts as the event listener. It takes an additional argument spec, which we will eventually use to pass down the configuration of the hx-trigger that this event listener belongs to. For now, we just need to figure out which method to trigger on the element.

Next, connect sets up the event listeners on an element. For this, it looks at the hx-trigger attribute, or adds the natural trigger if no such attribute exists. It keeps a list of those event listeners around, such that disconnect can then remove them properly. Again, we leave some TODOs for us to fill out later. Right now, only support a single event name inside of hx-trigger, without modifiers or even having multiple triggers on a single element.

function disconnect(elt) {
  const state = elt[stateProp]
  if (!state) {
    return // no instance, disconnect called twice
  }

  elt[stateProp] = null

  for (const cleanup of state.cleanup) {
    cleanup()
  }
}

disconnect simply calls all the cleanup functions added by connect and removes the state object, indicating that the element is no longer enhanced. Pretty straight-forward!

Isn't it weird how these functions mirror the custom elements lifecycle callbacks?
I wonder if that means anything...

Anyways, now that we have some structure to work with, we can start automatically connecting new elements when we discover them. How do we find new elements? Turns out we already know exactly when we have some new DOM, which is once we call swap! So we can also modify that function to initialize the new DOM we get, and ensure the old node is cleaned up, not leaving any dangling event listeners.

// we need to connect an element if it has any hx-[verb] attribute
const verbSelector = verbs.map(verb => `[hx-${verb}]`).join(',')

// query all children, and also this element if it matches
function* queryAllAndSelf(selector, ...elts) {
  for (const elt of elts) {
    if (elt.nodeType !== Node.ELEMENT_NODE) {
      continue
    }

    if (elt.matches(selector)) {
      yield elt
    }

    yield* elt.querySelectorAll(selector)
  }
}

// call connect on all htmx elements in a subtree
function init(...elts) {
  for (const elt of queryAllAndSelf(verbSelector, ...elts)) {
    connect(elt)
  }
}

// cleanup all htmx event listeners in a subtree
function deinit(...elts) {
  for (const elt of queryAllAndSelf(verbSelector, ...elts)) {
    disconnect(elt)
  }
}

// replace our swap function to also call init/deinit
function swap(spec, target, content) {
  deinit(target)
  // init first, because replaceWith moves our elements
  init(...content.children)
  target.replaceWith(content)
}

// on startup, init the entire document.
if (document.readyState !== 'loading') {
  init(document.body)
} else {
  document.addEventListener('DOMContentLoaded', () => init(document.body))
}

How are we doing?

No really, how are you? Close your eyes, take a deep breath for me, and slowly count to 4, will you? Now's a great time to grab a new cup of coffee, maybe walk around a little bit, and reflect.

We've written (ok, let's be honest: copy-pasted) a lot of code just now, and if it works, it would be so great! At the moment, only the most basic swaps are implemented, but we've already left some nice TODOs for us, and stubbed out many functions that we just need to implement. The entire script even generalizes - we attach CustomElement-like behaviour based on arbitrary selectors, but what if we want to have different behaviours based on different selectors? For instance, what about declarative drag-and-drop lists, or hx-on? Or perhabs even mount points for React components if we really need advanced client-side interactivity in some - let's call them "isles" - maybe? Isn't this just web components, again??

Before we get too crazy, let's take a moment to test things out. I've prepared an updated demo here! Spend some time to click around, look at the network tab, check out the full source code in the debugger.

...

I hope you had some fun messing with it! Unfortunately, it doesn't quite work yet, but we are so close! Fortunately, the only reason it doesn't work is because it attempts to use hx-target, which we never actually implemented!

But other than that, we are actually finished! I hope this showcases the basic structure. It took just about 100 lines of code, most of which was related to correctly wiring everything up. You can stop here, and extend it on your own. Or maybe you'd like to stick around, and we'll get a little bit closer to true HTMX. Implement hx-target, hx-swap, and so on. The goal for the rest of this post is to implement enough of HTMX to get active search to work, which I think will be a nice stopping point.

Implementing hx-target

Glad you stick around! 😊

Let's get our demo to finally work properly. Looking at the hx-target documentation, there appear to be just some prefixes that we need to handle, as well as the special string this, that refers to the element the attribute is defined on; but since we still don't support inheritance, that is just the same as the default anyways. In practice, HTMX supports some additional special strings (like window or document), but those are easy to add as well.

The most tricky options are next <selector> and previous <selector>; There are no built-in methods to query for those, so I'm going to borrow the same trick that HTMX does in that case: Query all elements on the document, and then use compareDocumentPosition to filter. It's messy and probably terribly slow (especially if you do something like next div), but it works.

function eatPrefix(str, prefix) {
  if (str.startsWith(prefix)) {
    return str.substr(prefix.length)
  } else {
    return false
  }
}

function hxQuerySelector(selector, context) {
  let suffix
  if (!selector || selector === 'this') {
    // TODO: inheritance
    return context
  // special keywords
  } else if (selector === 'window') {
    return window
  } else if (selector === 'document') {
    return document
  } else if (selector === 'next') {
    return context.nextElementSibling
  } else if (selector === 'previous') {
    return context.previousElementSibling
  // prefix-based selectors
  } else if ((suffix = eatPrefix(selector, 'closest '))) {
    return context.closest(suffix)
  } else if ((suffix = eatPrefix(selector, 'find '))) {
    return context.querySelector(suffix)
  } else if ((suffix = eatPrefix(selector, 'next '))) {
    return Array.from(document.querySelectorAll(suffix))
      .find(node => context.compareDocumentPosition(node)
        === Node.DOCUMENT_POSITION_FOLLOWING)
  } else if ((suffix = eatPrefix(selector, 'previous '))) {
    return Array.from(document.querySelectorAll(suffix))
      .findLast(node => context.compareDocumentPosition(node)
        === Node.DOCUMENT_POSITION_PRECEDING)
  } else {
    return document.querySelector(selector)
  }
}

function getTarget(elt) {
  return hxQuerySelector(elt.getAttribute('hx-target'), elt)
}

After replacing this function, our original demo now fully works!

Implementing hx-trigger

As I've said, the final goal is to fully support the active search demo. It has an hx-trigger attribute looking like this:

<input
  hx-trigger="input changed delay:0.5s, search"

This is quite a lot again! We have multiple triggers, need to support at least the changed and delay modifiers, and be able to parse durations for that 500ms value. There clearly is some state here - we need to store the last value and internal state of the delay - but how should we do that?

Let's start with something we roughly know how to do, and figure all of the rest out later. For now, let's just parse the attribute into an object that will look like this:

[
  { value: 'input', changed: true, delay: 500 },
  { value: 'search' }
]

Once we have structured data, we can think about how to handle all the different cases.

Often, it's nicer to do it the other way around - First try to "guess" a good model, do the implementation to validate your guess, and then start parsing/wiring thing up.
We actually did that in the beginning, when we started with htmx() and worked our way "upwards" until everything worked.
Here, we do it this way because our goal is to be able to handle this specific input.

Parsing Trigger Specs

The simplest thing we can probably parse is intervals - it's just a number and a suffix!

function parseInterval(str) {
  if (str.endsWith('ms')) {
    return parseFloat(str)
  } else if (str.endsWith('s')) {
    return parseFloat(str) * 1000
  } else if (str.endsWith('m')) {
    return parseFloat(str) * 60 * 1000
  } else {
    return parseFloat(str)
  }
}

We only need to be careful to handle ms first, to avoid accidentily going into the wrong branch.

Next, let's parse the full spec string. The attribute itself is a comma-separated list of specs; every spec starts with the main value (so in this case the event name), followed by a bunch of modifiers. A modifier can either just be a toggle, or a key:value pair, where the value might also be in some format that we want to parse. The modifiers are separated by spaces.

This is not entirely true - the from:selector modifier for example allows spaces and has to therefore always come last! We avoid having to write bespoke parsers for every attribute, though.

function parseSpecAttribute(elt, attributeName, modifierParsers = {}) {
  // TODO: inheritance
  const str = elt.getAttribute(attributeName)
  if (!str) {
    return []
  }

  const result = []
  // split on comma, and also remove surrounding spaces
  for (const specStr of str.split(/\s*,\s*/)) {
    if (!specStr) {
      continue // skip empty parts
    }

    // first "word" is the main value, rest is modifiers
    const [value, ...modifiers] = specStr.split(/\s+/)

    const spec = { value }
    for (const modifier of modifiers) {
      const comma = modifier.indexOf(':')
      if (comma >= 0) {
        // modifier with a value
        const modifierKey = modifier.substr(0, comma)
        const modifierValue = modifier.substr(comma+1)
        const modifierParser = modifierParsers[modifierKey]
        if (modifierParser) {
          // modifier with values in a special format, like `delay`
          spec[modifierKey] = modifierParser(modifierValue)
        } else {
          // simple modifiers with values
          spec[modifierKey] = modifierValue
        }
      } else {
        spec[modifier] = true // toggle modifier, like changed
      }
    }

    result.push(spec)
  }

  return result
}

Well, turns out I sort of lied? To me, this was the next obvious step, but it turned out to probably be the most complicated function we have to implement! At least it's done now, and we can re-use it for other attributes as well. I've also introduced a new modifierParsers argument, which we can use to inject our parseInterval function. We can update the call side inside of connect to include it like this:

function connect(elt) {
  // [...]
  const triggers = parseSpecAttribute(elt, 'hx-trigger', {
    delay: parseInterval
  })
  // [...]
}

We've got the "easy" part done! Next, we actually need to figure out what to do with those triggers.

Implementing changed

Again, let's start with the easier one - store the last value of the element in the state, filter out events if the value didn't change:

function trigger(evt, spec) {
  const elt = evt.currentTarget
  const state = elt[stateProp]
  if (!state) {
    return // not connected - should not happen?
  }

  const verb = verbs.find(verb => elt.hasAttribute(`hx-${verb}`))
  if (!verb) {
    return //hx-[verb] attribute got removed, can't do nothing
  }

  // TODO: hx-sync, hx-confirm, hx-push-url, ...
  evt.preventDefault()

  if (spec.changed && state.lastValue === elt.value) {
    return
  }

  state.lastValue = elt.value

  htmx(elt, verb)
}

I've copied the entire trigger function again, since we need to be careful when we call evt.preventDefault(). I decided that it makes most sense to stop the default behaviour even if we then decide to not handle the event, because the elements' value hasn't changed. Otherwise, we might trigger unwanted form submissions by accident.

Implementing delay

Did you know that if you had multiple triggers on the same element, their delay timeouts are actually shared? I didn't know that until I read the source for this post! Which means that if we want to match HTMX, we cannot just use some library like lodash.debounce, but have to implement it ourselves. But again, it turns out to not be that difficult or involved after all, if you just do it!

function trigger(evt, spec) {
  const state = elt[stateProp]
  if (!state) {
    return // not connected, mutation observer fired before events?
  }

  const verb = verbs.find(verb => elt.hasAttribute(`hx-${verb}`))
  if (!verb) {
    return //hx-[verb] attribute got removed, can't do nothing
  }

  // TODO: hx-sync, hx-confirm, hx-push-url, ...
  evt.preventDefault()

  if (spec.delay) {
    // start a timeout if delay is active
    if (state.timeout) {
      clearTimeout(state.timeout)
    }

    state.timeout = setTimeout(handle, spec.delay)
  } else {
    // no delay, dispatch directly
    handle()
  }

  function handle() {
    // [...] the previous trigger code [...]
  }
}

Again, we need to first figure out if we should be active, and only then continue handling the event. Since we might swap out the element while a delay is still active, we also need to update our disconnect function to clear a pending timeout:

function disconnect(elt) {
  // [...]
  if (state.timeout) {
    clearTimeout(state.timeout)
  }
}

Getting the request body

Looking at the parameters section in the HTMX docs, they do support many additional features, attributes and knobs again to support different use-cases:

This largly means that HTMX has to re-implement the form logic, including doing validation manually.

But let's say we just want to include surrounding forms? That is easy, because we can construct the body using FormData. Let's also support that one special case of also always sending the clicked element, since that is what we do in the active search demo, after all. What if we also don't need to support hx-include et al? Putting everything together, this means that we only have to handle a couple of distinct cases, and let the browser do most of the hard work. Hurray for web standards!

function getBody(elt) {
  // TODO: hx-include, hx-params, hx-vals, hx-validate, ...
  const form = elt.closest('form')
  if (form) {
    // use the browser to include a surrounding form
    return new URLSearchParams(new FormData(form))
  } else if (elt.name) {
    // always include the triggering element, if it has a name
    return new URLSearchParams({ [elt.name]: elt.value })
  } else {
    // no form, no name on the elt
    return undefined
  }
}

Implementing hx-indicator

hx-indicator adds the htmx-request class to the specified element whenever a request is in flight. While the docs only talk about supporting closest selector as a special case, why not just re-use our hxQuerySelector function and support all the other HTMX-y selectors as well? Let's go!

async function htmx(elt, verb) {
  // just use the elt as a fallback, so we always have an indicator
  const indicator = hxQuerySelector(elt.getAttribute('hx-indicator'), elt) ?? elt
  indicator.classList.add('htmx-request')
  try {
    // [...] previous htmx() code [...]
  } finally {
    indicator.classList.remove('htmx-request')
  }
}

Finally implementing hx-swap, for good measure

Can you feel the suspense?

While not strictly necessary for the active search demo, it would be such a shame to leave low-hanging fruit like this just... hanging there? We can also finally change our default from outerHTML to innerHTML, to match the behaviour of HTMX. Just one more function, bro!

function getSwap(elt) {
  const specs = parseSpecAttribute(elt, 'hx-swap')
  return specs[0]
}

function swap(spec, elt, content) {
  const initContent = () => init(...content.children)

  const swapStyle = spec?.value ?? 'innerHTML'
  if (swapStyle === 'innerHTML') {
    deinit(...elt.children)
    initContent()
    elt.replaceChildren(content)
  } else if (swapStyle === 'outerHTML') {
    deinit(elt)
    initContent()
    elt.replaceWith(content)
  } else if (swapStyle === 'delete') {
    deinit(elt)
  } else if (swapStyle !== 'none') {
    initContent()
    elt.insertAdjacentElement(swapStyle, content)
  }
}

The grand finale

We now have written sucessfully copy pasted less than 250 lines of code. We support sending basic HTMX requests, hx-target, hx-swap, hx-indicator, and a bunch of options in hx-trigger. It is enough to run the original active swap demo, without modifications. But, instead of having to include a 16kb library, our micro-htmx comes in at about one tenth of the size! Running terser and gzip on it gives me a final size of 1.5kb (3.2kb decompressed). If you ignore the lack of inheritance, our final version can actually run a substantial subset of the examples.

Check out the final working active search demo here!

If you haven't noticed, the code I showed here can be downloaded by grabbing the utmx.js file from the demo!

Of course, we've also left out a lot of features, so while it can do a lot for its size, it probably also only implements about 10% of HTMX (probably a bit more.). Most notably, we never cared about inheritance, we cannot hx-boost, there are no events whatsoever, and we don't support preserving elements in any way.

Part of the reason why I wanted to present this as a blog post instead of actually publishing a libary is that I highly suspect that as you get closer to supporting everything in HTMX, eventually you will have just re-implemented everything, most likely reaching those 16kb of the original library anyways (modulo some newer APIs). Instead, this is meant to be a rough baseline which you can build on and extend yourself, adding exactly the features you want. If you instead opt for the real deal (which I highly recommend you do, using a popular supported library instead of DIY/NIHing), I hope that you will have a better understanding of how these things actually work under the hood.


The code presented in this article and on all demo pages is licensed under Zero-Clause BSD. Do with it whatever you want!