Using Shiki to highlight code in elm-pages 3

Recently, I upgraded this website to version 3 of elm-pages and made a new stylesheet (you can actually read things now?!). Version 3 of elm-pages brings significant changes compared to v2; most notably the extension of the old DataSource API to BackendTask. This, combined with the new serverless page type, transforms elm-pages into a complete full-stack framework! In theory, you can now hook up your entire database using custom backend tasks and build an entire B2B SaaS product using elm-pages! I think that is pretty cool. If you're curious about the other updates, please also check out the official announcement!

For me personally, I finally wanted to start writing more again (and actually putting it on the internet). Since I will be mostly talking about programming, it wanted to make sure that code looked as nice as possible.

Why Choose Shiki?

Similar to highlight.js and prism.js, shiki is a JavaScript syntax highlighter that supports a wide array of themes and programming languages. What sets shiki apart is its utilization of the same parser, themes and language specifications as VS Code. It is so good at mimicing VS Code that even the VS Code documentation uses shiki for their syntax highlighting! By integrating shiki, I can directly import my editor settings, and shiki will ensure that code on my blog lopls exactly as it does in my editor.

Additionally, shiki offers an API to convert code into a list of tokens, which can be easily serialized to JSON and sent to Elm. We can then render those tokens on the Elm side, without resorting to using innerHTML tricks.

In some totally unrelated news, there is also a file called plugins/Shiki.elm in the elm-pages github repository. I totally thought of using shiki on my own before finding this!

Implementation Steps

Upon discovering this, my initial instinct was to just look at some usage examples within the elm-pages source tree and adapt the relevant portions for myself. Unfortunately for me though, the documentation seems to have switched to the elm-syntax-highlight package now, so there where no remaining usage examples left.

Copying the Shiki module is still a great first step, but a crucial piece missing: how do we actually call shiki, and get the data into Elm? The module seems to expect a Highlighted JSON object, so lets make one of those custom backend tasks to give it one:

// custom-backend-task.ts
import { codeToTokens } from 'shiki'

export async function highlightCode({ code, lang }) {
    return await codeToTokens(code, { lang, theme })
}

On the Elm side, calling this backend task is straightforward:

highlight :
    { code : String
    , lang : Maybe String
    }
    -> BackendTask FatalError Highlighted
highlight { code, lang } =
    let
        jsonParams =
            Json.Encode.object
                [ ( "code", Json.Encode.string code )
                , ( "lang", Json.Encode.Extra.maybe Json.Encode.string lang )
                ]
    in
    BackendTask.Custom.run "highlightCode" jsonParams Shiki.decoder
        |> BackendTask.allowFatal

Integration with elm-markdown

Integrating shiki with elm-markdown requires a bit more effort than expected. While elm-markdown supports defining custom tags, these tags must have string attributes and content. Consequently, we would be required to encode and decode our Highlighted structure again, just to get it into a custom tag! Additionally, custom tags can also include nested blocks, which we would also have to handle somehow while rendering.

Instead, I decided to extend the Block custom type by wrapping it in my own type:

type Block
    = Markdown Markdown.Block.Block
    | Code Shiki.Highlighted

loadMarkdownBody : String -> BackendTask FatalError (List Block)
loadMarkdownBody filePath =
    BackendTask.File.bodyWithoutFrontmatter filePath
        |> BackendTask.allowFatal
        |> BackendTask.andThen
            (\rawBody ->
                Markdown.Parser.parse rawBody
                    |> Result.mapError
                        (\_ -> FatalError.fromString "Markdown parsing error!")
                    |> BackendTask.fromResult
            )
        |> BackendTask.andThen
            (\blocks ->
                blocks
                    |> List.map
                        (\block ->
                            case block of
                                Markdown.Block.CodeBlock { body, language } ->
                                    Shiki.highlight { code = body, lang = language }
                                        |> BackendTask.map Code

                                _ ->
                                    BackendTask.succeed (Markdown block)
                        )
                    |> BackendTask.combine
            )

In the second andThen step, each markdown block is transformed into another BackendTask for further processing. For most blocks, we simply succeed, but for code blocks, we utilize the highlight function we defined earlier.

To render our custom block structure, we assume that rendering a list of markdown blocks is equivalent to rendering each block individually, and then combining the results (mathematically speaking, we assume Markdown.Renderer.render is additive). Looking at the relevant elm-markdown source, we can see that render is implemented using filterMap, implying that each block is rendered separately.

Knowing that, implementing our own render function is straightforward:

render : List Block -> List (Html msg)
render =
    List.concatMap
        (\block ->
            case block of
                Markdown markdown ->
                    markdown
                        |> List.singleton
                        |> Markdown.Renderer.render
                            Markdown.Renderer.defaultHtmlRenderer
                        |> Result.withDefault []

                Code highlighted ->
                    [ Shiki.view [] highlighted ]
        )

And that's it! If you want to look at the results, just read this page again, but this time, be more careful! You will notice right away.