I need to run a wiki for some personal stuff. I created a Docker container that runs Wiki.js supervised by Litestream and running on Fly.io. Here is how I did it.
The stack
Fly.io
Fly runs Docker containers on Firecracker VMs for you.
There is a bunch of cool magic you get from them on top:
- A nice
flyctl
command that is a really great CLI design - No host maintenance; just keep your app up to date, don’t worry about the host OS
- Direct access to each container with
flyctl ssh
- Fast logs that feels like just running Docker locally with
flyctl logs
- Let’s Encrypt certificates with just a CNAME and an extra command
- Persistent volumes for your app
There is a free tier that lets you run a few apps and store some data. Note that Wiki.js requires 1GB of RAM, which I think means this particular service can’t be deployed to Fly for free. That is ok for me, though, as my other option is deploying to Digital Ocean or similar, which also would cost money. See their pricing page for details.
Litestream
Litestream will constantly send the contents of a sqlite database to S3. This provides copies of the database at arbitrary checkpoints which Litestream can restore to a regular sqlite database file on demand.
Fly.io is now employing the creator of Litestream, and appears to be planning for tight integration in future products.
Wiki.js
Wiki.js is perfect for Fly.io + Litestream, because all data is stored in the database, even attachments like photos and PDFs.
Backing up the data is a one step task with Litestream and Wiki.js.
How to deploy Wiki.js to Fly.io under Litestream
Prerequisites
This assumes you already have an S3 bucket created. (Some notes on that later.) You’ll need:
- The S3 bucket itself
- An IAM user with permission to write to the bucket
- Access keys for the IAM user (an “access key ID” and its corresponding “secret access key” in AWS IAM parlance)
You may also want to use your own custom domain name.
I am using wiki.micahrl.com
for mine.
(If not, you can just use your-unique-app-name.fly.dev
for your wiki.)
I also recommend a git repository to keep yours in. This will likely be different for everyone; I keep mine in the repo for my psyops project.
Assuming you have those requirements, let’s get started.
Create a Dockerfile
To run Wiki.js on Fly.io without Litestream, you can use the official requarks/wiki
Docker container.
However, to use Litestream, you must add Litestream to that container.
Litestream has support for
running as a wrapper around another command
which makes adding it to another Docker container very easy –
just have your container run litestream replicate --exec 'the full command to wrap'
.
My Dockerfile looks like this:
# Dockerfile for running Litestream + Wiki.js on Fly.io
FROM litestream/litestream AS litestream
FROM ghcr.io/requarks/wiki:2
COPY --from=litestream /usr/local/bin/litestream /usr/local/bin/litestream
COPY litestream.yml /etc/litestream.yml
COPY start.sh /usr/local/bin/start.sh
USER root
RUN true \
&& chmod 755 /usr/local/bin/litestream \
&& chmod 755 /usr/local/bin/start.sh \
&& chown node /etc/litestream.yml \
&& touch /testfile \
&& true
USER node
ENTRYPOINT []
CMD ["/usr/local/bin/start.sh"]
Some notes about that:
- We use the official Litestream Docker container just so we can copy the litestream binary from it. The binary is statically linked and it just works!
- The
ENTRYPOINT
andCMD
are modified from the values defined inrequarks/wiki
. I based this value on the Litestream documentation mentioned earlier combined with the command from the official Wiki.js Docker image; see below for how I found that.
Our Dockerfile copies a litestream.yml
file from its directory, which looks like this:
dbs:
- path: ${DB_FILEPATH}
replicas:
- type: s3
bucket: ${LITESTREAM_S3_BUCKET}
path: ${LITESTREAM_S3_PATH}
region: ${LITESTREAM_S3_REGION}
All of those are environment variable names, which we will tell Fly to provide when it deploys, and which Litestream will read at runtime.
Finally, we run start.sh
from the Dockerfile:
#!/bin/sh
set -eu
# Restore the database from S3 if and only if there is no local copy of the database
/usr/local/bin/litestream restore -if-db-not-exists "$DB_FILEPATH"
# Run the Wiki.js docker-entrypoint.sh script, supervised by Litestream
/usr/local/bin/litestream replicate -exec "/usr/local/bin/docker-entrypoint.sh node server"
Deploy to Fly.io
First, decide on an app name.
I chose com-micahrl-wiki
for mine,
because I like reverse DNS style names, and names cannot contain a dot.
Log in and create your application, data volume, and secrets
# Create a directory to hold your configuration
# (Can be the root of a new git repo, or a subfolder of an existing one, whatever)
mkdir wiki.micahrl.com
cd wiki.micahrl.com
# Log in
flyctl auth login
# Create fly.toml
flyctl launch --no-deploy --name com-micahrl-wiki
# Make a 1GB volume
flyctl volumes create data --app com-micahrl-wiki -s 1
# The S3 access key you created in advance
flyctl secrets set \
LITESTREAM_ACCESS_KEY_ID=XXX \
LITESTREAM_SECRET_ACCESS_KEY=YYY
The flyctl launch
command will have created a fly.toml
file.
You’ll need to edit this by making the env
section look like this:
[env]
# Use https://your-app-name.fly.dev for now - you can change it to a custom domain later
url = "https://com-micahrl-wiki.fly.dev"
# Wiki.js variables
DB_TYPE = "sqlite"
DB_FILEPATH = "/mrldata/wikijs.sqlite"
# Handled by Litestream itself
LITESTREAM_S3_BUCKET = "com-micahrl-wiki-litestream-bucket"
LITESTREAM_S3_PATH = "wikijs.sqlite"
LITESTREAM_S3_REGION = "us-east-2"
And setting the internal port to 3000 (the port that Wiki.js uses):
[[services]]
internal_port = 3000
And mounting the data volume you created:
[mounts]
source = "data"
destination = "/mrldata"
The full version of my fly.toml
is
on GitHub.
Now deploy your app:
# Deploy the app itself
flyctl deploy
# 1GB RAM is the Wiki.js minimum requirement
fly scale memory 1024
At this stage, you should be able to visit https://your-app-name.fly.dev
and log in to the wiki,
but if you want to use a custom domain name,
don’t do the first-run wiki configuration until we set the domain name and get certificates working.
- Create a CNAME for your custom domain name (I used
wiki.micahrl.com
) to yourfly.dev
hostname (mine iscom-micahrl-wiki.fly.dev
). - Run
flyctl certs add wiki.micahrl.com
(using your own name) to provision certificates - Change the
url
tohttps://wiki.micahrl.com
(using your own name) infly.toml
- Run
flyctl deploy
again to pick up the change
Now you can log in to the wiki using your custom domain name and do the first-run configuration.
That’s it! Your wiki is now up and running.
Maintenance
A few tasks you will need to know how to do over time.
Restoring the database on the commandline
This is very easy – just set up your AWS credentials and run a single command.
export AWS_ACCESS_KEY_ID=AKIAxxxxxxxxxxxxxxxx
export AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/xxxxxxxxx
litestream restore -o wikijs.sqlite s3://com-micahrl-wiki-litestream-bucket/wikijs.sqlite
(Of course, substitute your own bucket name and backup path.)
You can examine the result with sqlite:
bash> sqlite3 wikijs.sqlite
sqlite> .tables
_litestream_lock commentProviders pageHistory settings
_litestream_seq comments pageHistoryTags storage
analytics editors pageLinks tags
apiKeys groups pageTags userAvatars
assetData locales pageTree userGroups
assetFolders loggers pages userKeys
assets migrations renderers users
authentication migrations_lock searchEngines
brute navigation sessions
sqlite> select * from pages;
1|home|b29b5d2ce62e55412776ab98f05631e0aa96597b|Wiki Home||0|1||||# Wiki Welcome
See also <https://me.micahrl.com>
|<h1 class="toc-header" id="wiki-welcome"><a href="#wiki-welcome" class="toc-anchor">_</a> Wiki Welcome</h1>
<p>See also <a class="is-external-link" href="https://me.micahrl.com">https://me.micahrl.com</a></p>
|[{"title":"Wiki Welcome","anchor":"#wiki-welcome","children":[]}]|markdown|2022-05-21T23:29:32.844Z|2022-05-21T23:43:25.915Z|markdown|en|1|1|{"js":"","css":""}
2|sandbox|385919f3575186b3410ad6e08ca821b49496413f|Sandbox|Stuff in here is just for fucking around|0|1||||# Sandbox
Stuff in here is just for fucking around
|<h1 class="toc-header" id="sandbox"><a href="#sandbox" class="toc-anchor">_</a> Sandbox</h1>
<p>Stuff in here is just for fucking around</p>
|[{"title":"Sandbox","anchor":"#sandbox","children":[]}]|markdown|2022-05-21T23:51:09.803Z|2022-05-21T23:51:11.761Z|markdown|en|1|1|{"js":"","css":""}
Restoring the database after losing your data volume
Let’s say that something went wrong with the data volume. Perhaps your application and its data volume are accidentlly deleted from Fly. How do you recover?
This Dockerfile restores for you automatically.
start.sh
runs /usr/local/bin/litestream restore -if-db-not-exists "$DB_FILEPATH"
before starting replication and running your app.
This means that if the database is not present in the data volume but there is a replication in S3,
Litestream copies it from S3 first.
If it’s already on the data volume,
Litestream just manages your app as normal.
You can test this by deleting your application and then re-deploying, but in my testing the Let’s Encrypt certificate management system got confused when I deleted and re-deployed my app with the same hostname. Instead, it’s less error-prone to make a copy of your data and use it to deploy a temporary copy of your app.
Of course, make sure to change the wiki homepage or make some other obvious edit, so that you can tell it gets restored properly.
- Copy the data on S3 to a new folder.
I used Cyberduck to do this in a GUI.
I called the new folder
wikijs-copy.sqlite
in S3. - Copy all the files in your directory to a new temporary location –
mkdir testing-app/ && cp Dockerfile fly.toml litestream.yml start.sh testing-app/
- Change to that new temporary –
cd testing-app
- Edit
fly.toml
inside the temporary direcory- Change
LITESTREAM_S3_PATH = "wikijs-copy.sqlite"
- Change the URL to a new application name like
url = "https://com-micahrl-wiki-2.fly.dev"
- Change
- Deploy the new temporary copy of the app
flyctl launch --no-deploy --name com-micahrl-wiki-2
flyctl volumes create data --app com-micahrl-wiki-2 -s 1
flyctl secrets set \
LITESTREAM_ACCESS_KEY_ID=XXX \
LITESTREAM_SECRET_ACCESS_KEY=YYY
flyctl scale memory 1024
flyctl deploy
If you do this it will come up with a distinct copy of your wiki’s data! Try editing both copies; note that they are now independent of each other.
When you’re done testing, destroy your temporary app. This will also delete the secrets and the data volume.
flyctl destroy com-micahrl-wiki-2
Upgrading Wiki.js
I am living somewhat dangerously and using FROM ghcr.io/requarks/wiki:2
in my Dockerfile.
This means that every time I run flyctl deploy
,
it will get the latest version of the Wiki.js container in the 2.x series
and base my wiki’s container on that.
The upside is that I don’t have to think about upgrades;
the downside is that a new version may break something.
You could instead set a specific version like FROM ghcr.io/requarks/wiki:2.5.283
,
and it would always use that version.
You can see a a list of all tags on Dockerhub,
and be in full control of when to upgrade.
If you did this, you could also test in a staging environment before upgrading the main site.
Growing the data volume size
At the time of this writing, Fly does not let you grow your own data volumes :(. They can do it for you if you open a support ticket.
However, since Litestream restores are automatic, you can destroy your application and re-deploy it. If you are using a custom domain name, this may cause temporary (up to 24 hour) problems with the HTTPS certificate for your site, so this is best kept as a last resort.
Other notes
A few auxiliary things that might be helpful.
Creating an S3 bucket and an IAM user with Terraform
Litestream backs up to S3 (and several other data stores). To use this, you’ll need to create an S3 bucket, a user that can write to it, and security credentials for that user.
I wanted to use Terraform to do most of that, so I wrote this Terraform configuration that creates:
- The S3 bucket
- An IAM policy that allows writing to the bucket
- An IAM group, that the policy attaches to
- An IAM user, in that group
Then I went to the AWS web console to create the security credentials for the user I created.
If Terraform isn’t your thing, you could do this with CloudFormation or just in the AWS web console directly.
Require authentication in Wiki.js for viewing pages
I want my wiki to be private by default, but allow users to mark some pages for public viewing. To do that:
- Log in as a user with admin privileges to Wiki.js
- Navigate to the admin area -> Groups -> Guest -> Page Rules tab
- Allow
read:pages
andread:assets
but notread:comments
, and only if the tag matchespublic
- Go to your home wiki page (and/or any other page you want to be publically visible) and tag is as
public
Unfortunately, you must prohibit reading comments, or else logged-out users will get an error on every page load that says “An unexpected error occurred”. Perhaps the Wiki.js team will fix this in a future release.
How to find the correct CMD
for the Wiki.js Docker container?
Our Dockerfile’s CMD
runs start.sh
, which runs litestream replicate -exec '...'
.
How do we know the right value to pass to -exec
?
In general, to find this,
look in the Dockerfile and combine the ENTRYPOINT
and CMD
values.
(Some containers have only one or the other;
if a container has both, as Wiki.js does, ENTRYPOINT
comes first and then CMD
.)
I could not find the production Dockerfile for Wiki.js though; unless I just missed it somewhere, I figure it is probably built as part of their build system. Rather than figure that out, I cheated by running this command (via):
dockcer pull requarks/wiki
docker inspect --format='{{range $e := .Config.Env}}
ENV {{$e}}
{{end}}{{range $e,$v := .Config.ExposedPorts}}
EXPOSE {{$e}}
{{end}}{{range $e,$v := .Config.Volumes}}
VOLUME {{$e}}
{{end}}{{with .Config.User}}USER {{.}}{{end}}
{{with .Config.WorkingDir}}WORKDIR {{.}}{{end}}
{{with .Config.Entrypoint}}ENTRYPOINT {{json .}}{{end}}
{{with .Config.Cmd}}CMD {{json .}}{{end}}
{{with .Config.OnBuild}}ONBUILD {{json .}}{{end}}' requarks/wiki
Which returned these results:
ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ENV NODE_VERSION=16.15.0
ENV YARN_VERSION=1.22.18
EXPOSE 3000/tcp
EXPOSE 3443/tcp
VOLUME /wiki/data/content
USER node
WORKDIR /wiki
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["node","server"]
Note that this is not an exact copy of the input Dockerfile, which might have been much more complicated. However, it has enough for our purposes here, telling us:
- What ports it uses
- Where wiki content is by default (although I override that with
DB_FILEPATH
anyway) - What user is running the server
- The
ENTRYPOINT
andCMD
See that in our Dockerfile we unset ENTRYPOINT
and set CMD
to call litestream
,
and then pass the upstream ENTRYPOINT
+ CMD
to the litestream
command:
ENTRYPOINT []
CMD ["/usr/local/bin/litestream", "replicate", "--exec", "/usr/local/bin/docker-entrypoint.sh node server"]