I built a utility theme for Hugo that facilitates archiving tweets, and a companion Python tool that processes tweet data for the theme. It’s called twarchive.
Embedded tweets look like this:
There is a comprehensive archive of all my tweets and more on a separate Hugo site at https://tweets.micahrl.com. That site uses mostly default styles from the theme, and is useful as an example. You can see its repo on github mrled/tweets.micahrl.com.
The theme source code is also on github mrled/twarchive, and the readme contains detailed install and usage instructions.
- Keep a local copy of tweets in high fidelity, even if they are deleted or otherwise unavailable from Twitter
- Keep a local copy of all media
- Do not give tracking information to Twitter or any other third party
- Allow tweets to be downloaded
This project has two components: a Python program to download tweet data to JSON files, and a Hugo theme module that renders them for the site.
Python program to download tweet data
The program understands Hugo sites, the Twitter API, and Twitter archives.
It can retrieve tweets from the Twitter API directly, and can also grab related tweets like thread parents, quote tweets, and retweets.
It can parse Twitter archive, and embed tweets without calling the API. This is especially useful for very old tweets, or if you have a tweet archive from a deleted account.
It can scan your Hugo posts for tweets embedded with twarchive’s shortcodes and download them or pull them from an archive, along with related tweets.
It works around Hugo’s limitation that it cannot generate a new page from data. Tweets are saved to JSON files inside Hugo’s data folder, but Hugo cannot create a page from data this way. The Python program creates a page for each tweet in the data folder instead.
Hugo theme, generated HTML
Each tweet is an iframe to a self-contained HTML file.
Images and videos are base64-encoded
data: URIs which are saved directly in the HTML.
Tweet styles are self contained and not affected by site styles.
Dark mode is supported if the user has set
but any site-specific toggles to enable dark mode
like I have
will not work.
Each tweet has a download button allowing for any user to easily make a copy of their own. Hat tip to Terrence Eden for explaining how this works.
data:URIs are unweildy. Chromium-based browsers refuse to display
- Capturing polls in tweets is not possible unless we use the v2 API. This implementation uses the v1.1 API because it is easier to get started, while v2 requires manual approval from Twitter 🙄.
- Styling could use some improvements, especially for tweet threads.
- Authentication: we can use the official twitter consumer key/secret for access to public data. love too skirt API key bullshit.
- Page performance: Using iframes means there is some asynchrony in page load. Each tweet (including embedded images and video) are loaded in a frame separately. Depending on how many tweets you want to embed in a page, this might make performance better or worse.
- Hugo performance: Including thousands of extra pages in a Hugo site increases build time.
I originally wanted to keep all my tweets on this site with like a
/tweetsURI, but when that got too slow I moved them off to https://tweets.micahrl.com. Now only tweets that I embed are included in this site, and my entire Twitter history is on another site that doesn’t undergo heavy development.