Blog - 2019
2019
5 years of jebrosen.com
In my final year of high school I was reading a lot of blogs by college students. Articles like, "23 things I wish I knew freshman year", or "87 true facts about college, you won't believe number 6". There was one piece of advice that really stuck with me: have a website. By that time I already had a web presence on various forums, IRC channels, and I think a few web pages somewhere, but nothing concrete or worth sharing. But now I was inspired. I was going to make a website for myself that could act as a portfolio, a blog, whatever I wanted!
More importantly: I had an excuse to learn node.js.
Why node.js? I'm not entirely sure, since that was five years ago. I was spending a lot of my free time learning new programming languages for fun, and maybe server-side JavaScript was just another one to add to the list. Plus, I could reuse the same code between the client and the server side! (spoiler alert: that never actually happened.)
The earliest iterations of "home-jebrosen" were hosted on OpenShift, and it used the express.js web framework. I used the lightweight database nedb for storage, because at the time I didn't want the complexity of a "real" database. 2014 was a wild year for that project. I vaguely remember cycling between several async libraries and syntaxes, including DIY callback stacks and the async library, all the while using express.js. A notable architectural feature of this version was the use of AJAX to switch between pages. At the end of it all, my 4-page website was a SPA CMS implemented in JavaScript. I could have accomplished the same with 4 HTML pages in Notepad, but of course that wouldn't have been nearly as fun.
I used my website backend as my playground for learning new JS frameworks, libraries, and syntaxes. Over the course of another year or two, I went through early versions of koa and its coroutine/generator style, promises, arrow functions, async/await, and sometimes transpiled with babel or used experimental versions of node.js. I dropped the dynamic functionality and stuck with server-side rendering with the Jade template engine (now known as Pug.
Another notable change I made around that time was the switch to SQLite. We wanted to try SQLite for a project at school, and it was the perfect excuse to rewrite some core functionality of my website for the umpteenth time! I am very happy with SQLite overall and will probably always reach for it first for small projects.
The other, much larger, change to my website also happened in college: Rust. Rust looked like the programming language I had been searching for for years: something low-level and performant like C and C++ but without so many footguns. I loved C for its simplicity, but it is too unsafe for my tastes. C++ solves some of the most annoying things about writing C code—especially with RAII and templates—but it introduces some complexity and downright awkwardness that I was unhappy with. As a minor example, I have read several explanations of rvalue references and never felt like I actually understand them. The situation is the same with variadic templates and SFINAE, and it was frustrating to peek into standard library implementations and see all of these things combined all the time.
Rust claimed to be a kind of goldilocks language ("fast, reliable, productive - pick three"). The syntax looked close enough to C++ for me to pick it up fairly quickly. And it felt like a good time to learn another programming language that was more my style (sorry, Perl, that side-eye is directed at you). After reading through the Rust Book and some other guides, I needed a real project to experience the language. So I just rewrote my entire website in Rust.
I started with the iron web framework, which seemed like a solid choice at the time. I used horrorshow templates, which have a syntax similar to Pug in the ways I cared about but unlike Pug were checked for validity at project compile time. Over the course of a month I rewrote everything, including the site, the blog, and the administration backend. I named the new project sirus: SImple RUst Site. A few months later, the iron project announced that it was going to be unmaintained, and I started looking elsewhere.
Somehow I settled on Rocket, which introduced me to nightly rust and comparatively heavy amounts of code generation. I really liked Rocket's approach to route definitions, especially with request and data guards. For brevity I will not tell the story here, but I am now one of Rocket's maintainers: I help answer support questions for the project, and for the past year or so I have been working on a migration to Rust's new async/await functionality.
Rocket is here to stay as part of my website for a while; it eliminates certain pain points that I would likely struggle with again if I were to switch, and I have no good reason to use anything else now. That's not to say that development has remained stagnant! I have still carried on my tradition of rewriting something every few months - I have gone through the tera, askama, and maud template engines. I also went to sqilte-with-diesel, to postgres-or-sqlite-with-diesel, to postgres-without-diesel. (The switch to postgres was motivated by using postgres for other things on my server anyway, and having everything in one system makes backups easier to deal with. I would probably still be using sqlite otherwise.)
So what's next? The math says I am overdue to switch template engines again, but that's pretty low on my priority list right now. I have been spending the bulk of my work-unrelated programming on making Rocket async, and it will be nice to work on some of my other projects too. If I get an itch to try something new, it could end up in my website in one form or another. Or not. After 5 years, maybe it is finally time to stop rewriting the world.
Adventures in Response Composition
Background
In web application servers it is useful to perform operations on "finished" responses before they are sent to a client. You might want to set caching or other headers, or compress the body data. Some web frameworks, including Rocket, have a system for user-defined middleware or response wrappers that can perform these operations in a reusable and composable way.
There are a wide variety of operations you might want to apply to a response after it has been built:
- Add a
Server
or other global header - Set the
Content-Type
,Content-Disposition
, or another response-specific header - Compress the response body and set the
Content-Encoding
header - Checksum the response and save the checksum in the
ETag
header - Compare the
If-None-Match
header in the request to theETag
of the response, and respond with a304 Not Modified
if they match - Serve a portion of the body selected by the
Range
header in the request - Add an
X-Response-Time
header indicating how much time the server spent processing the request
A server might perform some operations on every response, or only on specific
routes or when certain conditions are met. In Rocket, the idiomatic way to
operate on individual responses is the Responder
. Some responder types, such
as String
or Vec
, set the response body to the bytes they contain.
Other responders, such as Content
, modify the result of another Responder
.
These are known as wrapping Responder
s,
and they are the building blocks of composable operations on responses.
Consider the following route:
use rocket::http::ContentType;
use rocket::response::Content;
#[get("/hello")]
fn hello() -> Content<&'static str> {
Content(ContentType::JSON, "{\"hello\": \"world\"}")
}
When this route returns, Content::respond_to()
is called. It is a wrapping
responder: its implementation is "Run the inner responder, then set the
Content-Type
header to the specified ContentType
(in this case
application/json
)." Here the inner responder type is &str
, and its
respond_to
implementation is "set the response body to the UTF-8 bytes
underlying this string and set the Content-Type
to text/plain".
Because Content
sets the Content-Type
header after the inner responder
has run, the final response sent to the client will have Content-Type:
application/json
as desired.
Handling Content-Range
Another useful wrapping responder might be the "Range request handler". Range
requests are commonly used for resuming downloads and skipping around in
streaming media. Suppose we had a wrapping responder called Range
, used like
this:
#[get("/video.mp4")]
fn video() -> Range<File> {
Range(File::open("video.mp4"))
}
Range
might implement respond_to
in the following way:
- Run the inner responder - in this example,
File
- Check if the client sent a
Range
header - Grab the requested portion of the response body
- Set the response body to be only the portion
- Set the
Content-Range
header on the response
In a real project this wouldn't show as a video in most browsers, because we
forgot to set the Content-Type
header. Let's fix it:
#[get("/video.mp4")]
fn video() -> Range<Content<File>> {
Range(Content(ContentType::MP4, File::open("video.mp4")))
}
Hmm. What about Content<Range<File>>
?
#[get("/video.mp4")]
fn video() -> Content<Range<File>> {
Content(ContentType::MP4, Range(File::open("video.mp4")))
}
That works too! Content
and Range
can safely be reordered because they
don't interfere with each other in any way.
Suppose we decided Range
is a really nice feature and we built it into Rocket
directly, so every File
will handle range requests automatically:
#[get("/video.mp4")]
fn video() -> Content<File> {
Content(ContentType::MP4, File::open("video.mp4"))
}
Much simpler, and now we don't have to worry about Range
handling because
it's already done for us!
Handling ETag
Now imagine another useful responder. The HTTP ETag
header carries a checksum
of the response body which can be used to make repeated requests more
efficient. A browser can send a checksum it already has in the If-None-Match
header, and if it matches the current ETag
the server can send 304 Not Modified
with no body instead of a 200 OK
.
Let's use our hypothetical ETag
responder
#[get("/hello")]
fn hello() -> ETag<&'static str> {
ETag("Hello there!")
}
ETag
might work like this:
- Run the inner responder
- Checksum the body
- Respond with a
304 Not Modified
if the checksum matches - Set the
ETag
header
Just like the order of Range
and Content
don't matter, the order of ETag
and Content
does not matter.
But the order of Range
and ETag
does matter! The checksum of a whole file
is different from the checksum of any section of the file. But that's easy to
fix: always use Range<ETag<File>>
and never ETag<Range<File>>
. That way,
the checksum will always be calculated for the whole file.
Disappointment
Now we have a problem.
#[get("/video.mp4")]
fn video() -> Content<ETag<File>>> {
Content(ContentType::MP4, ETag(File::open("video.mp4")))
}
Remember that we wanted to make File
handle range requests automatically, so
ETag
will process after Range
handling. But as I just pointed out, Range handling
can only correctly be done after ETag handling.
This is disappointing to me, because it would have been really nice for Range
requests
to be handled automatically in Rocket. If you were hoping for a clever solution or idea
I'm afraid you will leave disappointed too, because I don't have one yet.
Dfam 3.0
The Dfam consortium is excited to announce the release of Dfam 3.0. This release represents a major transition for Dfam from a proof-of-concept database into a funded open community resource. Central to this transition is a major infrastructure and technology update, enabling Dfam to handle the increasing pace of genome sequencing and TE library generation.
—Xfam blog, https://xfam.wordpress.com/2019/03/06/dfam-3-0-is-out/
Since I started at the Institute for Systems Biology (ISB) last September we have been hard at work on this update. My own contributions to the 3.0 release were to the new REST API, the new web interface written in Angular, and updating some existing backend scripts and visualizations.
I would like to thank Robert Hubley and Arian Smit at ISB, Travis Wheeler at the University of Montana, and all the previous contributors to Dfam. Without them none of this would have been possible, and I am excited to be here working on this and other projects.
The opinions expressed herein are my own and do not necessarily represent the views of ISB or any of its collaborators.