Site Logo

Blog - 2019

2019

2019-11-02

5 years of jebrosen.com

In my final year of high school I was reading a lot of blogs by college students. Articles like, "23 things I wish I knew freshman year", or "87 true facts about college, you won't believe number 6". There was one piece of advice that really stuck with me: have a website. By that time I already had a web presence on various forums, IRC channels, and I think a few web pages somewhere, but nothing concrete or worth sharing. But now I was inspired. I was going to make a website for myself that could act as a portfolio, a blog, whatever I wanted!

More importantly: I had an excuse to learn node.js.


Why node.js? I'm not entirely sure, since that was five years ago. I was spending a lot of my free time learning new programming languages for fun, and maybe server-side JavaScript was just another one to add to the list. Plus, I could reuse the same code between the client and the server side! (spoiler alert: that never actually happened.)

The earliest iterations of "home-jebrosen" were hosted on OpenShift, and it used the express.js web framework. I used the lightweight database nedb for storage, because at the time I didn't want the complexity of a "real" database. 2014 was a wild year for that project. I vaguely remember cycling between several async libraries and syntaxes, including DIY callback stacks and the async library, all the while using express.js. A notable architectural feature of this version was the use of AJAX to switch between pages. At the end of it all, my 4-page website was a SPA CMS implemented in JavaScript. I could have accomplished the same with 4 HTML pages in Notepad, but of course that wouldn't have been nearly as fun.

I used my website backend as my playground for learning new JS frameworks, libraries, and syntaxes. Over the course of another year or two, I went through early versions of koa and its coroutine/generator style, promises, arrow functions, async/await, and sometimes transpiled with babel or used experimental versions of node.js. I dropped the dynamic functionality and stuck with server-side rendering with the Jade template engine (now known as Pug.

Another notable change I made around that time was the switch to SQLite. We wanted to try SQLite for a project at school, and it was the perfect excuse to rewrite some core functionality of my website for the umpteenth time! I am very happy with SQLite overall and will probably always reach for it first for small projects.


The other, much larger, change to my website also happened in college: Rust. Rust looked like the programming language I had been searching for for years: something low-level and performant like C and C++ but without so many footguns. I loved C for its simplicity, but it is too unsafe for my tastes. C++ solves some of the most annoying things about writing C code—especially with RAII and templates—but it introduces some complexity and downright awkwardness that I was unhappy with. As a minor example, I have read several explanations of rvalue references and never felt like I actually understand them. The situation is the same with variadic templates and SFINAE, and it was frustrating to peek into standard library implementations and see all of these things combined all the time.

Rust claimed to be a kind of goldilocks language ("fast, reliable, productive - pick three"). The syntax looked close enough to C++ for me to pick it up fairly quickly. And it felt like a good time to learn another programming language that was more my style (sorry, Perl, that side-eye is directed at you). After reading through the Rust Book and some other guides, I needed a real project to experience the language. So I just rewrote my entire website in Rust.

I started with the iron web framework, which seemed like a solid choice at the time. I used horrorshow templates, which have a syntax similar to Pug in the ways I cared about but unlike Pug were checked for validity at project compile time. Over the course of a month I rewrote everything, including the site, the blog, and the administration backend. I named the new project sirus: SImple RUst Site. A few months later, the iron project announced that it was going to be unmaintained, and I started looking elsewhere.

Somehow I settled on Rocket, which introduced me to nightly rust and comparatively heavy amounts of code generation. I really liked Rocket's approach to route definitions, especially with request and data guards. For brevity I will not tell the story here, but I am now one of Rocket's maintainers: I help answer support questions for the project, and for the past year or so I have been working on a migration to Rust's new async/await functionality.

Rocket is here to stay as part of my website for a while; it eliminates certain pain points that I would likely struggle with again if I were to switch, and I have no good reason to use anything else now. That's not to say that development has remained stagnant! I have still carried on my tradition of rewriting something every few months - I have gone through the tera, askama, and maud template engines. I also went to sqilte-with-diesel, to postgres-or-sqlite-with-diesel, to postgres-without-diesel. (The switch to postgres was motivated by using postgres for other things on my server anyway, and having everything in one system makes backups easier to deal with. I would probably still be using sqlite otherwise.)


So what's next? The math says I am overdue to switch template engines again, but that's pretty low on my priority list right now. I have been spending the bulk of my work-unrelated programming on making Rocket async, and it will be nice to work on some of my other projects too. If I get an itch to try something new, it could end up in my website in one form or another. Or not. After 5 years, maybe it is finally time to stop rewriting the world.

2019-06-23

Adventures in Response Composition

Background

In web application servers it is useful to perform operations on "finished" responses before they are sent to a client. You might want to set caching or other headers, or compress the body data. Some web frameworks, including Rocket, have a system for user-defined middleware or response wrappers that can perform these operations in a reusable and composable way.

There are a wide variety of operations you might want to apply to a response after it has been built:

  • Add a Server or other global header
  • Set the Content-Type, Content-Disposition, or another response-specific header
  • Compress the response body and set the Content-Encoding header
  • Checksum the response and save the checksum in the ETag header
  • Compare the If-None-Match header in the request to the ETag of the response, and respond with a 304 Not Modified if they match
  • Serve a portion of the body selected by the Range header in the request
  • Add an X-Response-Time header indicating how much time the server spent processing the request

A server might perform some operations on every response, or only on specific routes or when certain conditions are met. In Rocket, the idiomatic way to operate on individual responses is the Responder. Some responder types, such as String or Vec, set the response body to the bytes they contain. Other responders, such as Content, modify the result of another Responder. These are known as wrapping Responders, and they are the building blocks of composable operations on responses.

Consider the following route:

use rocket::http::ContentType;
use rocket::response::Content;

#[get("/hello")]
fn hello() -> Content<&'static str> {
    Content(ContentType::JSON, "{\"hello\": \"world\"}")
}

When this route returns, Content::respond_to() is called. It is a wrapping responder: its implementation is "Run the inner responder, then set the Content-Type header to the specified ContentType (in this case application/json)." Here the inner responder type is &str, and its respond_to implementation is "set the response body to the UTF-8 bytes underlying this string and set the Content-Type to text/plain".

Because Content sets the Content-Type header after the inner responder has run, the final response sent to the client will have Content-Type: application/json as desired.

Handling Content-Range

Another useful wrapping responder might be the "Range request handler". Range requests are commonly used for resuming downloads and skipping around in streaming media. Suppose we had a wrapping responder called Range, used like this:

#[get("/video.mp4")]
fn video() -> Range<File> {
    Range(File::open("video.mp4"))
}

Range might implement respond_to in the following way:

  1. Run the inner responder - in this example, File
  2. Check if the client sent a Range header
  3. Grab the requested portion of the response body
  4. Set the response body to be only the portion
  5. Set the Content-Range header on the response

In a real project this wouldn't show as a video in most browsers, because we forgot to set the Content-Type header. Let's fix it:

#[get("/video.mp4")]
fn video() -> Range<Content<File>> {
    Range(Content(ContentType::MP4, File::open("video.mp4")))
}

Hmm. What about Content<Range<File>>?

#[get("/video.mp4")]
fn video() -> Content<Range<File>> {
    Content(ContentType::MP4, Range(File::open("video.mp4")))
}

That works too! Content and Range can safely be reordered because they don't interfere with each other in any way.

Suppose we decided Range is a really nice feature and we built it into Rocket directly, so every File will handle range requests automatically:

#[get("/video.mp4")]
fn video() -> Content<File> {
    Content(ContentType::MP4, File::open("video.mp4"))
}

Much simpler, and now we don't have to worry about Range handling because it's already done for us!

Handling ETag

Now imagine another useful responder. The HTTP ETag header carries a checksum of the response body which can be used to make repeated requests more efficient. A browser can send a checksum it already has in the If-None-Match header, and if it matches the current ETag the server can send 304 Not Modified with no body instead of a 200 OK.

Let's use our hypothetical ETag responder

#[get("/hello")]
fn hello() -> ETag<&'static str> {
    ETag("Hello there!")
}

ETag might work like this:

  1. Run the inner responder
  2. Checksum the body
  3. Respond with a 304 Not Modified if the checksum matches
  4. Set the ETag header

Just like the order of Range and Content don't matter, the order of ETag and Content does not matter.

But the order of Range and ETag does matter! The checksum of a whole file is different from the checksum of any section of the file. But that's easy to fix: always use Range<ETag<File>> and never ETag<Range<File>>. That way, the checksum will always be calculated for the whole file.

Disappointment

Now we have a problem.

#[get("/video.mp4")]
fn video() -> Content<ETag<File>>> {
    Content(ContentType::MP4, ETag(File::open("video.mp4")))
}

Remember that we wanted to make File handle range requests automatically, so ETag will process after Range handling. But as I just pointed out, Range handling can only correctly be done after ETag handling.

This is disappointing to me, because it would have been really nice for Range requests to be handled automatically in Rocket. If you were hoping for a clever solution or idea I'm afraid you will leave disappointed too, because I don't have one yet.

2019-03-08

Dfam 3.0

The Dfam consortium is excited to announce the release of Dfam 3.0. This release represents a major transition for Dfam from a proof-of-concept database into a funded open community resource. Central to this transition is a major infrastructure and technology update, enabling Dfam to handle the increasing pace of genome sequencing and TE library generation.

—Xfam blog, https://xfam.wordpress.com/2019/03/06/dfam-3-0-is-out/

Since I started at the Institute for Systems Biology (ISB) last September we have been hard at work on this update. My own contributions to the 3.0 release were to the new REST API, the new web interface written in Angular, and updating some existing backend scripts and visualizations.

I would like to thank Robert Hubley and Arian Smit at ISB, Travis Wheeler at the University of Montana, and all the previous contributors to Dfam. Without them none of this would have been possible, and I am excited to be here working on this and other projects.

The opinions expressed herein are my own and do not necessarily represent the views of ISB or any of its collaborators.

2019-02-18

Washington 0.0.2

About six months have passed since I announced the Washington programming language, but not much progress has been made in that time. I have completed a draft specification and an interpreter, but much work remains to be done. The next steps are to define more presidents and write more example programs.