Phil's Musings notes from a peripatetic programmer

Recent Posts


Creating a "DSL" for GitHub's API (rewriting philschatz/octokit.js)

I originally wrote philschatz/octokit.js to interact with GitHub’s API using Promises. I started by forking an existing API michael/github and rewrote it using CoffeeScript and jQuery Promises. Then, thanks to contributions, they removed the dependence on Underscore and I removed the dependence on jQuery (and added support for several promise implementations).

But the code was getting too large as people incrementally added features so I thought it was time to rewrite with a focus on implementing as much of GitHub’s API concisely and consistently (with the hope of getting it adopted by GitHub officially).

Given my interest in Programming Languages, I decided to make a “Domain Specific Language” for GitHub (it’s in quotes because technically it’s still just Javascript but read on.

The results are at philschatz/octokat.js.

Necessary Features

This new library needed to:

  1. support ~100% of the GitHub API from the start
  2. have as little code as possible to maintain
  3. work in NodeJS and the browser
  4. have multiple source files
  5. have tons of unit tests
  6. support NodeJS callbacks and Promises

Let me quickly go through each of these and the challenges each presented.

Support ~100% of the GitHub API

This library abstracts all the authentication, status codes, caching, headers, HyperMedia (URL templates), and pagination returned from GitHub.

Abstracting Requests and Responses

Understanding how to form a request and how to parse the response is the bulk of the API; the logic is in src/request.coffee and src/replacer.coffee.

The other part is a chaining function that constructs a valid request.

Constructing Valid Requests

In the original philschatz/octokit.js there was a ton of copy/pasta dedicated to just constructing valid URLs.

Instead, this library has a regular expression that validates all URLs before calling GitHub and constructs objects dynamically through chaining (see src/grammar.coffee).

By reading the documentation at https://developer.github.com/v3/ a developer implicitly constructs a URL and then issues a request by calling one of the verb methods.

For example, to list all comments on an issue convert the documentation URL to the following:

GET /repos/:owner/:repo/issues/:number/comments

    .repos(owner, repo).issues(number).comments.fetch()

Here are 3 ways to list all comments on an issue:

octo = new Octokat()

REPO = octo.repos('octokit/octokit.rb') # for brevity

# Option 1: using callbacks
REPO.issues(1).comments.fetch (err, comments) ->
  console.error(err) if err
  console.log(comments) unless err

# Option 2: using Promises
REPO.issues(1).comments.fetch()
.then (comments) ->
  console.log(comments)

# Option 3: using methods on the fetched Repository object
REPO.fetch()
.then (repo) ->
  # `repo` contains returned JSON and additional methods
  repo.issues(1).fetch()
  .then (issue) ->
    # `repo` contains returned JSON and additional methods
    issue.comments.fetch()
    .then (comments) ->
      console.log(comments)

There are several verb methods to choose from:

  • .fetch() sends a GET and yields an Object
  • .read() sends a GET and yields raw data
  • .create(...) sends a POST and yields an Object
  • .update(...) sends a PATCH and yields an Object
  • .remove() sends a DELETE and yields a boolean
  • .add(...) sends a PUT and yields a boolean
  • .contains(...) sends a GET and yields a boolean

Pagination and HTML templates

GitHub returns headers for lists of results. These are automatically converted to nextPage, previousPage, firstPage, and lastPage methods on the resulting JSON.

Paging through all the issues on a repository looks like this:

# Create an issues object (but does not fetch anything yet)
ISSUES = octo.repos('octokit/octokit.rb').issues

# Option 1: with callbacks
ISSUES.fetch (err, issues) ->
  console.log(issues)
  issues.nextPage (err, moreIssues) ->
    console.log(moreIssues)

# Option 2: with Promises
ISSUES.fetch()
.then (issues) ->
  console.log(issues)
  issues.nextPage()
  .then (moreIssues) ->
    console.log(moreIssues)

Maintain a Minimal Amount of Code

With this API you are prevented from constructing an invalid URL because every request is validated against the URL_VALIDATOR regular expression before sending the request to GitHub.

Instead of writing largely copy/pasta code that constructs classes I opted for a regular expression that represents the entire GitHub API.

This can be found in src/grammar.coffee.

Works in NodeJS and the Browser

Getting this to work was a bit challenging.

NodeJS and AMD have slightly different syntaxes; enough to require adding in some boilerplate code to convert between the two.

Each of the source files contains 2 lines of boilerplate on the top of the file and the bottom.

Attempt 1

Originally, I had the entire library in a single file.

I was able to have very little boilerplate. Just something like the following:

@define ?= (cb) -> cb((dep) -> require(dep.replace('cs!', '')))

define (require) ->

  foo = require 'foo'

  ...

  module?.exports = Octokat # For NodeJS
  window?.Octokat = Octokat # For browsers
  return Octokat            # For browsers using AMD

This worked well but resulted in a single large file.

Attempt 2

I then split up the library into multiple files and got it working but ran into a problem when trying to build everything into one file.

I spent about a week trying to get jrburke/r.js and jrburke/almond to build a single file but could not get NodeJS, AMD, and the single file to all work at the same time.

Attempt 3

Finally, I opted for compiling the coffee files and concatenating them together with a custom define method similar to what other libraries do (see the require() method in less/less.js.

With this option, the tests run:

  • on the source coffee files in NodeJS
  • on the source coffee files in the browser
  • on the built dist/octokat.js file in the browser

Finally, multiple source files and tests running in all 3 environments!

Fixtures and Recording HTTP Requests

The library currently runs about 400 tests. There are about 120 unique tests, 80 are alternates using callbacks instead of promises, and 200 are the same tests but running in the browser.

In order to not pummel GitHub with duplicate requests I use linkedin/sepia to generate “cassettes” to replay the HTTP requests.

Unfortunately, sepia only runs on NodeJS so I wrote philschatz/sepia.js which plays back (and can record) linkedin/sepia tests.

Support NodeJS Callbacks and Promises

In order to support both callbacks and promises, the asynchronous methods (called verb methods) all support a function as the final argument and return a Promise.

This way, you can always end each line with a callback or with a .then() and the code just works.

The client only returns a Promise if one of the supported Promise libraries are detected (jQuery, angular, Q, or native Promises).

If you use one of those libraries but still prefer to use callbacks just make sure the callback is the last argument and the code will just workTM.

Keep Reading...

Removing jQuery, and adding Promises

This weekend I decided to remove jQuery from 2 of my libraries: css-polyfills.js and octokit.js.

It was motivated by requests from octokit users and by the horrible performance of css-polyfills.js on large textbooks (took ~1.5 hours).

Refactoring css-polyfills.js

I started with you-might-not-need-jquery and made incremental progress through the codebase. Initially my hope was to decrease the time by 50% but the refactor was only providing marginal improvements, until I hit the following line of code:

$el.find('#' + id)

This uses Sizzle to find an element by id. As soon as I turned it into the following…

document.getElementById(id)

BAM! the time dropped from ~1.5 hours down to 4.5 minutes.

The other annoying bit was that the DOM sometimes returns a live list, meaning if you remove an element, it is directly reflected in the NodeList.

To avoid the problem of iterating over live NodeList and removing some of them I wrapped it in _.toArray().

Refactoring octokit.js

This one was a bit easier.

octokit.js uses 2 features in jQuery: jQuery.ajax() and jQuery.Deferred.

ECMAScript 6 has native support for Promise (and the more fun generators) and it seems the XMLHTTPRequest object is not going away any time soon.

Fortunately, it does not do any DOM manipulation so refactoring this only required changing a few isolated spots in the code.

There were some benefits in making the change as well as drawbacks.

As a reward for being almost a decade ahead of its time, jQuery promises are almost, but not quote the same as ECMAScript Promises. They are constructed differently, have different names for functions, and have more features than the native Promises.

For example:

  • waiting on all Promises to complete in jQuery requires calling jQuery.when
  • jQuery promises have .notify() and .progress() methods
  • distinguish between waiting on a promise and extending one (.done() vs .then())
  • allow multiple arguments to .resolve()

For octokit this would have been a nice way to provide paginated results as a second argument.

Keep Reading...

Skeleton Auto-generation

In Building CSS skeletons and slots I showed how to style a piece of content based on where it occurs (the @contexts... parameters).

The code can be found at philschatz/skeleton-generator.css.

Here’s a discussion on the machinery needed to realize that.

Types and Templates

First, we need a way to describe the selector for a particular type. Let’s use a subfigure (figure inside a figure) for this example.

A typical figure would have styling like:

figure {
  border: 1px solid blue;
  figcaption {
    color: green;
  }
}

When split into slots and skeletons (let’s ignore @kind and @part for now) it would look like:

// The "skeleton":
figure {
  #content>#figure>.style();
  figcaption {
    #content>#figure>.caption();
  }
}

// The "slots":
#content {
  #figure {
    .style()    { border: 1px solid blue; }
    .caption()  { color: green; }
  }
}

Now, we have something to work with. To make a subfigure have a yellow border, the CSS would look something like:

figure {
  figure {                // Note: it's inside another figure
    border: 1px solid yellow;
  }
}

and the skeleton and slots would look like:

// The "skeleton" for a subfigure:
figure {
  figure {
    #content>#figure>.style(figure); // the arg denotes that this is
                                     // _inside_ another `figure`
    figcaption {
      #content>#figure>.caption(figure);
    }
  }
}
// The "slots" for a subfigure:
#content {
  #figure {
    .style(figure) { border: 1px solid yellow; }
  }
}

Now, we could write all possible permutations of figures, notes, tables, examples, etc in the skeleton file but that would be tedious. Instead, we can autogenerate them using the ... list expander in LessCSS.

First, we need to break up the figure into a couple pieces:

  1. the selector that distinguishes a figure from a .note or a .example
  2. the template of the “structure” inside a figure (it has a .style() and .caption())

For each @type (figure, note, table) the selector can be defined as:

#skeleton {
  .selector(figure) {
    figure { .x-template-helper(figure); }
  }
}

Assuming .x-template-helper does something (for now), this would allow us to create:

figure {
  figure {
    ...         // figures all the way down!
  }
}

… since the only thing this mixin defines is the selector figure { ... } (so it can be nested).

Next, we need to define what can be inside a figure (or subfigure). This is defined by the #skeleton>.template(figure) mixin. We can define it as:

#skeleton {
  .template(figure) {
    // All figures have a style and may have a caption.
    #content>#figure>.style();

    figcaption { #content>#figure>.caption(); }
  }
}

This says a figure has a .style() and a style on the caption.

So far we have defined both what a figure is (the .selector(@type)) and what it can contain (the .template(@type)).

Now we need to combine these to generate the expected CSS:

figure {
  figure {
    border: 1px solid yellow;
  }
}

To do it, we need to build a bit more machinery first!

List evaluation in LessCSS

There are several @types we need to permute. In this example there are only figures, but for a book there will be several types (ie figure, note, table, example).

The generator needs to iterate over all these types so for now let’s put them in a variable.

@all-types: figure, note, table, example;

LessCSS has a way way of going through and recursively evaluating a list:

.recursive() { } // base case
.recursive(@head; @tail...) {
  content: @head;
  .recursive(@tail...); // The `@tail` list is expanded to
                        // multiple arguments
}

Using this structure and the .selector(@type) and .template(@type) mixins we can create the following:

.build-children(@type; @rest...) {
  .selector(@type);
  // The recusion step is in `.x-template-helper`
}

// This mixin was "assumed" above but is defined here.
// It is used in the `.selector()` mixin.
.x-template-helper(@type) {
  .template(@type);
  // Recurse!
  .build-children(@all-types...);
}

In order to prevent an infinite loop, we define a @max-depth and add a @depth param to the mixins:

.build-children(0; ...) { } // base case
.build-children(@depth; @type; @rest...) {
  .selector(@type);
  .build-children(@depth; @rest...);
}

.x-template-helper(@type) {
  .template(@type);
  .build-children((@depth - 1); @all-types...);
}

Now, when .build-children(2; @base-types...) is called, we get the following:

figure {
  border: 1px solid blue;       // Recall figures have a blue border
  figcaption {
    color: green;               // Recall figure captions are green
  }
  figure {
    border: 1px solid yellow;   // Recall subfigures have a
                                // yellow border
  }
}

Which is the desired result.

Custom classes

So far, we’ve ignored the @kind and @part parameters to mixins. In this section we can add them back in.

The @part parameter describes which part of the book the content is in. Some examples are preface, chapter, or appendix. The @kind parameter describes the custom class on a piece of content. These are specific to a book (.how-to for physics or .timeline for a history book).

To handle the @part (preface, chapter, appendix, etc) we merely need to add a @part parameter to .build-children().

To handle the @kind parameter we need to let the skeleton-generation code know what are all the possible classes (@kind) for a given type (figure, example, section, etc).

To do this, we add a “function” mixin to the #content namespace which will “return” a list of classes for each type. Because these are specific to each book they are defined as a #content.kinds(@type) mixin that “returns” by setting a @return variable.

For example, we could have several classes on a figure: .full-widthand .half-width.

The expected CSS would be:

figure.full-width { width: 100%; }
figure.half-width { width: 50%;  }

To define these for the skeleton-generator and the add styling in the slots, it would look like:

#content {
  .kinds(figure) {
    // all the possible classes on a `figure` for this book:
    @return: 'full-width', 'half-width';
  }

  #figure {
    .style('full-width') { width: 100%; }
    .style('half-width') { width: 50%; }
  }
}

Then, we can add a @kind parameter to the definition of .selector(@type) and change the corresponding definitions of .build-children() and .x-template-helper().

Custom classes in @contexts...

So far, the @contexts... arguments only contain a list of types. For example, styling any exercise inside an example inside a chapter is done like this:

#content {
  #exercise {
    .style(any; chapter; example) { ... }
  }
}

To style an exercise inside a .how-to example we need a bit more information; namely the class .how-to. To do this, we can use the #content>.kinds(@type) mixin defined above when permuting.

It looks something like:

.x-helper-base-types(@depth; @contexts; @type; @rest...) {
  .selector(@type; @depth; any; @contexts...);

  // Loop for any custom `@kind`s
  #content>.kinds(@type);
  @kinds: @return;
  .x-helper-custom-types(@depth; @contexts; @type; @kinds...);

  // recurse
  .x-helper-base-types(@depth; @contexts; @rest...);
}

Now, we can style a figure based on:

  • the class
  • the context @types
  • the context @kinds (classes)

Optimizing

Checking all possible combinations of @types is computation and memory intensive. Fortunately, we can short-cut several of these. For example, a figure can never contain a section or an example

Instead of using a global @all-types for recursion, we can recurse based on valid children by defining a mixin that “returns” the valid children.

This would look like:

#skeleton {
  #content {
    .children(figure) { @return: figure, table; } //style subfigures
                                                  //and tables
                                                  //inside a figure
    // Defines the "root" children for a piece of content
    .children(none) { @return: table, figure, exercise, example; }
  }
}

This mixin is used in .x-helper-base-types to restrict the list of children that are permuted.

Now, we can quickly style a figure based on:

  • the class
  • the context @types
  • the context @kinds (classes)
Keep Reading...

Building CSS skeletons and slots

Simple Slots and Skeletons

We are transitioning from 1 output format for CSS (Docbook HTML) to multiple HTML formats (PDF, EPUB, web).

To reuse the same CSS we split the selectors and rules into mixins that are namespaced based on their logical part in the book. Instead of a .note { color: blue; } we split it into 2 parts:

// The skeleton:
.note { #content>#note>.style(); }

// and the slot:
#content {
  #note {
    .style() { color: blue; }
  }
}

This allows us to reuse the logical styling when the HTML format changes.

Now, styling a book merely involves filling in the right slots.

Slot parameters

Custom classes

Often we have custom “features” in a book. These are often written and as notes or sections with a custom class name.

To style these, every slot has a @kind parameter as the 1st argument to the mixin. Here is an example of a “How To” feature:

#content {
  #note {
    .style('how-to') { color: orange; }
  }
}

Parts of a book

But sometimes it is necessary to style the feature differently based on which part of the book it is in. Most commonly this is used for numbering. For example, a Figure in a chapter would be labeled Figure 4.3 but in an appendix it would be labeled Figure A3.

To support this, the 2nd argument is the @part. For most slot definitions this will just be any.

Contexts

When styling a book, it is often important to know what an element is in. For example, a worked-out example is actually an exercise inside an example. It should not be numbered (since it is not a homework problem) and might be rendered with text other than Problem and Solution.

To handle this case, the final arguments to a mixin are @contexts....

Here is how to style an exercise inside an example:

#content {
  #exercise {
    .style(any; @part; example) { color: green; }
  }
}

This makes any exercise (regardless of class) in any @part that occurs inside an example green.

If you wanted to make any exercise inside a how-to example yellow you would write:

#content {
  #exercise {
    .style(any; @part; example; 'how-to') { color: yellow; }
  }
}

Similarly, any exercise in an example in the review-questions section at the end of a chapter would be something like:

#content {
  #exercise {
    .style(any; chapter; end-of; section; 'review-questions'; example) { color: aqua; }
  }
}

That’s about it for the intro to slots and skeletons. I should have another post on the namespace organization of a book and a post on how the machinery needed to get the @contexts... parameter to work.

Keep Reading...

Pattern matching with LESS (CSS)

Background

Pattern-matching mixins is the distinguishing characteristic that separates Less from other CSS preprocessors like Stylus and Sass.

Let the compiler do what it’s good at: match templates and error early (when one is missing).

LESS is a beautiful language (like CSS) precisely because it is not Turing-complete. Instead, it explores just how far you can take a language with those restrictions.

For example, you can provide conditionals using the pattern-matching when keyword instead of relying explicit control flow like if then else. You can match arguments to mixins either by value or with ... (like Ruby, CoffeeScript, and recent versions of Java).

This frees the author up to put conditional styles elsewhere in a file (much like XSLT templates). Similarly, mixins whose arguments match are always applied. This can initially be an annoying feature but for styling textbooks this is central in our design.

Case Study: Making Textbooks

In textbooks we need to style elements differently depending on where they are (context is crucial).

A subfigure (figure > figure) should render differently than a figure.

A table should be numbered differently when it is in a chapter (ie Table 4.3) than when it is in an appendix (Table A3).

Or take problems and solutions: a problem and solution at the end of a chapter should be “treated” differently than an example with a worked out problem and solution. The former should be numbered (and solution should be moved to the back of a textbook) while the latter should be included in the example (since it is a worked-out example).

Or take a definition of a term. If an equation is used then it should not be numbered; referring to an equation inside a definition does not make a whole lot of sense.

In each of these cases it is the context that is important when deciding how to style an element.

Additionally, the HTML elements differ for different outputs; online the CSS selectors are different than for a single-page book or a multi-page EUPB.

Solution: Slots and Skeletons

To handle multiple HTML formats (EPUB, online, PDF) we came up with the idea of separating the CSS selectors (skeleton) from the rules (slots).

Each book contains a very similar HTML structure (depending on the output format; PDF, EPUB, online) but very different styling (fonts, colors, numbering schemes) so we split the CSS into 2 types of files: slots for the styling and skeletons for the HTML structure and linked them together using a namespaced tree of mixins that represent the logical structure of a book.

In this way a page header would be defined as (the slot):

#page>#top>.outside() { content: 'Chapter 1'; } and used as (the _skeleton_):

@page:left { @top-left: #page>#top>.outside(); }

This allows us to separate the styling (slots) from the content (skeleton).

Context

Getting back to pattern matching; the context of an element (ie “Exercise”) is important for styling that element. To accomplish this, we created rules for defining mixins and special “generators” for all the possible permutations of contexts.

In our system a mixin has roughly the following signature: #logical-type>.mixin(@kind; @part; @contexts...) { }.

Let’s go through the various parts:

  • #logical-type roughly corresponds to the structural element (and data-type attribute). Some examples are exercise, note, term, example
  • .mixin(...) is the kind of slot (.style(...) for the whole element, .title(...) for the title, .numbering(...) for how to number the element, etc)
  • @kind corresponds to a specific class for a particular book (a science book may have experiment while a history book may have did-you-know)
  • @part represents which part of a book the element occurs in (preface, chapter, appendix, any) and is largely used for numbering an element
  • @contexts... represents what this logical-type occurs inside (and where all the fun happens). Some examples would be #figure>.style(any; any; figure) { } (a subfigure), or #table>.style(any; any; glossary) (a table inside a glossary)

This gives us the flexibility to style the content of a book based on the logical parts (using namespaces) and the context it occurs in (using @contexts...) and have it work for different output formats (by having different skeleton files).

To accomplish the @contexts... bit we created some mixins that generate all permutations of different contexts. This is accomplished by adding list-expansion to LESS which allows us to expand a list when calling a mixin. For example:

.context-expander(@head; @tail...) {
  .mixin-call(@tail...);        // <-- The new piece added to LESS
}

Note: For most of these permutations, no mixins will match so they will be omitted from the generated CSS.

The result is a single skeleton file for each output format (EPUB, PDF, online) and a slots file (~100 lines) for each book.

Keep Reading...