Modular legal styles: an MLZ how-to

As outlined in the last post, MLZ now supports modular code for legal styles. This is a major development that promises to dramatically simplify support for law across all CSL styles over time. This post lays out the basics of the new architecture.

The text below is a technical description that assumes a basic familiarity with Citation Style Language (CSL). If this is your first encounter with CSL, the learning curve is not steep. General guidance notes for CSL style authors are available on zotero.org. With that, the CSL Specification, the CSL-m Specification Supplement, these notes, and a copy of MLZ (the unofficial Zotero variant with multilingual and legal extensions), you have everything you need to set about implementing a legal style for your own jurisdiction.

Overview

To get started thinking in modular terms, imagine that we want to define a set of citation macros to be copy-pasted across a number of different styles. Our copy-paste code should provide the basic arrangement of our citations, but we want them to adapt to the “look and feel” of the style into which we insert them.

This is the problem of legal referencing exactly; legal citation styles are typically incorporated by reference into a “main” style. The Chicago Manual of Style (CMS), the style of the American Psychological Association (APA), and many others provide basic guidance on citing the law, but refer authors to specialist style guides such as The Bluebook: A Uniform System of Citation [paywall] for the details. Each legal citation guide focuses on a particular jurisdiction, in turn referring the reader to other guides for the specifics of “foreign” citation conventions. [1]

To adapt eclectic legal citations to the context of a “main” style, editors may apply back-referencing conventions that differ from those dictated by the legal style itself. Concerning the use of id. versus ibid., the use or non-use of the so-called “five footnote rule,” the method of back-referencing to earlier cites, and other details, the editor may (and should) apply a set of “localized” conventions to legal citations embedded in a given style.

To return to our copy-paste idea for CSL style code, let’s start with the assumption that any reasonable citation can be broken into four elements: (1) a title; (2) a main element; (3) a locator; and (4) a tail element. If we can generate these elements in the correct form for our legal style, we can compose a finished cite. [2] Macros for each of them are the building blocks from which we will construct our “portable” CSL code.

[1] In an odd twist, the Bluebook itself is an outlier that offers its own short summaries of foreign citation conventions, without reference to the foreign guides on which they are based. Although in some sense “uniform,” this material is best passed over in preference to direct application of more accurate and up-to-date stylesheets for individual jurisdictions.
[2] This a tiny white lie, actually. Several variants of these macros are needed to make things completely portable. More on that below …

Designing templates for a main style

Since we plan to paste our law macros into arbitrary styles, we will want to give them distinctive names to avoid namespace clashes. Legal style modules in MLZ do this with a juris- prefix, so we will use that in our initial examples.

Full-form citations

For our first look at the “localisation” problem, let’s compare the respective forms for citing an English case in the style of the Oxford Standard for the Citation of Legal Authorities (OSCOLA) and the Bluebook. The two are nearly identical, but the latter is more fond of punctuation marks, and wants a comma to follow the case name:

OSCOLA
Donoghue v. Stevenson [1932] AC 562 (HL) 564
Bluebook
Donoghue v. Stevenson, [1932] A.C. 562 (H.L.) 564

CSL-m can strip periods (or not) when rendering a macro; but to control that pesky comma, we need to compose the elements with different delimiters. We can solve that problem by composing our standard macro elements with slightly different templates, like this:

OSCOLA
<group delimiter=" ">
  <text macro="juris-title" strip-periods="true"/>
  <group delimiter=", ">
    <text macro="juris-main" strip-periods="true"/>
    <text macro="juris-comma-locator" strip-periods="true"/>
  </group>
  <text macro="juris-space-locator" strip-periods="true"/>
  <text macro="juris-tail" strip-periods="true"/>
</group>
Bluebook
<group delimiter=", ">
  <text macro="juris-title"/>
  <group delimiter=" ">
      <group delimiter=", ">
        <text macro="juris-main"/>
        <text macro="juris-comma-locator"/>
      </group>
      <text macro="juris-space-locator"/>
      <text macro="juris-tail"/>
  </group>
</group>

In the examples above, note the macros juris-comma-locator and juris-space-locator. The sole purpose of these macros is to render the CSL locator variable with appropriate labels and other decorations: if the variable is not available, they render nothing. The citeproc-js processor limits the locator variable to a single use within each cite, so if the first macro (juris-comma-locator) is composed to render only when a comma should join it to juris-main, our template will produce correctly formatted cites. Always.

The code above will work just fine for full-form citations, but as every Bluebook-trained American law student well knows, that is the easy part. For subsequent references, we must adapt the elements to fit the back-reference conventions of our main style (which might be Bluebook, but might be something else).

Immediate back-references

The simplest back-reference in most styles is a reference to the immediately preceding source. The specific form of the reference should follow the rules of the main style, and again practices vary. A back-reference citing paragraph 35 of an English case cited in vendor-neutral form would appear as follows under OSCOLA and Bluebook rules:

OSCOLA
ibid [12]
ibid § 345
Bluebook
Id. at [12]
Id. § 345

In this example, the label (Ibid. or Id. is supplied by the main style. The locators together with their styling ([12] and 345) are to be supplied by our copy-paste English law macros. The interesting bit is the connector “at,” which is added in the Bluebook style only if the locator has no other label. To accomplish this effect, we define a pair of “bare” locator macros in our copy-paste code, as follows:

<macro name="juris-locator">
  <choose>
    <if locator="paragraph">
      <text variable="locator" prefix="[" suffix="]"/>
    </if>
    <else>
      <text variable="locator"/>
    </else>
  </choose>
</macro>

<macro name="juris-locator-label">
  <choose>
    <if locator="paragraph page" match="none">
      <label variable="locator" form="symbol"/>
    </if>
  </choose>
</macro>

In our main styles, we can again use slightly different templates to render the locators appropriately. In the Bluebook style only, we will need a small macro to insert the “at” term:

Bluebook
<macro name="at-mac">
  <text value="at"/>
</macro>

Once that is in place, we can use the following templates to render our standard macro elements appropriately in each of the main styles:

OSCOLA
<group delimiter=" ">
  <text term="ibid"/>
  <text macro="juris-locator"/>
</group>
Bluebook
<group delimiter=" ">
  <text term="ibid"/>
  <text macro="juris-locator-label" alternative-macro="at-mac"/>
  <text macro="juris-locator"/>
</group>

The alternative-macro attribute is a CSL-m extension to the official language, which calls the macro named in its argument when the primary module macro produces no output. Using the constructs above, our copy-paste macros can be used to produce correctly formatted back-references in both styles. Always.

Other back-reference forms can be fashioned in a similar way, using our fixed set of “portable” copy-paste macros as building-blocks in simple templates.

Standard macros

The full set of macros needed to build a legal citation for any context works out to the following (applying the same juris- prefix that we used in the examples above):

juris-title
juris-title-short
juris-main
juris-main-short
juris-comma-locator
juris-space-locator
juris-locator
juris-locator-label
juris-tail
juris-tail-short

Jurisdiction modules

Now comes the fun part. If we prepare a legal style composed exclusively using the macros listed above, the citeproc-js CSL processor used by MLZ can load it into a main style when it encounters items from that jurisdiction. Because we have separated the formatting requirements of the legal style from those of the main style, cites will render correctly across all styles that call on the legal style module. Always.

A jurisdiction module is an ordinary CSL or CSL-m style that defines all of the macros above, and no others. It must be a valid CSL (or CSL-m) style, and so must contain a citation node, and may contain a bibliography node. It may be run directly as a style in its own right for testing purposes, but as jurisdiction modules only provide formatting for legal references, it will not normally be used directly in production. In MLZ, legal style modules are loaded on demand when the processor determines that their code is required.

Module loading

In MLZ, legal items have a mandatory Jurisdiction field, populated from a controlled list of identifiers built from the Legal Resource Registry (LRR), a companion project to MLZ. An LRR jurisdiction identifier is a colon-delimited string. It may be followed by a court or institution identifier, separated by a semi-colon, but only the jurisdiction portion is used for style resolution purposes. As an example, the following identifier specifies the (District Court for the) Middle District of Tennessee in the United States:

us:c6:tn.md;district.court

To associate a law module with a particular jurisdiction, its ID and filename must adhere to a fixed schema. The filename must be composed as follows:

juris-<LRR-jurisdiction-id>[-variant].csl

The LRR jurisdiction identifier is mandatory; an arbitrary descriptive variant name is optional, and need not be used for unique styles that serve the target jurisdiction. A standard module for the District of Columbia (not the Federal Circuit) and its Bluebook variant would be written as follows:

juris-us:dc.csl

juris-us:dc-bluebook.csl

The ID of a jurisdiction module (in the info node of the module style) must follow the schema, using the root URL of the CitationStylist project, and dropping the .csl extension:

<id>http://citationstylist.org/modules/juris-us:dc</id>

Apart from these naming and metadata requirements, and the restriction of macro definitions to the list above (all of which must be defined), a jurisdiction module is just a standard CSL or CSL-m style, and can be validated and run in the usual way. If installed as a style in MLZ, it will be called upon automatically to format legal references from the target jurisdiction.

Jurisdiction resolution

When the processor encounters a macro with the juris- prefix, it will search for a suitable module based on the jurisdiction ID of the current item, and the preferred module variant, if any. Preferred variants can be set (optionally) as a comma-delimited list in the main style, via a style-options locale node:

<style-options jurisdiction-preference="babyblue,bluebook"/>

Modules are searched for among the installed MLZ styles in the following priority order:

  1. For each jurisdiction preference …
    • … a match attempt is made using the item’s jurisdiction ID …
    • … elements are dropped one by one from the end for each successive attempt …
    • … until a final attempt using the single top-level jurisdiction element.
  2. If a match is found, the module code is loaded and used to render the item.
  3. If a match is not found, the next preference is attempted.
  4. If all preferences (including the empty preference) fail, the style macro code is executed.

In lieu of conclusion

If you have reached this point in the post, you will understand the excitement that I feel over the implementation of modular CSL support for legal referencing. Styles prepared for the requirements of a specific jurisdiction can focus on getting the details right within their limited scope. If relied upon locally, there will be strong incentives to maintain quality. With modularity, styles from multiple jurisdictions can be combined in a single document, transparently and without conflicts, and legal support can be added to any of the 1,180 unique citation styles in the CSL repository with minimal effort. It’s a big win all around.

In terms of concrete benefits, CSL support for law opens the prospect of pushing reference managers and other third-party support tools into the legal research mainstream. While that is a very good thing if you happen to be involved in such a project, we can also anticipate public benefits from several vectors of innovation that have heretofore been in a state of stagnation:

Drafting efficiency
There is a reason why professionals in fields other than law have gravitated toward automated referencing systems over the past 30-odd years. Given quality metadata, generating citations automatically frees the researcher to concentrate more deeply on content. When automated referencing is tied to a well-designed research support platform, the gains are greater still.
Collaboration
MLZ is the first research tool to bridge the technological divide separating the law from other disciplines. It inherits from its pater familias Zotero a rich capacity for collaboration, which with the arrival of legal referencing support will reduce the barriers to research collaboration between lawyers and members of other disciplines.
Dissemination
The parsing of citation strings out of text documents has for years been a rite of passage for anyone engaged in legal archive development. In the U.S. jurisdiction, there have been important advances in recent years thanks to efforts by Eric Mill, Mike Lissner, Alan deLevie, and others. Robust parsers are essential for retrofitting legacy documents for electronic linking; but there are better ways for born-digital content. There is an attraction in principle to enriching official documents with embedded metadata at source; and modular style support brings that possibility one step closer.

Finally, this new offering puts paid to the silly notion that there is any sort of proprietary interest in citation styles. Referencing systems are just vehicles for enabling discourse. Like any other form of language, they do not belong to anyone—and the sooner we get past the misguided assumption that they might do, the better off we will all be.

This entry was posted in Announcements. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>