STX Documentation

STX

STX is a markup language for creating documents. It is designed to be simple enough to be easily read and written but at the same time powerful enough to allow creating complex documents.

A document in STX is generated by a single self-describing text file that must comply with the syntax so it can be compiled. External content can be included from other resources by using directives.

Some of the patterns in the syntax were inspired by other markup languages like Markdown and AsciiDoc so the entry barrier was low.

Command Line Interface

The current STX implementation was written in Python and can be installed with pip.

# Install the latest stx version
pip install stx

# Process a document
stx index.stx

# Watch a document for changes
stx index.stx --watch
Installing and processing a document.

The source code can be found in the python-stx repository in GitHub.

Example

This document is a living example of the capabilities of STX, the source code can be found in the file index.stx in the docs repository in GitHub.

Features

Most of the features are oriented to help to create technical and academic documents.

Structured Documents

The text written in STX generates documents represented by a hierarchical component structure. Building a document using components makes it possible to have “infinite” nesting and the option to render it to multiple output formats.

Nesting rules are based on the text alignment. Whenever there is block mark, the content can be nested following the alignment of the subsequent text. The content can be broken just by breaking the alignment or by using a % (percentage symbol).

|= Column 1 | Column 2
|- Cell     | - Nested list (item 1).
              - Item 2 with nested code block.
                +++ code
                This is nested in item2.
                +++
Example of nested components.

Cross References

Since the STX documents are structured, cross references are easy to generate and validate. Text between brackets are considered links, the target reference can be customized by appending it between parenthesis Broken references are validated generating a warning.

The matching algorithm for validating references ignores the case and only considers words using ASCII letters and numbers. A reference like [this] is equivalent than [THIS].

Only sections can be automatically referenced by using the heading text, if a section attempts to create duplicated references, the algorithm starts appending a number.

Custom references are supported by adding the ref attribute to any component.

@ref(`other options`, `more options`)
- Other option 1.
- Other option 2.

Please see [other options], it also works with [more options].
Custom references.

Numbering

Since the documents in STX are organized by sections, they can be easily numbered. Each section, figure and table are automatically numbered.

Additionally, there is a function for generating a Table of Contents. All numbers of sections, figures and tables of this document were auto-generated as well as the table of contents.

Output Formats

While develop a new output format is not a trivial task, STX is designed to be modular so that the output formats can be plugged in. The current supported output formats are described in this section.

Since STX document are stand-alone, the output format is specified by using directives in the same document. A document can have any number of output formats.

HTML

This output format generates one single HTML5 file. The document structure is completely based on the HTMLBook specification, a great unofficial draft from O’Reilly Media, Inc.

#output(format: html, target: index.html)
Specifying the HTML output format in a document.

Since this is the main format for rendering STX documents, some directives are optimized for HTML, example:

#stylesheets(
    assets/layout.css,
    assets/style.css
)
Linking CSS stylesheets in a document.

Another useful feature for creating HTML documents is the embed function, which integrates the content from another file directly in the document.

<embed(assets/layout.html)>
Embedding an HTML fragment into the document.

JSON

This output format generates a JSON file with the raw structure of the document. This format is not optimized for human reading but can be useful for debugging and using it with other tools.

#output(format: json, target: index.json)
Specifying the JSON output format in a document.

Syntax

The STX syntax is stricter than other markup languages. Having a malformed component will cause a compilation error.

Since this is a markup language as well, the marks in the text together with the text alignment makes it possible to create complex documents. This section describes the different types of marks and its syntax.

Block Components

This type of components are able to contain more components inside, they are the building blocks for a document.

Block marks are not recognized inside inline content, consequently, paragraphs cannot begin with a block mark.

When several block components are defined sequentially, they can even be considered as a single composite component.

Sections

There are 6 levels of sections, they can be created by repeating the = (equal symbol). The number of symbols represents the level of the section. The content of the section must be aligned with the section mark.

=== Heading Content

Section Content

Section syntax.

Compared with other markup languages, sections here are not just titles, they contain all subsequent components until a section with a level equal or less is found.

= Section A (1.)
== Section B (1.1, inside A)
=== Section C (1.1.1, inside B)
=== Section D (1.1.2, inside B)
=== Section E (1.1.3, inside B)
== Section F (1.2, inside A)
=== Section G (1.2.1., inside F)
=== Section H (1.2.2, inside F)
= Section I (2.)
== Section J (2.1, inside I)
== Section K (2.2, inside I)
Example of nested sections.

Lists

There are two kind of lists: unordered and ordered, they can be created by using the - (dash symbol) and . (period symbol) respectively.

- Item 1 Content
- Item 2 Content
- Item N Content
Bulleted list syntax.
. Item 1 Content
. Item 2 Content
. Item N Content
Numbered list syntax.

The content of the items can be any supported component as long as the nesting rules are followed.

Tables

Tables are created by gathering rows, the mark for heading rows is |= (pipe and equal symbols) and for normal rows is |- (pipe and dash symbol).

The content of a row represents a cell, more cells can be added to the row by using the | (pipe symbol), this cell mark can be used inline, in other words, multiple cells can be defined in a single line.

|= Heading 1 Content | Heading 2 Content | Heading N Content
|- Cell 1,1 Content  | Cell 2,1 Content  | Cell N,1 Content
|- Cell 1,2 Content  | Cell 2,2 Content  | Cell N,2 Content
|- Cell 1,N Content  
   | Cell 2,N Content
   | Cell N,N Content
Table syntax.

The number of cells per row is not required to match.

Captions

Captions are not components per se, they are used to create or change other components:

  • When using captions with tables, the caption becomes part of the table.
  • Captions used with any other component creates a figure, the other component gets wrapped by the figure.

There are two ways to apply a caption to a component:

  • Before the component: They are created by using :: (two colon symbols) and must be placed before the target component at the same alignment.
  • After the component: They are created by using :^ (colon and caret symbols) and must be placed after the target component at the same alignment.

:: Caption Content

Target Component

Pre-caption syntax.

Target Component

:^ Caption Content

Post-caption syntax.

Literal Block

This component is allowed to contain raw text which won't be parsed by STX. The text must be delimited by +++ (three plus symbols); one line before and one line after.

+++
This text is not parsed by *STX*, so it _should_ be rendered as it is.
+++
Simple literal component.

Optionally, the raw text can be processed by a function to decorate it or create a richer component. The function must be indicated by an entry right after the first +++ mark (there can by any amount of spaces in between).

+++ code:js
console.log('Hello world!');
+++
Code block component produced by a processed literal.

See the functions sections for more details.

Capturing Block

This component creates a group of components which can be processed by a function to decorate them or create a richer component. The components must be delimited by {{{ (three left curly braces) and }}} (three right curly braces) ; one line before and one line after, the function must be indicated by an entry right after the first {{{ mark (there can by any amount of spaces in between).

{{{ information
This text will be shown as an information admonition.
}}}
Admonition box produced by a captured paragraph.

See the functions sections for more details.

Content Breaks

Any block component can be broken by using % (percentage symbol).

= Section A

== Section B

Content of section B. Next block mark will break this section.

% 

This content now is inside the Section A.
Section content broken.

Paragraphs

Any other sequence of characters that doesn't match with a block mark is considered a paragraph. The paragraphs are composed by a sequence of inline components and are broken by an empty line.

This is the paragraph #1.

This is the paragraph #2.
This line is part of the paragraph #2.
Two paragraphs.

Inline Components

Rich Text

There are some delimiter marks that can decorate the surrounded inline content for producing richer components. These marks can be nested as any other component in STX.

Supported delimited text components.
Mark Description Example
* (asterisk symbol) Strong text: normally rendered as bold text. Note: *This* is important!
_ (underscore symbol) Emphasized text: normally rendered as cursive text. A.K.A text in _italics_.
` (grave accent symbol) Code sample text: normally rendered with a monospace typography. HTML uses the `<code>` tag.
~~ (two tilde symbols) Strike-through text: normally rendered with a horizontal line through its center. Use ~~one~~ two tilde symbols.
"" (two double quotes symbol) Inline typographic quotation primary marks: the text is surrounded with and . ""Primary"" marks.
'' (two single quotes symbol) Inline typographic quotation secondary marks: the text is surrounded with and . ''Secondary'' marks.

Symbols

There are some combinations of characters that generate specific symbols:

Character Sequence Symbol
... (three period symbols) (ellipsis)

Inline Capturing

Similar to the capturing blocks, inline content can be captured as well by surrounding it with { and } (left and right brace symbols). Captured text doesn't change too much by itself, it only gets grouped and the rendered result is normally the same.

""Light My Fire"" is a song by {the American rock band *the Doors*}.
The captured text is the American rock band *the Doors*.

In order to generate richer content, the captured content can be processed by appending a function call immediately after the } (right brace symbol).

Function Call

To create a function call, an entry must be surrounded with < and > (left and right angle bracket symbols). Functions calls can receive captured content and the arguments indicated in the entry.

The name of the entry indicates the function and the value are the direct arguments. Captured content can be passed as argument by putting it immediately before of a function call.

Click on the folder icon (<img: folder.png>).

Links are highlighted with {red}<color:#FF0000>.
Different ways to create function calls.

Attributes

All components can receive arguments, they are used mainly for changing how it is rendered or just to provide meta information.

The arguments must be specified before the component and are created with a @ (at symbol) followed by an entry. The entry represents the name and value of the attribute.

@type:appendix
= Appendix Title

Appendix content
Section marked as an appendix by using attributes.

The attributes which can accept all components are described in the table below.

Global attributes for components.
Attribute Accepted Values Description
ref Token or Group of tokens. Defines how the component can be referenced by links.

Directives

Directives are instructions for the STX compiler, they are created by using a # (number sign) followed by an entry representing the name and arguments of the directive.

Unlike functions, that produce a component and are evaluated after the document is parsed, directives are evaluated immediately when they are found in the document and doesn't necessarily produce components.

Include Directive

Accepts a Token or a Group of tokens, the tokens represent the files which are going to be included where the directive was placed.

The specified files are relative to the current STX file, if the arguments contain a folder, the entire content of the folder is included recursively.

#include(appendix.stx)

#include(appendix.stx, glossary.stx)
Include examples.

Output Directive

Indicates the output format after processing the document. Accepts following arguments:

  • format: Built-in values are html and json.
  • target: The file where the document should be generated.
  • options: Format-specific options.
#output(format: html, target: ./index.html)
Output directive example.

Stylesheet Directive

This directive exists because HTML and other formats supports the use of stylesheets. It accepts a Token or a Group of tokens representing the stylesheet files.

#stylesheets(
    assets/layout.css,
    assets/style.css
)
Stylesheets directive example.

Meta-data Directives

There are some directives for defining document attributes, these values are not necessarily rendered in the document.

  • title: Defines the title of the document.
  • author: Defines the author of the document.
  • encoding: Defines the encoding of the document.
#title(`STX Documentation`)
#author(`Sergio Pedraza`)
#encoding(`UTF-8`)
Meta-data directives example.

Functions

In STX, functions can receive plain values and components as arguments to generate new components. There are some built-in functions described in this section, however, more functions can be registered at runtime since the design is modular.

Functions can be invoked by using following components:

All allowed components can use plain values as arguments by sending them in the entry.

Code

This function can be invoked by using the code keyword accepting following plain arguments:

Literal text should be sent as argument, if a rich component is sent, it will be converted to plain text throwing a warning.

If the language is supported by some registered grammar, the result will be tokenized, otherwise the result will be just marked as a code block.

Images

This function can be invoked by using the img and image keywords, accepting following plain arguments:

The resulting component will be an image inserted in the document.

Embed Content

This function can be invoked by using the embed keyword, accepting following plain arguments:

The resulting component will be a literal text with the content of the file.

For embed the components of a STX file, the directive include should be used.

Table of Contents

This function can be invoked by using the toc keyword, accepting following plain arguments:

The resulting component will be a table of contents of all sections defined in the document.

Admonitions

This function can be invoked by using the information and warning keywords.

The resulting component will be a block marked as an admonition of the type of the keyword.

Data Syntax

Apart from the rich content of the components, in STX there is a syntax for entering structured data which is used to invoke and pass arguments to functions, define directives and attributes, among other uses.

This way of entering data is designed to be easy to write and read for a human, it consists in three types of data described below.

Token

Represents a final value, depending on the context it can be interpreted as text, number, boolean, etc.

There are two ways of writing a token:

  1. Direct: Any sequence of letters, digits, underscores (_), dashes (-), periods (.) or slashes (/).
  2. Delimited: Any sequence of characters between grave accents (`), special characters can be scaped by using a \.
012
abc
` `
assets/example.txt
Example of valid tokens (one per line).

Entry

An entry represents a named value, the name is defined by using a token followed by a : (colon symbol) as a separator, the value can be either a token or a group. When the value is a group, the separator can be omitted.

genre: jazz
song(artist: `Miles Davis`, name: `So What`)
fibonacci(0, 1, 1, 2, 3, 5, 8, 13, 21, ...)
file: folder/index.html
Example of valid entries (one per line).

Group

Sequence of values (token, entry or group) delimited by parenthesis () and separated by a comma (,).

genre: jazz
song(artist: `Miles Davis`, name: `So What`)
fibonacci(0, 1, 1, 2, 3, 5, 8, 13, 21, ...)
file: folder/index.html
Example of valid groups (one per line).