Strategic Plan

Balisage 2019 Program

Strategic Business Unit

Balisage 2019: The Markup Conference (BS2019)

Plan Details

Plan period: from 30/07/2019  to 02/08/2019

Plan submitted by:

Owen Ambur

Analysis

Articulation

Mission

To document and share the agenda for the Balisage 2019 conference.

Values

Practicality

Theory

There is nothing so practical as a good theory

Scorecard

Perspective Goals Objectives Performance Indicators Commentary
Explicit Markup Forms & Media unnamed performance indicator

0

Loading...
[To be determined]
Hierarchy unnamed performance indicator

0

Loading...
[To be determined]
Visibility & Knowledge unnamed performance indicator

0

to 18/12/2019

Loading...
TEI Annotation unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Workflows & Systems unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Scholarly Content Best Practices unnamed performance indicator

0

Loading...
[To be determined]
Inteviews unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
JSON Schema Schema Validation unnamed performance indicator

0

Loading...
[To be determined]
XProc unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Text & Markup Processing unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Character Data unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
XML Everywhere GUIs unnamed performance indicator

0

Loading...
[To be determined]
WebSocket unnamed performance indicator

0

Loading...
[To be determined]
State Chart XML unnamed performance indicator

0

Loading...
[To be determined]
XForms Multitasking Control unnamed performance indicator

0

Loading...
[To be determined]
Monitoring unnamed performance indicator

0

Loading...
[To be determined]
Tasks unnamed performance indicator

0

Loading...
[To be determined]
Interviews unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Document Dysfunction unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Invisible XML unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Aparecium Non-XML Data unnamed performance indicator

0

Loading...
[To be determined]
MVC Paradigm unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
EXPath Packaging unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Security Content unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Accessability unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Literary Creativity unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Interviews unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Vocabularies unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Parts of Speech unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Encodings unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
Security Controls Schemas & Tooling unnamed performance indicator

0

Loading...
[To be determined]
Documentation unnamed performance indicator

0

Loading...
[To be determined]
Parallel Schemas unnamed performance indicator

0

Loading...
[To be determined]
Conversion Tools unnamed performance indicator

0

Loading...
[To be determined]
Loose-Leaf Publishing unnamed objective unnamed performance indicator

0

Loading...
[To be determined]
XML DBs & CMSs Digital Editions unnamed performance indicator

0

Loading...
[To be determined]
Rules & Requirements unnamed objective unnamed performance indicator

0

Loading...
[To be determined]

Goals

Explicit Markup

Goal Statement: Separate meaning from format.

Tuesday 9:15 am - 9:45 am -- Explicit markup: a fool's errand or the next big thing? ... I still want the world that Brian Reid told us we could not have; I still want Brian Reid to have been wrong. I still believe that separating meaning from format will enable our documents to be displayed in many forms and media, that a markup format that makes hierarchy explicit makes complex documents tractable, that when content creators author in systems that make declarative markup visible and use the author's knowledge to add value to their content, we will be able to make documents sing! And I have the twenty-year-old slides to prove it.

Objectives:

  • Visibility & Knowledge
  • Forms & Media
  • Hierarchy

TEI Annotation

Goal Statement: Implement TEI standoff annotation in the browser.

Tuesday 9:45 am - 10:30 am -- Standoff markup allows you to add information to a text without modifying the source. Often this can be achieved by linking between different documents. Various mechanisms exist for handling the connections involved. But some cases such as named entity recognition appear to require inline markup. Could we do this with standoff markup too? The answer is yes, using the TEI Critical Apparatus model, but it isn't completely straightforward.

Objectives:

  • unnamed objective

Workflows & Systems

Goal Statement: Use XML to build rich models, extensive workflows, and robust systems.

Tuesday 11:00 am - 11:45 am -- Eating your own dog food -- Declarative solutions generally - and XML specifically - invite experimentation, iterative development, and play. In this way they encourage the self-described "non-programmer" to build rich models, extensive workflows, and robust systems. But can you build the whole application this way? And if the application is critical to getting paid, do you have the courage to do so? We Swedes are a courageous lot.

Objectives:

  • unnamed objective

Scholarly Content

Goal Statement: Optimize the reusability of scholarly content.

Tuesday 11:45 am - 12:30 pm -- Rules for the Rulemakers: JATS4R's Self Guidance on Attributes (LB) -- Maximal flexibility of rules, or ease of reuse -- choose one. The tighter the rules, the more consistent documents will be and the easier it will be to reuse them, but only if the rules are reasonable enough to be adopted. (If all the data creators ignore the rules, reuse doesn't get easier.)

Objectives:

  • Best Practices

Inteviews

Goal Statement: Get to know about the work and lives of some of the people in our markup community.

Tuesday 2:00 pm - 2:45 pm -- Interviews to be determined -- Balisage is a gathering of remarkable people! We have developed standards, tools, languages, and applications. We have written books, blogs, tweets, and code. We have changed organizational cultures - and in a few cases we have changed the world. In these interviews, we will get to know a bit about the work and lives of some of the remarkable people in our markup community.

Objectives:

  • unnamed objective

JSON Schema

Goal Statement: Apply Brzozowski derivatives to JSON Schema.

Tuesday 2:45 pm - 3:30 pm -- In 1964, Janusz Brzozowski defined a new technique for computing whether a string of symbols is in the language defined by an extended regular expression. Brzozowski derivatives have been used for content model validation in several XML schema processors; they can also be applied to the task of model validation for JSON Schema.

Objectives:

  • Schema Validation

XProc

Goal Statement: Enable XML centric workflows.

Tuesday 4:00 pm - 5:30 pm -- XProc 3.0 (LB) -- XProc is an XML pipeline language designed for XML centric workflows. XProc 3 is currently under development. The editorial team believes that the core language specification is in "last call". XProc 3.0 is designed to improve the usability of XProc. Features include: handling XML, text, binary, and JSON documents, text value and attribute value templates, typed variables and options using XDM 3.1, and a lot of shortcuts. XProc's language design is still about encapsulated data processing steps with defined inputs, outputs, and options. What sets it apart from other scripting languages, Make, and Ant is this: It is a truly functional language with immutable inputs and state. This allows composition of arbitrarily complex steps without risking unexpected side effects and without jeopardizing manageability. In contrast to other functional languages, it offers multiple return "values" (on the named output ports) that don't have to be consumed at once, or at all. Apart from becoming less verbose, XProc 3.0's major strength is that JSON, text, HTML and binary data are now first-class citizens, making it suitable for data processing in the Web age. In addition to describing the new XProc 3.0 we will show code (including both XProc 1.0 and 3.0) and demonstrate XProc tools.

Objectives:

  • unnamed objective

Text & Markup Processing

Goal Statement: Outline a plan to make text and markup processing easier.

Wednesday 9:00 am - 9:45 am -- Text and markup processing languages, past, present, and future -- Programming language design is in continual flux, with significant new languages coming along every few years. In the field of text and markup programming languages, things seem stable at the moment, with XSLT in a dominant position and a few other languages filling in the gaps. But text and markup processing is no more exempt from change than any other field. What should the next language for this application domain look like? Can we make text and markup processing easier than it is now? What direction should we take? For the last ten years or so, I have been working on this problem. I have a plan.

Objectives:

  • unnamed objective

Character Data

Goal Statement: Streamline character data for tokenization.

Wednesday 9:45 am - 10:30 am -- "With one voice": streamlining character data for tokenization -- Some full-text search and textual analysis tools operate exclusively on sequences of tokens. Deriving input for these tools from XML documents can be challenging and depends heavily on the encoding practices and assumptions which produced the XML. Does metadata information, for example, carry the same weight as the text? If a document includes annotations about nuances of the transcription, including those annotations may aid researchers attempting to find relevant documents, but may hinder a process that is performing textual analysis of the work authored. Rather than attempting to make all tools powerful enough to deal with these issues, a modular approach to tokenization has been developed.

Objectives:

  • unnamed objective

XML Everywhere

Goal Statement: Demonstrate that XML can be used up and down the application stack.

Wednesday 11:00 am - 11:45 am -- Graphical user interfaces in the X stack -- "XML Everywhere" isn't just a slogan: it actually works, up and down the XML application stack. Recent developments, such as the inclusion of custom elements in HTML5, allow the declarative approach of XML to come into the browser/server interaction.

Objectives:

  • WebSocket
  • GUIs
  • State Chart XML

XForms Multitasking

Goal Statement: Multitask in XForms.

Wednesday 11:45 am - 12:30 pm Multitasking algorithms in XForms -- XForms aren't just about simple data collection: they can process data as well. Even operations that have traditionally been considered "procedural" rather than "declarative" can be specified -- and executed -- from within an XForms instance. XForms actions provide form authors with the ability to solve basic data manipulation use cases such as changing data values and copying and deleting elements. XForms offers multitasking techniques to control execution, to monitor execution progress, and to set task execution priorities dynamically. There's a lot more there than you might have expected!

Objectives:

  • Tasks
  • Monitoring
  • Control

Interviews

Goal Statement: Get to know about the work and lives of some of the people in our markup community.

Wednesday 2:00 pm - 2:45 pm -- to be determined -- Balisage is a gathering of remarkable people! We have developed standards, tools, languages, and applications. We have written books, blogs, tweets, and code. We have changed organizational cultures -- and in a few cases we have changed the world. In these interviews, we will get to know a bit about the work and lives of some of the remarkable people in our markup community.

Objectives:

  • unnamed objective

Document Dysfunction

Goal Statement: Fix document dysfunction.

Wednesday 2:45 pm - 3:30 pm -- We Created Document Dysfunction. It Is Time to Fix It. (LB) -- Some of us building software need to take a hard look in the mirror. For years, we have promised that technology would solve the world's information management problems, but 85% of business information is still "dark data," with potentially useful insights lost in a rising tide of disconnected documents, emails, Slack conversations, voice-to-text messages, etc. We need an effective approach to documents and want to start a public conversation about these issues. We believe that effective solutions should be based on: Declarative Markup; AI sympathetic to "Small Data"; focus on company-specific documents; applying AI to documents as a whole; and solutions that do not disrupt existing workflows or require massive investment. The future isn't about AI making human beings obsolete; the future is about AI making human beings and companies more productive, effective, and creative.

Objectives:

  • unnamed objective

Invisible XML

Goal Statement: Discuss whether it is necessary to see the markup.

Wednesday 4:00 pm - 4:45 pm Do we really want to see markup? -- Markup fanatics have long cried, "We need to see the markup!" Yet since the earliest stages of developing the SGML standard, there has been an urge even among standards developers to avoid having to write tags everywhere. The recent urge to create "Invisible XML" is but the latest symptom of a smoldering disease, from which I, too, suffer.

Objectives:

  • unnamed objective

Aparecium

Goal Statement: Provide an XQuery/XSLT library to make the use of "invisible XML" convenient.

Wednesday 4:45 pm - 5:30 pm -- Aparecium: an XQuery/XSLT library for invisible XML -- This paper introduces Aparecium, a library intended to make the use of "invisible XML" convenient for users of XSLT and XQuery. Invisible XML, a method for treating non-XML documents as if they were XML, holds great promise for immediately and easily bringing our array of XML technologies to bear on the non-XML data that we encounter (CSS, wiki markup, domain-specific notations, JSON, LaTeX, etc.). Aparecium uses an Earley parser to ensure that any context-free grammar can be used.

Objectives:

  • Non-XML Data

MVC Paradigm

Goal Statement: Test the Model-View-Controller (MVC) paradigm in XForms.

Thursday 9:00 am - 9:45 am -- XForms Space Invaders -- The Model-View-Controller (MVC) paradigm is a design pattern for creating applications in which: the View (web page) interacts with the user; the Model controls manipulation of the data; and the Controller orchestrates the work of the view and the model. Implementing the classic arcade game Space Invaders in an XForms workbench proved to be a successful testbed for this approach. Key functionalities required for Space Invaders are an application "heartbeat" to control the speed/progression of the invaders; animated graphics for the invaders, the Mystery Ship, and laser fire; and the user-controlled laser cannon. The workbench was implemented using Orbeon Forms, an open source framework which supports XForms 1.1 with a number of custom extensions, including Javascript actions, Attribute Value Templates on XHTML elements, and listeners for "keypress" events. Most of the extensions required are included in the draft XForms 2.0 specification (albeit with slightly modified syntax).

Objectives:

  • unnamed objective

EXPath Packaging

Goal Statement: Manage code modules for languages based on XPath.

Thursday 9:45 am - 10:30 am -- Improving upon the EXPath Packaging System (LB) -- How can we best manage code modules for languages that are based on XPath? Users have had experience with the EXPath packaging system for nearly a decade and can now see both its strengths and weaknesses, particularly the handling of dependencies. The authors propose a replacement packaging system, based on experience with Maven, which leverages mature versioning and dependency management technologies.

Objectives:

  • unnamed objective

Security Content

Goal Statement: Package XML security content into bundles for easy deployment.

Thursday 11:00 am - 11:45 am -- SCAP composer: a DITA Open Toolkit plug-in for packaging security content -- The Security Content Automation Protocol (SCAP) schema for source data stream collections standardizes the requirements for packaging XML security content into bundles for easy deployment. SCAP bundles must be self-contained (each bundle contains all necessary information without external references) and reversible (XML components must be unmodified so they can be rebundled). These requirements (along with very long identifiers) make authoring the content and bundling very difficult. SCAP Composer is an authoring product which uses a DITA specialized element type for source data stream collections that makes the authoring process easier. SCAP Composer takes an incremental approach to aiding SCAP content authors: it helps only with creating source data stream collections; it does not offer any help with creating the XML resources encapsulated in a data stream collection. SCAP Composer is implemented using the DITA Open Toolkit and can be used with any DITA authoring software that includes the Toolkit, or with a standalone Toolkit.

Objectives:

  • unnamed objective

Accessability

Goal Statement: Discuss accessibility and XML.

Thursday 11:45 am - 12:30 pm -- Accessability and XML (LB)

Objectives:

  • unnamed objective

Literary Creativity

Goal Statement: Exercise your literary creativity with poems, short stories, jokes, and songs.

Thursday 1:15 pm - 2:00 pm (during lunch - location: Sinequa) Balisage Bard -- Once again, Balisage Bard gives you the opportunity to exercise your literary creativity with poems, short stories, jokes, and songs. Subject matter must be related to Balisage (markup, venue, papers, and so forth). Read your effort during the game session. Translations of works in languages other than English are not required but will be appreciated. There is a two-minute time limit for each presentation. As many submissions as time permits will be taken; authors will be called in the order they sign up (there will be a sign-up sheet at conference registration). If time permits, additional volunteers will be accepted during the game.

Objectives:

  • unnamed objective

Interviews

Goal Statement: Get to know about the work and lives of some of the people in our markup community.

Thursday 2:00 pm - 2:45 pm -- to be determined -- Balisage is a gathering of remarkable people! We have developed standards, tools, languages, and applications. We have written books, blogs, tweets, and code. We have changed organizational cultures -- and in a few cases we have changed the world. In these interviews, we will get to know a bit about the work and lives of some of the remarkable people in our markup community.

Objectives:

  • unnamed objective

Vocabularies

Goal Statement: Describe changes to vocabularies, both successful and unsuccessful.

Thursday 2:45 pm - 3:30 pm -- Extending vocabularies: the rack and the weeds -- Markup languages such as XML, JSON, and SGML divide documents into two parts: markup and content. While in theory markup could be created ad hoc for every document, this would mean that markup had no meaning (and thus no value) to anyone but the creator of the document. In order to realize the value of marked up documents for interchange and longevity, we create, write documentation for, and share markup vocabularies. Vocabularies are created in specific contexts and for specific purposes. Like all human constructs, they are flawed and need to be repaired and changed over time. As people bump up against the limitations of their markup vocabularies, they often want to extend those vocabularies. Understanding these processes requires sensitivity of the human needs involved and the social contexts in which people interact with and around the vocabularies. This paper characterizes some of these contexts and their properties, and in the light of this characterization describes changes to vocabularies both successful and unsuccessful.

Objectives:

  • unnamed objective

Parts of Speech

Goal Statement: Select the most likely POS and meaning for a given word token.

Thursday 4:00 pm - 4:45 pm -- You're not the POS of me: the centrality of markup for part-of-speech tagging (LB) -- Part-of-speech tagging is a markup problem: it takes a text and returns it as a series of tokens, each one marked up with information about its grammatical function. This is a standard early-stage process for most kinds of natural language processing, including speech recognition and machine translation. POS tagging for monolingual texts is challenging, but bilingual colloquial texts are even harder. They contain words from both languages, some governed by the grammar of one language, some by the other, and (often) some by both. POS-tagging processes which are effective for either language are unlikely to be sufficient, even in combination, for dealing with bilingual text. In the case of Welsh-English bilingual data, there is the additional challenge of "Wenglish". Wenglish texts combine features of both languages, for example by using Welsh orthography for an English word or English morphological suffixes on a Welsh root. In building DERWen -- a POS Tagger for Wenglish texts -- I have created a mixed Constraint Grammar which takes advantage of the marked-up data produced by the programme in order to select the most likely POS and meaning for a given word token. As the function of markup in the pipeline from raw text to POS-tagged output show, markup is central to computational linguistics.

Objectives:

  • unnamed objective

Encodings

Goal Statement: Explore the cascade of "encodings".

Thursday 4:45 pm - 5:30 pm -- Encoding -- In their model of digital objects, David Dubin and others postulate three entity types (propositions, symbols, and documents) with three relationships: "expresses", "encodes", and "inscribes". We can "express" an assertion with a sentence. We can also "inscribe" symbols in physical media. I'd like to investigate the cascade of "encodings" that we find in every digital computing system, and the articulation of those encodings that is bound up in everything we do. Encoding can be recursive, but do we really understand it? What is happening when we encode a sentence as a character string? A character as an integer? An integer as an octet? Is encoding a well-understood linguistic or mathematical relationship? Is encoding just a mapping (function)? Is it the same as the relationship between a name and its referent? Is it the same as the relationship between a sentence and the proposition it expresses? I don't think so. So let's explore some possibilities.

Objectives:

  • unnamed objective

Security Controls

Goal Statement: Develop technical standards for documentation related to systems security.

Friday 9:00 am - 9:45 am -- The Open Security Controls Assessment Language (OSCAL): schema and metaschema -- The Information Technology Lab at NIST is developing technical standards for documentation related to systems security. The Open Security Controls Assessment Language (OSCAL) defines lightweight schemas, along with related infrastructure, for tagging system security information to support routine tasks like cross-checking, validating against arbitrary constraints, and producing punchlists. OSCAL is not conceived as "another big XML application" but as a metaschema. This approach allows us to simplify the design and maintenance of schemas and related tooling; support generation of documentation; produce multiple parallel schemas for XML, JSON, and YAML; and construct conversion tools more easily. Documents and tools leverage basic HTML, or even Markdown, for simplicity even though it limits the complexity of what can be directly imported. Conversion is simplified by the metaschema approach, even when multiple schemas apply to a single data collection. We hope that these simplifications will lead not only to more documents but also to more useful documents.

Objectives:

  • Conversion Tools
  • Schemas & Tooling
  • Parallel Schemas
  • Documentation

Loose-Leaf Publishing

Goal Statement: Typeset and print only pages that have changed.

Friday 9:45 am - 10:30 am -- Loose-leaf publishing using Antenna House and CSS -- Loose-leaf publishing is the ability to typeset and print only the pages in a document that have changed since its last publication. This presents many interesting challenges. We developed a loose-leaf publication system using Antenna House Formatter, CSS for pagination, and XSLT for post processing the area tree into "change packages" which include only the changed pages. Both the CSS markup and the publication workflow warrant a closer look.

Objectives:

  • unnamed objective

XML DBs & CMSs

Goal Statement: Integrate XML databases and content management systems.

Friday 11:00 am - 11:45 am -- Reese's Peanut Butter Cups and eXist-db: integration of XML databases and content management systems in digital editions -- We have identified four models for integrating digital edition content into eXist-db: TEI Publisher; the eXist-db app framework using HTML templating; the eXist-db app framework without HTML templating; and Apache and PHP mediating between the user and eXist-db, so that eXist-db provides only XML database services.

Objectives:

  • Digital Editions

Rules & Requirements

Goal Statement: Address the need for rules, schemas, and conformance.

Friday 11:45 am - 12:30 pm -- Thinking, wishing, saying -- Can we have rules for our documents we cannot write down in a schema language? If a conformance requirement is not mechanically checkable, is it a conformance requirement? If a rule is not testable, is it a rule?

Objectives:

  • unnamed objective

You can create, execute and optionally publish your own strategy plan for free at StratNavApp.com.