Through my own fault of being overly enthusiastic, and holding the general belief that there is always a better way with anything technology based... it's just that nobody implemented it yet! I decided to see if I could do something to improve how side-effects and asynchronous processing are both used and managed in XPDLs (XPath Derived Languages), e.g. XQuery, XSLT, XProc, etc.

Following on from my post EXPath and Asynchronous HTTP, Debbie Lockett agreed to co-author a paper with me for XML Prague 2019, entitled "Task Abstraction for XPath Derived Languages", I wonder if she is regretting that decision yet? Anyhows...

One aspect of our paper looks at what facilities XPDL implementations currently offer for I/O, and if these cause side-effects. For those not in-the-know, XPDLs like XPath and XQuery are functional programming languages, which means that side-effects are verboten! XPath and XQuery even have a Formal Semantics which forbid these, so any implementation allowing side effects is firstly, sacrificing purism for ease of use, and secondly... very naughty!

Recap - Why are side-effects verboten?

At its simplest, consider three functions getChar(): Char, putChar(Char): (), and putString(String): (). The first consumes a character from stdin and returns it, the second outputs a character to stdout, and similarly the third outputs a string to stdout.

If we had an application that prints a prompt on the screen, retrieves a character, and then echoes it back to the screen before finally printing "Done.", its pseudo-code might look like:

putChar('>')
let $x = getChar()
putChar($x)
putString("Done.")

In an imperative programming language, we can be certain that these statements will be executed sequentially from top-to-bottom and all would be well.

However, in a pure functional programming language, the compiler is free to re-order the evaluation of expressions which are not dependent. It is also free to eliminate expressions from which the result is not required for a future computation.  Now we have problems... the first putChar expression has no relationship with any other expression, so it could be evaluated before or after the next two expressions (which are dependent through $x), or it might never be evaluated, this means that our prompt could appear after we enter and echo a char, or that it might never appear at all. Likewise for putString, "Done." could actually be printed to the screen before anything else!

Haskell? What about Haskell?

If you're wondering why I am talking about XQuery and not Haskell, then sit back down and relax, I'm getting to the Haskell stuff...

In the past I have dabbled with Haskell, and very much enjoyed studying Learn You a Haskell for Great Good!. One of the things that you quickly learn about Haskell, is that it is a Pure functional programming language. Haskell forbids side-effects, yet somehow still offers facilities for I/O that are safe... What is this black magic? Haskell's IO Monad.

What does Haskell's IO Monad have to do with XPath? This gentleman:

Philip Wadler

Philip Wadler (a.k.a Lambda Man). Don't be fooled, he is wearing that cloak for a reason! With the power of the abstractions in his mind, he could melt your face off!

Philip Wadler was responsible for both developing Monads for I/O in Haskell, and as I recently connected, also one of the editors of the W3C XQuery 1.0 and XPath 2.0 Formal Semantics specification. Coincidence? I think not!

I have known for some time that Haskell has something called the IO Monad, which enables it to remain a pure functional language (no side-effects), whilst still offering I/O operations, and I now know that Philip Wadler was heavily involved in both that and the formal specification for XPath.

Perhaps if I can understand how I/O really works in Haskell, then I can exploit that knowledge to help me avoid/manage side-effects for I/O in XPDLs.

Haskell's IO Actions

Haskell allows an application to utilize I/O through the use of IO Actions. In fact it actually enforces this, and provides no other mechanism for I/O! Note, that I am saying "IO Actions" here and not "IO Monad", as it has been suggested that the term "IO Monad" is misleading and/or confusing, see the article IO Monad Considered Harmful. However, the historically used term is "IO Monad".

I won't get into an in-depth explanation of Monads, there are many explanations out there that are more comprehensive and accurate that I could manage here. Rather it is enough to say that in Haskell, the monadic laws of the IO type allow you to compose IO Actions together. I will cover only the bare minimum on monads, allowing me to explain how Haskell remains a pure language (no side-effects) whilst allowing I/O.

In Haskell, all I/O functions must return an IO Action, so our earlier functions would now something look like this (in pseudo code) - getChar(): IO[Char], putChar(Char): IO[()], and putString(String): IO[()].

Okay, so... when we now call our functions, instead of getting a primitive value, we now get this thing called an IO, which at a non-educated glance appears to encapsulate our result.

However, what is happening here is actually much more subtle! Consider getChar's return value IO[Char], this does not encapsulate the result, rather it encapsulates a function which when applied will consume a character from stdin and return the resultant character.

By encapsulating the action to get the character, the IO action defers the evaluation of the actual I/O that we wish to occur. At this point no I/O has been performed, we have not read a character, we have only expressed that we want to read a character.

When everything returns an IO, we might wonder how we can actually read a character, and then put that same character back on the screen? Remember our putChar(Char) function accepts a value argument of type Char , but our getChar(): IO[Char] function now returns an IO[Char]... so we can no longer pass the output of getChar directly to putChar!

Not to worry, through the monadic laws of the IO type we can chain many such I/O actions together. If we rewrote our earlier application we might have something like:

let $w: IO[()] = putChar('>')
let $x: IO[Char] = getChar()
let $y: (a: Char -> IO[()]) = putChar(a)
let $z: IO[()] = putChar("Done.")

$w >> $x >>= $y >> $z

We won't go into the exact details here of how >> (which is called then) and >>= (which is called bind) work, it is enough to just say that each composes two IO actions (monads) into a single IO action (monad). The difference between the two, albeit I'm simplifying, is that >> discards the result of the first action when constructing the new action, whereas >>= extracts the result of the first action and passes it to the constructor for the new action.

The application code above is intentionally verbose to explicitly show the types involved. It could be rewritten simply as putChar('>') >> getChar >>= putChar >> putString("Done.") . Either way, the result of the above application would be IO[()] as that is the result of evaluating $z.

But we still don't have a result, we have this IO[()] thing, which means that we haven't yet actually done any of the I/O we described, we just have an abstraction which models what we would like to happen. Let's now look at how we make something meaningful happen...

So... How the hell does I/O really work in Haskell?

Firstly, when you write a Haskell program you must define a function called main and this main function must return an IO.

Some of you are probably now saying.... ahhhh ha!

All Haskell programs are an IO! The Haskell compiler knows what to do with your main IO.

Okay... more head-scratching...  how is this pure and functional? Let's take a look behind the curtain...

The IO type is actually a function with a definition something like, IO(RealWorld): (Result, RealWorld). This function is just a state transformer, it takes the state (of the real world), performs some computation, and returns a tuple consisting of both the result of the computation and a new state (of the real world).

Haskell defines RealWorld as an abstract type, and we really need not say too much more about it. It serves the purpose of being threaded through your IO actions to ensure things are executed in the order you would expect. Internally, the Haskell compiler likely doesn't pass around any "real world" state. Instead, it can optimize this out as it compiles your beautifully pure functional program into an imperative side-effecting program that can be executed by your computer's CPU.

In case you're still confused, simply put, all of the I/O you express in Haskell is constructing pure functions. The monadic part helps you to compose these functions without having to think about the RealWorld. The RealWorld is passed into each I/O function as a parameter, the function computes only upon its parameter, it then returns a result and a new RealWorld, which will be passed to the next I/O function, and so on.

If we reduce all the types, getChar(): IO[Char] would look something like:

getChar(): (RealWorld -> (Char, RealWorld))

Can we exploit this in XPDLs?

Haskell's IO Actions allow a user to both easily describe the I/O operations that we wish to perform, and make explicit the order that the operations must be performed in.

We could imagine defining something similar for XPDLs. To drive adoption we must strive to define this within the facilities of the existing XPDL specifications. Anything else suggesting new non-standardised constructs, would likely result in implementers refusing to undertake further complex work, and ultimately lead to a lack of adoption.

With, for example XQuery, unlike Haskell, the input is an instance of the XDM (XPath and XQuery Data Model) and the output is also an XDM. The XDM does have a function type though, so could we return something like an IO?

In Haskell the RealWorld is an abstract type. This means that the developer cannot instantiate it (ignoring unsafePerformIO), and therefore cannot complete the IO function themselves. This provides safety against the developer arbitrarily executing side-effecting IO.

The XDM type system is sealed, which means we cannot declare new types, abstract or otherwise! So what can we do? As we cannot create an abstract RealWorld type from within an XPDL, this leaves us with two options:

  1. We simply use a non-abstract type to represent the real world, and we ask the user not to instantiate a value of this type and apply it to their IO.
  2. We could instead define a function signature that returns an item(), but really is returning some proprietary type. If the user attempts to apply a parameter of anything apart from the proprietary type to their IO, we raise an error. Unfortunately, this would mean stepping outside of the XDM, and forcing that implementation to occur in the host language of the XPDL processor.

If we consider the simplest approach, and use a non-abstract type for the real world, as an IO is really nothing more than a function, a suitable signature for it in XPath could be:

declare function io:IO($realworld as element(io:realworld)) as item()+

In this example, I have expressed the real world type as a node of type element which is named io:realworld. Regrettably, the type system in XPath is not strong enough to allow me to strictly enforce that the return type is a tuple of the value result and the new real world. As a compromise, I have used a sequence which will always have one or more values, the first value must be the new real world, and the remainder is the value result of the IO function.

Our sample program expressed in XQuery might then look like:

(: Side-effecting functions provided by an external processor or library :)
declare function stdin:get-char() as xs:integer external;
declare function stdout:put-char($codepoint as xs:integer) as empty-sequence() external;
declare function stdout:put-str($s as xs:string) as empty-sequence() external;



(: IO version of get-char :)
declare function local:get-char()
    as function(element(io:realworld)) as item()+ {
  function($realworld as element(io:realworld)) {
    ($realworld,
       stdin:get-char())
  }
};

(: IO version of get-char :)
declare function local:put-char($codepoint as xs:integer)
    as function(element(io:realworld)) as item()+ {
  function($realworld as element(io:realworld)) {
    ($realworld,
        stdout:put-char($codepoint))
  }
};

(: IO version of put-str :)
declare function local:put-str($s as xs:string)
    as function(element(io:realworld)) as item()+ {
  function($realworld as element(io:realworld)) {
    ($realworld,
       stdout:put-str($s))
  }
};


(: Implementation of `bind` i.e. `>>=` :)
declare function local:bind($m as function(element(io:realworld)) as item()+, $binder as function(item()*) as function(element(io:realworld)) as item()+) as function(element(io:realworld)) as item()+ {
  function($realworld as element(io:realworld)) {
    let $mr := $m($realworld)
    let $new-realworld := $mr[1]
    let $computed := subsequence($mr, 2)
    return
      $binder($computed)($new-realworld)
  }
};

(: Implementation of `then` i.e. `>>` :)
declare function local:then($m1 as function(element(io:realworld)) as item()+, $m2 as function(element(io:realworld)) as item()+) as function(element(io:realworld)) as item()+ {
  local:bind($m1, function($ignored) {
    $m2
  })
};


(: Express our chain of IO :)
let $x :=
        local:put-char(fn:string-to-codepoints(">")[1])
            => local:then(local:get-char())
            => local:bind(local:put-char#1)
            => local:then(local:put-str("Done."))


(: Apply the IO function... or??? :)
return $x(<io:realworld/>)

In the above code we apply the IO function as the last expression of the XQuery. As long as the IO is always evaluated as the last expression, we need not worry about side-effects.

Similarly to Haskell, we would ideally return the IO function as the result of the query. The problem here is that the XDM result of the query is likely subject to the W3C XSLT and XQuery Serialization specification, which does not provide for the evaluation of any returned function.

Another consideration is that the IO (function) that we have been passing around is just a plain old XPath function, so even if we did return it as the result of the query, the XPDL processor has no idea that it should treat it differently. One potential solution would be for us to add annotations to our functions, but unfortunately these are only available in XQuery, and not in XPath; this would unacceptably exclude other XPDLs. Another option is that, when describing a particular function definition, both the W3C XPath functions specification and various EXPath function specifications already use language such as, "The function is non-deterministic with respect to...". We could perhaps define our IO functions with similar specification language like "The function produces an IO action, which should only be evaluated after query completion".

One last thing worth mentioning, is that you wouldn't be blamed for thinking that the XQuery code above is rather verbose! I intentionally wrapped each of the side-effecting functions in a safe IO function, so that the mechanics were apparent to the reader. However with a little bit of magic, we can instead write lift functions which will promote any non-safe (side-effecting) function into a safe IO function. The goal of the lift, is to take a function of type a -> b, and give us a function of type IO a -> IO b. Omitting the bind and then functions for brevity, our implementation now looks like:

(: Side-effecting functions provided by an external processor or library :)
declare function stdin:get-char() as xs:integer external;
declare function stdout:put-char($codepoint as xs:integer) as empty-sequence() external;
declare function stdout:put-str($s as xs:string) as empty-sequence() external;


(: Implemention of "lift"
 :
 : Lifts a function into an IO safe function.
 :
 : Converts a f():b to an f(IO[()]):IO[b]
 :)
declare function local:liftM0($f as function() as item()*) as function(function(element(io:realworld)) as item()+) as function(element(io:realworld)) as item()+ {
    function ($iob as function(element(io:realworld)) as item()+) {
        $iob => local:then(function($realworld as element(io:realworld)) {
            ($realworld,
                $f())
        })
    }
};

(: Implemention of "lift"
 :
 : Lifts a function into an IO safe function.
 :
 : Converts a f(a):b to an f(IO[a]):IO[b]
 :)
declare function local:liftM1($f as function(item()*) as item()*) as function(function(element(io:realworld)) as item()+) as function(element(io:realworld)) as item()+ {
    function ($iob as function(element(io:realworld)) as item()+) {
        $iob => local:bind(function($b as item()*) {
          function($realworld as element(io:realworld)) {
            ($realworld,
                $f($b)
            )
          }
        })
    }
};

(:
 : Implementation of `unit` a.k.a. `return`
 :
 : Just places a pure value into an IO
 :)
declare function local:unit($u as item()*) as function(element(io:realworld)) as item()+ {
    function($realworld as element(io:realworld)) {
            ($realworld,
                $u)
    }
};


(: Express our chain of IO :)
let $x := local:unit(fn:string-to-codepoints(">")[1])
    => (local:liftM1(stdout:put-char#1))()
    => (local:liftM0(stdin:get-char#0))()
    => (local:liftM1(stdout:put-char#1))()
    => local:then(local:unit("Done"))
    => (local:liftM1(stdout:put-str#1))()


(: Apply the IO function... or??? :)
return $x(<io:realworld/>)

We could likely go further and create a generic lift function rather than needing one for each function arity, but for now I just implemented for zero and one arities.

Conclusion

We can certainly express monadic IO actions from Haskell in XQuery. Like Haskell, the threading of the realworld through our IO actions allows us to ensure the computation order of our IO, and by evaluating the IO at the end of the program, ideally to be applied by the processor, we can ensure side-effect free computation. The lack of a more expressive type system in XQuery does open up potential problems, whereby the types of arguments and return values are not as strict (read enforceable) as they would be in Haskell, this limits the static checking that can be performed.

Personally, it has been an interesting project to see how far we can push these Haskell concepts in XQuery. Regards our upcoming conference paper, these results are helping us to form some strong view-points as to that which we wish to design for XPDLs.