EXPath and Asynchronous HTTP

Recent work on an improved HTTP Client started by Christian Grün in the EXPath Community Group has reignited the spark of collaboration for Open Source XPath extensions. Whilst Christian and I were openly developing a draft for the new EXPath HTTP Client module, we heard from Debbie Lockett, O'Neil Delpratt, and Michael Kay at Saxonica, who were interested in ensuring that their users needs for Saxon-JS were met.

Debbie had previously developed an HTTP Client for Saxon-JS. I was lucky enough to spend a few hours with Debbie discussing the intricacies of it whilst I was recently visiting the UK. The Saxon-JS HTTP Client has some interesting properties, in so far as it is coupled with a concept for asynchronous processing. In fact in Saxon-JS, the only way to make an HTTP request from your XSLT is to do so asynchronously. The request is "scheduled" by use of Saxon-JS's extension instruction ixsl:schedule-action:

<xsl:template name="send-request">
   <xsl:variable name="request" select="
   map{
      'method': 'POST',
      'href': 'http://localhost:19757/mywebapp/receiveXML',
      'body': $body, 
      'media-type': 'application/xml'
      } "/>
      
   <ixsl:schedule-action http-request="$request">
      <xsl:call-template name="handle-response"/>
   </ixsl:schedule-action>
   
</xsl:template>

<xsl:template name="handle-response">
   <xsl:context-item as="map(*)" use="required"/>
   <xsl:for-each select="?body">
      <xsl:call-template name="process-response-body"/>
   </xsl:for-each>
</xsl:template>

I do not know (and was not involved in) the thought process that took place when devising the above approach, but it appears to me that this is just a very slight abstraction over JavaScript's AJAX mechanism. As JavaScript forces you to make HTTP requests asynchronously, we see that neatly reflected in Saxon-JS's XSLT extension.

The discussion at Saxonica, was that any future EXPath HTTP Client needs to also support asynchronous execution, so that it can also be implemented in Saxon-JS. I very much agree; EXPath standards should be defined at a sufficiently high-level of abstraction so as not to restrict the choice of implementation language, or force any particular evaluation semantics, e.g. blocking or non-blocking (asynchronous).

My general feeling is that we should be able to separate the two issues of asynchronous processing, and making HTTP Requests (via the EXPath HTTP Client). Asynchronous processing is a much larger topic, and making an HTTP request is just one of many things that a developer authoring in XQuery or XSLT may want to perform.

Saxon-JS Generic Schedule Action

I think Saxonica have actually partly recognised this already, but have not yet come up with a clean general approach for asynchronous processing. If we examine the documentation for their ixsl:schedule-action instruction more closely, we see that it actually has three distinct functions:

wait
"Used to specify the delay in milliseconds before the call is invoked".
document
"Used to specify documents to be fetched before the call is invoked".
http-request
"Used to specify an HTTP request to be made before the call is invoked (with the HTTP response as the context)".

Whilst document and http-request are performing very task specific asynchronous operations, it seems that wait is actually rather beautifully generic, and could potentially be used to implement any asynchronous operation. I intentionally and carefully chose the word "potentially", because I have the luxury of being in a position where I can create theoretical arguments, i.e. I am not aware of the implementation detail or constraints of Saxon-JS.

If we imagine that currently the two threads of execution (t1 and t2) for an asynchronous http-request in Saxon-JS might look like:

t1: >-- ixsl:schedule-action \---> (next instuction)
                              \
t2:                            \@http-request ---> handle-response

We can reframe this in the form of wait and an EXPath HTTP Client function call for http:post:

t1: >-- ixsl:schedule-action/@wait=0 \---> (next instuction)
                                      \
t2:                                    \my-template --> http:post(...)

The only real difference here is who is in control, the processor or the program. In the first example, we ask the processor to perform an http request (asynchronously) at some point in the future and call our template when it has the http response. In the second example, we ask the processor to call our template at some point in the future, when it is called (presumably asynchronously) we can make the http request ourselves. Both approaches are asynchronous, however the second approach is more generic, rather than calling the EXPath HTTP Client module function http:post we could in fact call any function or do any sort of processing.

Well almost! I have to admit to a slight omission of detail to better favour my argument for a generic approach. When ixsl:schedule-action is invoked for a document or http-request it returns one or more JavaScript XHR (XMLHttpRequest) objects as XDM items. The purpose of returning these objects, is to allow the developer/user to cancel the HTTP requests. To me this looks very much like the concept of a Cancellable Task in some other languages! Unfortunately, when ixsl:schedule-action is invoked for a wait, it returns the empty sequence. To deliver a generic approach, I think a small modification could be made so that when invoking for wait we return a zero-arity XDM function, which acts as an Abort Function, allowing us to stop the processing of the ixsl:schedule-actioncalled template. If wait was then used for an HTTP request, we could abort the request using such a function.

An EXPath Standard for Asynchronous

So dear reader, I hope to have argued my case that from XPath based languages the concept of making HTTP requests and processing the response, should be considered separately from the topic of asynchronous processing. I believe that asynchronous processing is more widely applicable than just the use-cases of requesting documents or resources.

I think we likely need an EXPath standard for Asynchronous. I am hopeful that XSLT and XQuery are similar enough, that we can just define a set of functions for such a purpose.

EXPath and Asynchronous HTTP

Saxon-JS Generic Schedule Action

An EXPath Standard for Asynchronous

GraalVM UTF-8 Validation

Preventing Large Range Counts from Crashing eXist-db