xmlmirror

21 jul 2016

motivation

we are happy to announce the initial release of a useful new tool called xmlmirror. as the name more or less spells out, xmlmirror is an XML webeditor with schema validation, based on webforms and implemented with codemirror. xmlmirror further uses a library called Fast-XML-Lint which uses libxml2 for schema verification and which is compiled with emscripten. or in layman’s terms: a web application that really helps you to create complex XML documents from scratch, as well as fix existing documents that are broken.

live demo / source code

features

more details

selenium

unit testing was implemented using selenium 2.53:

nix-shell -p python35Packages.selenium firefox-bin --command "python3 selenium_test.py"

it works like this:

  1. opens a specially crafted html document: schemainfoCreator-test.html in a webbrowser
  2. executes it and looks for “OK” or a 10 seconds timeout

selenium_test.py:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

ff = webdriver.Firefox()
ff.get("schemainfoCreator-test.html")

assert "schemaInfo unit-test" in ff.title

try:
    element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "OK")))
finally:
    ff.quit()

closure-compiler

the google closure compiler was used to ensure strict typing even though it is javascript:

closure_compiler/jcc schemainfoCreator.js

hint: it was no joy to use this tooling due to a lack of documentation and examples.

fastXmlLint.c

it took a bit of time to get into the internals of libxml2 and the antique API documentation is more confusing than helpful. anyway, two interesting results:

  1. even though xmllint can parse xml documents with a multi-document relax-ng schema it can’t be used to parse a multi-document relax-ng schema itself, see discussion

  2. ltrace is your best friend in reverse-engineering shared object/library usage:

    for instance, running:

    ltrace -f xmllint --relaxng html5-rng/xhtml.rng test_fail1.html

    would yield:

    ...
    xmlSAXDefaultVersion(2, 0x40bca9, 116, 112) = 2
    getenv("XMLLINT_INDENT") = nil
    xmlGetExternalEntityLoader(0x7fff26fdca5d, 0x40bd36, 1, 76) = 0x7f969b6c1160
    xmlSetExternalEntityLoader(0x407660, 0x40bd36, 1, 76) = 0x7f969b6c1160
    xmlLineNumbersDefault(1, 0x40bd36, 1, 76) = 0
    xmlSubstituteEntitiesDefault(1, 0x40bd36, 0, 76) = 0
    __xmlLoadExtDtdDefaultValue(1, 0x40bd36, 0, 76) = 0x7f969b9c0a1c
    xmlRelaxNGNewParserCtxt(0x7fff26fdaf23, 0x40bd36, 0, 76) = 0x245c490
    xmlRelaxNGSetParserErrors(0x245c490, 0x404ac0, 0x404ac0, 0x7f969af39080) = 0x245c490
    xmlRelaxNGParse(0x245c490, 0x404ac0, 0x404ac0, 0x7f969af39080) = 0x2527f70
    xmlRelaxNGFreeParserCtxt(0x245c490, 0xffffffff, 0x7f969af38678, 0x25b5570) = 1
    ...

    and this is the exact order of libxml2 function calls xmllint issues to parse test_fail1.html!

    note: this helped us a lot and made it possible to discover the secret xmlLineNumbersDefault function!

emscripten

during this project we had the idea to create a c to javascript cross-compiler abstraction using nix for emscripten and we are happy to announce that it is now officially in nixpkgs, see PR 16208.

this means:

  1. you can use nix-build to cross-compile all your dependencies like libz and afterwards use these in your project
  2. since nix runs on all linuxes, mac os x and other unix-like platforms, you can now enojoy full toolchain automation and deployment when doing emscripten.

if using nixpkgs (master), you can check for emscripten targets using:

nix-env -qaP | grep emscriptenPackages

and install using:

nix-env -iA emscriptenPackages.json_c

note: don’t mix json_c (native, x86) with other libs (emscripten, javascript) in your user-profile or you will get weird error messages with object code being in the wrong format and such.

nixos development

nix-shell was the primary development tool along with the default.nix which can basically spawn two different environments:

  • nix-shell -A emEnv - emscripten environment: used to compile c-code in javascript
  • nix-shell -A nativeEnv - native environment: used to develop the c-code in question and also for unit testing purposes

see Makefile.emEnv and Makefile.nativeEnv respectivly.

let’s have a look at the default.nix:

let 

...

emEnvironment = stdenv.mkDerivation rec {
  name = "emEnv";
  shellHook = ''
    export HISTFILE=".zsh_history"
    alias make="colormake -f Makefile.emEnv"
    alias c="while true; do inotifywait * -e modify --quiet > /dev/null; clear; make closure| head -n 30; done"
    alias s="python customserver.py"
    alias jcc=closure_compiler/jcc
    echo "welcome to the emEnvironment"
    PS1="emEnv: \$? \w \[$(tput sgr0)\]"
  '';

  buildInputs = [ json-c libz xml-js ] ++ [ colormake nodejs emscripten autoconf automake libtool pkgconfig gnumake strace ltrace python openjdk ncurses ];
};

...

in

{
  # use nix-shell with -A to select the wanted environment to work with:
  #   --pure is optional

  # nix-shell -A nativeEnv --pure  
  nativeEnv = nativeEnvironment;
  # nix-shell -A emEnv --pure  
  emEnv = emEnvironment;
}

you will notice that emEnv is a stdenv.mkDerivation and it uses shellHook and buildInputs.

some remarks:

  • we set a HISTFILE and get a project based history which is nice
  • using alias we override make with colormake and also set the target Makefile to Makefile.emEnv
  • setting a custom PS1 makes it easier to identify the shell when working on n+1 projects at the same time
  • the s alias runs a python webserver with xhtml mime-type support, which is handy when developing with chromium as XHR requests will be working then

note: the default.nix does contain libz, json-c, xml-js packaging and since this is now in nixpkgs it is kind of obsolete now.

conclusion

we (paul/joachim) want to thank Jos van den Oever (prolific open source contributor and co-chair of the OASIS ODF TC on behalf of the Dutch government) for inspiring the creation of this tool. ODF is a prominent example of a real-world standard that leverages the relax ng standard, and we expect xmlmirror to be very useful in the creation of more ODF autotests. Jos has also graciously offered to provide an initial host repository for xmlmirror.

  • schema parsing in codemirror can now easily be extended with all relax ng schemas!

  • also thanks to profpatsch for his explanations on the nix feature called override, see emscripten-packages.nix

  • we also want to thank nlnet foundation for their financial contribution from the ODF fund which enabled us to complete this interesting project. thanks as well to Michiel Leenaars (not only from nlnet but also one of the people behind the ODF plugfests) for his interest in the project. now we have a real powerful xmleditor, made huge progress with the emscripten toolchain on nixos and have created a pretty useful development workflow.

if you have questions/comments, see nixcloud.io for contact details.

article source