Higher level than Csound: describes the notes only, not the exact waveforms it seems.
Therefore also a bit harder to convert to actual sound: https://stackoverflow.com/questions/33775336/convert-musicxml-to-wav but possibly easier to convert to LilyPond.
Now they need to create a "MusicCSS" that gives the waveforms! :-)
The usual "let's make a standard without a reference implementation" W3C approach.