xamarin/emscripten-test

Name: emscripten-test

Owner: Xamarin

Description: null

Created: 2016-11-14 21:00:54.0

Updated: 2018-01-22 18:47:53.0

Pushed: 2017-06-23 22:35:57.0

Homepage: null

Size: 21

Language: C

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

MONO EMSCRIPTEN PROTOTYPE NOTES

This is a small repo for testing an attempt at an emscripten port of mono. Enclosed are a C# test program, a C “driver” program, and these instructions. The current prototype gets partially into runtime startup before failing when it tries to load assemblies.

If you just want to run the prototype, skip to “How to run this” at the end.

Why I did this

Emscripten is a C-to-Javascript compiler produced by Mozilla. Its compiler (emcc) compiles C to LLVM bitcode (its “.o” and “.a” files are literally this). Its linker links LLVM bitcode files into asmjs, which is a restricted subset of JavaScript. Its libc fakes many things which you normally expect in C but which are not part of a web browser (for example, a filesystem).

The team responsible for Emscripten is deeply involved with the effort for WebAssembly, which is a bytecode format that can be executed by web browsers. WebAssembly is very close to an initial public release. There is a branch of Emscripten which can output WebAssembly now. There is an experimental version of Firefox you can download which executes WebAssembly, and it will probably be in “real” Firefox by spring 2017.

I think supporting WebAssembly is an important long term goal for Mono. Emscripten is by far the most plausible means to do this.

Prototype scope

My plan was to compile libmono using Emscripten, and link it with bitcode from Mono's AOT bitcode compiler. My goals with this project were

Platform limitations

There are some things Emscripten cannot do but which we usually depend on.

  1. No signals
  2. No register access, no access to ucontext_t or equivalent, no stack access, no unwinding. (I believe it's actually storing “registers” in JavaScript variables and using the JavaScript stack as the stack.)
  3. No threads. (Javascript has a model for spawning simultaneous execution threads, but they are not memory-sharing.)
  4. Not all LLVM bitcode ops/intrinsics are supported. The stated goal of the emscripten project is to support the bitcode emitted by Clang. They view support for LLVM bitcode emitted by other sources (like us) as a “nice-to-have”.

Some of these limitations are similar to those we already encounter on platforms like the Apple Watch. The llvm bitcode and cooperative GC work we have already done get us a long way toward something that works.

Some of the remaining limitations will improve in future. WebAssembly is more isolated from JavaScript than asmjs and might support something more like a “real” stack; I haven't investigated. There is something called SharedArrayBuffer which may be added to JavaScript soon, and I hear reports that c++11 threads are a candidate feature for wasm for post-1.0.

The only real blockers here I think are the lack of any stack-walking primitives, which will hurt our GC badly, and the lack of threads. We can deal with the former by rewriting our GC, if we decide it is worth it. The latter is a more serious problem, especially if we want to create something fundamentally more useful than existing projects such as Katelyn Gadd's JSIL.

What happened
Trivial test

My first pass was to write a small C# program containing no memory usage and no corlib usage. The Main looked like:

int x = 5; int y = 1;
while (x > 0) {
    y *= x; x -= 1;
}
return y;

It was trivially easy to compile this into bitcode and get emscripten to emit a js file; I could embed the js file in a browser; I could call Test_Program_Main_string_() in the JavaScript console and it returned 120. Success.

Minimal test

My next pass was to add a single object allocation to the test program. This did not immediately work because there was no libmono and therefore no GOT.

To make this test work, I had to build an Emscripten-compiled libmono; I also needed to AOT compile mscorlib and Emscripten-compile a small driver program that uses the embedding API to invoke libmono. My model for this was the launcher program from our PS4 port.

I spent about two weeks on this first-pass prototype. Here is where I wound up with:

I can sucessfully compile all of libmono; whatever combination of C# standard libraries I want; the C# test program; the C driver program; and a packed filesystem including mscorlib.dll, together into a js file. The js file is 192 MB in size. I can run the js file in Node; it takes twenty seconds before anything starts happening, I assume because the js file is 192 MB in size. (I did the sketchiest possible browser test and it did not seem to have the load-time issue.) When Node runs the js file, libmono loads and the runtime gets a significant portion of the way into initializing. It then fails with Runtime critical type System.Object not found: it does successfully load the mscorlib.dll assembly itself, but is unable to find System.Object's data in the aot cache, and thus fails to load the class.

I believe this is all very promising and indicates we likely need to only solve a few problems before the minimal test can run. However, we would probably need to solve the js file size / load time problem to create anything remotely useful to a customer.

What exactly I did

Here are the changes I made to libmono to make it compile and run:

Remaining issues
Major TODO items

I also think it is worth sometime soon approaching WebAssembly to ask about, at minimum:

Minor things that don't work
Entirely abstract issues

These do not actually need to be solved, but are sort of general Mono design issues that made this experiment more awkward than it could be. These are things to think about if in future we want Mono to elegantly support “things like WebAssembly”:

Late June TODOs
How to run this

My repro steps are ad hoc and are mostly shaped by a desire to not ever have to run make install. But:

Install emscripten and node.js (I suggest doing this via Homebrew). Check out this repository and cd to it. In a terminal run the following steps:

# Check out llvm and mono prototype branch
git submodule init && git submodule update --recursive

# Build llvm and install it into a "install/mono-llvm" dir
mkdir -p install/llvm
cd external/llvm
(cd cmake && cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=`pwd`/../../../install/llvm ..)
(cd cmake && make install)
cd ../..

# Build compiler. Also build a mobile_static corlib and disable ALWAYS_AOT.
cd external/mono-compile
git submodule update --init --recursive
./autogen.sh --enable-nls=no CC="ccache clang" --disable-boehm --with-sigaltstack=no --prefix=`pwd`/../../install/mono-compile --enable-maintainer-mode --enable-llvm --enable-llvm-runtime --with-llvm=`pwd`/../../install/llvm --with-runtime_preset=testing_aot_full --disable-btls
make -j8
cd ../..

# Build runtime. You're going to be doing this with emscripten, so it looks a little funny.
# Running "make" will get only as far as starting to build the standard library, then fail
# with an error about mcs or jay/jay. That's fine, keep going, we only need the static libs.
cd external/mono-runtime
emconfigure ./autogen.sh --enable-nls=no --disable-boehm --with-sigaltstack=no --prefix=`pwd`/../../install/mono-runtime --enable-maintainer-mode --with-cooperative-gc=yes --enable-division-check --with-sgen-default-concurrent=no --host=asmjs-local-emscripten --enable-minimal=jit --disable-mcs-build --disable-btls
emmake make -j
cd ../..

# Now we're going to actually build the emscripten example. First let's open a new bash session:
bash

# And set some environment variables.
# This is mostly to set us up to use the AOT compiler without actually installing it.
export TESTS=`pwd`;
export COMPILER=$TESTS/external/mono-compile RUNTIME=$TESTS/external/mono-runtime;
export COMPILER_BIN=$COMPILER/runtime/_tmpinst/bin;
export MONO_CFG_DIR=$COMPILER/runtime/etc MONO_PATH=$COMPILER/mcs/class/lib/net_4_x;
export PATH=$PATH:"$TESTS/install/llvm/bin";
export MSCORLIB_PATH=$COMPILER/mcs/class/lib/testing_aot_full;
export MSCORLIB=$MSCORLIB_PATH/mscorlib.dll

# This will become the virtual filesystem of the emscripten program
mkdir -p assembly

# Build mscorlib into bytecode in the current directory.
MONO_PATH=$MSCORLIB_PATH MONO_ENABLE_COOP=1 $COMPILER/mono/mini/mono --aot=static,llvmonly,asmonly,llvm-outfile=mscorlib.bc $MSCORLIB

# Copy the assembly into the virtual filesystem and strip its IL.
cp $MSCORLIB assembly
$COMPILER/mono/mini/mono $COMPILER/mcs/class/lib/net_4_x/mono-cil-strip.exe assembly/mscorlib.dll

# Compile the test program.
$COMPILER_BIN/mcs program.cs -t:library -out:program.dll -debug:full

# AOT the test program, again to bitcode in this directory.
MONO_ENABLE_COOP=1 $COMPILER/mono/mini/mono --aot=static,llvmonly,asmonly,llvm-outfile=program.bc program.dll

# Copy the test program's assembly into the virtual file system and strip that IL too.
cp program.dll assembly/program.dll
$COMPILER/mono/mini/mono $COMPILER/mcs/class/lib/net_4_x/mono-cil-strip.exe assembly/program.dll

# Build the C driver.
emcc -c driver.c

# Link.
# A few things to note here: The mono libraries have to all get defined as a group,
# since they have recursive dependencies; the emscripten linker takes extra arguments
# via -s, which we use to set the heap, set the virutal file system, and make sure
# void main() is visible to node.
emcc -L$RUNTIME/mono/sgen/.libs -L$RUNTIME/mono/mini/.libs -L$RUNTIME/eglib/src/.libs -L$RUNTIME/mono/metadata/.libs program.bc -L$RUNTIME/mono/utils/.libs mscorlib.bc driver.o -Wl,--start-group -lmonoutils -lmini-static -lmonoruntimesgen-static -lmonosgen-static -Wl,--end-group -leglib -o csharp.js -s EXPORTED_FUNCTIONS='["_main"]' --embed-file assembly\@/ -s TOTAL_MEMORY=134217728

# Run a small node program that loads the emscripten output and executes it.
# If this worked, it would print "120".
node test.js

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.