Name: emscripten-test
Owner: Xamarin
Description: null
Created: 2016-11-14 21:00:54.0
Updated: 2018-01-22 18:47:53.0
Pushed: 2017-06-23 22:35:57.0
Homepage: null
Size: 21
Language: C
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This is a small repo for testing an attempt at an emscripten port of mono. Enclosed are a C# test program, a C “driver” program, and these instructions. The current prototype gets partially into runtime startup before failing when it tries to load assemblies.
If you just want to run the prototype, skip to “How to run this” at the end.
Emscripten is a C-to-Javascript compiler produced by Mozilla. Its compiler (emcc
) compiles C to LLVM bitcode (its “.o” and “.a” files are literally this). Its linker links LLVM bitcode files into asmjs, which is a restricted subset of JavaScript. Its libc fakes many things which you normally expect in C but which are not part of a web browser (for example, a filesystem).
The team responsible for Emscripten is deeply involved with the effort for WebAssembly, which is a bytecode format that can be executed by web browsers. WebAssembly is very close to an initial public release. There is a branch of Emscripten which can output WebAssembly now. There is an experimental version of Firefox you can download which executes WebAssembly, and it will probably be in “real” Firefox by spring 2017.
I think supporting WebAssembly is an important long term goal for Mono. Emscripten is by far the most plausible means to do this.
My plan was to compile libmono using Emscripten, and link it with bitcode from Mono's AOT bitcode compiler. My goals with this project were
There are some things Emscripten cannot do but which we usually depend on.
Some of these limitations are similar to those we already encounter on platforms like the Apple Watch. The llvm bitcode and cooperative GC work we have already done get us a long way toward something that works.
Some of the remaining limitations will improve in future. WebAssembly is more isolated from JavaScript than asmjs and might support something more like a “real” stack; I haven't investigated. There is something called SharedArrayBuffer which may be added to JavaScript soon, and I hear reports that c++11 threads are a candidate feature for wasm for post-1.0.
The only real blockers here I think are the lack of any stack-walking primitives, which will hurt our GC badly, and the lack of threads. We can deal with the former by rewriting our GC, if we decide it is worth it. The latter is a more serious problem, especially if we want to create something fundamentally more useful than existing projects such as Katelyn Gadd's JSIL.
My first pass was to write a small C# program containing no memory usage and no corlib usage. The Main looked like:
int x = 5; int y = 1;
while (x > 0) {
y *= x; x -= 1;
}
return y;
It was trivially easy to compile this into bitcode and get emscripten to emit a js file; I could embed the js file in a browser; I could call Test_Program_Main_string_() in the JavaScript console and it returned 120
. Success.
My next pass was to add a single object allocation to the test program. This did not immediately work because there was no libmono and therefore no GOT.
To make this test work, I had to build an Emscripten-compiled libmono; I also needed to AOT compile mscorlib and Emscripten-compile a small driver program that uses the embedding API to invoke libmono. My model for this was the launcher program from our PS4 port.
I spent about two weeks on this first-pass prototype. Here is where I wound up with:
I can sucessfully compile all of libmono; whatever combination of C# standard libraries I want; the C# test program; the C driver program; and a packed filesystem including mscorlib.dll, together into a js file. The js file is 192 MB in size. I can run the js file in Node; it takes twenty seconds before anything starts happening, I assume because the js file is 192 MB in size. (I did the sketchiest possible browser test and it did not seem to have the load-time issue.) When Node runs the js file, libmono loads and the runtime gets a significant portion of the way into initializing. It then fails with Runtime critical type System.Object not found
: it does successfully load the mscorlib.dll assembly itself, but is unable to find System.Object's data in the aot cache, and thus fails to load the class.
I believe this is all very promising and indicates we likely need to only solve a few problems before the minimal test can run. However, we would probably need to solve the js file size / load time problem to create anything remotely useful to a customer.
Here are the changes I made to libmono to make it compile and run:
Applied a patch from Zoltan which causes us to use LLVMInt32Type ()
in the place we currently use LLVMInt8Type ()
in mini-llvm.c (this is necessary because emscripten does not support the llvm.expect.i8
intrinsic, only llvm.expect.i32
).
Added support for a asmjs-local-emscripten
triple to configure.ac, mono-config.c
Changed mono_thread_info_attach
to assert if called twice; TLS is currently broken (see “remaining issues” below), so unless I assert early it fails later in a confusing place
Added #ifndef HOST_EMSCRIPTEN
to effectively comment out:
register_thread
in mono-threads.c, skipping many important parts of runtime initsgen_unified_suspend_stop_world()
, mono_memory_barrier()
, mono_memory_read_barrier()
and mono_memory_write_barrier()
(these return without doing anything)async_suspend_critical()
and async_abort_critical()
in threads.c (these return KeepSuspended
)g_assert_not_reached
when called)g_error
s)is_thread_in_critical_region()
in mono-threads.c (returns FALSE)g_assert (*code_end > *code_start);
in compute_llvm_code_range ()
in aot-runtime.c which was failing (Rodrigo believes that if this assert is not met, many things will break)g_assert (sb_header == sb_header_for_addr (sb_header, desc->block_size));
in alloc_sb
in lock-free-alloc.c which was failing. (This seems? bad.)mono_thread_info_attach()
in domain.c (see above)Obviously, these #ifndef
s were incredibly scattershot, and should be expected to break a number of important things.
Fixed the madvise
calls in mono_mprotect()
to use posix_madvise
in Emscripten to work around a missing symbol issue.
Added mini-emscripten.h, mini-emscripten.c (these activate on HOST_EMSCRIPTEN
), and mono-threads-emscripten.c (this last one activates with USE_EMSCRIPTEN_BACKEND
, which is set by HOST_EMSCRIPTEN
). They're mostly full of stub methods. I define the counts of all register types to be 0.
Fixed what appears to be an actual typo in atomic.c??
register thread
, or because of whatever problem was causing the assert in lock-free-alloc.c to fail.llvm_eh_unwind_init
?) our system for determining roots may have to be revised majorly.compute_llvm_code_range()
?) obviously needs to be restoredI also think it is worth sometime soon approaching WebAssembly to ask about, at minimum:
Besides llvm.expect.i8
, building our bitcode files right now produces the following warnings:
LLVM failed for 'Write': opcode oparglist
LLVM failed for 'WriteLine': opcode oparglist
LLVM failed for 'CreateIUnknown': non-default callconv
LLVM failed for 'Concat': opcode oparglist
LLVM failed for 'CoCreateInstance': non-default callconv
Obviously not being able to run Console.WriteLine
is a little embarrassing.
Linking the prototype results in a series of warnings, some worrisome:
mono_
functions are still missingllvm_sin_f64
, llvm_returnaddress
, pthread_getschedparam
, inotify_rm_watch
, inotify_add_watch
, llvm_x86_sse2_pause
, getgrnam
, pthread_attr_getstacksize
, inotify_init
, llvm_nacl_atomic_cmpxchg_i64
, pthread_setschedparam
, getgrgid
, pthread_attr_setschedparam
, wapi_GetVolumeInformation
, llvm_eh_unwind_init
, llvm_cos_f64
, pthread_attr_getschedpolicy
, gc_stats
Note that, somewhat terrifyingly, link errors in emscripten are warnings; you get a warning at link time, and then an exception thrown at runtime if you try to call a nonexistent function.
In my test steps (see below) I don't use --with-runtime_preset=testing_aot_full
. I probably should.
These do not actually need to be solved, but are sort of general Mono design issues that made this experiment more awkward than it could be. These are things to think about if in future we want Mono to elegantly support “things like WebAssembly”:
--aot=static
, if we could embed some assemblies directly into the executable. The UWP team was also asking about this.My repro steps are ad hoc and are mostly shaped by a desire to not ever have to run make install
. But:
Install emscripten and node.js (I suggest doing this via Homebrew). Check out this repository and cd to it. In a terminal run the following steps:
# Check out llvm and mono prototype branch
git submodule init && git submodule update --recursive
# Build llvm and install it into a "install/mono-llvm" dir
mkdir -p install/llvm
cd external/llvm
(cd cmake && cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=`pwd`/../../../install/llvm ..)
(cd cmake && make install)
cd ../..
# Build compiler. Also build a mobile_static corlib and disable ALWAYS_AOT.
cd external/mono-compile
git submodule update --init --recursive
./autogen.sh --enable-nls=no CC="ccache clang" --disable-boehm --with-sigaltstack=no --prefix=`pwd`/../../install/mono-compile --enable-maintainer-mode --enable-llvm --enable-llvm-runtime --with-llvm=`pwd`/../../install/llvm --with-runtime_preset=testing_aot_full --disable-btls
make -j8
cd ../..
# Build runtime. You're going to be doing this with emscripten, so it looks a little funny.
# Running "make" will get only as far as starting to build the standard library, then fail
# with an error about mcs or jay/jay. That's fine, keep going, we only need the static libs.
cd external/mono-runtime
emconfigure ./autogen.sh --enable-nls=no --disable-boehm --with-sigaltstack=no --prefix=`pwd`/../../install/mono-runtime --enable-maintainer-mode --with-cooperative-gc=yes --enable-division-check --with-sgen-default-concurrent=no --host=asmjs-local-emscripten --enable-minimal=jit --disable-mcs-build --disable-btls
emmake make -j
cd ../..
# Now we're going to actually build the emscripten example. First let's open a new bash session:
bash
# And set some environment variables.
# This is mostly to set us up to use the AOT compiler without actually installing it.
export TESTS=`pwd`;
export COMPILER=$TESTS/external/mono-compile RUNTIME=$TESTS/external/mono-runtime;
export COMPILER_BIN=$COMPILER/runtime/_tmpinst/bin;
export MONO_CFG_DIR=$COMPILER/runtime/etc MONO_PATH=$COMPILER/mcs/class/lib/net_4_x;
export PATH=$PATH:"$TESTS/install/llvm/bin";
export MSCORLIB_PATH=$COMPILER/mcs/class/lib/testing_aot_full;
export MSCORLIB=$MSCORLIB_PATH/mscorlib.dll
# This will become the virtual filesystem of the emscripten program
mkdir -p assembly
# Build mscorlib into bytecode in the current directory.
MONO_PATH=$MSCORLIB_PATH MONO_ENABLE_COOP=1 $COMPILER/mono/mini/mono --aot=static,llvmonly,asmonly,llvm-outfile=mscorlib.bc $MSCORLIB
# Copy the assembly into the virtual filesystem and strip its IL.
cp $MSCORLIB assembly
$COMPILER/mono/mini/mono $COMPILER/mcs/class/lib/net_4_x/mono-cil-strip.exe assembly/mscorlib.dll
# Compile the test program.
$COMPILER_BIN/mcs program.cs -t:library -out:program.dll -debug:full
# AOT the test program, again to bitcode in this directory.
MONO_ENABLE_COOP=1 $COMPILER/mono/mini/mono --aot=static,llvmonly,asmonly,llvm-outfile=program.bc program.dll
# Copy the test program's assembly into the virtual file system and strip that IL too.
cp program.dll assembly/program.dll
$COMPILER/mono/mini/mono $COMPILER/mcs/class/lib/net_4_x/mono-cil-strip.exe assembly/program.dll
# Build the C driver.
emcc -c driver.c
# Link.
# A few things to note here: The mono libraries have to all get defined as a group,
# since they have recursive dependencies; the emscripten linker takes extra arguments
# via -s, which we use to set the heap, set the virutal file system, and make sure
# void main() is visible to node.
emcc -L$RUNTIME/mono/sgen/.libs -L$RUNTIME/mono/mini/.libs -L$RUNTIME/eglib/src/.libs -L$RUNTIME/mono/metadata/.libs program.bc -L$RUNTIME/mono/utils/.libs mscorlib.bc driver.o -Wl,--start-group -lmonoutils -lmini-static -lmonoruntimesgen-static -lmonosgen-static -Wl,--end-group -leglib -o csharp.js -s EXPORTED_FUNCTIONS='["_main"]' --embed-file assembly\@/ -s TOTAL_MEMORY=134217728
# Run a small node program that loads the emscripten output and executes it.
# If this worked, it would print "120".
node test.js