Jake Zohdi Developer
thumbnail

Wasm Sudoku

This is my experiment with compiling C source code to WebAssembly. I first followed documentation from Mozilla here to use emscripten. Though emscripten made it is to quickly get things working, I wanted to explore more in-depth what it would take to do without this library. In the end, I implement some minimal javascript that is able to load a wasm binary created using clang (with wasi-sdk).

The C program solves sudoku boards using AC3 and backtracking algorithms. github here.

When you go to the sudoku game (Run button), it uses an exported wasm.solve method to generate new boards and their solution.

Some Sudoku Information

History of Sudoku” attributes the origins of Sudoku to the Swiss mathematician Leonhard Euler in 1783, with some versions later appearing in French newspapers, and then finally attributed to American Howard Garns in the 1970s.

From Wikipedia: “On July 6, 1895, Le Siècle's rival, La France, refined the puzzle so that it was almost a modern Sudoku and named it carré magique diabolique ('diabolical magic square').”

”The modern Sudoku was most likely designed anonymously by Howard Garns, a 74-year-old retired architect and freelance puzzle constructor from Connersville, Indiana, and first published in 1979 by Dell Magazines as Number Place (the earliest known examples of modern Sudoku).”

In 2011, Gary McGuire and team at University College Dublin showed that 17 filled squares is the minimal number of “hints” to create a uniquely solvable board.

https://www.technologyreview.com/2012/01/06/188520/mathematicians-solve-minimum-sudoku-problem/

Wasm

Emscripten and clang both generate wasm file by leveraging LLVM to an intermediate representation (IR) and then in in WebAssembly binary code. The IR transformation optimizes results heavily, the LLVM toolchain has had a long life of improving code output for languages such as c, ruby, rust, and more.

The resulting wasm code is a compact, low-level binary format designed for fast, near-native execution. The transformation from LLVM IR to WebAssembly is handled by LLVM’s own WebAssembly backend. This backend is responsible for taking the IR and compiling it into a wasm binary that adheres to the WebAssembly specification. The LLVM team actively maintains this backend.

Modern browsers (such as Chrome, Firefox, Safari, and Edge) include a dedicated WebAssembly runtime within their JavaScript engines (e.g., V8 in Chrome, SpiderMonkey in Firefox). When your application loads a wasm module, the browser is responsible for:

The WebAssembly specification is defined and maintained by the WebAssembly Working Group, which operates under the umbrella of the World Wide Web Consortium (W3C). This group is composed of representatives from various stakeholders such as major browser vendors (such as Google, Mozilla, Microsoft, and Apple).

Emscripten

Overall, while it is possible to compile and connect to wasm files by hand (without the use of this library/others like it), emscripten makes it much easier and helps to ensure wider support for your C code. For example if your C code calls standard library functions like printf , emscripten polyfils these (console.log for printf) for you so that no errors are raised.

What does the runtime environment for calling wasm from JS need?

Linear Memory:

WebAssembly uses a linear block of memory (an ArrayBuffer) that simulates the flat memory model of C. This memory is managed through a WebAssembly.Memory object. Your wasm code expects to have a certain amount of memory available that it can use as its heap, stack, and for storing global data.

WebAssembly.Memory: This object represents the linear memory that the wasm module will use. When you allocate memory in your C code (e.g., via malloc), the wasm module uses this memory to fulfill those requests. I had run into errors while calling my wasm file RuntimeError memory access out of bounds error which I was able to resolve by increasing the amount of memory.

importObject: This maps JavaScript objects (like our memory instance) to the expected imports of the wasm module. Typically, Emscripten places these under the env namespace. You can also use this to forward javascript functions to the wasm runtime (via function pointers)

Function Table:

Wasm modules use tables to store function pointers, which are needed for indirect function calls (for example, when calling a function through a pointer or for dynamic dispatch). These are managed by a WebAssembly.Table object. The table holds references to functions that can be called indirectly by the wasm code.

On top of providing runtime helpers, emscripten makes it easy to convert input and output parameters. The wasm functions exported will only recognize a small subset of primitives. The primitives mostly consist of integer types, which includes int, char, and pointers. For example to pass a string or an array to the C code (compiled to wasm), the data must be added to the memory stack first and then a stack pointer will be passed in.

Example of using emscripten to manually add a string to the memory stack

Emscripten emulates a POSIX-like environment in the browser.

If your code uses I/O or other OS defined behavior such as reading a file, emscripten helps to fill in this functionality.

POSIX stands for the "Portable Operating System Interface," which is a family of standards specified by the IEEE that define the interface for operating system services. These standards cover many aspects of OS behavior including file I/O, process control, inter-process communication, and other system calls that C and UNIX-like systems typically provide.

File System Emulation:

Emscripten provides a virtual file system (often referred to as MEMFS or IDBFS) that resides in the browser’s memory or persistent storage (via IndexedDB). This allows your C code—which might use standard functions like fopen, fread, fwrite, or stat—to operate on files as if it were running on a typical POSIX-compliant system. For example, you can read and write files, and the Emscripten runtime handles translating those calls to operations on an in-memory data structure or persistent storage.

When compiled with Emscripten, this code operates on the emulated file system. In the browser, the file “example.txt” would exist only within the Emscripten virtual file system, and you could later inspect it using Emscripten’s APIs or even synchronize it with persistent storage.

Using Clang

After getting code working with emscripten I wanted to see if I could call a wasm binary with my own implemented JS runtime. For my purpose, it wasn’t too difficult but I could easily see for a large/more complex code base, emscripten would be a highly beneficial tool. A few pieces I needed to solve included: downloading and installing wasi-sdk/clang, stubbing some C standard library, and managing string input/output on the memory stack.

wasi-sdk

On initial usage of clang --target=wasm32 -nostdlib -Wl,--no-entry -Wl,--export-al , I would get the errors: No available targets are compatible with triple 'wasm32' or

I was able to solve this by downloading the wasi-sdk release for my OS and running clang via this binary. Example:

Stubbing standard lib.

Once my wasm file was generated, loading the file would initially fail due to: Import #0 "wasi_snapshot_preview1": module is not an object or function like in this issue. Here it seems that this would be a blocker if I was using certain OS functionality, which I removed all instances of such as printf. I was able to solve this by adding the module to the Wasm instantiate method.

Managing string input/output

The wasm binary exports a solve function that takes a char * (string) argument as well as return value. To handle this, we need to add the data to the memory buffer (added in env: as seen above) and pass in a stack pointer (number) of the index in the buffer.

Putting it all together

When I put all of the above pieces together, I end up compiling the solve function from this C code, and use the following typescript as the basis for my sudoku engine. Note that since memory is updated by the wasm binary, it is important to call const afterView = new Uint8Array(this.wasmMemory.buffer, ptr, length); after the wasm function is run. You won’t be able to see the new data even if you know the location of the resultPtr before.

Thanks for reading! On the sudoku frontend, I randomly insert digits into the new board while validating that it’s still solvable using the above method.