From a0c525b30c8a6f9a21f8da9f643d00d9dcd094e6 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Tue, 1 Aug 2023 12:09:16 -0700 Subject: [PATCH 01/70] Initial commit From 5b29562cd246e1d6526d2ba68c3a2d5d6b42d387 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 9 Aug 2023 15:44:40 -0500 Subject: [PATCH 02/70] Initial setup and overview --- .github/workflows/ci-interpreter.yml | 2 +- README.md | 12 + document/core/conf.py | 4 +- proposals/js-string-builtins/Overview.md | 409 +++++++++++++++++++++++ 4 files changed, 424 insertions(+), 3 deletions(-) create mode 100644 proposals/js-string-builtins/Overview.md diff --git a/.github/workflows/ci-interpreter.yml b/.github/workflows/ci-interpreter.yml index 6edfa0c460..4961393e09 100644 --- a/.github/workflows/ci-interpreter.yml +++ b/.github/workflows/ci-interpreter.yml @@ -31,4 +31,4 @@ jobs: - name: Build interpreter run: cd interpreter && opam exec make - name: Run tests - run: cd interpreter && opam exec make JS=node ci + run: cd interpreter && opam exec make all diff --git a/README.md b/README.md index 21660b8d6a..afcb9a457d 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,18 @@ [![CI for specs](https://github.com/WebAssembly/spec/actions/workflows/ci-spec.yml/badge.svg)](https://github.com/WebAssembly/spec/actions/workflows/ci-spec.yml) [![CI for interpreter & tests](https://github.com/WebAssembly/spec/actions/workflows/ci-interpreter.yml/badge.svg)](https://github.com/WebAssembly/spec/actions/workflows/ci-interpreter.yml) +# JS String Builtins Proposal for WebAssembly + +This repository is a clone of [github.com/WebAssembly/spec/](https://github.com/WebAssembly/spec/). +It is meant for discussion, prototype specification and implementation of a proposal to +add support for efficient access to JS string operations to WebAssembly. + +* See the [overview](proposals/js-string-builtins/Overview.md) for a summary of the proposal. + +* See the [modified spec](https://webassembly.github.io/js-string-builtins/) for details. + +Original `README` from upstream repository follows... + # spec This repository holds the sources for the WebAssembly draft specification diff --git a/document/core/conf.py b/document/core/conf.py index 3952701bdf..e0d4eaf5ca 100644 --- a/document/core/conf.py +++ b/document/core/conf.py @@ -66,10 +66,10 @@ logo = 'static/webassembly.png' # The name of the GitHub repository this resides in -repo = 'spec' +repo = 'js-string-builtins' # The name of the proposal it represents, if any -proposal = '' +proposal = 'js-string-builtins' # The draft version string (clear out for release cuts) draft = ' (Draft ' + date.today().strftime("%Y-%m-%d") + ')' diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md new file mode 100644 index 0000000000..bf18681052 --- /dev/null +++ b/proposals/js-string-builtins/Overview.md @@ -0,0 +1,409 @@ +# JS String Builtins + +## Motivation + +JavaScript runtimes have a rich set of [builtin objects and primitives](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects). Some languages targeting WebAssembly may have compatible primitives and would benefit from being able to use the equivalent JavaScript primitive for their implementation. The most pressing use-case here is for languages who would like to use the JavaScript String type to implement their strings. + +It is already possible to use any JavaScript or Web API from WebAssembly by importing JavaScript 'glue code' which adapts between JavaScript and WebAssembly values and calling conventions. Usually, this has a negligible performance impact and work has been done to optimize this [in runtimes when we can](https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-%F0%9F%8E%89/). + +However, the overhead of importing glue code is prohibitive for primitives such as Strings, ArrayBuffers, RegExp, Map, and BigInt where the desired overhead of operations is a tight sequence of inline instructions, not an indirect function call (which is typical of imported functions). + +## Overview + +This proposal aims to provide a minimal and general mechanism for importing specific JavaScript primitives for efficient usage in WebAssembly code. + +This is done by first adding a mechanism for providing import values at compile-time. This makes it possible for Web runtimes to reliably specialize compiled code to the import value provided. This must be done carefully to not break certain invariants and optimizations that Web runtimes currently rely on. + +Then, it adds a set of builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. + +These two pieces in combination allow runtimes to reliably emit optimal code sequences for JavaScript String operations within WebAssembly modules. In the future, other builtin objects or primitives can be exposed through new builtins. + +## Compile-time imports + +Today, imports are provided when instantiating a module. This prevents web runtimes from being able to know anything about an import (beyond the type specified in the module) while compiling, without either speculation or deferring compilation until instantiation or later. This makes it difficult to optimize for a specific import when compiling a module. + +Speculation and deferred compilation are useful techniques, but should not be necessary to efficiently use specific JavaScript primitives from WebAssembly code on the Web. + +This proposal modifies the WebAssembly JS-API methods for compilation to optionally accept an object that specifies certain import values earlier than instantiation. There are several constraints to keep in mind before fully describing the API. + +### Constraints + +#### Modules can be shared across web workers using postMessage + +Not all import values are shareable across workers. We must be able to send shareable import values across workers, and reject unshareable values. + +It’s possible that we could disable sharing modules that have any compile-time imports for an initial version, but this would need to be solved in the fullness of time. + +#### Compiled modules can be serialized to the network cache + +Web engines can cache optimized code in a network cache entry keyed off of a HTTP fetch request. If the optimized code is specialized to runtime provided import values, we will need to expand the cache key to include those values. There is a risk that specializing to keys that change every page load could effectively disable code caching. + +#### Decoding the imports section can happen on a different thread from the imports object + +Parsing and compiling a module can happen on background threads which cannot perform property lookups on an imports object. + +#### Reading from the imports object requires knowing the keys + +See [‘read the imports’](https://webassembly.github.io/spec/js-api/index.html#read-the-imports). Import value lookup is performed using JavaScript ‘get property’ which requires knowledge of the key you’re looking up. It’s not possible to pull all possible values from the imports object eagerly as it may be a JavaScript proxy or other exotic object which does not provide iteration over all possible keys. + +#### We should standardize the web interfaces that can be specialized to + +Specializing to an import can be critical to the runtime performance of the module. We should provide strong guarantees about when specialization happens and to which imports. + +#### Do not conflict with future core wasm features + +Do our best to not conflict with potential future wasm proposals, such as pre-imports, staged compilation, module linking, or the component model. Make minimal or no changes to the core specification. + +### Modifications + +#### Add a WebIDL attribute for `shareable` + +This attribute is to be used on WebAssembly builtins, and possibly other Web interfaces in the future. They can be used with the [structured clone](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm) algorithm, and as compile-time imports. + +As they are well defined for structured clone, they are valid to be sent through postMessage. We may prevent them from being stored in user-facing persistent storage, such as IndexedDB. This is the situation with modules, as well. + +#### Modify the JS-API compilation methods to accept optional options + +``` +dictionary WebAssemblyImportValue { + required USVString module; + required USVString name; + required any value; +} +dictionary WebAssemblyCompileOptions { + optional sequence imports; +} + +interface Module { + constructor(BufferSource bytes, optional WebAssemblyCompileOptions options); + ... +} + +namespace WebAssembly { + Promise compile( +BufferSource bytes, +optional WebAssemblyCompileOptions options); + ... +}; + +``` + +The `imports` field is a list of import values to apply when compiling. It is not the same kind of imports object as used when instantiating, due to the above mentioned design constraints around threading and ‘get property’. + +Every import key of ‘module’ and ‘name’ must be specified at most once, or else a `LinkError` will be thrown. Every value provided must have the `shareable` WebIDL attribute. + +A module compiled with `imports` extends the [‘Compile a WebAssembly module’](https://webassembly.github.io/spec/js-api/index.html#compile-a-webassembly-module) algorithm to check that the import values are compatible with the module. This could be expressed with a new embedding function `module_validate_imports(module, externval)` which only performs import matching and does not mutate the store. Any issues are reported with a `LinkError`. + +The provided import values are stored in the module object. Any import provided at compile-time does not need to be provided during instantiation. The `WebAssembly.Module.imports()` static method will also exclude listing these imports. + +Any provided import value may be specialized to when compiling the module if the engine deems it profitable. It is expected that standardized WebAssembly builtins will be guaranteed to be specialized to if they are exposed by an engine. + +Because every `shareable` value is valid for structured clone, the compiled module can always be sent with `postMessage`. `shareable` values are also expected to be safe to be cached by the browser in the network cache. + +### Open Questions + +#### Should we throw LinkError for importing a non `shareable` value + +The above proposal throws a `LinkError` if you import a non `shareable` value. It could be possible to import a non `shareable` value if we were to prevent that module from being sent with postMessage. Web engines with module caching would likely not specialize on those values to keep the resulting module cacheable. + +#### Is there a better design for the compile-time imports object? + +Having a different structure for the two different kinds of imports is very unfortunate. + +One option proposed is to have the same kind of imports object as instantiation, but limit the recognized values to properties that are iterable. This would simplify the API, but might lead to confusion as the import objects look identical but are not treated the same. + +## Builtins + +### Do we need builtins? + +Now that we have a method for providing imports at compile-time, we need to decide what we're actually importing. + +One interesting option would be for engines to pattern match on import values to well-known API’s. You could imagine an engine recognizing an import to `String.prototype.charCodeAt` and emitting efficient code generation for it. + +There are two main problems with this approach. + +The first problem is that existing API’s require a calling convention conversion to handle differences around the `this` value, which WebAssembly function import calls leave as `undefined`. The second problem is that certain primitive use JS operators such as `===` and `<` that cannot be imported. + +It's possible that we could extend the [js-types](https://github.com/webassembly/js-types) `WebAssembly.Function` API to handle the `this` parameter. However, at this point the pattern matching will become more than one level deep, which becomes increasingly fragile. + +It seems that creating new importable definitions that adapt existing JS primitives to WebAssembly is simpler and more flexible in the future. + +### What is a builtin? + +A builtin is a definition on the WebAssembly namespace that can be imported by a module and provides efficient access to a JavaScript or Web primitive. There will be two types of builtins, functions and types. Type builtins will only become available with the [type-imports proposal](https://github.com/webassembly/type-imports). + +Builtins do not provide any new abilities to WebAssembly. They merely wrap existing primitives in such a manner that WebAssembly can efficiently use them. + +The standardization of builtins will be governed by the WebAssembly standards process and would exist in the JS-API document. + +The bar for adding a new builtin would be that it enables significantly better code generation for an important use-case beyond what is possible with a normal import. We don't want to add a new builtin for every existing API, only ones where adapting the JavaScript API to WebAssembly and allowing inline code generation results in significantly better codegen than a plain function call. + +### Function builtins + +Function builtins would be an instance of `WebAssembly.Function` and have a function type. One conceptualization is that they are a WebAssembly function on the outside and a JavaScript function on the inside. This combination allows efficient adaptation of primitives. + +Their behavior would be defined using algorithmic steps similar to the WebIDL or EcmaScript standards. If possible, we could define them using equivalent JavaScript source code to emphasize that these do not provide any new abilities. + +### Type builtins + +Type builtins would be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/webassembly/type-imports) proposal. The values contained in a type builtin would be specified with a predicate. + +## JS String Builtin API + +The following is an initial set of function builtins for JavaScript String. They are defined on a `String` namespace in `WebAssembly` namespace. Each example includes pseudo-code illustrating their operation and some descriptive text. + +TODO: formalize these better. + +[1]: This is meant to refer to what the original String.fromCharCode / String.fromCodePoint / String.prototype.charCodeAt / String.prototype.codePointAt / String.prototype.substring would do, in the absence of any monkey-patching. In a final version of this specification, we'll have to use more robust phrasing to express that; in the meantime, the given phrasing is more readable. + +[2]: "array.length" is meant to express "load the array's length", in Wasm terms: (array.len (local.get $array)). + +[3]: “trap” is meant to emit a wasm trap. This results in a WebAssembly.RuntimeError with the bit set that it is not catchable by exception handling. + +### WebAssembly.String.fromWtf16Array + +``` +func fromWtf16Array( + array: (ref null (array i16)), + start: i32, + end: i32 +) -> (ref extern) +{ + start = ToUint32(start); + end = ToUint32(end); + + // [2] + if (end > array.length || start > end) + trap; + + let result = ""; + for(let i = start; i < end; i++) { + // [1], [4] + result += String.fromCharCode(array[i]); + } + return result; +} +``` + +[4]: "array[i]" is meant to express "load the i-th element of the array", in Wasm terms: (array.get_u $i16-array-type (local.get $array) (local.get $i)) for an appropriate $i16-array-type. + +Note: This function takes an i16 array defined in its own recursion group. If this is an issue for a toolchain, we can look into how to relax the function type while still maintaining good performance. + +### WebAssembly.String.fromWtf8Array + +``` +func fromWtf8Array( + array: (ref null (array i8)), + start: i32, + end: i32 +) -> (ref extern) +{ + start = ToUint32(start); + end = ToUint32(end); + + // [2] + if (end > array.length || start > end) + trap; + + // This summarizes as: "decode the WTF-8 string stored at array[start:end], + // or trap if there is no valid WTF-8 string there". + let result = ""; + while (start < end) { + if there is no valid wtf8-encoded code point at array[start] + trap; + let c be the code point at array[start]; + // [1] + result += String.fromCodePoint(c); + increment start by as many bytes as it took to decode c; + } + return result; +} +``` + +Note to implementers: while this is the only usage of WTF-8 in this document, it shouldn't be very burdensome to implement, because all existing strings in Wasm modules (import/export names, contents of the name section) are already in UTF-8 format, so implementations must already have decoding infrastructure for that. We need the relaxation from UTF-8 to WTF-8 to support WTF-16 based source languages, which may have unpaired surrogates in string constants in existing/legacy code. + +### WebAssembly.String.toWtf16Array + +"start" is the index in the array where the first codeunit of the string will be written. + +Returns the number of codeunits written. Traps if the string doesn't fit into the array. + +``` +func toWtf16Array( + string: externref, + array: (ref null (array (mut i16))), + start: i32 +) -> i32 +{ + if (typeof string !== "string") + trap; + + start = ToUint32(start); + + if (start + string.length > array.length) + trap; + + for (let i = 0; i < string.length; i++) { + // [4], [5] + array[start + i] = string.charCodeAt(i); + } + return string.length; +} +``` + +[4]: "array[i] = …" is meant to express "store the value … as the i-th element of the array", in Wasm terms: (array.set $i16-array-type (local.get $array) (local.get $i) (…)) for an appropriate $i16-array-type. + +### WebAssembly.String.fromCharCode + +``` +func fromCharCode( + charCode: i32 +) -> (ref extern) +{ + charCode = ToUint32(charCode); + return String.fromCharCode(charCode); // [1], [4] +} +``` +[4]: Any charCode > 0xFFFF values are implicitly truncated. + +### WebAssembly.String.fromCodePoint + +``` +func fromCodePoint( + codePoint: i32 +) -> (ref extern) +{ + codePoint = ToUint32(codePoint); + if (codePoint > 0x10FFFF) + trap; + + return String.fromCodePoint(codePoint); // [1] +} +``` + +### WebAssembly.String.charCodeAt + +``` +func charCodeAt( + string: externref, + index: i32 +) -> i32 +{ + if (typeof string !== "string") + trap; + + index = ToUint32(index); + if (index >= string.length) + trap; + + return string.charCodeAt(index); // [1] +} +``` + +### WebAssembly.String.codePointAt + +``` +func codePointAt( + string: externref, + index: i32 +) -> i32 +{ + if (typeof string !== "string") + trap; + + index = ToUint32(index); + if (index >= string.length) + trap; + + return string.codePointAt(index); // [1] +} +``` + +### WebAssembly.String.length + +``` +func length(string: externref) -> i32 { + if (typeof string !== "string") + trap; + + return string.length; +} +``` + +### WebAssembly.String.concatenate + +``` +func concatenate( + first: externref, + second: externref +) -> externref +{ + if (typeof first !== "string") + trap; + if (typeof second !== "string") + trap; + + return first + second; +} +``` + +### WebAssembly.String.substring + +``` +func substring( + string: externref, + startIndex: i32, + endIndex: i32 +) -> (ref extern) +{ + if (typeof string !== "string") + trap; + + startIndex = ToUint32(startIndex); + endIndex = ToUint32(endIndex); + if (startIndex > string.length || + startIndex > endIndex) + return ""; + + // [1] + return string.substring(startIndex, endIndex); +} +``` + +Note: We could consider allowing negative start/end indices, and adding them to the string's length to compute the effective indices, like String.prototype.slice does it. Is one of these behaviors more convenient for common use cases? Arguably it is more fitting with Wasm's style to only accept obviously-valid (i.e. in-bounds) parameters, and leave it to calling code to decide whether other values (positive out-of-bounds and/or negative) can occur at all, and if yes, how to handle them (map into bounds somehow, or reject). +Note: Taking that thought one step further, we could consider throwing exceptions when startIndex > endIndex or startIndex > string.length or endIndex > string.length. If we do so, we should keep in mind that allowing empty slices (startIndex == endIndex) can be useful when this situation arises dynamically in string-processing algorithms. It is unlikely that throwing instead of returning an empty string in these cases would offer performance benefits. + +### WebAssembly.String.equals + +``` +func equals( + first: externref, + second: externref +) -> i32 +{ + if (first !== null && typeof first !== "string") + trap; + if (second !== null && typeof second !== "string") + trap; + return first === second ? 1 : 0; +} +``` + +### WebAssembly.String.compare + +``` +function compare( + first: externref, + second: externref +) -> i32 +{ + if (typeof first !== "string") + trap; + if (typeof second !== "string") + trap; + + if (first === second) + return 0; + return first < second ? -1 : 1; +} +``` From 1dea0a34da783d161ddb5d0b73b30c21499b055e Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 13 Oct 2023 15:50:39 +0200 Subject: [PATCH 03/70] Update proposal * Adopts builtin modules approach * Adds section of polyfilling * Adds section on feature detection * Adds cast/test builtins * Adds future extension ideas for - binding memory - utf8/wtf8 - evolving the type signatures --- proposals/js-string-builtins/Overview.md | 228 +++++++++++++---------- 1 file changed, 128 insertions(+), 100 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index bf18681052..87d6ea7756 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -12,66 +12,65 @@ However, the overhead of importing glue code is prohibitive for primitives such This proposal aims to provide a minimal and general mechanism for importing specific JavaScript primitives for efficient usage in WebAssembly code. -This is done by first adding a mechanism for providing import values at compile-time. This makes it possible for Web runtimes to reliably specialize compiled code to the import value provided. This must be done carefully to not break certain invariants and optimizations that Web runtimes currently rely on. +This is done by first adding a set of builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. -Then, it adds a set of builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. +Then a mechanism for importing modules containing these builtins (builtin modules) is added to the WebAssembly JS-API. These modules exist in a new reserved import namespace `wasm:` that is enabled at compile-time with a flag. These two pieces in combination allow runtimes to reliably emit optimal code sequences for JavaScript String operations within WebAssembly modules. In the future, other builtin objects or primitives can be exposed through new builtins. -## Compile-time imports +## Do we need builtins? -Today, imports are provided when instantiating a module. This prevents web runtimes from being able to know anything about an import (beyond the type specified in the module) while compiling, without either speculation or deferring compilation until instantiation or later. This makes it difficult to optimize for a specific import when compiling a module. +It is already possible today to import JS builtin functions (such as String.prototoype.getCharCodeAt) from wasm modules. Instead of defining new wasm specific-builtins, we could just re-use those directly. -Speculation and deferred compilation are useful techniques, but should not be necessary to efficiently use specific JavaScript primitives from WebAssembly code on the Web. +There are several problems with this approach. -This proposal modifies the WebAssembly JS-API methods for compilation to optionally accept an object that specifies certain import values earlier than instantiation. There are several constraints to keep in mind before fully describing the API. +The first problem is that existing API’s require a calling convention conversion to handle differences around the `this` value, which WebAssembly function import calls leave as `undefined`. The second problem is that certain primitive use JS operators such as `===` and `<` that cannot be imported. A third problem is that most JS builtins are extremely permissive of the types of values they accept, and it's desirable to leverage wasm's type system to remove those checks and coercions wherever we can. -### Constraints +It seems that creating new importable definitions that adapt existing JS primitives to WebAssembly is simpler and more flexible in the future. -#### Modules can be shared across web workers using postMessage +## Do we need builtin modules? -Not all import values are shareable across workers. We must be able to send shareable import values across workers, and reject unshareable values. +There is a variety of execution techniques for WebAssembly. Some WebAssembly engines compile modules eagerly (at WebAssembly.compile), some use interpreters and dynamic tiering, and some use on-demand compilation (after instantiation) and dynamic tiering. -It’s possible that we could disable sharing modules that have any compile-time imports for an initial version, but this would need to be solved in the fullness of time. +If we just have builtin functions, it would be possible to normally import then without any work to add builtin modules. The main issue is that imported values are not known until instantiation, and so engines that compile eagerly would be unable to generate specialized code to these imports. -#### Compiled modules can be serialized to the network cache +It seems desirable to support a variety of execution techniques, especially because engines may support multiple depending on heuristics or change them over time. -Web engines can cache optimized code in a network cache entry keyed off of a HTTP fetch request. If the optimized code is specialized to runtime provided import values, we will need to expand the cache key to include those values. There is a risk that specializing to keys that change every page load could effectively disable code caching. +By adding builtin modules that are in a reserved and known namespace `:wasm`, engines can know that these builtin functions are being used at `WebAssembly.compile` time and generate optimal code for them. -#### Decoding the imports section can happen on a different thread from the imports object +## Goals for builtins -Parsing and compiling a module can happen on background threads which cannot perform property lookups on an imports object. +Builtins should not provide any new abilities to WebAssembly that JS doesn't already have. They are intended to just wrap existing primitives in such a manner that WebAssembly can efficiently use them. In the cases the primitive already has a name, we should re-use it and not invent a new one. -#### Reading from the imports object requires knowing the keys +The standardization of wasm builtins will be governed by the WebAssembly standards process and would exist in the JS-API document. -See [‘read the imports’](https://webassembly.github.io/spec/js-api/index.html#read-the-imports). Import value lookup is performed using JavaScript ‘get property’ which requires knowledge of the key you’re looking up. It’s not possible to pull all possible values from the imports object eagerly as it may be a JavaScript proxy or other exotic object which does not provide iteration over all possible keys. +The bar for adding a new builtin would be that it enables significantly better code generation for an important use-case beyond what is possible with a normal import. We don't want to add a new builtin for every existing API, only ones where adapting the JavaScript API to WebAssembly and allowing inline code generation results in significantly better codegen than a plain function call. -#### We should standardize the web interfaces that can be specialized to +## Function builtins -Specializing to an import can be critical to the runtime performance of the module. We should provide strong guarantees about when specialization happens and to which imports. +Function builtins are an instance of `WebAssembly.Function` and have a function type. One conceptualization is that they are a WebAssembly function on the outside and a JavaScript function on the inside. This combination allows efficient adaptation of primitives. -#### Do not conflict with future core wasm features +Their behavior would be defined using algorithmic steps similar to the WebIDL or EcmaScript standards. If possible, we could define them using equivalent JavaScript source code to emphasize that these do not provide any new abilities. -Do our best to not conflict with potential future wasm proposals, such as pre-imports, staged compilation, module linking, or the component model. Make minimal or no changes to the core specification. +## Type builtins -### Modifications +Type builtins could be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/webassembly/type-imports) proposal. The values contained in a type builtin would be specified with a predicate. -#### Add a WebIDL attribute for `shareable` +## Builtin modules -This attribute is to be used on WebAssembly builtins, and possibly other Web interfaces in the future. They can be used with the [structured clone](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm) algorithm, and as compile-time imports. +Builtin modules provide a collection of function or type builtins that can be imported. Each builtin module has a name, such as `js-string`, and lives under the `wasm:` namespace. A full import specifier would therefore be `(import "wasm:js-string" "equals" ...)`. -As they are well defined for structured clone, they are valid to be sent through postMessage. We may prevent them from being stored in user-facing persistent storage, such as IndexedDB. This is the situation with modules, as well. +The JS-API does not reserve a `wasm:` namespace today, so modules theoretically could already be using this namespace. Additionally, some users may wish to disable this feature for modules they compile so they could polyfill it. This feature is therefore opt-in on an individual builtin-module basis. -#### Modify the JS-API compilation methods to accept optional options +To just enabled the `js-string` module, a user would compile with: +``` +WebAssembly.compile(bytes, { builtinModules: ['js-string'] }); +``` +The full extension to the JS-API WebIDL is: ``` -dictionary WebAssemblyImportValue { - required USVString module; - required USVString name; - required any value; -} dictionary WebAssemblyCompileOptions { - optional sequence imports; + optional sequence builtinModules; } interface Module { @@ -80,77 +79,34 @@ interface Module { } namespace WebAssembly { - Promise compile( -BufferSource bytes, -optional WebAssemblyCompileOptions options); - ... + Promise compile( + BufferSource bytes, + optional WebAssemblyCompileOptions options); + ... }; - ``` -The `imports` field is a list of import values to apply when compiling. It is not the same kind of imports object as used when instantiating, due to the above mentioned design constraints around threading and ‘get property’. - -Every import key of ‘module’ and ‘name’ must be specified at most once, or else a `LinkError` will be thrown. Every value provided must have the `shareable` WebIDL attribute. - -A module compiled with `imports` extends the [‘Compile a WebAssembly module’](https://webassembly.github.io/spec/js-api/index.html#compile-a-webassembly-module) algorithm to check that the import values are compatible with the module. This could be expressed with a new embedding function `module_validate_imports(module, externval)` which only performs import matching and does not mutate the store. Any issues are reported with a `LinkError`. - -The provided import values are stored in the module object. Any import provided at compile-time does not need to be provided during instantiation. The `WebAssembly.Module.imports()` static method will also exclude listing these imports. - -Any provided import value may be specialized to when compiling the module if the engine deems it profitable. It is expected that standardized WebAssembly builtins will be guaranteed to be specialized to if they are exposed by an engine. - -Because every `shareable` value is valid for structured clone, the compiled module can always be sent with `postMessage`. `shareable` values are also expected to be safe to be cached by the browser in the network cache. - -### Open Questions - -#### Should we throw LinkError for importing a non `shareable` value - -The above proposal throws a `LinkError` if you import a non `shareable` value. It could be possible to import a non `shareable` value if we were to prevent that module from being sent with postMessage. Web engines with module caching would likely not specialize on those values to keep the resulting module cacheable. - -#### Is there a better design for the compile-time imports object? +A wasm module that has enabled a wasm builtin module will have the specific import specifier, such as `wasm:js-string` for that module available and eagerly applied. -Having a different structure for the two different kinds of imports is very unfortunate. +Concretely this means that imports that refer to that specifier will be eagerly checked for link errors at compile time, those imports will not show up in `WebAssembly.Module.imports()`, and those imports will not need to be provided at instantiation time. -One option proposed is to have the same kind of imports object as instantiation, but limit the recognized values to properties that are iterable. This would simplify the API, but might lead to confusion as the import objects look identical but are not treated the same. +When the module is instantiated, a unique instantiation of the builtin module is created. This means that re-exports of builtin functions will have different identities if they come from different instances. This is a useful property for future extensions to bind memory to builtins or evolve the types as things like type-imports or a core stringref type are added (see below). -## Builtins +## Feature detection -### Do we need builtins? +Users may wish to detect if a specific builtin module is available in their system. -Now that we have a method for providing imports at compile-time, we need to decide what we're actually importing. +A simple option is to add `WebAssembly.hasBuiltinModule(name)` method. This is likely too coarse grained though, users may wish to know if a specific function in a builtin module is available, as new ones may be added over time. -One interesting option would be for engines to pattern match on import values to well-known API’s. You could imagine an engine recognizing an import to `String.prototype.charCodeAt` and emitting efficient code generation for it. +A more general option would then be to extend `WebAssembly.validate` to also take a list of builtin modules to enable, like compile does. After validating the module, the eager link checking that compile does is also performed. This would allow checking for the presence of individual parts of a builtin module. -There are two main problems with this approach. +## Polyfilling -The first problem is that existing API’s require a calling convention conversion to handle differences around the `this` value, which WebAssembly function import calls leave as `undefined`. The second problem is that certain primitive use JS operators such as `===` and `<` that cannot be imported. - -It's possible that we could extend the [js-types](https://github.com/webassembly/js-types) `WebAssembly.Function` API to handle the `this` parameter. However, at this point the pattern matching will become more than one level deep, which becomes increasingly fragile. - -It seems that creating new importable definitions that adapt existing JS primitives to WebAssembly is simpler and more flexible in the future. - -### What is a builtin? - -A builtin is a definition on the WebAssembly namespace that can be imported by a module and provides efficient access to a JavaScript or Web primitive. There will be two types of builtins, functions and types. Type builtins will only become available with the [type-imports proposal](https://github.com/webassembly/type-imports). - -Builtins do not provide any new abilities to WebAssembly. They merely wrap existing primitives in such a manner that WebAssembly can efficiently use them. - -The standardization of builtins will be governed by the WebAssembly standards process and would exist in the JS-API document. - -The bar for adding a new builtin would be that it enables significantly better code generation for an important use-case beyond what is possible with a normal import. We don't want to add a new builtin for every existing API, only ones where adapting the JavaScript API to WebAssembly and allowing inline code generation results in significantly better codegen than a plain function call. - -### Function builtins - -Function builtins would be an instance of `WebAssembly.Function` and have a function type. One conceptualization is that they are a WebAssembly function on the outside and a JavaScript function on the inside. This combination allows efficient adaptation of primitives. - -Their behavior would be defined using algorithmic steps similar to the WebIDL or EcmaScript standards. If possible, we could define them using equivalent JavaScript source code to emphasize that these do not provide any new abilities. - -### Type builtins - -Type builtins would be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/webassembly/type-imports) proposal. The values contained in a type builtin would be specified with a predicate. +If a user wishes to polyfill these imports for some reason, or is running on a system without a builtin module, these imports may be provided as normal through instantiation. ## JS String Builtin API -The following is an initial set of function builtins for JavaScript String. They are defined on a `String` namespace in `WebAssembly` namespace. Each example includes pseudo-code illustrating their operation and some descriptive text. +The following is an initial set of function builtins for JavaScript String. The builtin module name is `js-string`. Each example includes pseudo-code illustrating their operation and some descriptive text. TODO: formalize these better. @@ -160,7 +116,28 @@ TODO: formalize these better. [3]: “trap” is meant to emit a wasm trap. This results in a WebAssembly.RuntimeError with the bit set that it is not catchable by exception handling. -### WebAssembly.String.fromWtf16Array +### "wasm:js-string" "cast" + +``` +function cast( + string: externref +) { + if (typeof string !== "string") + trap; +} +``` + +### "wasm:js-string" "test" + +``` +function test( + string: externref +) -> i32 { + return typeof string === "string" ? 1 : 0; +} +``` + +### "wasm:js-string" "fromWtf16Array" ``` func fromWtf16Array( @@ -189,7 +166,7 @@ func fromWtf16Array( Note: This function takes an i16 array defined in its own recursion group. If this is an issue for a toolchain, we can look into how to relax the function type while still maintaining good performance. -### WebAssembly.String.fromWtf8Array +### "wasm:js-string" "fromWtf8Array" ``` func fromWtf8Array( @@ -222,7 +199,7 @@ func fromWtf8Array( Note to implementers: while this is the only usage of WTF-8 in this document, it shouldn't be very burdensome to implement, because all existing strings in Wasm modules (import/export names, contents of the name section) are already in UTF-8 format, so implementations must already have decoding infrastructure for that. We need the relaxation from UTF-8 to WTF-8 to support WTF-16 based source languages, which may have unpaired surrogates in string constants in existing/legacy code. -### WebAssembly.String.toWtf16Array +### "wasm:js-string" "toWtf16Array" "start" is the index in the array where the first codeunit of the string will be written. @@ -253,7 +230,7 @@ func toWtf16Array( [4]: "array[i] = …" is meant to express "store the value … as the i-th element of the array", in Wasm terms: (array.set $i16-array-type (local.get $array) (local.get $i) (…)) for an appropriate $i16-array-type. -### WebAssembly.String.fromCharCode +### "wasm:js-string" "fromCharCode" ``` func fromCharCode( @@ -266,7 +243,7 @@ func fromCharCode( ``` [4]: Any charCode > 0xFFFF values are implicitly truncated. -### WebAssembly.String.fromCodePoint +### "wasm:js-string" "fromCodePoint" ``` func fromCodePoint( @@ -281,7 +258,7 @@ func fromCodePoint( } ``` -### WebAssembly.String.charCodeAt +### "wasm:js-string" "charCodeAt" ``` func charCodeAt( @@ -300,7 +277,7 @@ func charCodeAt( } ``` -### WebAssembly.String.codePointAt +### "wasm:js-string" "codePointAt" ``` func codePointAt( @@ -319,7 +296,7 @@ func codePointAt( } ``` -### WebAssembly.String.length +### "wasm:js-string" "length" ``` func length(string: externref) -> i32 { @@ -330,7 +307,7 @@ func length(string: externref) -> i32 { } ``` -### WebAssembly.String.concatenate +### "wasm:js-string" "concatenate" ``` func concatenate( @@ -347,7 +324,7 @@ func concatenate( } ``` -### WebAssembly.String.substring +### "wasm:js-string" "substring" ``` func substring( @@ -373,7 +350,7 @@ func substring( Note: We could consider allowing negative start/end indices, and adding them to the string's length to compute the effective indices, like String.prototype.slice does it. Is one of these behaviors more convenient for common use cases? Arguably it is more fitting with Wasm's style to only accept obviously-valid (i.e. in-bounds) parameters, and leave it to calling code to decide whether other values (positive out-of-bounds and/or negative) can occur at all, and if yes, how to handle them (map into bounds somehow, or reject). Note: Taking that thought one step further, we could consider throwing exceptions when startIndex > endIndex or startIndex > string.length or endIndex > string.length. If we do so, we should keep in mind that allowing empty slices (startIndex == endIndex) can be useful when this situation arises dynamically in string-processing algorithms. It is unlikely that throwing instead of returning an empty string in these cases would offer performance benefits. -### WebAssembly.String.equals +### "wasm:js-string" "equals" ``` func equals( @@ -389,7 +366,7 @@ func equals( } ``` -### WebAssembly.String.compare +### "wasm:js-string" "compare" ``` function compare( @@ -407,3 +384,54 @@ function compare( return first < second ? -1 : 1; } ``` + +## Future extensions + +There are several extensions we can make in the future as need arrives. + +### Binding memory to builtins + +It may be useful to have a builtin that operates on a specific wasm memory. For JS strings, this could allow us to encode a JS string directly into linear memory. + +One way we could do this, is by having the JS-API bind the first imported memory of a module to any imported builtin functions that want to operate on memory. If there is no imported memory and a builtin function that needs memory is imported, then a link error is reported. + +The memory is imported as opposed to exported so that it is guaranteed to exist when the builtin imports are provided. Using a memory defined only locally would have limited flexibility and would also be exposing a potentially private memory to outside its module. + +A quick example: +``` +(module + (; memory 0 ;) + (import ... (memory ...)) + + (; bound to memory 0 through the JS-API instantiating the builtin module ;) + (import "wasm:js-string" "encodeStringToMemoryUTF16" (func ...)) +) +``` + +Because the `wasm:js-string` module is instantiated when the module using it is instantiated, the imported memory will be around to be provided to both modules. + +If multi-memory is in use and the desired memory to bind with is not the first import, then we could consider parsing the imports to determine which memory is needed for which builtin. For example, `encodeStringToMemoryUTF16.2` for binding to memory 2. + +### Better function types to avoid runtime checks + +The initial set of JS String Builtins are typed to use `externref` for wherever a JS String is needed. This can lead to runtime checks that should be avoidable. Optimizing compilers can probably get rid of some of these, but not all. + +In the future, we could have type imports or a core stringref type. In this event, it would be desirable to use those in the function types to avoid unnecessary runtime checks. + +The difficulty is how to do this in a backwards compatible way. If we, for example, changed the type of a builtin from `[externref] -> []` to `(ref null 0) -> []`, we would break old code that imported it with the externref parameter. + +One option would be to version the name of the function builtins, and add a new one for the more advanced type signature. + +Another option to do this would be to extend the JS-API to inspect the function types used when importing these builtins to determine whether to provide it the 'advanced type' version or the 'basic type' version. This would be a heuristic, something like checking if the type refers to a type import or not. + +### UTF-8 and WTF-8 support + +There are no JS builtins available to get a UTF-8 or WTF-8 view of a JS String. + +One option would be to specify wasm builtins in term of the Web TextEncoder and TextDecoder interfaces. But this is probably a 'layering' violation, and is not clear what this means on JS runtimes outside the web. + +Another option around this would be to directly refer to the UTF-8/WTF-8 specs in the JS-API and write out the algorithms we need. However, this probably violates the goal of not creating a new String API. + +A final option would be to get TC39 to add the methods we need to JS Strings, so that we can use them in wasm builtins. This could take some time though, and may not be possible if TC39 does not find these methods worthwile. + +This needs more thought and discussion. From 09444a5ab0d8cde670b43200a406db255b0b7110 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Mon, 6 Nov 2023 07:49:54 -0600 Subject: [PATCH 04/70] Address review feedback --- proposals/js-string-builtins/Overview.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 87d6ea7756..9f7fc91b20 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -32,7 +32,7 @@ It seems that creating new importable definitions that adapt existing JS primiti There is a variety of execution techniques for WebAssembly. Some WebAssembly engines compile modules eagerly (at WebAssembly.compile), some use interpreters and dynamic tiering, and some use on-demand compilation (after instantiation) and dynamic tiering. -If we just have builtin functions, it would be possible to normally import then without any work to add builtin modules. The main issue is that imported values are not known until instantiation, and so engines that compile eagerly would be unable to generate specialized code to these imports. +If we just have builtin functions, it would be possible to normally import them without any work to add builtin modules. The main issue is that imported values are not known until instantiation, and so engines that compile eagerly would be unable to generate specialized code to these imports. It seems desirable to support a variety of execution techniques, especially because engines may support multiple depending on heuristics or change them over time. @@ -121,9 +121,10 @@ TODO: formalize these better. ``` function cast( string: externref -) { +) -> externref { if (typeof string !== "string") trap; + return string; } ``` @@ -428,7 +429,7 @@ Another option to do this would be to extend the JS-API to inspect the function There are no JS builtins available to get a UTF-8 or WTF-8 view of a JS String. -One option would be to specify wasm builtins in term of the Web TextEncoder and TextDecoder interfaces. But this is probably a 'layering' violation, and is not clear what this means on JS runtimes outside the web. +One option would be to specify wasm builtins in terms of the Web TextEncoder and TextDecoder interfaces. But this is probably a 'layering' violation, and is not clear what this means on JS runtimes outside the web. Another option around this would be to directly refer to the UTF-8/WTF-8 specs in the JS-API and write out the algorithms we need. However, this probably violates the goal of not creating a new String API. From 969eceeebbf3444dd0b3c6440e2809b8d2accb71 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Thu, 7 Dec 2023 12:16:50 -0600 Subject: [PATCH 05/70] Rename concatenate to concat --- proposals/js-string-builtins/Overview.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 9f7fc91b20..cde024a9a7 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -308,10 +308,10 @@ func length(string: externref) -> i32 { } ``` -### "wasm:js-string" "concatenate" +### "wasm:js-string" "concat" ``` -func concatenate( +func concat( first: externref, second: externref ) -> externref @@ -321,7 +321,7 @@ func concatenate( if (typeof second !== "string") trap; - return first + second; + return first.concat(second); } ``` From 6e8afd8f34f1c65f22f2f7053b209bf01aa9e149 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 29 Dec 2023 11:21:06 -0600 Subject: [PATCH 06/70] Editorial changes - Eliminate usage of 'builtin module' in description. This is not essential to the proposal and causes confusion around a similarly named JS proposal, which had different goals. - Clarify some minor points. - Make JS-API changes to WebIDL comprehensive. - Reword feature detection section to actually propose change to WebAssembly.validate method --- proposals/js-string-builtins/Overview.md | 79 ++++++++++++++++-------- 1 file changed, 52 insertions(+), 27 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index cde024a9a7..b13fbbb15d 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -12,13 +12,13 @@ However, the overhead of importing glue code is prohibitive for primitives such This proposal aims to provide a minimal and general mechanism for importing specific JavaScript primitives for efficient usage in WebAssembly code. -This is done by first adding a set of builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. +This is done by first adding a set of wasm builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. -Then a mechanism for importing modules containing these builtins (builtin modules) is added to the WebAssembly JS-API. These modules exist in a new reserved import namespace `wasm:` that is enabled at compile-time with a flag. +Then a mechanism for importing these wasm builtin functions is added to the WebAssembly JS-API. These builtins are grouped in modules and exist in a new reserved import namespace `wasm:` that is enabled at compile-time with a flag. -These two pieces in combination allow runtimes to reliably emit optimal code sequences for JavaScript String operations within WebAssembly modules. In the future, other builtin objects or primitives can be exposed through new builtins. +These two pieces in combination allow runtimes to reliably emit optimal code sequences for JavaScript String operations within WebAssembly modules. In the future, other JS builtin objects or JS primitives can be exposed through new wasm builtins. -## Do we need builtins? +## Do we need new wasm builtin functions? It is already possible today to import JS builtin functions (such as String.prototoype.getCharCodeAt) from wasm modules. Instead of defining new wasm specific-builtins, we could just re-use those directly. @@ -28,15 +28,15 @@ The first problem is that existing API’s require a calling convention conversi It seems that creating new importable definitions that adapt existing JS primitives to WebAssembly is simpler and more flexible in the future. -## Do we need builtin modules? +## Do we need a new import mechanism for wasm builtin functions? There is a variety of execution techniques for WebAssembly. Some WebAssembly engines compile modules eagerly (at WebAssembly.compile), some use interpreters and dynamic tiering, and some use on-demand compilation (after instantiation) and dynamic tiering. -If we just have builtin functions, it would be possible to normally import them without any work to add builtin modules. The main issue is that imported values are not known until instantiation, and so engines that compile eagerly would be unable to generate specialized code to these imports. +If we just have builtin functions, it would be possible to normally import them normally through instantiation. However this would prevent engines from using eager compilation when builtins are in use. It seems desirable to support a variety of execution techniques, especially because engines may support multiple depending on heuristics or change them over time. -By adding builtin modules that are in a reserved and known namespace `:wasm`, engines can know that these builtin functions are being used at `WebAssembly.compile` time and generate optimal code for them. +By adding builtins that are in a reserved and known namespace (`wasm:`), engines can know that these builtin functions are being used at `WebAssembly.compile` time and generate optimal code for them. ## Goals for builtins @@ -48,7 +48,7 @@ The bar for adding a new builtin would be that it enables significantly better c ## Function builtins -Function builtins are an instance of `WebAssembly.Function` and have a function type. One conceptualization is that they are a WebAssembly function on the outside and a JavaScript function on the inside. This combination allows efficient adaptation of primitives. +Function builtins are an instance of `WebAssembly.Function` and have a function type. They are conceptually a WebAssembly function on the outside and a JavaScript function on the inside. This combination allows efficient adaptation of primitives. Their behavior would be defined using algorithmic steps similar to the WebIDL or EcmaScript standards. If possible, we could define them using equivalent JavaScript source code to emphasize that these do not provide any new abilities. @@ -56,57 +56,82 @@ Their behavior would be defined using algorithmic steps similar to the WebIDL or Type builtins could be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/webassembly/type-imports) proposal. The values contained in a type builtin would be specified with a predicate. -## Builtin modules +This proposal does not add any type builtins, as the design around type-imports is in flux. -Builtin modules provide a collection of function or type builtins that can be imported. Each builtin module has a name, such as `js-string`, and lives under the `wasm:` namespace. A full import specifier would therefore be `(import "wasm:js-string" "equals" ...)`. +## Using builtins -The JS-API does not reserve a `wasm:` namespace today, so modules theoretically could already be using this namespace. Additionally, some users may wish to disable this feature for modules they compile so they could polyfill it. This feature is therefore opt-in on an individual builtin-module basis. +Every builtin has a name, and builtins are grouped into collections with a name that matches the interface they are mirroring. -To just enabled the `js-string` module, a user would compile with: +An example import specifier could therefore be `(import "wasm:js-string" "equals" ...)`. + +The JS-API does not reserve a `wasm:` namespace today, so modules theoretically could already be using this namespace. Additionally, some users may wish to disable this feature for modules they compile so they could polyfill it. This feature is therefore opt-in via flags for each interface. + +To just enabled `js-string` builtins, a user would compile with: ``` -WebAssembly.compile(bytes, { builtinModules: ['js-string'] }); +WebAssembly.compile(bytes, { builtins: ['js-string'] }); ``` The full extension to the JS-API WebIDL is: ``` dictionary WebAssemblyCompileOptions { - optional sequence builtinModules; + optional sequence builtins; } +[LegacyNamespace=WebAssembly, Exposed=*] interface Module { constructor(BufferSource bytes, optional WebAssemblyCompileOptions options); ... } +[Exposed=*] namespace WebAssembly { - Promise compile( - BufferSource bytes, - optional WebAssemblyCompileOptions options); - ... + # Validate accepts compile options for feature detection. + # See below for details. + boolean validate( + BufferSource bytes, + optional WebAssemblyCompileOptions options); + + # Async compile accepts compile options. + Promise compile( + BufferSource bytes, + optional WebAssemblyCompileOptions options); + + # Async instantiate overload with bytes parameters does accept compile + # options. + Promise instantiate( + BufferSource bytes, + optional object importObject, + optional WebAssemblyCompileOptions options + ); + + # Async instantiate overload with module parameter does not accept compile + # options and remains the same. + Promise instantiate( + Module moduleObject, + optional object importObject + ); }; ``` -A wasm module that has enabled a wasm builtin module will have the specific import specifier, such as `wasm:js-string` for that module available and eagerly applied. +A wasm module that has enabled builtins will have the specific import specifier, such as `wasm:js-string` for that interface available and eagerly applied. Concretely this means that imports that refer to that specifier will be eagerly checked for link errors at compile time, those imports will not show up in `WebAssembly.Module.imports()`, and those imports will not need to be provided at instantiation time. -When the module is instantiated, a unique instantiation of the builtin module is created. This means that re-exports of builtin functions will have different identities if they come from different instances. This is a useful property for future extensions to bind memory to builtins or evolve the types as things like type-imports or a core stringref type are added (see below). +When the module is instantiated, a unique instantiation of the builtins are created. This means that re-exports of builtin functions will have different identities if they come from different instances. This is a useful property for future extensions to bind memory to builtins or evolve the types as things like type-imports or a core stringref type are added (see below). ## Feature detection -Users may wish to detect if a specific builtin module is available in their system. - -A simple option is to add `WebAssembly.hasBuiltinModule(name)` method. This is likely too coarse grained though, users may wish to know if a specific function in a builtin module is available, as new ones may be added over time. +Users may wish to detect if a specific builtin is available in their system. -A more general option would then be to extend `WebAssembly.validate` to also take a list of builtin modules to enable, like compile does. After validating the module, the eager link checking that compile does is also performed. This would allow checking for the presence of individual parts of a builtin module. +For this purpose, `WebAssembly.validate` is extended to take a list of builtins to enable, like compile does. After validating the module, the eager link checking that compile does is also performed. Users can inspect the result of validate on modules importing builtins to see if they are supported. ## Polyfilling -If a user wishes to polyfill these imports for some reason, or is running on a system without a builtin module, these imports may be provided as normal through instantiation. +If a user wishes to polyfill these imports for some reason, or is running on a system without a builtin, these imports may be provided as normal through instantiation. ## JS String Builtin API -The following is an initial set of function builtins for JavaScript String. The builtin module name is `js-string`. Each example includes pseudo-code illustrating their operation and some descriptive text. +The following is an initial set of function builtins for JavaScript String. The builtin are exposed under `wasm:js-string`. Each example includes pseudo-code illustrating their operation and some descriptive text. TODO: formalize these better. @@ -404,7 +429,7 @@ A quick example: (; memory 0 ;) (import ... (memory ...)) - (; bound to memory 0 through the JS-API instantiating the builtin module ;) + (; bound to memory 0 through the JS-API instantiating the builtins ;) (import "wasm:js-string" "encodeStringToMemoryUTF16" (func ...)) ) ``` From 5f10a917e04f1319ad1bf030fdaac0b5688e1a06 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 29 Dec 2023 13:04:00 -0600 Subject: [PATCH 07/70] Tighten up definitions of function builtins - Function builtin behaviors is defined using 'create a host function' - Clarify behavior around monkey patching using standard language - Clarify edge cases around nullability - Clarify edge cases around unsigned/signed integers - Restrict 'substring' behavior to normal cases - Use wasm helpers for when wasm instructions are needed --- proposals/js-string-builtins/Overview.md | 273 +++++++++++++++-------- 1 file changed, 186 insertions(+), 87 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index b13fbbb15d..81ad387c06 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -48,9 +48,13 @@ The bar for adding a new builtin would be that it enables significantly better c ## Function builtins -Function builtins are an instance of `WebAssembly.Function` and have a function type. They are conceptually a WebAssembly function on the outside and a JavaScript function on the inside. This combination allows efficient adaptation of primitives. +Function builtins are defined with an external wasm function type, and internal JS-defined behavior. They have the same semantics as following ['create a host function'](https://webassembly.github.io/spec/js-api/#create-a-host-function) for the wasm function type and JS code given to get a wasm `funcaddr` that can be imported. -Their behavior would be defined using algorithmic steps similar to the WebIDL or EcmaScript standards. If possible, we could define them using equivalent JavaScript source code to emphasize that these do not provide any new abilities. +There are several implications of this: + - Calling a function builtin from wasm will have the wasm parameters converted to JS values, and JS results converted back to wasm values. + - Exported function builtins are wrapped using ['create a new Exported function'](https://webassembly.github.io/spec/js-api/#a-new-exported-function). + - Function builtins must be imported with the correct type. + - Function builtins may become `funcref`, stored in tables, etc. ## Type builtins @@ -131,24 +135,48 @@ If a user wishes to polyfill these imports for some reason, or is running on a s ## JS String Builtin API -The following is an initial set of function builtins for JavaScript String. The builtin are exposed under `wasm:js-string`. Each example includes pseudo-code illustrating their operation and some descriptive text. +The following is an initial set of function builtins for JavaScript String. The builtins are exposed under `wasm:js-string`. -TODO: formalize these better. +All below references to builtins on the Global object (e.g. `String.fromCharCode`) refer to the original version on the Global object before any modifications by user code. -[1]: This is meant to refer to what the original String.fromCharCode / String.fromCodePoint / String.prototype.charCodeAt / String.prototype.codePointAt / String.prototype.substring would do, in the absence of any monkey-patching. In a final version of this specification, we'll have to use more robust phrasing to express that; in the meantime, the given phrasing is more readable. +The following internal helpers are defined in wasm and used by the below definitions: -[2]: "array.length" is meant to express "load the array's length", in Wasm terms: (array.len (local.get $array)). - -[3]: “trap” is meant to emit a wasm trap. This results in a WebAssembly.RuntimeError with the bit set that it is not catchable by exception handling. +```wasm +(module + (type $array_i16 (array i16)) + (type $array_i16_mut (array (mut i16))) + + (func (export "trap") + unreachable + ) + (func (export "array_len") (param arrayref) (result i32) + local.get 0 + array.len + ) + (func (export "array_i16_get") (param (ref $array_i16) i32) (result i32) + local.get 0 + local.get 1 + array.get_u $array_i16 + ) + (func (export "array_i16_mut_set") (param (ref $array_i16_mut) i32 i32) + local.get 0 + local.get 1 + local.get 2 + array.set $array_i16_mut + ) +) +``` ### "wasm:js-string" "cast" ``` -function cast( +func cast( string: externref -) -> externref { - if (typeof string !== "string") - trap; +) -> (ref extern) { + if (string === null || + typeof string !== "string") + trap(); + return string; } ``` @@ -156,64 +184,97 @@ function cast( ### "wasm:js-string" "test" ``` -function test( +func test( string: externref ) -> i32 { - return typeof string === "string" ? 1 : 0; + if (string === null || + typeof string !== "string") + return 0; + return 1; } ``` ### "wasm:js-string" "fromWtf16Array" ``` +/// Convert the specified range of an immutable i16 array into a String, +/// treating each i16 as an unsigned 16-bit char code. +/// +/// The range is given by [start, end). This function traps if the range is +/// outside the bounds of the array. +/// +/// NOTE: This function only takes an immutable i16 array defined in its own +/// recursion group. +/// +/// If this is an issue for toolchains, we can look into how to relax the +/// function type while still maintaining good performance. func fromWtf16Array( array: (ref null (array i16)), start: i32, end: i32 ) -> (ref extern) { - start = ToUint32(start); - end = ToUint32(end); + // NOTE: `start` and `end` are interpreted as signed 32-bit integers when + // converted to JS values using standard conversions. Reinterpret them as + // unsigned here. + start >>>= 0; + end >>>= 0; - // [2] - if (end > array.length || start > end) - trap; + if (array === null) + trap(); + + if (start > end || + end > array_len(array)) + trap(); let result = ""; for(let i = start; i < end; i++) { - // [1], [4] - result += String.fromCharCode(array[i]); + let charCode = array_i16_get(array, i); + result += String.fromCharCode(charCode); } return result; } ``` -[4]: "array[i]" is meant to express "load the i-th element of the array", in Wasm terms: (array.get_u $i16-array-type (local.get $array) (local.get $i)) for an appropriate $i16-array-type. - -Note: This function takes an i16 array defined in its own recursion group. If this is an issue for a toolchain, we can look into how to relax the function type while still maintaining good performance. - ### "wasm:js-string" "fromWtf8Array" ``` +/// Convert the specified range of an immutable i8 array into a String, +/// treating the array as encoded using WTF-8. +/// +/// The range is given by [start, end). This function traps if the range is +/// outside the bounds of the array. +/// +/// NOTE: This function only takes an immutable i8 array defined in its own +/// recursion group. +/// +/// If this is an issue for toolchains, we can look into how to relax the +/// function type while still maintaining good performance. func fromWtf8Array( array: (ref null (array i8)), start: i32, end: i32 ) -> (ref extern) { - start = ToUint32(start); - end = ToUint32(end); + // NOTE: `start` and `end` are interpreted as signed 32-bit integers when + // converted to JS values using standard conversions. Reinterpret them as + // unsigned here. + start >>>= 0; + end >>>= 0; - // [2] - if (end > array.length || start > end) - trap; + if (array === null) + trap(); + + if (start > end || + end > array_len(array)) + trap(); // This summarizes as: "decode the WTF-8 string stored at array[start:end], // or trap if there is no valid WTF-8 string there". let result = ""; while (start < end) { if there is no valid wtf8-encoded code point at array[start] - trap; + trap(); let c be the code point at array[start]; // [1] result += String.fromCodePoint(c); @@ -227,35 +288,43 @@ Note to implementers: while this is the only usage of WTF-8 in this document, it ### "wasm:js-string" "toWtf16Array" -"start" is the index in the array where the first codeunit of the string will be written. - -Returns the number of codeunits written. Traps if the string doesn't fit into the array. - ``` +/// Copy a string into a pre-allocated mutable i16 array at `start` index. +/// +/// Returns the number of char codes written, which is equal to the length of +/// the string. +/// +/// Traps if the string doesn't fit into the array. func toWtf16Array( string: externref, array: (ref null (array (mut i16))), start: i32 ) -> i32 { - if (typeof string !== "string") - trap; + // NOTE: `start` is interpreted as a signed 32-bit integer when converted + // to a JS value using standard conversions. Reinterpret as unsigned here. + start >>>= 0; + + if (array === null) + trap(); - start = ToUint32(start); + if (string === null || + typeof string !== "string") + trap(); - if (start + string.length > array.length) - trap; + // The following addition is safe from overflow as adding two 32-bit integers + // cannot overflow Number.MAX_SAFE_INTEGER (2^53-1). + if (start + string.length > array_len(array)) + trap(); for (let i = 0; i < string.length; i++) { - // [4], [5] - array[start + i] = string.charCodeAt(i); + let charCode = string.charCodeAt(i); + array_i16_mut_set(array, start + i, charCode); } return string.length; } ``` -[4]: "array[i] = …" is meant to express "store the value … as the i-th element of the array", in Wasm terms: (array.set $i16-array-type (local.get $array) (local.get $i) (…)) for an appropriate $i16-array-type. - ### "wasm:js-string" "fromCharCode" ``` @@ -263,11 +332,13 @@ func fromCharCode( charCode: i32 ) -> (ref extern) { - charCode = ToUint32(charCode); - return String.fromCharCode(charCode); // [1], [4] + // NOTE: `charCode` is interpreted as a signed 32-bit integer when converted + // to a JS value using standard conversions. Reinterpret as unsigned here. + charCode >>>= 0; + + return String.fromCharCode(charCode); } ``` -[4]: Any charCode > 0xFFFF values are implicitly truncated. ### "wasm:js-string" "fromCodePoint" @@ -276,11 +347,16 @@ func fromCodePoint( codePoint: i32 ) -> (ref extern) { - codePoint = ToUint32(codePoint); + // NOTE: `codePoint` is interpreted as a signed 32-bit integer when converted + // to a JS value using standard conversions. Reinterpret as unsigned here. + codePoint >>>= 0; + + // fromCodePoint will throw a RangeError for values outside of this range, + // eagerly check for this an present as a wasm trap. if (codePoint > 0x10FFFF) - trap; + trap(); - return String.fromCodePoint(codePoint); // [1] + return String.fromCodePoint(codePoint); } ``` @@ -292,14 +368,18 @@ func charCodeAt( index: i32 ) -> i32 { - if (typeof string !== "string") - trap; + // NOTE: `index` is interpreted as a signed 32-bit integer when converted to + // a JS value using standard conversions. Reinterpret as unsigned here. + index >>>= 0; + + if (string === null || + typeof string !== "string") + trap(); - index = ToUint32(index); if (index >= string.length) - trap; + trap(); - return string.charCodeAt(index); // [1] + return string.charCodeAt(index); } ``` @@ -311,14 +391,18 @@ func codePointAt( index: i32 ) -> i32 { - if (typeof string !== "string") - trap; + // NOTE: `index` is interpreted as a signed 32-bit integer when converted to + // a JS value using standard conversions. Reinterpret as unsigned here. + index >>>= 0; + + if (string === null || + typeof string !== "string") + trap(); - index = ToUint32(index); if (index >= string.length) - trap; + trap(); - return string.codePointAt(index); // [1] + return string.codePointAt(index); } ``` @@ -326,8 +410,9 @@ func codePointAt( ``` func length(string: externref) -> i32 { - if (typeof string !== "string") - trap; + if (string === null || + typeof string !== "string") + trap(); return string.length; } @@ -341,10 +426,12 @@ func concat( second: externref ) -> externref { - if (typeof first !== "string") - trap; - if (typeof second !== "string") - trap; + if (first === null || + typeof first !== "string") + trap(); + if (second === null || + typeof second !== "string") + trap(); return first.concat(second); } @@ -355,27 +442,31 @@ func concat( ``` func substring( string: externref, - startIndex: i32, - endIndex: i32 + start: i32, + end: i32 ) -> (ref extern) { - if (typeof string !== "string") - trap; - - startIndex = ToUint32(startIndex); - endIndex = ToUint32(endIndex); - if (startIndex > string.length || - startIndex > endIndex) + // NOTE: `start` and `end` are interpreted as signed 32-bit integers when + // converted to JS values using standard conversions. Reinterpret them as + // unsigned here. + start >>>= 0; + end >>>= 0; + + if (string === null || + typeof string !== "string") + trap(); + + // Ensure the range is ordered and within bounds to avoid the complex + // behavior that `substring` performs when that is not the case. + if (start > end || + end > string.length) return ""; // [1] - return string.substring(startIndex, endIndex); + return string.substring(start, end); } ``` -Note: We could consider allowing negative start/end indices, and adding them to the string's length to compute the effective indices, like String.prototype.slice does it. Is one of these behaviors more convenient for common use cases? Arguably it is more fitting with Wasm's style to only accept obviously-valid (i.e. in-bounds) parameters, and leave it to calling code to decide whether other values (positive out-of-bounds and/or negative) can occur at all, and if yes, how to handle them (map into bounds somehow, or reject). -Note: Taking that thought one step further, we could consider throwing exceptions when startIndex > endIndex or startIndex > string.length or endIndex > string.length. If we do so, we should keep in mind that allowing empty slices (startIndex == endIndex) can be useful when this situation arises dynamically in string-processing algorithms. It is unlikely that throwing instead of returning an empty string in these cases would offer performance benefits. - ### "wasm:js-string" "equals" ``` @@ -384,10 +475,14 @@ func equals( second: externref ) -> i32 { - if (first !== null && typeof first !== "string") - trap; - if (second !== null && typeof second !== "string") - trap; + // Explicitly allow null strings to be compared for equality as that is + // meaningful. + if (first !== null && + typeof first !== "string") + trap(); + if (second !== null && + typeof second !== "string") + trap(); return first === second ? 1 : 0; } ``` @@ -400,10 +495,14 @@ function compare( second: externref ) -> i32 { - if (typeof first !== "string") - trap; - if (typeof second !== "string") - trap; + // Explicitly do not allow null strings to be compared, as there is no + // meaningful ordering given by the JS `<` operator. + if (first === null || + typeof first !== "string") + trap(); + if (second === null || + typeof second !== "string") + trap(); if (first === second) return 0; From a0a777321e1f4a83d5b570b4d14bca3932d4ebbb Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 29 Dec 2023 13:55:27 -0600 Subject: [PATCH 08/70] Rework support for WTF-8 The existing WTF-8 operation in this proposal violated one of the goals of the proposal: "don't create substantial new functionality" by introducing WTF-8 transcoding support to the web platform without prior precedent. The WTF-8 operation is removed because of this. The naming for WTF-16 operations is reworked to refer to 'charCodes' instead as that is what the JS String interface uses. We could support UTF-8 transcoding by referring to the TextEncoder/TextDecoder interfaces, so this commit adds support for that. --- proposals/js-string-builtins/Overview.md | 267 +++++++++++++++++------ 1 file changed, 201 insertions(+), 66 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 81ad387c06..8c54960499 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -42,6 +42,8 @@ By adding builtins that are in a reserved and known namespace (`wasm:`), engines Builtins should not provide any new abilities to WebAssembly that JS doesn't already have. They are intended to just wrap existing primitives in such a manner that WebAssembly can efficiently use them. In the cases the primitive already has a name, we should re-use it and not invent a new one. +Most builtins should be simple and do little work outside of calling into the JS functionality to do the operation. The one exception is for operations that convert between a JS primitive and a wasm primitive, such as between JS strings/arrays/linear memory. In this case, the builtin may need some non-trivial code to perform the operation. In these cases, it's still expected that the operation is just semantically copying information and not substantially transforming it into a new interpretation. + The standardization of wasm builtins will be governed by the WebAssembly standards process and would exist in the JS-API document. The bar for adding a new builtin would be that it enables significantly better code generation for an important use-case beyond what is possible with a normal import. We don't want to add a new builtin for every existing API, only ones where adapting the JavaScript API to WebAssembly and allowing inline code generation results in significantly better codegen than a plain function call. @@ -133,6 +135,14 @@ For this purpose, `WebAssembly.validate` is extended to take a list of builtins If a user wishes to polyfill these imports for some reason, or is running on a system without a builtin, these imports may be provided as normal through instantiation. +## UTF8/WTF8 support + +As stated above in 'goals for builtins', builtins are intended to just wrap existing primitives and not invent new functionality. + +JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write wasm builtins for these encodings without introducing significant new logic to them. + +There is however, the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces. + ## JS String Builtin API The following is an initial set of function builtins for JavaScript String. The builtins are exposed under `wasm:js-string`. @@ -194,7 +204,7 @@ func test( } ``` -### "wasm:js-string" "fromWtf16Array" +### "wasm:js-string" "fromCharCodeArray" ``` /// Convert the specified range of an immutable i16 array into a String, @@ -208,7 +218,7 @@ func test( /// /// If this is an issue for toolchains, we can look into how to relax the /// function type while still maintaining good performance. -func fromWtf16Array( +func fromCharCodeArray( array: (ref null (array i16)), start: i32, end: i32 @@ -236,57 +246,7 @@ func fromWtf16Array( } ``` -### "wasm:js-string" "fromWtf8Array" - -``` -/// Convert the specified range of an immutable i8 array into a String, -/// treating the array as encoded using WTF-8. -/// -/// The range is given by [start, end). This function traps if the range is -/// outside the bounds of the array. -/// -/// NOTE: This function only takes an immutable i8 array defined in its own -/// recursion group. -/// -/// If this is an issue for toolchains, we can look into how to relax the -/// function type while still maintaining good performance. -func fromWtf8Array( - array: (ref null (array i8)), - start: i32, - end: i32 -) -> (ref extern) -{ - // NOTE: `start` and `end` are interpreted as signed 32-bit integers when - // converted to JS values using standard conversions. Reinterpret them as - // unsigned here. - start >>>= 0; - end >>>= 0; - - if (array === null) - trap(); - - if (start > end || - end > array_len(array)) - trap(); - - // This summarizes as: "decode the WTF-8 string stored at array[start:end], - // or trap if there is no valid WTF-8 string there". - let result = ""; - while (start < end) { - if there is no valid wtf8-encoded code point at array[start] - trap(); - let c be the code point at array[start]; - // [1] - result += String.fromCodePoint(c); - increment start by as many bytes as it took to decode c; - } - return result; -} -``` - -Note to implementers: while this is the only usage of WTF-8 in this document, it shouldn't be very burdensome to implement, because all existing strings in Wasm modules (import/export names, contents of the name section) are already in UTF-8 format, so implementations must already have decoding infrastructure for that. We need the relaxation from UTF-8 to WTF-8 to support WTF-16 based source languages, which may have unpaired surrogates in string constants in existing/legacy code. - -### "wasm:js-string" "toWtf16Array" +### "wasm:js-string" "copyToCharCodeArray" ``` /// Copy a string into a pre-allocated mutable i16 array at `start` index. @@ -295,7 +255,7 @@ Note to implementers: while this is the only usage of WTF-8 in this document, it /// the string. /// /// Traps if the string doesn't fit into the array. -func toWtf16Array( +func copyToCharCodeArray( string: externref, array: (ref null (array (mut i16))), start: i32 @@ -510,6 +470,193 @@ function compare( } ``` +## Encoding API + +The following is an initial set of function builtins for the [`TextEncoder` interface](https://encoding.spec.whatwg.org/#interface-textencoder) and [`TextDecoder` interface](https://encoding.spec.whatwg.org/#interface-textdecoder) interfaces. These builtins are exposed under `wasm:text-encoder` and `wasm:text-decoder`, respectively. + +All below references to builtins on the Global object (e.g. `String.fromCharCode`) refer to the original version on the Global object before any modifications by user code. + +The following internal helpers are defined in wasm and used by the below definitions: + +```wasm +(module + (type $array_i8 (array i8)) + (type $array_i8_mut (array (mut i8))) + + (func (export "trap") + unreachable + ) + (func (export "array_len") (param arrayref) (result i32) + local.get 0 + array.len + ) + (func (export "array_i8_get") (param (ref $array_i8) i32) (result i32) + local.get 0 + local.get 1 + array.get_u $array_i8 + ) + (func (export "array_i8_mut_new") (param i32) (result (ref $array_i8_mut)) + local.get 0 + array.new_default $array_i8_mut + ) + (func (export "array_i8_mut_set") (param (ref $array_i8_mut) i32 i32) + local.get 0 + local.get 1 + local.get 2 + array.set $array_i8_mut + ) +) +``` + +### "wasm:text-decoder" "decodeStringFromUTF8Array" + +``` +/// Decode the specified range of an i8 array using UTF-8 into a string. +/// +/// The range is given by [start, end). This function traps if the range is +/// outside the bounds of the array. +/// +/// NOTE: This function only takes an immutable i8 array defined in its own +/// recursion group. +/// +/// If this is an issue for toolchains, we can look into how to relax the +/// function type while still maintaining good performance. +func decodeStringFromUTF8Array( + array: (ref null (array i8)), + start: i32, + end: i32 +) -> (ref extern) +{ + // NOTE: `start` and `end` are interpreted as signed 32-bit integers when + // converted to JS values using standard conversions. Reinterpret them as + // unsigned here. + start >>>= 0; + end >>>= 0; + + if (array === null) + trap(); + + if (start > end || + end > array_len(array)) + trap(); + + // Inialize a UTF-8 decoder with the default options + let decoder = new TextDecoder("utf-8", { + fatal: false, + ignoreBOM: false, + }); + + // Copy the wasm array into a Uint8Array for decoding + let bytesLength = end - start; + let bytes = new Uint8Array(bytesLength); + for (let i = start; i < end; i++) { + bytes[i - start] = array_i8_get(array, i); + } + + return decoder.decode(bytes); +} +``` + +### "wasm:text-encoder" "measureStringAsUTF8" + +``` +/// Returns the number of bytes string would take when encoded as UTF-8. +/// +/// Traps if the string doesn't fit into the array. +func measureStringAsUTF8( + string: externref, +) -> i32 +{ + // NOTE: `start` is interpreted as a signed 32-bit integer when converted + // to a JS value using standard conversions. Reinterpret as unsigned here. + start >>>= 0; + + if (array === null) + trap(); + + if (string === null || + typeof string !== "string") + trap(); + + // Encode the string into bytes using UTF-8 + let encoder = new TextEncoder(); + let bytes = encoder.encode(string); + + // Trap if the number of bytes is larger than can fit into an i32 + if (bytes.length > 0xffff_ffff) { + trap(); + } + return bytes.length; +} +``` + +### "wasm:text-encoder" "encodeStringIntoUTF8Array" + +``` +/// Encode a string into a pre-allocated mutable i8 array at `start` index using +/// the UTF-8 encoding. +/// +/// Returns the number of bytes written. +/// +/// Traps if the string doesn't fit into the array. +func encodeStringIntoUTF8Array( + string: externref, + array: (ref null (array (mut i8))), + start: i32 +) -> i32 +{ + // NOTE: `start` is interpreted as a signed 32-bit integer when converted + // to a JS value using standard conversions. Reinterpret as unsigned here. + start >>>= 0; + + if (array === null) + trap(); + + if (string === null || + typeof string !== "string") + trap(); + + // Encode the string into bytes using UTF-8 + let encoder = new TextEncoder(); + let bytes = encoder.encode(string); + + // The following addition is safe from overflow as adding two 32-bit integers + // cannot overflow Number.MAX_SAFE_INTEGER (2^53-1). + if (start + bytes.length > array_len(array)) + trap(); + + for (let i = 0; i < bytes.length; i++) { + array_i8_mut_set(array, start + i, bytes[i]); + } + + return bytes.length; +} +``` + +### "wasm:text-encoder" "encodeStringToUTF8Array" + +``` +/// Encode a string into a new mutable i8 array using UTF-8. +func encodeStringToUTF8Array( + string: externref +) -> (ref (array (mut i8))) +{ + if (string === null || + typeof string !== "string") + trap(); + + // Encode the string into bytes using UTF-8 + let encoder = new TextEncoder(); + let bytes = encoder.encode(string); + + let array = array_i8_mut_new(bytes.length); + for (let i = 0; i < bytes.length; i++) { + array_i8_mut_set(array, i, bytes[i]); + } + return array; +} +``` + ## Future extensions There are several extensions we can make in the future as need arrives. @@ -548,15 +695,3 @@ The difficulty is how to do this in a backwards compatible way. If we, for examp One option would be to version the name of the function builtins, and add a new one for the more advanced type signature. Another option to do this would be to extend the JS-API to inspect the function types used when importing these builtins to determine whether to provide it the 'advanced type' version or the 'basic type' version. This would be a heuristic, something like checking if the type refers to a type import or not. - -### UTF-8 and WTF-8 support - -There are no JS builtins available to get a UTF-8 or WTF-8 view of a JS String. - -One option would be to specify wasm builtins in terms of the Web TextEncoder and TextDecoder interfaces. But this is probably a 'layering' violation, and is not clear what this means on JS runtimes outside the web. - -Another option around this would be to directly refer to the UTF-8/WTF-8 specs in the JS-API and write out the algorithms we need. However, this probably violates the goal of not creating a new String API. - -A final option would be to get TC39 to add the methods we need to JS Strings, so that we can use them in wasm builtins. This could take some time though, and may not be possible if TC39 does not find these methods worthwile. - -This needs more thought and discussion. From 42143e463431d3d650312718f5c415fee24780ab Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Tue, 9 Jan 2024 10:58:17 -0500 Subject: [PATCH 09/70] Review comments --- proposals/js-string-builtins/Overview.md | 77 ++++++++++++++++++------ 1 file changed, 58 insertions(+), 19 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 8c54960499..19c9b88b03 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -153,7 +153,6 @@ The following internal helpers are defined in wasm and used by the below definit ```wasm (module - (type $array_i16 (array i16)) (type $array_i16_mut (array (mut i16))) (func (export "trap") @@ -163,10 +162,10 @@ The following internal helpers are defined in wasm and used by the below definit local.get 0 array.len ) - (func (export "array_i16_get") (param (ref $array_i16) i32) (result i32) + (func (export "array_i16_mut_get") (param (ref $array_i16_mut) i32) (result i32) local.get 0 local.get 1 - array.get_u $array_i16 + array.get_u $array_i16_mut ) (func (export "array_i16_mut_set") (param (ref $array_i16_mut) i32 i32) local.get 0 @@ -183,6 +182,8 @@ The following internal helpers are defined in wasm and used by the below definit func cast( string: externref ) -> (ref extern) { + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -197,6 +198,8 @@ func cast( func test( string: externref ) -> i32 { + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") return 0; @@ -207,19 +210,19 @@ func test( ### "wasm:js-string" "fromCharCodeArray" ``` -/// Convert the specified range of an immutable i16 array into a String, +/// Convert the specified range of a mutable i16 array into a String, /// treating each i16 as an unsigned 16-bit char code. /// /// The range is given by [start, end). This function traps if the range is /// outside the bounds of the array. /// -/// NOTE: This function only takes an immutable i16 array defined in its own +/// NOTE: This function only takes a mutable i16 array defined in its own /// recursion group. /// /// If this is an issue for toolchains, we can look into how to relax the /// function type while still maintaining good performance. func fromCharCodeArray( - array: (ref null (array i16)), + array: (ref null (array (mut i16))), start: i32, end: i32 ) -> (ref extern) @@ -239,14 +242,14 @@ func fromCharCodeArray( let result = ""; for(let i = start; i < end; i++) { - let charCode = array_i16_get(array, i); + let charCode = array_i16_mut_get(array, i); result += String.fromCharCode(charCode); } return result; } ``` -### "wasm:js-string" "copyToCharCodeArray" +### "wasm:js-string" "intoCharCodeArray" ``` /// Copy a string into a pre-allocated mutable i16 array at `start` index. @@ -255,7 +258,7 @@ func fromCharCodeArray( /// the string. /// /// Traps if the string doesn't fit into the array. -func copyToCharCodeArray( +func intoCharCodeArray( string: externref, array: (ref null (array (mut i16))), start: i32 @@ -268,6 +271,8 @@ func copyToCharCodeArray( if (array === null) trap(); + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -332,6 +337,8 @@ func charCodeAt( // a JS value using standard conversions. Reinterpret as unsigned here. index >>>= 0; + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -355,6 +362,8 @@ func codePointAt( // a JS value using standard conversions. Reinterpret as unsigned here. index >>>= 0; + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -370,6 +379,8 @@ func codePointAt( ``` func length(string: externref) -> i32 { + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -412,14 +423,15 @@ func substring( start >>>= 0; end >>>= 0; + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); - // Ensure the range is ordered and within bounds to avoid the complex - // behavior that `substring` performs when that is not the case. - if (start > end || - end > string.length) + // Ensure the range is within bounds to avoid the complex behavior that + // `substring` performs when that is not the case. + if (start > string.length) return ""; // [1] @@ -483,7 +495,7 @@ The following internal helpers are defined in wasm and used by the below definit (type $array_i8 (array i8)) (type $array_i8_mut (array (mut i8))) - (func (export "trap") + (func (export "unreachable") unreachable ) (func (export "array_len") (param arrayref) (result i32) @@ -508,6 +520,25 @@ The following internal helpers are defined in wasm and used by the below definit ) ``` +```js +// Triggers a wasm trap, which will generate a WebAssembly.RuntimeError that is +// uncatchable to WebAssembly with an implementation defined message. +function trap() { + // Directly constructing and throwing a WebAssembly.RuntimeError will yield + // an exception that is catchable by the WebAssembly exception-handling + // proposal. Workaround this by executing an unreachable trap and + // modifying it. The final spec will probably use a non-polyfillable + // intrinsic to get this exactly right. + try { + unreachable(); + } catch (err) { + // Wasm trap error messages are not defined by the JS-API spec currently. + err.message = IMPL_DEFINED; + throw err; + } +} +``` + ### "wasm:text-decoder" "decodeStringFromUTF8Array" ``` @@ -562,7 +593,7 @@ func decodeStringFromUTF8Array( ``` /// Returns the number of bytes string would take when encoded as UTF-8. /// -/// Traps if the string doesn't fit into the array. +/// Traps if the length of the UTF-8 encoded string doesn't fit into an i32 func measureStringAsUTF8( string: externref, ) -> i32 @@ -571,9 +602,8 @@ func measureStringAsUTF8( // to a JS value using standard conversions. Reinterpret as unsigned here. start >>>= 0; - if (array === null) - trap(); - + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -594,7 +624,9 @@ func measureStringAsUTF8( ``` /// Encode a string into a pre-allocated mutable i8 array at `start` index using -/// the UTF-8 encoding. +/// the UTF-8 encoding. This uses the replacement character for unpaired +/// surrogates and so it doesn't support lossless round-tripping with +/// `decodeStringFromUTF8Array`. /// /// Returns the number of bytes written. /// @@ -612,6 +644,8 @@ func encodeStringIntoUTF8Array( if (array === null) trap(); + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); @@ -637,10 +671,15 @@ func encodeStringIntoUTF8Array( ``` /// Encode a string into a new mutable i8 array using UTF-8. +//// +/// This uses the replacement character for unpaired surrogates and so it +/// doesn't support lossless round-tripping with `decodeStringFromUTF8Array`. func encodeStringToUTF8Array( string: externref ) -> (ref (array (mut i8))) { + // Technically a partially redundant test, but want to be clear the null is + // not allowed. if (string === null || typeof string !== "string") trap(); From 206a3a095ddce021c95699b65ae4edd3618fac95 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Tue, 16 Jan 2024 09:35:36 -0500 Subject: [PATCH 10/70] Update substring range checking --- proposals/js-string-builtins/Overview.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 19c9b88b03..94e5fcbc41 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -429,12 +429,15 @@ func substring( typeof string !== "string") trap(); - // Ensure the range is within bounds to avoid the complex behavior that - // `substring` performs when that is not the case. - if (start > string.length) + // Ensure the range is ordered to avoid the complex behavior that `substring` + // performs when that is not the case. + if (start > end || + start > string.length) return ""; - // [1] + // If end > string.length, `substring` is specified to clamp it + // start is guaranteed to be at least zero (as it is unsigned), so there will + // not be any clamping of start. return string.substring(start, end); } ``` From f8138fec916d9eb6a884e66ec224ecbfdf5456e9 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 19 Jan 2024 15:08:34 -0500 Subject: [PATCH 11/70] Fix mutability of array i8 Fixes #19. --- proposals/js-string-builtins/Overview.md | 33 ++++++++++++------------ 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 94e5fcbc41..51c865a44c 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -153,7 +153,7 @@ The following internal helpers are defined in wasm and used by the below definit ```wasm (module - (type $array_i16_mut (array (mut i16))) + (type $array_i16 (array (mut i16))) (func (export "trap") unreachable @@ -162,16 +162,16 @@ The following internal helpers are defined in wasm and used by the below definit local.get 0 array.len ) - (func (export "array_i16_mut_get") (param (ref $array_i16_mut) i32) (result i32) + (func (export "array_i16_get") (param (ref $array_i16) i32) (result i32) local.get 0 local.get 1 - array.get_u $array_i16_mut + array.get_u $array_i16 ) - (func (export "array_i16_mut_set") (param (ref $array_i16_mut) i32 i32) + (func (export "array_i16_set") (param (ref $array_i16) i32 i32) local.get 0 local.get 1 local.get 2 - array.set $array_i16_mut + array.set $array_i16 ) ) ``` @@ -242,7 +242,7 @@ func fromCharCodeArray( let result = ""; for(let i = start; i < end; i++) { - let charCode = array_i16_mut_get(array, i); + let charCode = array_i16_get(array, i); result += String.fromCharCode(charCode); } return result; @@ -284,7 +284,7 @@ func intoCharCodeArray( for (let i = 0; i < string.length; i++) { let charCode = string.charCodeAt(i); - array_i16_mut_set(array, start + i, charCode); + array_i16_set(array, start + i, charCode); } return string.length; } @@ -495,8 +495,7 @@ The following internal helpers are defined in wasm and used by the below definit ```wasm (module - (type $array_i8 (array i8)) - (type $array_i8_mut (array (mut i8))) + (type $array_i8 (array (mut i8))) (func (export "unreachable") unreachable @@ -510,15 +509,15 @@ The following internal helpers are defined in wasm and used by the below definit local.get 1 array.get_u $array_i8 ) - (func (export "array_i8_mut_new") (param i32) (result (ref $array_i8_mut)) + (func (export "array_i8_new") (param i32) (result (ref $array_i8)) local.get 0 - array.new_default $array_i8_mut + array.new_default $array_i8 ) - (func (export "array_i8_mut_set") (param (ref $array_i8_mut) i32 i32) + (func (export "array_i8_set") (param (ref $array_i8) i32 i32) local.get 0 local.get 1 local.get 2 - array.set $array_i8_mut + array.set $array_i8 ) ) ``` @@ -556,7 +555,7 @@ function trap() { /// If this is an issue for toolchains, we can look into how to relax the /// function type while still maintaining good performance. func decodeStringFromUTF8Array( - array: (ref null (array i8)), + array: (ref null (array (mut i8))), start: i32, end: i32 ) -> (ref extern) @@ -663,7 +662,7 @@ func encodeStringIntoUTF8Array( trap(); for (let i = 0; i < bytes.length; i++) { - array_i8_mut_set(array, start + i, bytes[i]); + array_i8_set(array, start + i, bytes[i]); } return bytes.length; @@ -691,9 +690,9 @@ func encodeStringToUTF8Array( let encoder = new TextEncoder(); let bytes = encoder.encode(string); - let array = array_i8_mut_new(bytes.length); + let array = array_i8_new(bytes.length); for (let i = 0; i < bytes.length; i++) { - array_i8_mut_set(array, i, bytes[i]); + array_i8_set(array, i, bytes[i]); } return array; } From fee87c1dc79210396392dbde559bfe99ed8d74a3 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 19 Jan 2024 16:06:05 -0500 Subject: [PATCH 12/70] Add streaming-related functions Fixes #19 --- proposals/js-string-builtins/Overview.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 51c865a44c..739b5a18bd 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -116,6 +116,18 @@ namespace WebAssembly { Module moduleObject, optional object importObject ); + + # Async streaming compile accepts compile options. + Promise compileStreaming( + Promise source, + optional WebAssemblyCompileOptions options); + + # Async streaming compile and instantiate accepts compile options after + # imports. + Promise instantiateStreaming( + Promise source, + optional object importObject, + optional WebAssemblyCompileOptions options); }; ``` From 9363099246db61b34f556f378024bc8a283d342f Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 19 Jan 2024 16:15:46 -0500 Subject: [PATCH 13/70] Clarify behavior of flags and function names Fixes #17. --- proposals/js-string-builtins/Overview.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 739b5a18bd..c131359b2a 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -58,6 +58,8 @@ There are several implications of this: - Function builtins must be imported with the correct type. - Function builtins may become `funcref`, stored in tables, etc. +The `name of the WebAssembly function` JS-API procedure is extended to return the import field name for builtin functions, not an index value. + ## Type builtins Type builtins could be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/webassembly/type-imports) proposal. The values contained in a type builtin would be specified with a predicate. @@ -133,10 +135,16 @@ namespace WebAssembly { A wasm module that has enabled builtins will have the specific import specifier, such as `wasm:js-string` for that interface available and eagerly applied. -Concretely this means that imports that refer to that specifier will be eagerly checked for link errors at compile time, those imports will not show up in `WebAssembly.Module.imports()`, and those imports will not need to be provided at instantiation time. +Concretely this means that imports that refer to that specifier will be eagerly checked for link errors at compile time, those imports will not show up in `WebAssembly.Module.imports()`, and those imports will not need to be provided at instantiation time. No property lookup on the instantiation imports object will be done for those imports. When the module is instantiated, a unique instantiation of the builtins are created. This means that re-exports of builtin functions will have different identities if they come from different instances. This is a useful property for future extensions to bind memory to builtins or evolve the types as things like type-imports or a core stringref type are added (see below). +## Progressive enhancement + +For engines that don't support builtins, any compile options passed to the JS-API will be ignored (due to WebIDL rules for extra parameters). For engines that do support builtins, any imports that refer to a builtin are not looked up on the instantiation import object. + +Together this means that it's safe for users to request builtins while still providing a polyfill for backup behavior and the optimal path will be chosen. + ## Feature detection Users may wish to detect if a specific builtin is available in their system. From 492bbd40ac9a9fc04e5e2cbc22f79a04760c7dad Mon Sep 17 00:00:00 2001 From: Martin Kustermann Date: Fri, 15 Mar 2024 12:37:37 +0100 Subject: [PATCH 14/70] Fix link to type-imports prposal in js-string-builtins/Overview.md --- proposals/js-string-builtins/Overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index c131359b2a..39c2df8c18 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -62,7 +62,7 @@ The `name of the WebAssembly function` JS-API procedure is extended to return th ## Type builtins -Type builtins could be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/webassembly/type-imports) proposal. The values contained in a type builtin would be specified with a predicate. +Type builtins could be an instance of the `WebAssembly.Type` interface provided by the [type-imports](https://github.com/WebAssembly/proposal-type-imports) proposal. The values contained in a type builtin would be specified with a predicate. This proposal does not add any type builtins, as the design around type-imports is in flux. From 2b24b0d1ca7595b27cac579b3db5cabc6187ce47 Mon Sep 17 00:00:00 2001 From: Thomas Lively Date: Mon, 18 Mar 2024 09:53:21 -0700 Subject: [PATCH 15/70] Clarify feature detection scheme Explain that users should validate modules that deliberately produce link errors to test for support for particular builtins. --- proposals/js-string-builtins/Overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 39c2df8c18..7b249fe910 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -149,7 +149,7 @@ Together this means that it's safe for users to request builtins while still pro Users may wish to detect if a specific builtin is available in their system. -For this purpose, `WebAssembly.validate` is extended to take a list of builtins to enable, like compile does. After validating the module, the eager link checking that compile does is also performed. Users can inspect the result of validate on modules importing builtins to see if they are supported. +For this purpose, `WebAssembly.validate` is extended to take a list of builtins to enable, like compile does. After validating the module, the eager link checking that compile does is also performed. Users can validate a module that deliberately imports a builtin operation with an incorrect signature and infer support for that particular builtin if validation reports a link error. ## Polyfilling From cf796dcb4b800c7f60fec709469024abbd9d5686 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Mon, 25 Mar 2024 16:56:54 -0500 Subject: [PATCH 16/70] Add test for js-string-builtins This commit adds a basic suite of tests for the js-string-builtins. This is done by defining a polyfill module matching the overview, and then comparing the host provided builtins against the polyfill on representative inputs. --- test/.DS_Store | Bin 0 -> 6148 bytes test/js-api/.DS_Store | Bin 0 -> 6148 bytes test/js-api/js-string/basic.tentative.any.js | 383 +++++++++++++++++++ test/js-api/js-string/polyfill.js | 170 ++++++++ test/js-api/wasm-module-builder.js | 4 + 5 files changed, 557 insertions(+) create mode 100644 test/.DS_Store create mode 100644 test/js-api/.DS_Store create mode 100644 test/js-api/js-string/basic.tentative.any.js create mode 100644 test/js-api/js-string/polyfill.js diff --git a/test/.DS_Store b/test/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..f0081cd54879fe0a01a241b176985baa2af64f92 GIT binary patch literal 6148 zcmeHK!A`Q*cVojrX7?-pC(~leRy+qSd)l zQ81tT!A01gbjv#@DjxY^+#l+Iu-8YI>&r0isYzRn<6fd;J+t8yonp6KnNI7iW>q%o z^;uO;TZgr(JZdy&v!b)RcW`{xeTtrA^=e2I_`hk{wm643oGkKr@rH4v;(N4_e;yto zfqn2{0O5w@x|GvOzH?3(nE_^i8CYe&o}o@@b%WQ;05kCK8KCn)q7u3mQ-k{Gz(KbF zh;)tAf^+I6C`MXzEv5!>2Su1rL=!4(iy=%n+NJe#Ev5!dI0)N(2>WJXI~1Ycj?b6c z9fWI;M`nN-SY{w=mSw8{r$6`q%R#(i2AF}3VnF0NUZ;&I+1k369MxKhdW%XzeyPDl k2u^eS5T317H;7QgipK@76&n>%yu?~xz=$4HYGX=`rrFY@_D~8r>kIh;zJlP} zIQvskTET;e$PCPWv$He1*)L%?0|2bn4_$x~0B}@=1qYicqQ;d@NWpsM5SgAMgC0bX zKsy*rM3dt$GN9g_1GkXEI_ih_mv)0#1~~c@bmJ(^s?`@(m?_TA&GUJ_z%Tt9IqJ)*Tz6Cv!@bRFMQm^Fj7APGuWjrfH}2w2A|DK$48J@rix$W5h{lwi#ckKW zmYwc6YMjZ)ct?-b^ZT2 ziF(8UG4QV#V8yy$ui=*L-8!*3b=L~iD^w*aE;sm|f`-0|F;-p0tEgJgFO-4kS { + // Compile a module that exports a function for each builtin that will call + // it. We could just generate a module that re-exports the builtins, but that + // would not catch any special codegen that could happen when direct calling + // a known builtin function from wasm. + const builder = new WasmModuleBuilder(); + const arrayIndex = builder.addArray(kWasmI16, true, kNoSuperType, true); + const builtins = [ + { + name: "test", + params: [kWasmExternRef], + results: [kWasmI32], + }, + { + name: "cast", + params: [kWasmExternRef], + results: [wasmRefType(kWasmExternRef)], + }, + { + name: "fromCharCodeArray", + params: [wasmRefNullType(arrayIndex), kWasmI32, kWasmI32], + results: [wasmRefType(kWasmExternRef)], + }, + { + name: "intoCharCodeArray", + params: [kWasmExternRef, wasmRefNullType(arrayIndex), kWasmI32], + results: [kWasmI32], + }, + { + name: "fromCharCode", + params: [kWasmI32], + results: [wasmRefType(kWasmExternRef)], + }, + { + name: "fromCodePoint", + params: [kWasmI32], + results: [wasmRefType(kWasmExternRef)], + }, + { + name: "charCodeAt", + params: [kWasmExternRef, kWasmI32], + results: [kWasmI32], + }, + { + name: "codePointAt", + params: [kWasmExternRef, kWasmI32], + results: [kWasmI32], + }, + { + name: "length", + params: [kWasmExternRef], + results: [kWasmI32], + }, + { + name: "concat", + params: [kWasmExternRef, kWasmExternRef], + results: [wasmRefType(kWasmExternRef)], + }, + { + name: "substring", + params: [kWasmExternRef, kWasmI32, kWasmI32], + results: [wasmRefType(kWasmExternRef)], + }, + { + name: "equals", + params: [kWasmExternRef, kWasmExternRef], + results: [kWasmI32], + }, + { + name: "compare", + params: [kWasmExternRef, kWasmExternRef], + results: [kWasmI32], + }, + ]; + + // Add a function type for each builtin + for (let builtin of builtins) { + builtin.type = builder.addType({ + params: builtin.params, + results: builtin.results + }); + } + + // Add an import for each builtin + for (let builtin of builtins) { + builtin.importFuncIndex = builder.addImport( + "wasm:js-string", + builtin.name, + builtin.type); + } + + // Generate an exported function to call the builtin + for (let builtin of builtins) { + let func = builder.addFunction(builtin.name + "Imp", builtin.type); + func.addLocals(builtin.params.length); + let body = []; + for (let i = 0; i < builtin.params.length; i++) { + body.push(kExprLocalGet); + body.push(...wasmSignedLeb(i)); + } + body.push(kExprCallFunction); + body.push(...wasmSignedLeb(builtin.importFuncIndex)); + func.addBody(body); + func.exportAs(builtin.name); + } + + const buffer = builder.toBuffer(); + + // Instantiate this module using the builtins from the host + const builtinModule = new WebAssembly.Module(buffer, { + builtins: ["js-string"] + }); + const builtinInstance = new WebAssembly.Instance(builtinModule, {}); + builtinExports = builtinInstance.exports; + + // Instantiate this module using the polyfill module + const polyfillModule = new WebAssembly.Module(buffer); + const polyfillInstance = new WebAssembly.Instance(polyfillModule, { + "wasm:js-string": polyfillImports + }); + polyfillExports = polyfillInstance.exports; +}); + +// A helper function to assert that the behavior of two functions are the +// same. +function assert_same_behavior(funcA, funcB, ...params) { + let resultA; + let errA = null; + try { + resultA = funcA(...params); + } catch (err) { + errA = err; + } + + let resultB; + let errB = null; + try { + resultB = funcB(...params); + } catch (err) { + errB = err; + } + + if (errA || errB) { + assert_equals(errA === null, errB === null, errA ? errA.message : errB.message); + assert_equals(Object.getPrototypeOf(errA), Object.getPrototypeOf(errB)); + } + assert_equals(resultA, resultB); + + if (errA) { + throw errA; + } + return resultA; +} + +function assert_throws_if(func, shouldThrow, constructor) { + let error = null; + try { + func(); + } catch (e) { + error = e; + } + assert_equals(error !== null, shouldThrow); + if (shouldThrow && error !== null) { + assert_true(error instanceof constructor); + } +} + +// Constant values used in the tests below +const testStrings = [ + "", + "a", + "1", + "ab", + "hello, world", + "\n", + "☺", + "☺☺", + String.fromCodePoint(0x10000, 0x10001) +]; +const testCharCodes = [1, 2, 3, 10, 0x7f, 0xff, 0xfffe, 0xffff]; +const testCodePoints = [1, 2, 3, 10, 0x7f, 0xff, 0xfffe, 0xffff, 0x10000, 0x10001]; +const testExternRefValues = [ + null, + undefined, + true, + false, + {x:1337}, + ["abracadabra"], + 13.37, + -0, + 0x7fffffff + 0.1, + -0x7fffffff - 0.1, + 0x80000000 + 0.1, + -0x80000000 - 0.1, + 0xffffffff + 0.1, + -0xffffffff - 0.1, + Number.EPSILON, + Number.MAX_SAFE_INTEGER, + Number.MIN_SAFE_INTEGER, + Number.MIN_VALUE, + Number.MAX_VALUE, + Number.NaN, + "hi", + 37n, + new Number(42), + new Boolean(true), + Symbol("status"), + () => 1337, +]; + +// Test that `test` and `cast` work on various JS values. Run all the +// other builtins and assert that they also perform equivalent type +// checks. +test(() => { + for (let a of testExternRefValues) { + let isString = assert_same_behavior( + builtinExports['test'], + polyfillExports['test'], + a + ); + + assert_throws_if(() => assert_same_behavior( + builtinExports['cast'], + polyfillExports['cast'], + a + ), !isString, WebAssembly.RuntimeError); + + let arrayMutI16 = helperExports.createArrayMutI16(10); + assert_throws_if(() => assert_same_behavior( + builtinExports['intoCharCodeArray'], + polyfillExports['intoCharCodeArray'], + a, arrayMutI16, 0 + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['charCodeAt'], + polyfillExports['charCodeAt'], + a, 0 + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['codePointAt'], + polyfillExports['codePointAt'], + a, 0 + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['length'], + polyfillExports['length'], + a + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['concat'], + polyfillExports['concat'], + a, a + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['substring'], + polyfillExports['substring'], + a, 0, 0 + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['equals'], + polyfillExports['equals'], + a, a + ), !isString, WebAssembly.RuntimeError); + + assert_throws_if(() => assert_same_behavior( + builtinExports['compare'], + polyfillExports['compare'], + a, a + ), !isString, WebAssembly.RuntimeError); + } +}); + +// Test that `fromCharCode` works on various char codes +test(() => { + for (let a of testCharCodes) { + assert_same_behavior( + builtinExports['fromCharCode'], + polyfillExports['fromCharCode'], + a + ); + } +}); + +// Test that `fromCodePoint` works on various code points +test(() => { + for (let a of testCodePoints) { + assert_same_behavior( + builtinExports['fromCodePoint'], + polyfillExports['fromCodePoint'], + a + ); + } +}); + +// Perform tests on various strings +test(() => { + for (let a of testStrings) { + let length = assert_same_behavior( + builtinExports['length'], + polyfillExports['length'], + a + ); + + for (let i = 0; i < length; i++) { + let charCode = assert_same_behavior( + builtinExports['charCodeAt'], + polyfillExports['charCodeAt'], + a, i + ); + } + + for (let i = 0; i < length; i++) { + let charCode = assert_same_behavior( + builtinExports['codePointAt'], + polyfillExports['codePointAt'], + a, i + ); + } + + let arrayMutI16 = helperExports.createArrayMutI16(length); + assert_same_behavior( + builtinExports['intoCharCodeArray'], + polyfillExports['intoCharCodeArray'], + a, arrayMutI16, 0 + ); + + assert_same_behavior( + builtinExports['fromCharCodeArray'], + polyfillExports['fromCharCodeArray'], + arrayMutI16, 0, length + ); + + for (let i = 0; i < length; i++) { + for (let j = 0; j < length; j++) { + assert_same_behavior( + builtinExports['substring'], + polyfillExports['substring'], + a, i, j + ); + } + } + } +}); + +// Test various binary operations +test(() => { + for (let a of testStrings) { + for (let b of testStrings) { + assert_same_behavior( + builtinExports['concat'], + polyfillExports['concat'], + a, b + ); + + assert_same_behavior( + builtinExports['equals'], + polyfillExports['equals'], + a, b + ); + + assert_same_behavior( + builtinExports['compare'], + polyfillExports['compare'], + a, b + ); + } + } +}); diff --git a/test/js-api/js-string/polyfill.js b/test/js-api/js-string/polyfill.js new file mode 100644 index 0000000000..e18236899d --- /dev/null +++ b/test/js-api/js-string/polyfill.js @@ -0,0 +1,170 @@ +// Generate some helper functions for manipulating (array (mut i16)) from JS +let helperExports; +{ + const builder = new WasmModuleBuilder(); + const arrayIndex = builder.addArray(kWasmI16, true, kNoSuperType, true); + + builder + .addFunction("createArrayMutI16", { + params: [kWasmI32], + results: [kWasmAnyRef] + }) + .addBody([ + kExprLocalGet, + ...wasmSignedLeb(0), + ...GCInstr(kExprArrayNewDefault), + ...wasmSignedLeb(arrayIndex) + ]) + .exportFunc(); + + builder + .addFunction("arrayLength", { + params: [kWasmArrayRef], + results: [kWasmI32] + }) + .addBody([ + kExprLocalGet, + ...wasmSignedLeb(0), + ...GCInstr(kExprArrayLen) + ]) + .exportFunc(); + + builder + .addFunction("arraySet", { + params: [wasmRefNullType(arrayIndex), kWasmI32, kWasmI32], + results: [] + }) + .addBody([ + kExprLocalGet, + ...wasmSignedLeb(0), + kExprLocalGet, + ...wasmSignedLeb(1), + kExprLocalGet, + ...wasmSignedLeb(2), + ...GCInstr(kExprArraySet), + ...wasmSignedLeb(arrayIndex) + ]) + .exportFunc(); + + builder + .addFunction("arrayGet", { + params: [wasmRefNullType(arrayIndex), kWasmI32], + results: [kWasmI32] + }) + .addBody([ + kExprLocalGet, + ...wasmSignedLeb(0), + kExprLocalGet, + ...wasmSignedLeb(1), + ...GCInstr(kExprArrayGetU), + ...wasmSignedLeb(arrayIndex) + ]) + .exportFunc(); + + let bytes = builder.toBuffer(); + let module = new WebAssembly.Module(bytes); + let instance = new WebAssembly.Instance(module); + + helperExports = instance.exports; +} + +function throwIfNotString(a) { + if (typeof a !== "string") { + throw new WebAssembly.RuntimeError(); + } +} + +this.polyfillImports = { + test: (string) => { + if (string === null || + typeof string !== "string") { + return 0; + } + return 1; + }, + cast: (string) => { + throwIfNotString(string); + return string; + }, + fromCharCodeArray: (array, arrayStart, arrayCount) => { + arrayStart >>>= 0; + arrayCount >>>= 0; + let length = helperExports.arrayLength(array); + if (BigInt(arrayStart) + BigInt(arrayCount) > BigInt(length)) { + throw new WebAssembly.RuntimeError(); + } + let result = ''; + for (let i = arrayStart; i < arrayStart + arrayCount; i++) { + result += String.fromCharCode(helperExports.arrayGet(array, i)); + } + return result; + }, + intoCharCodeArray: (string, arr, arrayStart) => { + arrayStart >>>= 0; + throwIfNotString(string); + let arrLength = helperExports.arrayLength(arr); + let stringLength = string.length; + if (BigInt(arrayStart) + BigInt(stringLength) > BigInt(arrLength)) { + throw new WebAssembly.RuntimeError(); + } + for (let i = 0; i < stringLength; i++) { + helperExports.arraySet(arr, arrayStart + i, string[i].charCodeAt(0)); + } + return stringLength; + }, + fromCharCode: (charCode) => { + charCode >>>= 0; + return String.fromCharCode(charCode); + }, + fromCodePoint: (codePoint) => { + codePoint >>>= 0; + return String.fromCodePoint(codePoint); + }, + charCodeAt: (string, stringIndex) => { + stringIndex >>>= 0; + throwIfNotString(string); + if (stringIndex >= string.length) + throw new WebAssembly.RuntimeError(); + return string.charCodeAt(stringIndex); + }, + codePointAt: (string, stringIndex) => { + stringIndex >>>= 0; + throwIfNotString(string); + if (stringIndex >= string.length) + throw new WebAssembly.RuntimeError(); + return string.codePointAt(stringIndex); + }, + length: (string) => { + throwIfNotString(string); + return string.length; + }, + concat: (stringA, stringB) => { + throwIfNotString(stringA); + throwIfNotString(stringB); + return stringA + stringB; + }, + substring: (string, startIndex, endIndex) => { + startIndex >>>= 0; + endIndex >>>= 0; + throwIfNotString(string); + if (startIndex > string.length, + endIndex > string.length, + endIndex < startIndex) { + return ""; + } + return string.substring(startIndex, endIndex); + }, + equals: (stringA, stringB) => { + throwIfNotString(stringA); + throwIfNotString(stringB); + return stringA === stringB; + }, + compare: (stringA, stringB) => { + throwIfNotString(stringA); + throwIfNotString(stringB); + if (stringA < stringB) { + return -1; + } + return stringA === stringB ? 0 : 1; + }, +}; diff --git a/test/js-api/wasm-module-builder.js b/test/js-api/wasm-module-builder.js index d0f9e78bcd..ae2601f136 100644 --- a/test/js-api/wasm-module-builder.js +++ b/test/js-api/wasm-module-builder.js @@ -100,6 +100,10 @@ let kWasmS128 = 0x7b; let kWasmAnyRef = 0x6f; let kWasmAnyFunc = 0x70; +// Packed storage types +let kWasmI8 = 0x78; +let kWasmI16 = 0x77; + let kExternalFunction = 0; let kExternalTable = 1; let kExternalMemory = 2; From 89513754dbdade7491a85e6c2365d1fd6979ae7f Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 29 Mar 2024 15:56:29 -0500 Subject: [PATCH 17/70] Fix type signature of 'concat' Fixes #24. --- proposals/js-string-builtins/Overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 7b249fe910..6b58f45966 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -415,7 +415,7 @@ func length(string: externref) -> i32 { func concat( first: externref, second: externref -) -> externref +) -> (ref extern) { if (first === null || typeof first !== "string") From 2b634f5a9f00f843418a06bfcd71340ac7957d06 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 29 Mar 2024 17:20:52 -0500 Subject: [PATCH 18/70] Add section on string constants, including Struct.from --- proposals/js-string-builtins/Overview.md | 37 ++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 6b58f45966..aa7921a095 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -718,6 +718,43 @@ func encodeStringToUTF8Array( } ``` +## String constants + +This proposal does not add a way to defined string constants within a wasm module. Users are expected to define any string literals needed in a JS source file that their wasm module's can access using several methods. + +The first step is to define the string literals in a JS source. There are many ways to do this: + 1. A JS array literal: `let strings = ["a", ...];` + 2. JSON literal: `let strings = JSON.parse('["a", ...]')` + - This may be faster than a JS array literal for very large strings, due to the complexity of parsing JS vs. JSON. + 3. Imported JSON file using fetch: `let strings = await fetch('strings.json').json();` + - This could allow you to run other startup logic or module compilation + while your strings are fetched, at the cost of managing asynchronicity. + +The second step is to access the string literals in wasm. + +The easiest way to do this is to import every string as a global: `(global (import "strings" "i") (ref extern))`. This has the advantage that string literals can be used in initializer expressions in the module using `global.get`. This approach may have some difficulty scaling to modules with many strings due to an implementation agreed-upon limit of 100,000 imports. + +To support large modules, this proposals adds a new static method: `WebAssembly.Struct.from` which constructs a struct from a JS iterable. The resulting struct can then be imported as a global and accessed using `struct.get`. This proposal also relaxes `struct.get` to be valid in a constant expression, as long as the reference type is non-nullable. + +### WebAssembly.Struct.from + +```webidl +[LegacyNamespace=WebAssembly, Exposed=*] +interface Struct { + static from( + string fieldType, + bool fieldMutable, + unsigned long fieldCount, + sequence fieldValues); +} +``` + +`Struct.from` creates a wasm struct using the `fieldType` and `fieldMutable` to create the field type, and then repeated `fieldCount` times. The initial values are taken from iterating over fieldValues. + +An alternative design could allow multiple different field types to be declared, but this would result in a very large type for the string constant use-case that would also make iterating over the values more difficult. + +This proposal does not change exported wasm GC structs to report `struct instanceof WebAssembly.Struct === true.` The intention is just to have an idiomatic name for this operation. If this is considered an issue, the method could be moved to the WebAssembly namespace and called `WebAssembly.structFrom`. + ## Future extensions There are several extensions we can make in the future as need arrives. From 6a7a39b3cd5e4fc6c5f45554492310e74a54c8d1 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 10 Apr 2024 12:39:19 -0500 Subject: [PATCH 19/70] Rework to use imported string constants idea --- proposals/js-string-builtins/Overview.md | 51 +++++++----------------- 1 file changed, 14 insertions(+), 37 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index aa7921a095..bd1ffb7660 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -163,6 +163,20 @@ JS Strings are semantically a sequence of 16-bit code units (referred to as char There is however, the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces. +## String constants + +String constants may be defined in JS and made available to wasm through a variety of means. + +The simplest way is to have a module import each string as an immutable global. This can work for small amounts of strings, but has a high cost for when the number of string constants is very large. + +This proposal adds an extension to JS-API compile routine to support optimized 'imported string constants' to address this use-case. + +The `WebAssemblyCompileOptions` is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" %stringConstant (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. + +The string namespace is chosen to be the single quote ASCII character `'`. We may revise this to be a longer name before this proposal is finalized. + +All imports that reference this namespace must be globals of immutable externref. If they are not, an eager compile error is emitted. + ## JS String Builtin API The following is an initial set of function builtins for JavaScript String. The builtins are exposed under `wasm:js-string`. @@ -718,43 +732,6 @@ func encodeStringToUTF8Array( } ``` -## String constants - -This proposal does not add a way to defined string constants within a wasm module. Users are expected to define any string literals needed in a JS source file that their wasm module's can access using several methods. - -The first step is to define the string literals in a JS source. There are many ways to do this: - 1. A JS array literal: `let strings = ["a", ...];` - 2. JSON literal: `let strings = JSON.parse('["a", ...]')` - - This may be faster than a JS array literal for very large strings, due to the complexity of parsing JS vs. JSON. - 3. Imported JSON file using fetch: `let strings = await fetch('strings.json').json();` - - This could allow you to run other startup logic or module compilation - while your strings are fetched, at the cost of managing asynchronicity. - -The second step is to access the string literals in wasm. - -The easiest way to do this is to import every string as a global: `(global (import "strings" "i") (ref extern))`. This has the advantage that string literals can be used in initializer expressions in the module using `global.get`. This approach may have some difficulty scaling to modules with many strings due to an implementation agreed-upon limit of 100,000 imports. - -To support large modules, this proposals adds a new static method: `WebAssembly.Struct.from` which constructs a struct from a JS iterable. The resulting struct can then be imported as a global and accessed using `struct.get`. This proposal also relaxes `struct.get` to be valid in a constant expression, as long as the reference type is non-nullable. - -### WebAssembly.Struct.from - -```webidl -[LegacyNamespace=WebAssembly, Exposed=*] -interface Struct { - static from( - string fieldType, - bool fieldMutable, - unsigned long fieldCount, - sequence fieldValues); -} -``` - -`Struct.from` creates a wasm struct using the `fieldType` and `fieldMutable` to create the field type, and then repeated `fieldCount` times. The initial values are taken from iterating over fieldValues. - -An alternative design could allow multiple different field types to be declared, but this would result in a very large type for the string constant use-case that would also make iterating over the values more difficult. - -This proposal does not change exported wasm GC structs to report `struct instanceof WebAssembly.Struct === true.` The intention is just to have an idiomatic name for this operation. If this is considered an issue, the method could be moved to the WebAssembly namespace and called `WebAssembly.structFrom`. - ## Future extensions There are several extensions we can make in the future as need arrives. From f13f076771627b6c18d41bb9a7b13080c9373c0e Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 10 Apr 2024 13:31:46 -0500 Subject: [PATCH 20/70] Fixup typos --- proposals/js-string-builtins/Overview.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index bd1ffb7660..0b248e0234 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -171,11 +171,11 @@ The simplest way is to have a module import each string as an immutable global. This proposal adds an extension to JS-API compile routine to support optimized 'imported string constants' to address this use-case. -The `WebAssemblyCompileOptions` is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" %stringConstant (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. +The `WebAssemblyCompileOptions` is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" "%stringConstant%"" (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. The string namespace is chosen to be the single quote ASCII character `'`. We may revise this to be a longer name before this proposal is finalized. -All imports that reference this namespace must be globals of immutable externref. If they are not, an eager compile error is emitted. +All imports that reference this namespace must be globals of type immutable externref. If they are not, an eager compile error is emitted. ## JS String Builtin API From 8c6595063541f4bec863bdf3d7696e9f37532b07 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=96mer=20Sinan=20A=C4=9Facan?= Date: Mon, 22 Apr 2024 12:21:07 +0200 Subject: [PATCH 21/70] Add missing start parameter to measureStringAsUTF8 overview --- proposals/js-string-builtins/Overview.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 0b248e0234..f6cd59ff8f 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -632,6 +632,7 @@ func decodeStringFromUTF8Array( /// Traps if the length of the UTF-8 encoded string doesn't fit into an i32 func measureStringAsUTF8( string: externref, + start: i32 ) -> i32 { // NOTE: `start` is interpreted as a signed 32-bit integer when converted From 809dd46a7a7120b4d642455ae69ee294e3a021dd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=96mer=20Sinan=20A=C4=9Facan?= Date: Mon, 22 Apr 2024 15:08:36 +0200 Subject: [PATCH 22/70] Update Overview.md --- proposals/js-string-builtins/Overview.md | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index f6cd59ff8f..ba392f259b 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -631,14 +631,9 @@ func decodeStringFromUTF8Array( /// /// Traps if the length of the UTF-8 encoded string doesn't fit into an i32 func measureStringAsUTF8( - string: externref, - start: i32 + string: externref ) -> i32 { - // NOTE: `start` is interpreted as a signed 32-bit integer when converted - // to a JS value using standard conversions. Reinterpret as unsigned here. - start >>>= 0; - // Technically a partially redundant test, but want to be clear the null is // not allowed. if (string === null || From bb95b017c99fa1f3785fc6e92c442c418f8597e1 Mon Sep 17 00:00:00 2001 From: Thomas Steiner Date: Mon, 6 May 2024 13:16:00 +0200 Subject: [PATCH 23/70] Many small improvements - Syntax highlighting - Added links - Code font - Consistency - Grammer fixes --- proposals/js-string-builtins/Overview.md | 73 ++++++++++++------------ 1 file changed, 38 insertions(+), 35 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index ba392f259b..b37d4f020c 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -2,58 +2,58 @@ ## Motivation -JavaScript runtimes have a rich set of [builtin objects and primitives](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects). Some languages targeting WebAssembly may have compatible primitives and would benefit from being able to use the equivalent JavaScript primitive for their implementation. The most pressing use-case here is for languages who would like to use the JavaScript String type to implement their strings. +JavaScript runtimes have a rich set of [builtin objects and primitives](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects). Some languages targeting WebAssembly may have compatible primitives and would benefit from being able to use the equivalent JavaScript primitive for their implementation. The most pressing use-case here is for languages who would like to use the JavaScript [`String`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) type to implement their strings. -It is already possible to use any JavaScript or Web API from WebAssembly by importing JavaScript 'glue code' which adapts between JavaScript and WebAssembly values and calling conventions. Usually, this has a negligible performance impact and work has been done to optimize this [in runtimes when we can](https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-%F0%9F%8E%89/). +It is already possible to use any JavaScript or Web API from WebAssembly by importing JavaScript 'glue code' which adapts between JavaScript and WebAssembly values and calling conventions. Usually, this has a negligible performance impact and work has been done to [optimize this in runtimes when we can](https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-%F0%9F%8E%89/). -However, the overhead of importing glue code is prohibitive for primitives such as Strings, ArrayBuffers, RegExp, Map, and BigInt where the desired overhead of operations is a tight sequence of inline instructions, not an indirect function call (which is typical of imported functions). +However, the overhead of importing glue code is prohibitive for primitives such as [`String`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String), [`ArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer), [`RegExp`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp), [`Map`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map), and [`BigInt`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt) where the desired overhead of operations is a tight sequence of inline instructions, not an indirect function call (which is typical of imported functions). ## Overview This proposal aims to provide a minimal and general mechanism for importing specific JavaScript primitives for efficient usage in WebAssembly code. -This is done by first adding a set of wasm builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. +This is done by first adding a set of Wasm builtin functions for performing JavaScript String operations. These builtin functions mirror a subset of the [JavaScript String API](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String) and adapt it to be efficiently callable without JavaScript glue code. -Then a mechanism for importing these wasm builtin functions is added to the WebAssembly JS-API. These builtins are grouped in modules and exist in a new reserved import namespace `wasm:` that is enabled at compile-time with a flag. +Then a mechanism for importing these Wasm builtin functions is added to the WebAssembly JS-API. These builtins are grouped in modules and exist in a new reserved import namespace `wasm:` that is enabled at compile-time with a flag. -These two pieces in combination allow runtimes to reliably emit optimal code sequences for JavaScript String operations within WebAssembly modules. In the future, other JS builtin objects or JS primitives can be exposed through new wasm builtins. +These two pieces in combination allow runtimes to reliably emit optimal code sequences for JavaScript string operations within WebAssembly modules. In the future, other JS builtin objects or JS primitives can be exposed through new Wasm builtins. -## Do we need new wasm builtin functions? +## Do we need new Wasm builtin functions? -It is already possible today to import JS builtin functions (such as String.prototoype.getCharCodeAt) from wasm modules. Instead of defining new wasm specific-builtins, we could just re-use those directly. +It is already possible today to import JS builtin functions (such as [`String.prototoype.harCodeAt()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt)) from Wasm modules. Instead of defining new Wasm specific-builtins, we could just re-use those directly. There are several problems with this approach. -The first problem is that existing API’s require a calling convention conversion to handle differences around the `this` value, which WebAssembly function import calls leave as `undefined`. The second problem is that certain primitive use JS operators such as `===` and `<` that cannot be imported. A third problem is that most JS builtins are extremely permissive of the types of values they accept, and it's desirable to leverage wasm's type system to remove those checks and coercions wherever we can. +The first problem is that existing APIs require a calling convention conversion to handle differences around the `this` value, which WebAssembly function import calls leave as `undefined`. The second problem is that certain primitives use JS operators such as `===` and `<` that cannot be imported. A third problem is that most JS builtins are extremely permissive of the types of values they accept, and it's desirable to leverage Wasm's type system to remove those checks and coercions wherever we can. It seems that creating new importable definitions that adapt existing JS primitives to WebAssembly is simpler and more flexible in the future. -## Do we need a new import mechanism for wasm builtin functions? +## Do we need a new import mechanism for Wasm builtin functions? -There is a variety of execution techniques for WebAssembly. Some WebAssembly engines compile modules eagerly (at WebAssembly.compile), some use interpreters and dynamic tiering, and some use on-demand compilation (after instantiation) and dynamic tiering. +There is a variety of execution techniques for WebAssembly. Some WebAssembly engines compile modules eagerly (at `WebAssembly.compile()`), some use interpreters and dynamic tiering, and some use on-demand compilation (after instantiation) and dynamic tiering. -If we just have builtin functions, it would be possible to normally import them normally through instantiation. However this would prevent engines from using eager compilation when builtins are in use. +If we just have builtin functions, it would be possible to normally import them through instantiation. However this would prevent engines from using eager compilation when builtins are in use. It seems desirable to support a variety of execution techniques, especially because engines may support multiple depending on heuristics or change them over time. -By adding builtins that are in a reserved and known namespace (`wasm:`), engines can know that these builtin functions are being used at `WebAssembly.compile` time and generate optimal code for them. +By adding builtins that are in a reserved and known namespace (`wasm:`), engines can know that these builtin functions are being used at `WebAssembly.compile()` time and generate optimal code for them. ## Goals for builtins Builtins should not provide any new abilities to WebAssembly that JS doesn't already have. They are intended to just wrap existing primitives in such a manner that WebAssembly can efficiently use them. In the cases the primitive already has a name, we should re-use it and not invent a new one. -Most builtins should be simple and do little work outside of calling into the JS functionality to do the operation. The one exception is for operations that convert between a JS primitive and a wasm primitive, such as between JS strings/arrays/linear memory. In this case, the builtin may need some non-trivial code to perform the operation. In these cases, it's still expected that the operation is just semantically copying information and not substantially transforming it into a new interpretation. +Most builtins should be simple and do little work outside of calling into the JS functionality to do the operation. The one exception is for operations that convert between a JS primitive and a Wasm primitive, such as between JS strings/arrays/linear memory. In this case, the builtin may need some non-trivial code to perform the operation and it's still expected that the operation is just semantically copying information and not substantially transforming it into a new interpretation. -The standardization of wasm builtins will be governed by the WebAssembly standards process and would exist in the JS-API document. +The standardization of Wasm builtins will be governed by the WebAssembly standards process and would exist in the JS-API document. The bar for adding a new builtin would be that it enables significantly better code generation for an important use-case beyond what is possible with a normal import. We don't want to add a new builtin for every existing API, only ones where adapting the JavaScript API to WebAssembly and allowing inline code generation results in significantly better codegen than a plain function call. ## Function builtins -Function builtins are defined with an external wasm function type, and internal JS-defined behavior. They have the same semantics as following ['create a host function'](https://webassembly.github.io/spec/js-api/#create-a-host-function) for the wasm function type and JS code given to get a wasm `funcaddr` that can be imported. +Function builtins are defined with an external Wasm function type, and internal JS-defined behavior. They have the same semantics as following ['create a host function'](https://webassembly.github.io/spec/js-api/#create-a-host-function) for the Wasm function type and JS code given to get a wasm `funcaddr` that can be imported. There are several implications of this: - - Calling a function builtin from wasm will have the wasm parameters converted to JS values, and JS results converted back to wasm values. + - Calling a function builtin from Wasm will have the Wasm parameters converted to JS values, and JS results converted back to Wasm values. - Exported function builtins are wrapped using ['create a new Exported function'](https://webassembly.github.io/spec/js-api/#a-new-exported-function). - Function builtins must be imported with the correct type. - Function builtins may become `funcref`, stored in tables, etc. @@ -75,12 +75,14 @@ An example import specifier could therefore be `(import "wasm:js-string" "equals The JS-API does not reserve a `wasm:` namespace today, so modules theoretically could already be using this namespace. Additionally, some users may wish to disable this feature for modules they compile so they could polyfill it. This feature is therefore opt-in via flags for each interface. To just enabled `js-string` builtins, a user would compile with: -``` + +```js WebAssembly.compile(bytes, { builtins: ['js-string'] }); ``` The full extension to the JS-API WebIDL is: -``` + +```idl dictionary WebAssemblyCompileOptions { optional sequence builtins; } @@ -133,11 +135,11 @@ namespace WebAssembly { }; ``` -A wasm module that has enabled builtins will have the specific import specifier, such as `wasm:js-string` for that interface available and eagerly applied. +A Wasm module that has enabled builtins will have the specific import specifier, such as `wasm:js-string` for that interface available and eagerly applied. Concretely this means that imports that refer to that specifier will be eagerly checked for link errors at compile time, those imports will not show up in `WebAssembly.Module.imports()`, and those imports will not need to be provided at instantiation time. No property lookup on the instantiation imports object will be done for those imports. -When the module is instantiated, a unique instantiation of the builtins are created. This means that re-exports of builtin functions will have different identities if they come from different instances. This is a useful property for future extensions to bind memory to builtins or evolve the types as things like type-imports or a core stringref type are added (see below). +When the module is instantiated, a unique instantiation of the builtins is created. This means that re-exports of builtin functions will have different identities if they come from different instances. This is a useful property for future extensions to bind memory to builtins or evolve the types as things like type-imports or a core stringref type are added (see below). ## Progressive enhancement @@ -149,7 +151,7 @@ Together this means that it's safe for users to request builtins while still pro Users may wish to detect if a specific builtin is available in their system. -For this purpose, `WebAssembly.validate` is extended to take a list of builtins to enable, like compile does. After validating the module, the eager link checking that compile does is also performed. Users can validate a module that deliberately imports a builtin operation with an incorrect signature and infer support for that particular builtin if validation reports a link error. +For this purpose, `WebAssembly.validate()` is extended to take a list of builtins to enable, like `compile()` does. After validating the module, the eager link checking that `compile()` does is also performed. Users can validate a module that deliberately imports a builtin operation with an incorrect signature and infer support for that particular builtin if validation reports a link error. ## Polyfilling @@ -159,19 +161,19 @@ If a user wishes to polyfill these imports for some reason, or is running on a s As stated above in 'goals for builtins', builtins are intended to just wrap existing primitives and not invent new functionality. -JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write wasm builtins for these encodings without introducing significant new logic to them. +JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write Wasm builtins for these encodings without introducing significant new logic to them. -There is however, the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces. +There is the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces. ## String constants -String constants may be defined in JS and made available to wasm through a variety of means. +String constants may be defined in JS and made available to Wasm through a variety of means. The simplest way is to have a module import each string as an immutable global. This can work for small amounts of strings, but has a high cost for when the number of string constants is very large. -This proposal adds an extension to JS-API compile routine to support optimized 'imported string constants' to address this use-case. +This proposal adds an extension to the JS-API compile routine to support optimized 'imported string constants' to address this use-case. -The `WebAssemblyCompileOptions` is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" "%stringConstant%"" (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. +The `WebAssemblyCompileOptions` dictionary is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" "%stringConstant%"" (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. The string namespace is chosen to be the single quote ASCII character `'`. We may revise this to be a longer name before this proposal is finalized. @@ -181,9 +183,9 @@ All imports that reference this namespace must be globals of type immutable exte The following is an initial set of function builtins for JavaScript String. The builtins are exposed under `wasm:js-string`. -All below references to builtins on the Global object (e.g. `String.fromCharCode`) refer to the original version on the Global object before any modifications by user code. +All below references to builtins on the Global object (e.g., `String.fromCharCode()`) refer to the original version on the Global object before any modifications by user code. -The following internal helpers are defined in wasm and used by the below definitions: +The following internal helpers are defined in Wasm and are used by the below definitions: ```wasm (module @@ -521,11 +523,11 @@ function compare( ## Encoding API -The following is an initial set of function builtins for the [`TextEncoder` interface](https://encoding.spec.whatwg.org/#interface-textencoder) and [`TextDecoder` interface](https://encoding.spec.whatwg.org/#interface-textdecoder) interfaces. These builtins are exposed under `wasm:text-encoder` and `wasm:text-decoder`, respectively. +The following is an initial set of function builtins for the [`TextEncoder`](https://encoding.spec.whatwg.org/#interface-textencoder) and the [`TextDecoder`](https://encoding.spec.whatwg.org/#interface-textdecoder) interfaces. These builtins are exposed under `wasm:text-encoder` and `wasm:text-decoder`, respectively. -All below references to builtins on the Global object (e.g. `String.fromCharCode`) refer to the original version on the Global object before any modifications by user code. +All below references to builtins on the Global object (e.g. `String.fromCharCode()`) refer to the original version on the Global object before any modifications by user code. -The following internal helpers are defined in wasm and used by the below definitions: +The following internal helpers are defined in Wasm and used by the below definitions: ```wasm (module @@ -734,14 +736,15 @@ There are several extensions we can make in the future as need arrives. ### Binding memory to builtins -It may be useful to have a builtin that operates on a specific wasm memory. For JS strings, this could allow us to encode a JS string directly into linear memory. +It may be useful to have a builtin that operates on a specific Wasm memory. For JS strings, this could allow us to encode a JS string directly into linear memory. -One way we could do this, is by having the JS-API bind the first imported memory of a module to any imported builtin functions that want to operate on memory. If there is no imported memory and a builtin function that needs memory is imported, then a link error is reported. +One way we could do this is by having the JS-API bind the first imported memory of a module to any imported builtin functions that want to operate on memory. If there is no imported memory and a builtin function that needs memory is imported, then a link error is reported. The memory is imported as opposed to exported so that it is guaranteed to exist when the builtin imports are provided. Using a memory defined only locally would have limited flexibility and would also be exposing a potentially private memory to outside its module. A quick example: -``` + +```wasm (module (; memory 0 ;) (import ... (memory ...)) From 47b47440e82affd5f22b16e43a51423d6a578df7 Mon Sep 17 00:00:00 2001 From: Thomas Steiner Date: Tue, 21 May 2024 16:45:26 +0200 Subject: [PATCH 24/70] Update proposals/js-string-builtins/Overview.md --- proposals/js-string-builtins/Overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index b37d4f020c..17d8191f17 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -20,7 +20,7 @@ These two pieces in combination allow runtimes to reliably emit optimal code seq ## Do we need new Wasm builtin functions? -It is already possible today to import JS builtin functions (such as [`String.prototoype.harCodeAt()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt)) from Wasm modules. Instead of defining new Wasm specific-builtins, we could just re-use those directly. +It is already possible today to import JS builtin functions (such as [`String.prototoype.charCodeAt()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt)) from Wasm modules. Instead of defining new Wasm specific-builtins, we could just re-use those directly. There are several problems with this approach. From 33291d70d0a4ebd989ca575096178b192956b2e6 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 12 Apr 2024 08:44:49 -0500 Subject: [PATCH 25/70] Fix nullability for string constants --- proposals/js-string-builtins/Overview.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index 17d8191f17..cce34cf1ae 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -175,9 +175,9 @@ This proposal adds an extension to the JS-API compile routine to support optimiz The `WebAssemblyCompileOptions` dictionary is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" "%stringConstant%"" (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. -The string namespace is chosen to be the single quote ASCII character `'`. We may revise this to be a longer name before this proposal is finalized. +This works by having the JS-API create a `(global (ref extern))` for each string constant, and provide that as the import value to the module. The [normal import checking for globals](https://webassembly.github.io/gc/core/valid/matching.html#match-globaltype) is performed, which allows for a user to specify either nullable or non-nullable externref, as long as the import global type is immutable. This check is eager, resulting in a compile error if it fails. -All imports that reference this namespace must be globals of type immutable externref. If they are not, an eager compile error is emitted. +The string namespace is chosen to be the single quote ASCII character `'`. This is to reduce binary size impact. We may revise this to be a longer name before this proposal is finalized, if we can mitigate the binary size increase. ## JS String Builtin API From f7a14c3f6fea96b3980670dd55ae2e7cbc602160 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Tue, 25 Jun 2024 12:59:04 -0500 Subject: [PATCH 26/70] Expand section on string constants --- proposals/js-string-builtins/Overview.md | 38 +++++++++++++++++++++--- 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index cce34cf1ae..c4099a2283 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -84,7 +84,7 @@ The full extension to the JS-API WebIDL is: ```idl dictionary WebAssemblyCompileOptions { - optional sequence builtins; + optional record builtins; } [LegacyNamespace=WebAssembly, Exposed=*] @@ -173,11 +173,41 @@ The simplest way is to have a module import each string as an immutable global. This proposal adds an extension to the JS-API compile routine to support optimized 'imported string constants' to address this use-case. -The `WebAssemblyCompileOptions` dictionary is extended with a `boolean importedStringConstants` flag. When this is set, the module may define imports of the form `(import "'" "%stringConstant%"" (global externref))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. +The `WebAssemblyCompileOptions` dictionary is extended with a `USVString? importedStringConstants` flag. -This works by having the JS-API create a `(global (ref extern))` for each string constant, and provide that as the import value to the module. The [normal import checking for globals](https://webassembly.github.io/gc/core/valid/matching.html#match-globaltype) is performed, which allows for a user to specify either nullable or non-nullable externref, as long as the import global type is immutable. This check is eager, resulting in a compile error if it fails. +``` +partial dictionary WebAssemblyCompileOptions { + USVString? importedStringConstants; +} +``` + +When this is set to a non-null value, the module may import globals of the form `(import "%importedStringConstants%" "%stringConstant%"" (global ...))`, and the JS-API will use the provided `%stringConstant%` import field name to be the value of the global. This allows for any UTF-8 string to be imported with minimal overhead. + +### Example + +```wasm +(module + (global (import "strings" "my string constant") (ref extern)) + (export "constant" (global 0)) +) +``` + +```js +let instance = WebAssembly.instantiate(bytes, {importedStringConstants: "strings"}); + +// The global is automatically populated with the string constant +assertEq(instance.exports.constant.value, "my string constant"); +``` + +### Details + +When `importedStringConstants` is non-null, the specified string becomes the `imported string namespace`. + +During the ['compile a module'](https://webassembly.github.io/spec/js-api/index.html#compile-a-webassembly-module) step of the JS-API, the imports of the module are examined to see which refer to the imported string namespace. If an import refers to the imported string namespace, then the import type is [matched](https://webassembly.github.io/spec/core/valid/types.html#globals) against an extern type of `(global (ref extern))`. If an import fails to match, then 'compile a module' fails. The resulting module is associated with the imported string namespace for use during instantiation. + +During the ['read the imports'](https://webassembly.github.io/spec/js-api/index.html#read-the-imports) step of the JS-API, if the module has an imported string namespace, then every import that refers to this namespace has a global created to hold the string constant specified in the import field. This global is added to the imports object. -The string namespace is chosen to be the single quote ASCII character `'`. This is to reduce binary size impact. We may revise this to be a longer name before this proposal is finalized, if we can mitigate the binary size increase. +When the imports object is used during ['instantiate a module'](https://webassembly.github.io/spec/js-api/index.html#instantiate-the-core-of-a-webassembly-module), these implicitly created globals should never cause a link error due to the eager matching done in 'compile a module'. ## JS String Builtin API From 7530d20e07866338df2aaaa912310292bf0eef8f Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Tue, 25 Jun 2024 13:04:37 -0500 Subject: [PATCH 27/70] Revert unintentional change to builtins field --- proposals/js-string-builtins/Overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index c4099a2283..afd8c55bf5 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -84,7 +84,7 @@ The full extension to the JS-API WebIDL is: ```idl dictionary WebAssemblyCompileOptions { - optional record builtins; + optional sequence builtins; } [LegacyNamespace=WebAssembly, Exposed=*] From 98953e88ad1ef4ffb7970ff6be8b7a7952eb6cde Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 26 Jun 2024 15:52:58 -0500 Subject: [PATCH 28/70] Add note about empty imports object --- proposals/js-string-builtins/Overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index afd8c55bf5..d40de3567b 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -205,7 +205,7 @@ When `importedStringConstants` is non-null, the specified string becomes the `im During the ['compile a module'](https://webassembly.github.io/spec/js-api/index.html#compile-a-webassembly-module) step of the JS-API, the imports of the module are examined to see which refer to the imported string namespace. If an import refers to the imported string namespace, then the import type is [matched](https://webassembly.github.io/spec/core/valid/types.html#globals) against an extern type of `(global (ref extern))`. If an import fails to match, then 'compile a module' fails. The resulting module is associated with the imported string namespace for use during instantiation. -During the ['read the imports'](https://webassembly.github.io/spec/js-api/index.html#read-the-imports) step of the JS-API, if the module has an imported string namespace, then every import that refers to this namespace has a global created to hold the string constant specified in the import field. This global is added to the imports object. +During the ['read the imports'](https://webassembly.github.io/spec/js-api/index.html#read-the-imports) step of the JS-API, if the module has an imported string namespace, then every import that refers to this namespace has a global created to hold the string constant specified in the import field. This global is added to the imports object. If all imports in a module are from the imported string namespace, no import object needs to be provided. When the imports object is used during ['instantiate a module'](https://webassembly.github.io/spec/js-api/index.html#instantiate-the-core-of-a-webassembly-module), these implicitly created globals should never cause a link error due to the eager matching done in 'compile a module'. From c82a5426cbf526838c3f6ea64601bf3315741668 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Tue, 6 Aug 2024 17:06:11 -0500 Subject: [PATCH 29/70] Initial outline of builtin support --- document/core/appendix/embedding.rst | 9 +++ document/js-api/index.bs | 82 ++++++++++++++++++++++++---- document/web-api/index.bs | 14 ++--- 3 files changed, 86 insertions(+), 19 deletions(-) diff --git a/document/core/appendix/embedding.rst b/document/core/appendix/embedding.rst index 737230ef85..cccaa1eb84 100644 --- a/document/core/appendix/embedding.rst +++ b/document/core/appendix/embedding.rst @@ -161,6 +161,15 @@ Modules \end{array} +.. index:: validation +.. _embed-module-module-validate-partial-imports: + +:math:`\F{module\_validate\_partial\_imports}(\module, \imports) : \error^?` +................................................ + +// TODO +1. Return :math:`\ERROR`. + .. index:: instantiation, module instance .. _embed-module-instantiate: diff --git a/document/js-api/index.bs b/document/js-api/index.bs index b666ef074c..1117026f7e 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -93,6 +93,7 @@ urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: df text: store_init; url: appendix/embedding.html#embed-store-init text: module_decode; url: appendix/embedding.html#embed-module-decode text: module_validate; url: appendix/embedding.html#embed-module-validate + text: module_validate_partial_imports; url: appendix/embedding.html#embed-module-validate-partial-imports text: module_instantiate; url: appendix/embedding.html#embed-module-instantiate text: module_imports; url: appendix/embedding.html#embed-module-imports text: module_exports; url: appendix/embedding.html#embed-module-exports @@ -302,13 +303,18 @@ dictionary WebAssemblyInstantiatedSource { required Instance instance; }; +dictionary WebAssemblyCompileOptions { + USVString? importedStringConstants; + sequence<USVString> builtins; +}; + [Exposed=*] namespace WebAssembly { - boolean validate(BufferSource bytes); - Promise<Module> compile(BufferSource bytes); + boolean validate(BufferSource bytes, optional WebAssemblyCompileOptions options); + Promise<Module> compile(BufferSource bytes, optional WebAssemblyCompileOptions options); Promise<WebAssemblyInstantiatedSource> instantiate( - BufferSource bytes, optional object importObject); + BufferSource bytes, optional object importObject, optional WebAssemblyCompileOptions options); Promise<Instance> instantiate( Module moduleObject, optional object importObject); @@ -335,10 +341,21 @@ Note:
- The validate(|bytes|) method, when invoked, performs the following steps: + To Validate builtins for WebAssembly module from module |module| and enabled builtins |builtins|, perform the following steps: + 1. Let |store| be the [=surrounding agent=]'s [=associated store=]. + 1. Let |imports| be ... // TODO: create imports from a builtin set name + 1. Let |result| be [=module_validate_partial_imports=](|store|, |module|, |imports|). + 1. If |result| is [=error=], return [=error=]. + 1. Return true. +
+ +
+ The validate(|bytes|, |options|) method, when invoked, performs the following steps: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. 1. [=Compile a WebAssembly module|Compile=] |stableBytes| as a WebAssembly module and store the results as |module|. 1. If |module| is [=error=], return false. + 1. Let |builtins| be options["builtins"] + 1. If [=Validate builtins for WebAssembly module=] |module| |builtins| is [=error=], return false. 1. Return true.
@@ -346,41 +363,47 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje * \[[Module]] : a WebAssembly [=/module=] * \[[Bytes]] : the source bytes of \[[Module]]. + * \[[Builtins]] : an ordered set of names of builtin sets
- To construct a WebAssembly module object from a module |module| and source bytes |bytes|, perform the following steps: + To construct a WebAssembly module object from a module |module|, source bytes |bytes|, builtins |builtins|, perform the following steps: 1. Let |moduleObject| be a new {{Module}} object. 1. Set |moduleObject|.\[[Module]] to |module|. 1. Set |moduleObject|.\[[Bytes]] to |bytes|. + 1. Set |moduleObject|.\[[Builtins]] to |builtins|. 1. Return |moduleObject|.
- To asynchronously compile a WebAssembly module from source bytes |bytes|, using optional [=task source=] |taskSource|, perform the following steps: + To asynchronously compile a WebAssembly module from source bytes |bytes| and {{WebAssemblyCompileOptions}} |options| using optional [=task source=] |taskSource|, perform the following steps: 1. Let |promise| be [=a new promise=]. 1. Run the following steps [=in parallel=]: 1. [=compile a WebAssembly module|Compile the WebAssembly module=] |bytes| and store the result as |module|. 1. [=Queue a task=] to perform the following steps. If |taskSource| was provided, queue the task on that task source. 1. If |module| is [=error=], reject |promise| with a {{CompileError}} exception. + 1. Let |builtins| be options["builtins"] + 1. If [=Validate builtins for WebAssembly module=] |module| |builtins| is [=error=], reject |promise| with a {{CompileError}} exception. 1. Otherwise, - 1. [=Construct a WebAssembly module object=] from |module| and |bytes|, and let |moduleObject| be the result. + 1. [=Construct a WebAssembly module object=] from |module|, |bytes|, |builtins|, and let |moduleObject| be the result. 1. [=Resolve=] |promise| with |moduleObject|. 1. Return |promise|.
- The compile(|bytes|) method, when invoked, performs the following steps: + The compile(|bytes|, |options|) method, when invoked, performs the following steps: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. - 1. [=Asynchronously compile a WebAssembly module=] from |stableBytes| and return the result. + 1. [=Asynchronously compile a WebAssembly module=] from |stableBytes| using |options| and return the result.
To read the imports from a WebAssembly module |module| from imports object |importObject|, perform the following steps: 1. If |module|.[=imports=] [=list/is empty|is not empty=], and |importObject| is undefined, throw a {{TypeError}} exception. + // TODO: instantiate builtin sets 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), + // TODO: get exports from a builtin set if enabled 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). 1. If [=Type=](|o|) is not Object, throw a {{TypeError}} exception. 1. Let |v| be [=?=] [$Get$](|o|, |componentName|). @@ -527,9 +550,9 @@ The verification of WebAssembly type requirements is deferred to the
- The instantiate(|bytes|, |importObject|) method, when invoked, performs the following steps: + The instantiate(|bytes|, |importObject|, |options|) method, when invoked, performs the following steps: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. - 1. [=Asynchronously compile a WebAssembly module=] from |stableBytes| and let |promiseOfModule| be the result. + 1. [=Asynchronously compile a WebAssembly module=] from |stableBytes| using |options| and let |promiseOfModule| be the result. 1. [=Instantiate a promise of a module|Instantiate=] |promiseOfModule| with imports |importObject| and return the result.
@@ -570,7 +593,7 @@ dictionary ModuleImportDescriptor { [LegacyNamespace=WebAssembly, Exposed=*] interface Module { - constructor(BufferSource bytes); + constructor(BufferSource bytes, optional WebAssemblyCompileOptions options); static sequence<ModuleExportDescriptor> exports(Module moduleObject); static sequence<ModuleImportDescriptor> imports(Module moduleObject); static sequence<ArrayBuffer> customSections(Module moduleObject, DOMString sectionName); @@ -602,6 +625,7 @@ interface Module { 1. Let |module| be |moduleObject|.\[[Module]]. 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |name|, |type|) of [=module_imports=](|module|), + // TODO: skip import if in builtin module set 1. Let |kind| be the [=string value of the extern type=] |type|. 1. Let |obj| be «[ "{{ModuleImportDescriptor/module}}" → |moduleName|, "{{ModuleImportDescriptor/name}}" → |name|, "{{ModuleImportDescriptor/kind}}" → |kind| ]». 1. [=list/Append=] |obj| to |imports|. @@ -1646,6 +1670,40 @@ They expose the same interface as native JavaScript errors like {{TypeError}} an Note: It is not currently possible to define this behavior using Web IDL. +

Builtins

+ +The JS-API defines sets of builtin functions which can be imported through a flag when compiling a module. WebAssembly builtin functions mirror existing JavaScript builtins, but adapt them to be useable directly as WebAssembly functions. + +All builtin functions are grouped into builtin sets. Every builtin set has a name that [=read the imports|is used during import lookup=]. All names are prefixed by the `wasm:` namespace. + + +

String Builtins

+ +String builtins adapt the interface of the String builtin object. The import name for this set is `wasm:js-string`. + +
+ +The unwrapString(|v|) method, when invoked, performs the following steps: + +1. If [=Type=](|v|) is not String + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. Return |v| + +
+ +

charCodeAt

+ +The type of this function is `(func (param externref i32) (result i32))`. + +
+When this builtin is invoked, the following steps must be run: + +1. Let |string| be [$unwrapString$](param0) +1. TODO + +
+ +

Error Condition Mappings to JavaScript

Running WebAssembly programs encounter certain events which halt execution of the WebAssembly code. diff --git a/document/web-api/index.bs b/document/web-api/index.bs index efc6a66635..4439fbb580 100644 --- a/document/web-api/index.bs +++ b/document/web-api/index.bs @@ -89,25 +89,25 @@ This document builds off of the WebAssembly specification [[WEBASSEMBLY]] and th
 [Exposed=(Window,Worker)]
 partial namespace WebAssembly {
-  Promise<Module> compileStreaming(Promise<Response> source);
+  Promise<Module> compileStreaming(Promise<Response> source, optional WebAssemblyCompileOptions options);
   Promise<WebAssemblyInstantiatedSource> instantiateStreaming(
-      Promise<Response> source, optional object importObject);
+      Promise<Response> source, optional object importObject, optional WebAssemblyCompileOptions options);
 };
 
-The compileStreaming(|source|) method, when invoked, returns the result of [=compile a potential WebAssembly response|compiling a potential WebAssembly response=] with |source|. +The compileStreaming(|source|, |options|) method, when invoked, returns the result of [=compile a potential WebAssembly response|compiling a potential WebAssembly response=] with |source| using |options|.
-The instantiateStreaming(|source|, |importObject|) method, when invoked, performs the following steps: +The instantiateStreaming(|source|, |importObject|, |options|) method, when invoked, performs the following steps: - 1. Let |promiseOfModule| be the result of [=compile a potential WebAssembly response|compiling a potential WebAssembly response=] with |source|. + 1. Let |promiseOfModule| be the result of [=compile a potential WebAssembly response|compiling a potential WebAssembly response=] with |source| using |options|. 1. Return the result of [=instantiate a promise of a module|instantiating the promise of a module=] |promiseOfModule| with imports |importObject|.
-To compile a potential WebAssembly response with a promise of a {{Response}} |source|, perform the following steps: +To compile a potential WebAssembly response with a promise of a {{Response}} |source| and {{WebAssemblyCompileOptions}} |options|, perform the following steps: Note: This algorithm accepts a {{Response}} object, or a promise for one, and compiles and instantiates the resulting bytes of the response. This compilation @@ -136,7 +136,7 @@ Note: This algorithm accepts a {{Response}} object, or a 1. [=Upon fulfillment=] of |bodyPromise| with value |bodyArrayBuffer|: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bodyArrayBuffer|. - 1. [=Asynchronously compile a WebAssembly module|Asynchronously compile the WebAssembly module=] |stableBytes| using the [=networking task source=] and [=resolve=] |returnValue| with the result. + 1. [=Asynchronously compile a WebAssembly module|Asynchronously compile the WebAssembly module=] |stableBytes| using the [=networking task source=] and |options| and [=resolve=] |returnValue| with the result. 1. [=Upon rejection=] of |bodyPromise| with reason |reason|: 1. [=Reject=] |returnValue| with |reason|. 1. [=Upon rejection=] of |source| with reason |reason|: From 1c865b98f48d645bbe1fc94bb6523f3b01508558 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 7 Aug 2024 13:19:08 -0500 Subject: [PATCH 30/70] Expand specification of builtins --- document/js-api/index.bs | 134 +++++++++++++++++++++++++++++++-------- 1 file changed, 108 insertions(+), 26 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 1117026f7e..50740dfb52 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -93,7 +93,6 @@ urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: df text: store_init; url: appendix/embedding.html#embed-store-init text: module_decode; url: appendix/embedding.html#embed-module-decode text: module_validate; url: appendix/embedding.html#embed-module-validate - text: module_validate_partial_imports; url: appendix/embedding.html#embed-module-validate-partial-imports text: module_instantiate; url: appendix/embedding.html#embed-module-instantiate text: module_imports; url: appendix/embedding.html#embed-module-imports text: module_exports; url: appendix/embedding.html#embed-module-exports @@ -120,6 +119,7 @@ urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: df text: ref_type; url: appendix/embedding.html#embed-ref-type text: val_default; url: appendix/embedding.html#embed-val-default text: match_valtype; url: appendix/embedding.html#embed-match-valtype + text: match_externtype; url: appendix/embedding.html#embed-match-externtype text: error; url: appendix/embedding.html#embed-error text: store; url: exec/runtime.html#syntax-store text: table type; url: syntax/types.html#syntax-tabletype @@ -341,11 +341,10 @@ Note:
- To Validate builtins for WebAssembly module from module |module| and enabled builtins |builtins|, perform the following steps: - 1. Let |store| be the [=surrounding agent=]'s [=associated store=]. - 1. Let |imports| be ... // TODO: create imports from a builtin set name - 1. Let |result| be [=module_validate_partial_imports=](|store|, |module|, |imports|). - 1. If |result| is [=error=], return [=error=]. + To validate builtins for a WebAssembly module from module |module| and enabled builtins |builtinSetNames|, perform the following steps: + 1. If [=validate builtin set names|validating builtin set names=] for |builtinSetNames| is false, return false. + 1. [=list/iterate|For each=] |import| of [=module_imports=](|module|), + 1. If [=validate an import for builtins|validating a import for builtin=] with |import| and |builtinSetNames| is false, return false. 1. Return true.
@@ -354,8 +353,8 @@ Note: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. 1. [=Compile a WebAssembly module|Compile=] |stableBytes| as a WebAssembly module and store the results as |module|. 1. If |module| is [=error=], return false. - 1. Let |builtins| be options["builtins"] - 1. If [=Validate builtins for WebAssembly module=] |module| |builtins| is [=error=], return false. + 1. Let |builtinSetNames| be options["builtins"] + 1. If [=validate builtins for a WebAssembly module|validating builtins for WebAssembly module=] |module| with |builtinSetNames| returns false, return false. 1. Return true. @@ -363,15 +362,15 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje * \[[Module]] : a WebAssembly [=/module=] * \[[Bytes]] : the source bytes of \[[Module]]. - * \[[Builtins]] : an ordered set of names of builtin sets + * \[[BuiltinSets]] : an ordered set of names of builtin sets
- To construct a WebAssembly module object from a module |module|, source bytes |bytes|, builtins |builtins|, perform the following steps: + To construct a WebAssembly module object from a module |module|, source bytes |bytes|, enabled builtins |builtinSetNames|, perform the following steps: 1. Let |moduleObject| be a new {{Module}} object. 1. Set |moduleObject|.\[[Module]] to |module|. 1. Set |moduleObject|.\[[Bytes]] to |bytes|. - 1. Set |moduleObject|.\[[Builtins]] to |builtins|. + 1. Set |moduleObject|.\[[BuiltinSets]] to |builtinSetNames|. 1. Return |moduleObject|.
@@ -383,10 +382,10 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje 1. [=compile a WebAssembly module|Compile the WebAssembly module=] |bytes| and store the result as |module|. 1. [=Queue a task=] to perform the following steps. If |taskSource| was provided, queue the task on that task source. 1. If |module| is [=error=], reject |promise| with a {{CompileError}} exception. - 1. Let |builtins| be options["builtins"] - 1. If [=Validate builtins for WebAssembly module=] |module| |builtins| is [=error=], reject |promise| with a {{CompileError}} exception. + 1. Let |builtinSetNames| be options["builtins"] + 1. If [=validate builtins for a WebAssembly module|validating builtins=] for |module| |builtinSetNames| is false, reject |promise| with a {{CompileError}} exception. 1. Otherwise, - 1. [=Construct a WebAssembly module object=] from |module|, |bytes|, |builtins|, and let |moduleObject| be the result. + 1. [=Construct a WebAssembly module object=] from |module|, |bytes|, |builtinSetNames|, and let |moduleObject| be the result. 1. [=Resolve=] |promise| with |moduleObject|. 1. Return |promise|. @@ -398,13 +397,19 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje
- To read the imports from a WebAssembly module |module| from imports object |importObject|, perform the following steps: + To read the imports from a WebAssembly module |module| from imports object |importObject| and enabled builtins |builtinSetNames|, perform the following steps: 1. If |module|.[=imports=] [=list/is empty|is not empty=], and |importObject| is undefined, throw a {{TypeError}} exception. - // TODO: instantiate builtin sets + 1. Let |instantiatedBuiltins| be the ordered map « ». + 1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, + 1. Assert: |instantiatedBuiltins| does not contain |builtinSetName| + 1. Let |exportsObject| be the result of [=instantiate a builtin set=] with |builtinSetName| + 1. [=map/set|Set=] |instantiatedBuiltins|[|builtinSetName|] to |exportsObject| 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), - // TODO: get exports from a builtin set if enabled - 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). + 1. If |instantiatedBuiltins| [=map/exist|contains=] |moduleName|, + 1. Let |o| be |instantiatedBuiltins|[|moduleName|] + 1. Else, + 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). 1. If [=Type=](|o|) is not Object, throw a {{TypeError}} exception. 1. Let |v| be [=?=] [$Get$](|o|, |componentName|). 1. If |externtype| is of the form [=external-type/func=] |functype|, @@ -519,7 +524,8 @@ The verification of WebAssembly type requirements is deferred to the To asynchronously instantiate a WebAssembly module from a {{Module}} |moduleObject| and imports |importObject|, perform the following steps: 1. Let |promise| be [=a new promise=]. 1. Let |module| be |moduleObject|.\[[Module]]. - 1. [=Read the imports=] of |module| with imports |importObject|, and let |imports| be the result. + 1. Let |builtinSetNames| be |moduleObject|.\[[BuiltinSets]]. + 1. [=Read the imports=] of |module| with imports |importObject| and |builtinSetNames|, and let |imports| be the result. If this operation throws an exception, catch it, [=reject=] |promise| with the exception, and return |promise|. 1. Run the following steps [=in parallel=]: 1. [=Queue a task=] to perform the following steps: @@ -623,12 +629,13 @@ interface Module {
The imports(|moduleObject|) method, when invoked, performs the following steps: 1. Let |module| be |moduleObject|.\[[Module]]. + 1. Let |builtinSetNames| be |moduleObject|.\[[BuiltinSets]]. 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |name|, |type|) of [=module_imports=](|module|), - // TODO: skip import if in builtin module set - 1. Let |kind| be the [=string value of the extern type=] |type|. - 1. Let |obj| be «[ "{{ModuleImportDescriptor/module}}" → |moduleName|, "{{ModuleImportDescriptor/name}}" → |name|, "{{ModuleImportDescriptor/kind}}" → |kind| ]». - 1. [=list/Append=] |obj| to |imports|. + 1. If [=find a builtin=] for (|moduleName|, |name|, |type|) and |builtinSetNames| is null, + 1. Let |kind| be the [=string value of the extern type=] |type|. + 1. Let |obj| be «[ "{{ModuleImportDescriptor/module}}" → |moduleName|, "{{ModuleImportDescriptor/name}}" → |name|, "{{ModuleImportDescriptor/kind}}" → |kind| ]». + 1. [=list/Append=] |obj| to |imports|. 1. Return |imports|.
@@ -669,7 +676,8 @@ interface Instance {
The Instance(|module|, |importObject|) constructor, when invoked, runs the following steps: 1. Let |module| be |module|.\[[Module]]. - 1. [=Read the imports=] of |module| with imports |importObject|, and let |imports| be the result. + 1. Let |builtinSetNames| be |module|.\[[BuiltinSets]]. + 1. [=Read the imports=] of |module| with imports |importObject| and |builtinSetNames|, and let |imports| be the result. 1. [=Instantiate the core of a WebAssembly module=] |module| with |imports|, and let |instance| be the result. 1. [=initialize an instance object|Initialize=] **this** from |module| and |instance|. @@ -1672,9 +1680,83 @@ Note: It is not currently possible to define this behavior using Web IDL.

Builtins

-The JS-API defines sets of builtin functions which can be imported through a flag when compiling a module. WebAssembly builtin functions mirror existing JavaScript builtins, but adapt them to be useable directly as WebAssembly functions. +The JS-API defines sets of builtin functions which can be imported through {{WebAssemblyCompileOptions|options}} when compiling a module. WebAssembly builtin functions mirror existing JavaScript builtins, but adapt them to be useable directly as WebAssembly functions with minimal overhead. + +All builtin functions are grouped into sets. Every builtin set has a unique name that [=read the imports|is used during import lookup=]. All names are prefixed by the `wasm:` namespace (e.g. `wasm:js-string`). + +
+To get the builtins for a builtin set with |builtinSetName|, perform the following steps: + +1. Return a list of (|name|, |funcType|, |steps|) for the set with name |builtinSetName| defined within this section. + +
+ +
+To find a builtin with |import| and enabled builtins |builtinSetNames|, perform the following steps: -All builtin functions are grouped into builtin sets. Every builtin set has a name that [=read the imports|is used during import lookup=]. All names are prefixed by the `wasm:` namespace. +1. Assert: [=validate builtin set names=] |builtinSetNames| is true. +1. Let |importModuleName| be |import|[0] +1. Let |importName| be |import|[1] +1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, + 1. If |importModuleName| equals |builtinSetName| + 1. Let |builtins| be the result of [=get the builtins for a builtin set=] |builtinSetName| + 1. [=list/iterate|For each=] |builtin| of |builtins| + 1. Let |builtinName| be |builtin|[0] + 1. If |importName| equals |builtinName|, return (|builtinSetName|, |builtin|). +1. Return null. + +
+ +
+To validate builtin set names with |builtinSetNames|, perform the following steps: + +1. If |builtinSetNames| contains any duplicates, return false. +1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, + 1. If |builtinSetName| does not equal the name of one of the builtin sets defined in this section, return false. +1. Return false. + +
+ +
+ To create a builtin function from type |functype| and execution steps |steps|, perform the following steps: + + 1. Let |stored settings| be the incumbent settings object. + 1. Let |hostfunc| be a [=host function=] which executes |steps| when called. + 1. Let (|store|, |funcaddr|) be [=func_alloc=](|store|, |functype|, |hostfunc|). + 1. Set the [=surrounding agent=]'s [=associated store=] to |store|. + 1. Return |funcaddr|. + +
+ +
+To instantiate a builtin set with name |builtinSetName|, perform the following steps: + +1. Let |builtins| be the result of [=get the builtins for a builtin set=] |builtinSetName| +1. Let |exportsObject| be [=!=] [$OrdinaryObjectCreate$](null). + 1. If |externtype| is of the form [=external-type/func=] functype, +1. [=list/iterate|For each=] (|name|, |funcType|, |algorithm|) of |builtins|, + 1. Let |funcaddr| be the result fo [=create a builtin function=] with |funcType| and |algorithm| + 1. Let |func| be the result of creating [=a new Exported Function=] from |funcaddr|. + 1. Let |value| be |func|. + 1. Let |status| be [=!=] [$CreateDataProperty$](|exportsObject|, |name|, |value|). + 1. Assert: |status| is true. +1. Perform [=!=] [$SetIntegrityLevel$](|exportsObject|, `"frozen"`). +1. Return |exportsObject|. + +
+ +
+To validate an import for builtins with |import| and enabled builtins |builtinSetNames|, perform the following steps: + +1. Assert: [=validate builtin set names=] |builtinSetNames| is true. +1. Let |maybeBuiltin| be the result of [=find a builtin|finding a builtin=] for |import| and |builtinSetNames| +1. If |maybeBuiltin| is null, return true. +1. Let |importExternType| be |import|[2] +1. Let |builtinFuncType| be |maybeBuiltin|[0][1] +1. Let |builtinExternType| be `func |builtinFuncType|` +1. Return [=match_externtype=](|builtinExternType|, |importExternType|) + +

String Builtins

From 9cc146e00d8478b84457dae01c3e0b1a33865397 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 7 Aug 2024 13:20:31 -0500 Subject: [PATCH 31/70] Remove unneeded modification to embedding spec --- document/core/appendix/embedding.rst | 9 --------- 1 file changed, 9 deletions(-) diff --git a/document/core/appendix/embedding.rst b/document/core/appendix/embedding.rst index cccaa1eb84..737230ef85 100644 --- a/document/core/appendix/embedding.rst +++ b/document/core/appendix/embedding.rst @@ -161,15 +161,6 @@ Modules \end{array} -.. index:: validation -.. _embed-module-module-validate-partial-imports: - -:math:`\F{module\_validate\_partial\_imports}(\module, \imports) : \error^?` -................................................ - -// TODO -1. Return :math:`\ERROR`. - .. index:: instantiation, module instance .. _embed-module-instantiate: From b9c4a82073311166e308c0ccf7f3d65d84ae32cb Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 7 Aug 2024 13:51:32 -0500 Subject: [PATCH 32/70] Initial string constants support --- document/js-api/index.bs | 69 +++++++++++++++++++++++++++++----------- 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 50740dfb52..3fed558d31 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -341,10 +341,15 @@ Note:
- To validate builtins for a WebAssembly module from module |module| and enabled builtins |builtinSetNames|, perform the following steps: + To validate builtins and imported string for a WebAssembly module from module |module|, enabled builtins |builtinSetNames|, and |importedStringModule|, perform the following steps: 1. If [=validate builtin set names|validating builtin set names=] for |builtinSetNames| is false, return false. 1. [=list/iterate|For each=] |import| of [=module_imports=](|module|), - 1. If [=validate an import for builtins|validating a import for builtin=] with |import| and |builtinSetNames| is false, return false. + 1. If |importedStringModule| is not null and |import|[0] equals |importedStringModule|, + 1. Let |importExternType| be |import|[2] + 1. Let |stringExternType| be `global const (ref extern)` + 1. If [=match_externtype=](|stringExternType|, |importExternType|) is false, return false + 1. Else, + 1. If [=validate an import for builtins|validating a import for builtin=] with |import| and |builtinSetNames| is false, return false. 1. Return true.
@@ -354,7 +359,8 @@ Note: 1. [=Compile a WebAssembly module|Compile=] |stableBytes| as a WebAssembly module and store the results as |module|. 1. If |module| is [=error=], return false. 1. Let |builtinSetNames| be options["builtins"] - 1. If [=validate builtins for a WebAssembly module|validating builtins for WebAssembly module=] |module| with |builtinSetNames| returns false, return false. + 1. Let |importedStringModule| be options["importedStringConstants"] + 1. If [=validate builtins and imported string for a WebAssembly module|validating builtins and imported strings=] for |module| with |builtinSetNames| and |importedStringModule| returns false, return false. 1. Return true.
@@ -363,14 +369,16 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje * \[[Module]] : a WebAssembly [=/module=] * \[[Bytes]] : the source bytes of \[[Module]]. * \[[BuiltinSets]] : an ordered set of names of builtin sets + * \[[ImportedStringModule]] : an optional module specifier string where any string constant can be imported from.
- To construct a WebAssembly module object from a module |module|, source bytes |bytes|, enabled builtins |builtinSetNames|, perform the following steps: + To construct a WebAssembly module object from a module |module|, source bytes |bytes|, enabled builtins |builtinSetNames|, and |importedStringModule|, perform the following steps: 1. Let |moduleObject| be a new {{Module}} object. 1. Set |moduleObject|.\[[Module]] to |module|. 1. Set |moduleObject|.\[[Bytes]] to |bytes|. 1. Set |moduleObject|.\[[BuiltinSets]] to |builtinSetNames|. + 1. Set |moduleObject|.\[[ImportedStringModule]] to |importedStringModule|. 1. Return |moduleObject|.
@@ -383,9 +391,10 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje 1. [=Queue a task=] to perform the following steps. If |taskSource| was provided, queue the task on that task source. 1. If |module| is [=error=], reject |promise| with a {{CompileError}} exception. 1. Let |builtinSetNames| be options["builtins"] - 1. If [=validate builtins for a WebAssembly module|validating builtins=] for |module| |builtinSetNames| is false, reject |promise| with a {{CompileError}} exception. + 1. Let |importedStringModule| be options["importedStringConstants"] + 1. If [=validate builtins and imported string for a WebAssembly module|validating builtins and imported strings=] for |module| with |builtinSetNames| and |importedStringModule| is false, reject |promise| with a {{CompileError}} exception. 1. Otherwise, - 1. [=Construct a WebAssembly module object=] from |module|, |bytes|, |builtinSetNames|, and let |moduleObject| be the result. + 1. [=Construct a WebAssembly module object=] from |module|, |bytes|, |builtinSetNames|, |importedStringModule|, and let |moduleObject| be the result. 1. [=Resolve=] |promise| with |moduleObject|. 1. Return |promise|. @@ -396,18 +405,36 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje 1. [=Asynchronously compile a WebAssembly module=] from |stableBytes| using |options| and return the result. +
+To instantiate imported strings with module |module| and |importedStringModule|, perform the following steps: +1. Assert: |importedStringModule| is not null. +1. Let |exportsObject| be [=!=] [$OrdinaryObjectCreate$](null). + 1. If |externtype| is of the form [=external-type/func=] functype, +1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), + 1. If |moduleName| does not equal |importedStringModule|, then [=iteration/continue=]. + 1. Let |stringConstant| be |componentName|. + 1. Let |status| be [=!=] [$CreateDataProperty$](|exportsObject|, |stringConstant|, |stringConstant|). + 1. Assert: |status| is true. +1. Perform [=!=] [$SetIntegrityLevel$](|exportsObject|, `"frozen"`). +1. Return |exportsObject|. + +
+
- To read the imports from a WebAssembly module |module| from imports object |importObject| and enabled builtins |builtinSetNames|, perform the following steps: + To read the imports from a WebAssembly module |module| from imports object |importObject|, enabled builtins |builtinSetNames|, and |importedStringModule|, perform the following steps: 1. If |module|.[=imports=] [=list/is empty|is not empty=], and |importObject| is undefined, throw a {{TypeError}} exception. - 1. Let |instantiatedBuiltins| be the ordered map « ». + 1. Let |builtinOrStringImports| be the ordered map « ». 1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, - 1. Assert: |instantiatedBuiltins| does not contain |builtinSetName| + 1. Assert: |builtinOrStringImports| does not contain |builtinSetName| 1. Let |exportsObject| be the result of [=instantiate a builtin set=] with |builtinSetName| - 1. [=map/set|Set=] |instantiatedBuiltins|[|builtinSetName|] to |exportsObject| + 1. [=map/set|Set=] |builtinOrStringImports|[|builtinSetName|] to |exportsObject| + 1. If |importedStringModule| is not null, + 1. Let |exportsObject| be the result of [=instantiate imported strings=] with |module| and |importedStringModule| + 1. [=map/set|Set=] |builtinOrStringImports|[|importedStringModule|] to |exportsObject| 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), - 1. If |instantiatedBuiltins| [=map/exist|contains=] |moduleName|, - 1. Let |o| be |instantiatedBuiltins|[|moduleName|] + 1. If |builtinOrStringImports| [=map/exist|contains=] |moduleName|, + 1. Let |o| be |builtinOrStringImports|[|moduleName|] 1. Else, 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). 1. If [=Type=](|o|) is not Object, throw a {{TypeError}} exception. @@ -525,7 +552,8 @@ The verification of WebAssembly type requirements is deferred to the 1. Let |promise| be [=a new promise=]. 1. Let |module| be |moduleObject|.\[[Module]]. 1. Let |builtinSetNames| be |moduleObject|.\[[BuiltinSets]]. - 1. [=Read the imports=] of |module| with imports |importObject| and |builtinSetNames|, and let |imports| be the result. + 1. Let |importedStringModule| be |moduleObject|.\[[ImportedStringModule]]. + 1. [=Read the imports=] of |module| with imports |importObject|, |builtinSetNames| and |importedStringModule|, and let |imports| be the result. If this operation throws an exception, catch it, [=reject=] |promise| with the exception, and return |promise|. 1. Run the following steps [=in parallel=]: 1. [=Queue a task=] to perform the following steps: @@ -630,12 +658,14 @@ interface Module { The imports(|moduleObject|) method, when invoked, performs the following steps: 1. Let |module| be |moduleObject|.\[[Module]]. 1. Let |builtinSetNames| be |moduleObject|.\[[BuiltinSets]]. + 1. Let |importedStringModule| be |moduleObject|.\[[ImportedStringModule]]. 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |name|, |type|) of [=module_imports=](|module|), - 1. If [=find a builtin=] for (|moduleName|, |name|, |type|) and |builtinSetNames| is null, - 1. Let |kind| be the [=string value of the extern type=] |type|. - 1. Let |obj| be «[ "{{ModuleImportDescriptor/module}}" → |moduleName|, "{{ModuleImportDescriptor/name}}" → |name|, "{{ModuleImportDescriptor/kind}}" → |kind| ]». - 1. [=list/Append=] |obj| to |imports|. + 1. If [=find a builtin=] for (|moduleName|, |name|, |type|) and |builtinSetNames| is not null, then [=iteration/continue=] + 1. If |importedStringModule| is not null and |moduleName| equals |importedStringModule|, then [=iteration/continue=] + 1. Let |kind| be the [=string value of the extern type=] |type|. + 1. Let |obj| be «[ "{{ModuleImportDescriptor/module}}" → |moduleName|, "{{ModuleImportDescriptor/name}}" → |name|, "{{ModuleImportDescriptor/kind}}" → |kind| ]». + 1. [=list/Append=] |obj| to |imports|. 1. Return |imports|.
@@ -677,7 +707,8 @@ interface Instance { The Instance(|module|, |importObject|) constructor, when invoked, runs the following steps: 1. Let |module| be |module|.\[[Module]]. 1. Let |builtinSetNames| be |module|.\[[BuiltinSets]]. - 1. [=Read the imports=] of |module| with imports |importObject| and |builtinSetNames|, and let |imports| be the result. + 1. Let |importedStringModule| be |module|.\[[ImportedStringModule]]. + 1. [=Read the imports=] of |module| with imports |importObject|, |builtinSetNames|, and |importedStringModule|, and let |imports| be the result. 1. [=Instantiate the core of a WebAssembly module=] |module| with |imports|, and let |instance| be the result. 1. [=initialize an instance object|Initialize=] **this** from |module| and |instance|. @@ -1746,7 +1777,7 @@ To instantiate a builtin set with name |builtinSetName|, perform the
-To validate an import for builtins with |import| and enabled builtins |builtinSetNames|, perform the following steps: +To validate an import for builtins with |import|, enabled builtins |builtinSetNames|, perform the following steps: 1. Assert: [=validate builtin set names=] |builtinSetNames| is true. 1. Let |maybeBuiltin| be the result of [=find a builtin|finding a builtin=] for |import| and |builtinSetNames| From 6b2a180382d50915247a70c1291e6d13befb73a2 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 7 Aug 2024 13:54:38 -0500 Subject: [PATCH 33/70] Fix module constructor to handle options --- document/js-api/index.bs | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 3fed558d31..4b1c3f8f0a 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -682,13 +682,18 @@ interface Module {
- The Module(|bytes|) constructor, when invoked, performs the following steps: + The Module(|bytes|, |options|) constructor, when invoked, performs the following steps: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. 1. [=Compile a WebAssembly module|Compile the WebAssembly module=] |stableBytes| and store the result as |module|. 1. If |module| is [=error=], throw a {{CompileError}} exception. + 1. Let |builtinSetNames| be options["builtins"] + 1. Let |importedStringModule| be options["importedStringConstants"] + 1. If [=validate builtins and imported string for a WebAssembly module|validating builtins and imported strings=] for |module| with |builtinSetNames| and |importedStringModule| returns false, throw a {{CompileError}} exception. 1. Set **this**.\[[Module]] to |module|. 1. Set **this**.\[[Bytes]] to |stableBytes|. + 1. Set **this**.\[[BuiltinSets]] to |builtinSetNames|. + 1. Set **this**.\[[ImportedStringModule]] to |importedStringModule|. Note: Some implementations enforce a size limitation on |bytes|. Use of this API is discouraged, in favor of asynchronous APIs.
From 8f2e8dcf40c45a1d7f2e7a78202bb233904ea2f7 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Thu, 8 Aug 2024 12:40:03 -0500 Subject: [PATCH 34/70] Don't freeze internal instantiation of builtins and string constants --- document/js-api/index.bs | 2 -- 1 file changed, 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 4b1c3f8f0a..aa9791343a 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -415,7 +415,6 @@ To instantiate imported strings with module |module| and |importedStr 1. Let |stringConstant| be |componentName|. 1. Let |status| be [=!=] [$CreateDataProperty$](|exportsObject|, |stringConstant|, |stringConstant|). 1. Assert: |status| is true. -1. Perform [=!=] [$SetIntegrityLevel$](|exportsObject|, `"frozen"`). 1. Return |exportsObject|. @@ -1776,7 +1775,6 @@ To instantiate a builtin set with name |builtinSetName|, perform the 1. Let |value| be |func|. 1. Let |status| be [=!=] [$CreateDataProperty$](|exportsObject|, |name|, |value|). 1. Assert: |status| is true. -1. Perform [=!=] [$SetIntegrityLevel$](|exportsObject|, `"frozen"`). 1. Return |exportsObject|. From 60e5e69850fc645c3a185d77d939c1b7cf9cd28f Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Thu, 8 Aug 2024 13:24:57 -0500 Subject: [PATCH 35/70] Move UTF-8 support to a future proposal --- proposals/js-string-builtins/Overview.md | 34 ++++++++++++------------ 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/proposals/js-string-builtins/Overview.md b/proposals/js-string-builtins/Overview.md index d40de3567b..6528301602 100644 --- a/proposals/js-string-builtins/Overview.md +++ b/proposals/js-string-builtins/Overview.md @@ -157,14 +157,6 @@ For this purpose, `WebAssembly.validate()` is extended to take a list of builtin If a user wishes to polyfill these imports for some reason, or is running on a system without a builtin, these imports may be provided as normal through instantiation. -## UTF8/WTF8 support - -As stated above in 'goals for builtins', builtins are intended to just wrap existing primitives and not invent new functionality. - -JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write Wasm builtins for these encodings without introducing significant new logic to them. - -There is the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces. - ## String constants String constants may be defined in JS and made available to Wasm through a variety of means. @@ -551,7 +543,19 @@ function compare( } ``` -## Encoding API +## Future extensions + +There are several extensions we can make in the future as need arrives. + +### UTF8/WTF8 support + +As stated above in 'goals for builtins', builtins are intended to just wrap existing primitives and not invent new functionality. + +JS Strings are semantically a sequence of 16-bit code units (referred to as char codes in method naming), and there are no builtin operations on them to acquire a UTF-8 or WTF-8 view. This makes it difficult to write Wasm builtins for these encodings without introducing significant new logic to them. + +There is the Encoding API for `TextEncoder`/`TextDecoder` which can be used for UTF-8 support. However, this is technically a separate spec from JS and may not be available on all JS engines (in practice it's available widely). This proposal exposes UTF-8 data conversions using this API under separate `wasm:text-encoder` `wasm:text-decoder` interfaces which are available when the host implements these interfaces. + +### Encoding API The following is an initial set of function builtins for the [`TextEncoder`](https://encoding.spec.whatwg.org/#interface-textencoder) and the [`TextDecoder`](https://encoding.spec.whatwg.org/#interface-textdecoder) interfaces. These builtins are exposed under `wasm:text-encoder` and `wasm:text-decoder`, respectively. @@ -607,7 +611,7 @@ function trap() { } ``` -### "wasm:text-decoder" "decodeStringFromUTF8Array" +#### "wasm:text-decoder" "decodeStringFromUTF8Array" ``` /// Decode the specified range of an i8 array using UTF-8 into a string. @@ -656,7 +660,7 @@ func decodeStringFromUTF8Array( } ``` -### "wasm:text-encoder" "measureStringAsUTF8" +#### "wasm:text-encoder" "measureStringAsUTF8" ``` /// Returns the number of bytes string would take when encoded as UTF-8. @@ -684,7 +688,7 @@ func measureStringAsUTF8( } ``` -### "wasm:text-encoder" "encodeStringIntoUTF8Array" +#### "wasm:text-encoder" "encodeStringIntoUTF8Array" ``` /// Encode a string into a pre-allocated mutable i8 array at `start` index using @@ -731,7 +735,7 @@ func encodeStringIntoUTF8Array( } ``` -### "wasm:text-encoder" "encodeStringToUTF8Array" +#### "wasm:text-encoder" "encodeStringToUTF8Array" ``` /// Encode a string into a new mutable i8 array using UTF-8. @@ -760,10 +764,6 @@ func encodeStringToUTF8Array( } ``` -## Future extensions - -There are several extensions we can make in the future as need arrives. - ### Binding memory to builtins It may be useful to have a builtin that operates on a specific Wasm memory. For JS strings, this could allow us to encode a JS string directly into linear memory. From 5126002a6705064b02008872143fb7a3c9b5bba8 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 9 Aug 2024 14:56:01 -0500 Subject: [PATCH 36/70] Remove DS_Store --- test/.DS_Store | Bin 6148 -> 0 bytes 1 file changed, 0 insertions(+), 0 deletions(-) delete mode 100644 test/.DS_Store diff --git a/test/.DS_Store b/test/.DS_Store deleted file mode 100644 index f0081cd54879fe0a01a241b176985baa2af64f92..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 6148 zcmeHK!A`Q*cVojrX7?-pC(~leRy+qSd)l zQ81tT!A01gbjv#@DjxY^+#l+Iu-8YI>&r0isYzRn<6fd;J+t8yonp6KnNI7iW>q%o z^;uO;TZgr(JZdy&v!b)RcW`{xeTtrA^=e2I_`hk{wm643oGkKr@rH4v;(N4_e;yto zfqn2{0O5w@x|GvOzH?3(nE_^i8CYe&o}o@@b%WQ;05kCK8KCn)q7u3mQ-k{Gz(KbF zh;)tAf^+I6C`MXzEv5!>2Su1rL=!4(iy=%n+NJe#Ev5!dI0)N(2>WJXI~1Ycj?b6c z9fWI;M`nN-SY{w=mSw8{r$6`q%R#(i2AF}3VnF0NUZ;&I+1k369MxKhdW%XzeyPDl k2u^e Date: Fri, 9 Aug 2024 15:03:06 -0500 Subject: [PATCH 37/70] Remove DS_Store --- test/js-api/.DS_Store | Bin 6148 -> 0 bytes 1 file changed, 0 insertions(+), 0 deletions(-) delete mode 100644 test/js-api/.DS_Store diff --git a/test/js-api/.DS_Store b/test/js-api/.DS_Store deleted file mode 100644 index 2830d0975c22c1acab77d64cc98b1a717d3a2e14..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 6148 zcmeHK%}T>S5T317H;7QgipK@76&n>%yu?~xz=$4HYGX=`rrFY@_D~8r>kIh;zJlP} zIQvskTET;e$PCPWv$He1*)L%?0|2bn4_$x~0B}@=1qYicqQ;d@NWpsM5SgAMgC0bX zKsy*rM3dt$GN9g_1GkXEI_ih_mv)0#1~~c@bmJ(^s?`@(m?_TA&GUJ_z%Tt9IqJ)*Tz6Cv!@bRFMQm^Fj7APGuWjrfH}2w2A|DK$48J@rix$W5h{lwi#ckKW zmYwc6YMjZ)ct?-b^ZT2 ziF(8UG4QV#V8yy$ui=*L-8!*3b=L~iD^w*aE;sm|f`-0|F;-p0tEgJgFO-4kS Date: Fri, 9 Aug 2024 11:42:12 -0700 Subject: [PATCH 38/70] Mark up options appropriately --- document/js-api/index.bs | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index aa9791343a..d214b0ca4f 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -358,8 +358,8 @@ Note: 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. 1. [=Compile a WebAssembly module|Compile=] |stableBytes| as a WebAssembly module and store the results as |module|. 1. If |module| is [=error=], return false. - 1. Let |builtinSetNames| be options["builtins"] - 1. Let |importedStringModule| be options["importedStringConstants"] + 1. Let |builtinSetNames| be |options|["builtins"] + 1. Let |importedStringModule| be |options|["importedStringConstants"] 1. If [=validate builtins and imported string for a WebAssembly module|validating builtins and imported strings=] for |module| with |builtinSetNames| and |importedStringModule| returns false, return false. 1. Return true. @@ -390,8 +390,8 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje 1. [=compile a WebAssembly module|Compile the WebAssembly module=] |bytes| and store the result as |module|. 1. [=Queue a task=] to perform the following steps. If |taskSource| was provided, queue the task on that task source. 1. If |module| is [=error=], reject |promise| with a {{CompileError}} exception. - 1. Let |builtinSetNames| be options["builtins"] - 1. Let |importedStringModule| be options["importedStringConstants"] + 1. Let |builtinSetNames| be |options|["builtins"] + 1. Let |importedStringModule| be |options|["importedStringConstants"] 1. If [=validate builtins and imported string for a WebAssembly module|validating builtins and imported strings=] for |module| with |builtinSetNames| and |importedStringModule| is false, reject |promise| with a {{CompileError}} exception. 1. Otherwise, 1. [=Construct a WebAssembly module object=] from |module|, |bytes|, |builtinSetNames|, |importedStringModule|, and let |moduleObject| be the result. @@ -686,8 +686,8 @@ interface Module { 1. Let |stableBytes| be a [=get a copy of the buffer source|copy of the bytes held by the buffer=] |bytes|. 1. [=Compile a WebAssembly module|Compile the WebAssembly module=] |stableBytes| and store the result as |module|. 1. If |module| is [=error=], throw a {{CompileError}} exception. - 1. Let |builtinSetNames| be options["builtins"] - 1. Let |importedStringModule| be options["importedStringConstants"] + 1. Let |builtinSetNames| be |options|["builtins"] + 1. Let |importedStringModule| be |options|["importedStringConstants"] 1. If [=validate builtins and imported string for a WebAssembly module|validating builtins and imported strings=] for |module| with |builtinSetNames| and |importedStringModule| returns false, throw a {{CompileError}} exception. 1. Set **this**.\[[Module]] to |module|. 1. Set **this**.\[[Bytes]] to |stableBytes|. From 0aad8b912aaeb728cd52aab794b57ad0b32dc7dc Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 13:19:30 -0700 Subject: [PATCH 39/70] Make UnwrapString an abstract-op and use it to implement cast --- document/js-api/index.bs | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index d214b0ca4f..1b5afd3fa5 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1797,16 +1797,28 @@ To validate an import for builtins with |import|, enabled builtins |b String builtins adapt the interface of the String builtin object. The import name for this set is `wasm:js-string`. -
+

Abstract operations

+
-The unwrapString(|v|) method, when invoked, performs the following steps: +The UnwrapString(|v|) abstract operation, when invoked, performs the following steps: -1. If [=Type=](|v|) is not String +1. If [=Type=](|v|) is not [=String=] 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. 1. Return |v|
+

cast

+ +The |funcType| of this builtin is `(func (param externref) (result externref))`. + +
+When this builtin is invoked with parameter |v|, the following steps must be run: + +1. Return [=?=] [$UnwrapString$](|v|) + +
+

charCodeAt

The type of this function is `(func (param externref i32) (result i32))`. @@ -1814,7 +1826,7 @@ The type of this function is `(func (param externref i32) (result i32))`.
When this builtin is invoked, the following steps must be run: -1. Let |string| be [$unwrapString$](param0) +1. Let |string| be [$UnwrapString$](param0) 1. TODO
From 57aa62d13400f27666e422fb444411b0aff8ecb7 Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 13:23:46 -0700 Subject: [PATCH 40/70] Consistently use "steps" instead of "algorithm" for builtins --- document/js-api/index.bs | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 1b5afd3fa5..121f29d96f 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1753,7 +1753,7 @@ To validate builtin set names with |builtinSetNames|, perform the fol
- To create a builtin function from type |functype| and execution steps |steps|, perform the following steps: + To create a builtin function from type |funcType| and execution steps |steps|, perform the following steps: 1. Let |stored settings| be the incumbent settings object. 1. Let |hostfunc| be a [=host function=] which executes |steps| when called. @@ -1769,8 +1769,8 @@ To instantiate a builtin set with name |builtinSetName|, perform the 1. Let |builtins| be the result of [=get the builtins for a builtin set=] |builtinSetName| 1. Let |exportsObject| be [=!=] [$OrdinaryObjectCreate$](null). 1. If |externtype| is of the form [=external-type/func=] functype, -1. [=list/iterate|For each=] (|name|, |funcType|, |algorithm|) of |builtins|, - 1. Let |funcaddr| be the result fo [=create a builtin function=] with |funcType| and |algorithm| +1. [=list/iterate|For each=] (|name|, |funcType|, |steps|) of |builtins|, + 1. Let |funcaddr| be the result fo [=create a builtin function=] with |funcType| and |steps| 1. Let |func| be the result of creating [=a new Exported Function=] from |funcaddr|. 1. Let |value| be |func|. 1. Let |status| be [=!=] [$CreateDataProperty$](|exportsObject|, |name|, |value|). From acb72c9cd9df91e253ac19df089d27c515e8c88c Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 13:29:52 -0700 Subject: [PATCH 41/70] Add test --- document/js-api/index.bs | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 121f29d96f..4af1e5bb1a 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1819,6 +1819,19 @@ When this builtin is invoked with parameter |v|, the following steps must be run
+

test

+ +The |funcType| of this builtin is `(func (param externref) (result i32))`. + +
+When this builtin is invoked with parameter |v|, the following steps must be run: + +1. If [=Type=](|v|) is not [=String=] + 1. Return 0 +1. Return 1 + +
+

charCodeAt

The type of this function is `(func (param externref i32) (result i32))`. From 9c45e27282c492ae4ca41ebbe40c5e318cc06cfb Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 13:36:07 -0700 Subject: [PATCH 42/70] Add stubs for the rest of the operations --- document/js-api/index.bs | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 4af1e5bb1a..47b4b6b029 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1832,6 +1832,22 @@ When this builtin is invoked with parameter |v|, the following steps must be run +

fromCharCodeArray

+ +TODO + +

intoCharCodeArray

+ +TODO + +

fromCharCode

+ +TODO + +

fromCodePoint

+ +TODO +

charCodeAt

The type of this function is `(func (param externref i32) (result i32))`. @@ -1844,6 +1860,29 @@ When this builtin is invoked, the following steps must be run: +

codePointAt

+ +TODO + +

length

+ +TODO + +

concat

+ +TODO + +

substring

+ +TODO + +

equals

+ +TODO + +

compare

+ +TODO

Error Condition Mappings to JavaScript

From a924966d963f43977de22fd099ed43a2fc51299d Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 15:10:47 -0700 Subject: [PATCH 43/70] Add fromCharCode --- document/js-api/index.bs | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 47b4b6b029..e2861a804d 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -62,6 +62,7 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: 𝔽; url: #𝔽 text: ℤ; url: #ℤ text: SameValue; url: sec-samevalue + text: String.fromCharCode; url: sec-string.fromcharcode urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: dfn url: valid/modules.html#valid-module text: valid @@ -1842,7 +1843,14 @@ TODO

fromCharCode

-TODO +The |funcType| of this builtin is `(func (param i32) (result externref))`. + +
+When this builtin is invoked with parameter |v|, the following steps must be run: + +1. Return [=!=] [=String.fromCharCode=]([=ToJSValue=](|v|)). + +

fromCodePoint

From 4d1520e4fdf6179e3cc50bc83b4fbca6d18ff0fc Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 15:54:54 -0700 Subject: [PATCH 44/70] Switch to using the Call abstract op, and add fromCodePoint --- document/js-api/index.bs | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index e2861a804d..2e2f32df00 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -63,6 +63,7 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: ℤ; url: #ℤ text: SameValue; url: sec-samevalue text: String.fromCharCode; url: sec-string.fromcharcode + text: String.fromCodePoint; url: sec-string.fromcodepoint urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: dfn url: valid/modules.html#valid-module text: valid @@ -1848,7 +1849,7 @@ The |funcType| of this builtin is `(func (param i32) (result externref))`.
When this builtin is invoked with parameter |v|, the following steps must be run: -1. Return [=!=] [=String.fromCharCode=]([=ToJSValue=](|v|)). +1. Return [=!=] [$Call$]([=String.fromCharCode=], undefined, [=ToJSValue=](|v|)).
@@ -1856,6 +1857,15 @@ When this builtin is invoked with parameter |v|, the following steps must be run TODO +The |funcType| of this builtin is `(func (param i32) (result externref))`. + +
+When this builtin is invoked with parameter |v|, the following steps must be run: + +1. Return [=!=] [$Call$]([=String.fromCodePoint=], undefined, [=ToJSValue=](|v|)). + +
+

charCodeAt

The type of this function is `(func (param externref i32) (result i32))`. From a356284edaece50742245b5ee4efe8d90b18943d Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 16:03:15 -0700 Subject: [PATCH 45/70] Add length --- document/js-api/index.bs | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 2e2f32df00..32541d037d 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1884,7 +1884,15 @@ TODO

length

-TODO +The |funcType| of this builtin is `(func (param externref) (result i32))`. + +
+When this builtin is invoked with parameter |v|, the following steps must be run: + +1. Let |string| be [=?=] [$UnwrapString$](|v|). +1. Return the [=string/length=] of |string|. + +

concat

From 05b8756cc584d001b6f2b2b7f32463bdad559a73 Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 16:13:55 -0700 Subject: [PATCH 46/70] Add charCodeAt and codePointAt --- document/js-api/index.bs | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 32541d037d..9fe76464f7 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -64,6 +64,8 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: SameValue; url: sec-samevalue text: String.fromCharCode; url: sec-string.fromcharcode text: String.fromCodePoint; url: sec-string.fromcodepoint + text: String.prototype.charCodeAt; url: sec-string.prototype.charcodeat + text: String.prototype.codePointAt; url: sec-string.prototype.codepointat urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: dfn url: valid/modules.html#valid-module text: valid @@ -1862,6 +1864,8 @@ The |funcType| of this builtin is `(func (param i32) (result externref))`.
When this builtin is invoked with parameter |v|, the following steps must be run: +1. If |v| > 0x10ffff + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. 1. Return [=!=] [$Call$]([=String.fromCodePoint=], undefined, [=ToJSValue=](|v|)).
@@ -1871,16 +1875,30 @@ When this builtin is invoked with parameter |v|, the following steps must be run The type of this function is `(func (param externref i32) (result i32))`.
-When this builtin is invoked, the following steps must be run: +When this builtin is invoked with parameters |string| and |index|, the following steps must be run: -1. Let |string| be [$UnwrapString$](param0) -1. TODO +1. Let |string| be [=?=] [$UnwrapString$](|string|). +1. Let |length| be the [=string/length=] of |string|. +1. If |index| >= |length| + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. Return [=!=] [$Call$]([=String.prototype.charCodeAt=], |string|, [=ToJSValue=](|index|)).

codePointAt

-TODO +The type of this function is `(func (param externref i32) (result i32))`. + +
+When this builtin is invoked with parameters |string| and |index|, the following steps must be run: + +1. Let |string| be [=?=] [$UnwrapString$](|string|). +1. Let |length| be the [=string/length=] of |string|. +1. If |index| >= |length| + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. Return [=!=] [$Call$]([=String.prototype.codePointAt=], |string|, [=ToJSValue=](|index|)). + +

length

From 1253b7bbd0b4b56974bea2d898ae5022df124742 Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 16:16:45 -0700 Subject: [PATCH 47/70] Add concat --- document/js-api/index.bs | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 9fe76464f7..ec7dde67c4 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -66,6 +66,7 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: String.fromCodePoint; url: sec-string.fromcodepoint text: String.prototype.charCodeAt; url: sec-string.prototype.charcodeat text: String.prototype.codePointAt; url: sec-string.prototype.codepointat + text: String.prototype.concat; url: sec-string.prototype.concat urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: dfn url: valid/modules.html#valid-module text: valid @@ -1914,7 +1915,16 @@ When this builtin is invoked with parameter |v|, the following steps must be run

concat

-TODO +The |funcType| of this builtin is `(func (param externref externref) (result externref))`. + +
+When this builtin is invoked with parameters |first| and |second|, the following steps must be run: + +1. Let |first| be [=?=] [$UnwrapString$](|first|). +1. Let |second| be [=?=] [$UnwrapString$](|second|). +1. Return [=!=] [$Call$]([=String.prototype.concat=], |first|, |second|). + +

substring

From a98b6862fc7acb53ce1629868dee551ca9a41019 Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 16:39:13 -0700 Subject: [PATCH 48/70] Add substring --- document/js-api/index.bs | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index ec7dde67c4..14bdd7a90e 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -67,6 +67,7 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: String.prototype.charCodeAt; url: sec-string.prototype.charcodeat text: String.prototype.codePointAt; url: sec-string.prototype.codepointat text: String.prototype.concat; url: sec-string.prototype.concat + text: String.prototype.substring; url: sec-string.prototype.substring urlPrefix: https://webassembly.github.io/spec/core/; spec: WebAssembly; type: dfn url: valid/modules.html#valid-module text: valid @@ -1922,13 +1923,24 @@ When this builtin is invoked with parameters |first| and |second|, the following 1. Let |first| be [=?=] [$UnwrapString$](|first|). 1. Let |second| be [=?=] [$UnwrapString$](|second|). -1. Return [=!=] [$Call$]([=String.prototype.concat=], |first|, |second|). +1. Return [=!=] [$Call$]([=String.prototype.concat=], |first|, « |second| »).

substring

-TODO +The |funcType| of this builtin is `(func (param externref i32 i32) (result externref))`. + +
+When this builtin is invoked with parameters |string|, |start|, and |end|, the following steps must be run: + +1. Let |string| be [=?=] [$UnwrapString$](|string|). +1. Let |length| be the [=string/length=] of |string|. +1. If |start| > |end| or |start| > |length| + 1. Return the empty string. +1. Return [=!=] [$Call$]([=String.prototype.substring=], |string|, « [=ToJSValue=](|start|), [=ToJSValue=](|end|) »). + +

equals

From 500233267fd47c6ec0b7af16ce73261288916479 Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 9 Aug 2024 16:44:32 -0700 Subject: [PATCH 49/70] Use angle quotes appropriately --- document/js-api/index.bs | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 14bdd7a90e..6c3d3c4095 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1853,14 +1853,12 @@ The |funcType| of this builtin is `(func (param i32) (result externref))`.
When this builtin is invoked with parameter |v|, the following steps must be run: -1. Return [=!=] [$Call$]([=String.fromCharCode=], undefined, [=ToJSValue=](|v|)). +1. Return [=!=] [$Call$]([=String.fromCharCode=], undefined, « [=ToJSValue=](|v|) »).

fromCodePoint

-TODO - The |funcType| of this builtin is `(func (param i32) (result externref))`.
@@ -1868,7 +1866,7 @@ When this builtin is invoked with parameter |v|, the following steps must be run 1. If |v| > 0x10ffff 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. -1. Return [=!=] [$Call$]([=String.fromCodePoint=], undefined, [=ToJSValue=](|v|)). +1. Return [=!=] [$Call$]([=String.fromCodePoint=], undefined, « [=ToJSValue=](|v|) »).
@@ -1883,7 +1881,7 @@ When this builtin is invoked with parameters |string| and |index|, the following 1. Let |length| be the [=string/length=] of |string|. 1. If |index| >= |length| 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. -1. Return [=!=] [$Call$]([=String.prototype.charCodeAt=], |string|, [=ToJSValue=](|index|)). +1. Return [=!=] [$Call$]([=String.prototype.charCodeAt=], |string|, « [=ToJSValue=](|index|) »). @@ -1898,7 +1896,7 @@ When this builtin is invoked with parameters |string| and |index|, the following 1. Let |length| be the [=string/length=] of |string|. 1. If |index| >= |length| 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. -1. Return [=!=] [$Call$]([=String.prototype.codePointAt=], |string|, [=ToJSValue=](|index|)). +1. Return [=!=] [$Call$]([=String.prototype.codePointAt=], |string|, « [=ToJSValue=](|index|) »). @@ -1923,7 +1921,7 @@ When this builtin is invoked with parameters |first| and |second|, the following 1. Let |first| be [=?=] [$UnwrapString$](|first|). 1. Let |second| be [=?=] [$UnwrapString$](|second|). -1. Return [=!=] [$Call$]([=String.prototype.concat=], |first|, « |second| »). +1. Return [=!=] [$Call$]([=String.prototype.concat=], |first|, « |second| »). @@ -1938,7 +1936,7 @@ When this builtin is invoked with parameters |string|, |start|, and |end|, the f 1. Let |length| be the [=string/length=] of |string|. 1. If |start| > |end| or |start| > |length| 1. Return the empty string. -1. Return [=!=] [$Call$]([=String.prototype.substring=], |string|, « [=ToJSValue=](|start|), [=ToJSValue=](|end|) »). +1. Return [=!=] [$Call$]([=String.prototype.substring=], |string|, « [=ToJSValue=](|start|), [=ToJSValue=](|end|) »). From a402a3a378cdb2c53c73acde7e681a4efd831ca2 Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Wed, 14 Aug 2024 12:49:14 -0700 Subject: [PATCH 50/70] Add basic support for fromCharCodeArray Needs more detail to properly integrate with GC array ops. --- document/js-api/index.bs | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 6c3d3c4095..1daf657909 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1814,6 +1814,15 @@ The UnwrapString(|v|) abstract operatio +
+ +The FromCharCode(|v|) abstract operation, when invoked, performs the following steps: + +1. Assert: |v| is of type [=i32=]. +1. Return [=!=] [$Call$]([=String.fromCharCode=], undefined, « [=ToJSValue=](|v|) »). + +
+

cast

The |funcType| of this builtin is `(func (param externref) (result externref))`. @@ -1840,7 +1849,25 @@ When this builtin is invoked with parameter |v|, the following steps must be run

fromCharCodeArray

-TODO +The |funcType| of this builtin is `(func (param (ref null (array (mut i16))) i32 i32) (result externref))`. + +
+When this builtin is invoked with parameters |array|, |start|, and |end|, the following steps must be run: + +1. If |array| is nul] + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. If |start| > |end| or |end| > [=array_len=](|array|) + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. Let |result| be the empty string +1. Let |i| be |start| +1. While |i| < |end|: + 1. Let |charCode| be [=array_i16_get=](|array|, |i|). + 1. Let |charCodeString| be [$FromCharCode$](|charCode|). + 1. Let |result| be the concatenation of |result| and |charCodeString|. + 1. Set |i| to |i| + 1. +1. Return |result| + +

intoCharCodeArray

@@ -1853,7 +1880,7 @@ The |funcType| of this builtin is `(func (param i32) (result externref))`.
When this builtin is invoked with parameter |v|, the following steps must be run: -1. Return [=!=] [$Call$]([=String.fromCharCode=], undefined, « [=ToJSValue=](|v|) »). +1. Return [$FromCharCode$](|v|).
From e6462a6a99c098ef0a1037f2f397a625767e6f3b Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Wed, 14 Aug 2024 17:49:48 -0700 Subject: [PATCH 51/70] Make fromCharCodeArray slightly less formal This avoids having to refer to actual Wasm instructions, since after all this is a host function. --- document/js-api/index.bs | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 1daf657909..d80a7d644a 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1851,21 +1851,26 @@ When this builtin is invoked with parameter |v|, the following steps must be run The |funcType| of this builtin is `(func (param (ref null (array (mut i16))) i32 i32) (result externref))`. +Note: This function only takes a mutable i16 array defined in its own recursion group. +If this is an issue for toolchains, we can look into how to relax the function type +while still maintaining good performance. +
When this builtin is invoked with parameters |array|, |start|, and |end|, the following steps must be run: -1. If |array| is nul] +1. If |array| is null 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. -1. If |start| > |end| or |end| > [=array_len=](|array|) +1. Let |length| be the number of elements in |array|. +1. If |start| > |end| or |end| > |length| 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. -1. Let |result| be the empty string -1. Let |i| be |start| +1. Let |result| be the empty string. +1. Let |i| be |start|. 1. While |i| < |end|: - 1. Let |charCode| be [=array_i16_get=](|array|, |i|). + 1. Let |charCode| be the value of the element stored at index |i| in |array|. 1. Let |charCodeString| be [$FromCharCode$](|charCode|). 1. Let |result| be the concatenation of |result| and |charCodeString|. 1. Set |i| to |i| + 1. -1. Return |result| +1. Return |result|.
From 86d7160391b61a49bbb3ea846a2e1b51fbd39c4b Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Wed, 14 Aug 2024 18:13:52 -0700 Subject: [PATCH 52/70] Add intoCharCodeArray --- document/js-api/index.bs | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index d80a7d644a..cf2b6b7938 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1823,6 +1823,15 @@ The FromCharCode(|v|) abstract operatio +
+ +The CharCodeAt(|string|, |index|) abstract operation, when invoked, performs the following steps: + +1. Assert: |index| is of type [=i32=]. +1. Return [=!=] [$Call$]([=String.prototype.charCodeAt=], |string|, « [=ToJSValue=](|index|) »). + +
+

cast

The |funcType| of this builtin is `(func (param externref) (result externref))`. @@ -1876,7 +1885,31 @@ When this builtin is invoked with parameters |array|, |start|, and |end|, the fo

intoCharCodeArray

-TODO +The |funcType| of this builtin is `(func (param externref (ref null (array (mut i16))) i32) (result i32))`. + +Note: This function only takes a mutable i16 array defined in its own recursion group. +If this is an issue for toolchains, we can look into how to relax the function type +while still maintaining good performance. + +
+When this builtin is invoked with parameters |string|, |array|, and |start|, the following steps must be run: + +1. If |array| is null + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. Let |string| be [=?=] [$UnwrapString$](|string|). +1. Let |stringLength| be the [=string/length=] of |string|. +1. Let |arrayLength| be the number of elements in |array|. +1. If |start| + |length| > |arrayLength| + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. Let |i| be 0. +1. While |i| < |stringLength|: + 1. Let |charCode| be [$CharCodeAt$](|string|, |i|). + 1. Set the element at index |start| + |i| in |array| to [=ToWebAssemblyValue=](|charCode|). + 1. Set |i| to |i| + 1. +1. Return |stringLength|. + +
+

fromCharCode

@@ -1913,7 +1946,7 @@ When this builtin is invoked with parameters |string| and |index|, the following 1. Let |length| be the [=string/length=] of |string|. 1. If |index| >= |length| 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. -1. Return [=!=] [$Call$]([=String.prototype.charCodeAt=], |string|, « [=ToJSValue=](|index|) »). +1. Return [$CharCodeAt$](|string|, |index|). From a3c7562be0c8fbe9dd6d289360e6fa26ed3b29cc Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Fri, 16 Aug 2024 11:45:34 -0700 Subject: [PATCH 53/70] Add equals and compare --- document/js-api/index.bs | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index cf2b6b7938..9e739dec5e 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -62,6 +62,8 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: 𝔽; url: #𝔽 text: ℤ; url: #ℤ text: SameValue; url: sec-samevalue + text: IsStrictlyEqual; url: sec-isstrictlyequal + text: IsLessThan; url: sec-islessthan text: String.fromCharCode; url: sec-string.fromcharcode text: String.fromCodePoint; url: sec-string.fromcodepoint text: String.prototype.charCodeAt; url: sec-string.prototype.charcodeat @@ -2007,11 +2009,41 @@ When this builtin is invoked with parameters |string|, |start|, and |end|, the f

equals

-TODO +The |funcType| of this builtin is `(func (param externref externref) (result i32))`. + +Note: Explicitly allow null strings to be compared for equality as that is meaningful. + +
+ +When this builtin is invoked with parameters |first| and |second|, the following steps must be run: + +1. If |first| is not null and [=Type=](|first|) is not [=String=] + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. If |second| is not null and [=Type=](|second|) is not [=String=] + 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. +1. If [=!=] [=IsStrictlyEqual=](|first|, |second|) is true + 1. Return 1. +1. Return 0. + +

compare

-TODO +The |funcType| of this builtin is `(func (param externref externref) (result i32))`. + +
+ +When this builtin is invoked with parameters |first| and |second|, the following steps must be run: + +1. Let |first| be [=?=] [$UnwrapString$](|first|). +1. Let |second| be [=?=] [$UnwrapString$](|second|). +1. If [=!=] [=IsStrictlyEqual=](|first|, |second|) is true + 1. Return 0. +1. If [=!=] [=IsLessThan=](|first|, |second|, true) is true + 1. Return -1. +1. Return 1. + +

Error Condition Mappings to JavaScript

From d7765e664ad02ba7db00b694e54d64bf908da5f5 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 21 Aug 2024 09:05:39 -0500 Subject: [PATCH 54/70] Review fixes for builtin steps --- document/js-api/index.bs | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 9e739dec5e..790efc910a 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1863,8 +1863,6 @@ When this builtin is invoked with parameter |v|, the following steps must be run The |funcType| of this builtin is `(func (param (ref null (array (mut i16))) i32 i32) (result externref))`. Note: This function only takes a mutable i16 array defined in its own recursion group. -If this is an issue for toolchains, we can look into how to relax the function type -while still maintaining good performance.
When this builtin is invoked with parameters |array|, |start|, and |end|, the following steps must be run: @@ -1890,8 +1888,6 @@ When this builtin is invoked with parameters |array|, |start|, and |end|, the fo The |funcType| of this builtin is `(func (param externref (ref null (array (mut i16))) i32) (result i32))`. Note: This function only takes a mutable i16 array defined in its own recursion group. -If this is an issue for toolchains, we can look into how to relax the function type -while still maintaining good performance.
When this builtin is invoked with parameters |string|, |array|, and |start|, the following steps must be run: @@ -1901,7 +1897,7 @@ When this builtin is invoked with parameters |string|, |array|, and |start|, the 1. Let |string| be [=?=] [$UnwrapString$](|string|). 1. Let |stringLength| be the [=string/length=] of |string|. 1. Let |arrayLength| be the number of elements in |array|. -1. If |start| + |length| > |arrayLength| +1. If |start| + |stringLength| > |arrayLength| 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. 1. Let |i| be 0. 1. While |i| < |stringLength|: From 95701693f482ffa00d0c7c7d592b438e9904f745 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 21 Aug 2024 09:19:22 -0500 Subject: [PATCH 55/70] Tweak wording for trapping in builtins --- document/js-api/index.bs | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 790efc910a..4cb2ef530b 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1803,7 +1803,7 @@ To validate an import for builtins with |import|, enabled builtins |b

String Builtins

-String builtins adapt the interface of the String builtin object. The import name for this set is `wasm:js-string`. +String builtins adapt the interface of the [=String=] builtin object. The import name for this set is `wasm:js-string`.

Abstract operations

@@ -1811,7 +1811,7 @@ String builtins adapt the interface of the String builtin object. The import nam The UnwrapString(|v|) abstract operation, when invoked, performs the following steps: 1. If [=Type=](|v|) is not [=String=] - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Return |v|
@@ -1868,10 +1868,10 @@ Note: This function only takes a mutable i16 array defined in its own recursion When this builtin is invoked with parameters |array|, |start|, and |end|, the following steps must be run: 1. If |array| is null - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Let |length| be the number of elements in |array|. 1. If |start| > |end| or |end| > |length| - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Let |result| be the empty string. 1. Let |i| be |start|. 1. While |i| < |end|: @@ -1893,12 +1893,12 @@ Note: This function only takes a mutable i16 array defined in its own recursion When this builtin is invoked with parameters |string|, |array|, and |start|, the following steps must be run: 1. If |array| is null - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Let |string| be [=?=] [$UnwrapString$](|string|). 1. Let |stringLength| be the [=string/length=] of |string|. 1. Let |arrayLength| be the number of elements in |array|. 1. If |start| + |stringLength| > |arrayLength| - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Let |i| be 0. 1. While |i| < |stringLength|: 1. Let |charCode| be [$CharCodeAt$](|string|, |i|). @@ -1928,7 +1928,7 @@ The |funcType| of this builtin is `(func (param i32) (result externref))`. When this builtin is invoked with parameter |v|, the following steps must be run: 1. If |v| > 0x10ffff - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Return [=!=] [$Call$]([=String.fromCodePoint=], undefined, « [=ToJSValue=](|v|) »).
@@ -1943,7 +1943,7 @@ When this builtin is invoked with parameters |string| and |index|, the following 1. Let |string| be [=?=] [$UnwrapString$](|string|). 1. Let |length| be the [=string/length=] of |string|. 1. If |index| >= |length| - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Return [$CharCodeAt$](|string|, |index|).
@@ -1958,7 +1958,7 @@ When this builtin is invoked with parameters |string| and |index|, the following 1. Let |string| be [=?=] [$UnwrapString$](|string|). 1. Let |length| be the [=string/length=] of |string|. 1. If |index| >= |length| - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Return [=!=] [$Call$]([=String.prototype.codePointAt=], |string|, « [=ToJSValue=](|index|) »). @@ -2014,9 +2014,9 @@ Note: Explicitly allow null strings to be compared for equality as that is meani When this builtin is invoked with parameters |first| and |second|, the following steps must be run: 1. If |first| is not null and [=Type=](|first|) is not [=String=] - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. If |second| is not null and [=Type=](|second|) is not [=String=] - 1. Throw a {{RuntimeError}} exception. TODO: this needs to not be catchable, like a trap. + 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. If [=!=] [=IsStrictlyEqual=](|first|, |second|) is true 1. Return 1. 1. Return 0. From 91115df2026c65ba2b9647b6eb4e68e77e812bc7 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 21 Aug 2024 09:34:51 -0500 Subject: [PATCH 56/70] Fix spec to separate builtin name used for imports and compile options --- document/js-api/index.bs | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 4cb2ef530b..6c4fdf2b72 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -434,7 +434,8 @@ To instantiate imported strings with module |module| and |importedStr 1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, 1. Assert: |builtinOrStringImports| does not contain |builtinSetName| 1. Let |exportsObject| be the result of [=instantiate a builtin set=] with |builtinSetName| - 1. [=map/set|Set=] |builtinOrStringImports|[|builtinSetName|] to |exportsObject| + 1. Let |builtinSetQualifiedName| be |builtinSetName| prefixed with "wasm:" + 1. [=map/set|Set=] |builtinOrStringImports|[|builtinSetQualifiedName|] to |exportsObject| 1. If |importedStringModule| is not null, 1. Let |exportsObject| be the result of [=instantiate imported strings=] with |module| and |importedStringModule| 1. [=map/set|Set=] |builtinOrStringImports|[|importedStringModule|] to |exportsObject| @@ -1725,7 +1726,7 @@ Note: It is not currently possible to define this behavior using Web IDL. The JS-API defines sets of builtin functions which can be imported through {{WebAssemblyCompileOptions|options}} when compiling a module. WebAssembly builtin functions mirror existing JavaScript builtins, but adapt them to be useable directly as WebAssembly functions with minimal overhead. -All builtin functions are grouped into sets. Every builtin set has a unique name that [=read the imports|is used during import lookup=]. All names are prefixed by the `wasm:` namespace (e.g. `wasm:js-string`). +All builtin functions are grouped into sets. Every builtin set has a |name| that is used in {{WebAssemblyCompileOptions}}, and a |qualified name| with a `wasm:` prefix that [=read the imports|is used during import lookup=].
To get the builtins for a builtin set with |builtinSetName|, perform the following steps: @@ -1741,7 +1742,8 @@ To find a builtin with |import| and enabled builtins |builtinSetNames 1. Let |importModuleName| be |import|[0] 1. Let |importName| be |import|[1] 1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, - 1. If |importModuleName| equals |builtinSetName| + 1. Let |builtinSetQualifiedName| be |builtinSetName| prefixed with "wasm:" + 1. If |importModuleName| equals |builtinSetQualifiedName| 1. Let |builtins| be the result of [=get the builtins for a builtin set=] |builtinSetName| 1. [=list/iterate|For each=] |builtin| of |builtins| 1. Let |builtinName| be |builtin|[0] @@ -1803,7 +1805,7 @@ To validate an import for builtins with |import|, enabled builtins |b

String Builtins

-String builtins adapt the interface of the [=String=] builtin object. The import name for this set is `wasm:js-string`. +String builtins adapt the interface of the [=String=] builtin object. The |name| for this set is `js-string`.

Abstract operations

From 1937e008a5010b9b5c72519b46e702f87ed0c28a Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 21 Aug 2024 09:41:57 -0500 Subject: [PATCH 57/70] Add note about patching the String builtins --- document/js-api/index.bs | 2 ++ 1 file changed, 2 insertions(+) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 6c4fdf2b72..de61dbd4f5 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1807,6 +1807,8 @@ To validate an import for builtins with |import|, enabled builtins |b String builtins adapt the interface of the [=String=] builtin object. The |name| for this set is `js-string`. +Note: The algorithms in this section refer to JS builtins defined on [=String=]. These refer to the actual builtin and do not perform a dynamic lookup on the [=String=] object. +

Abstract operations

From e1bbe79449c72084e29d0484f2704a6e4500f143 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 23 Aug 2024 09:46:39 -0500 Subject: [PATCH 58/70] Add more tests --- .../js-string/constants.tentative.any.js | 61 +++++++++++++++++++ .../js-api/js-string/imports.tentative.any.js | 26 ++++++++ 2 files changed, 87 insertions(+) create mode 100644 test/js-api/js-string/constants.tentative.any.js create mode 100644 test/js-api/js-string/imports.tentative.any.js diff --git a/test/js-api/js-string/constants.tentative.any.js b/test/js-api/js-string/constants.tentative.any.js new file mode 100644 index 0000000000..ef391a90b7 --- /dev/null +++ b/test/js-api/js-string/constants.tentative.any.js @@ -0,0 +1,61 @@ +// META: global=window,dedicatedworker,jsshell,shadowrealm +// META: script=/wasm/jsapi/wasm-module-builder.js + +// Instantiate a module with an imported global and return the global. +function instantiateImportedGlobal(module, name, type, mutable, importedStringConstants) { + let builder = new WasmModuleBuilder(); + builder.addImportedGlobal(module, name, type, mutable); + builder.addExportOfKind("global", kExternalGlobal, 0); + let bytes = builder.toBuffer(); + let mod = new WebAssembly.Module(bytes, { importedStringConstants }); + let instance = new WebAssembly.Instance(mod, {}); + return instance.exports["global"]; +} + +const badGlobalTypes = [ + [kWasmAnyRef, false], + [kWasmAnyRef, true], + [wasmRefType(kWasmAnyRef), false], + [wasmRefType(kWasmAnyRef), true], + [kWasmFuncRef, false], + [kWasmFuncRef, true], + [wasmRefType(kWasmFuncRef), false], + [wasmRefType(kWasmFuncRef), true], + [kWasmExternRef, true], + [wasmRefType(kWasmExternRef), true], +]; +for ([type, mutable] of badGlobalTypes) { + test(() => { + assert_throws_js(WebAssembly.CompileError, + () => instantiateImportedGlobal("'", "constant", type, mutable, "'"), + "type mismatch"); + }); +} + +const goodGlobalTypes = [ + [kWasmExternRef, false], + [wasmRefType(kWasmExternRef), false], +]; +const constants = [ + '', + '\0', + '0', + '0'.repeat(100000), + '\uD83D\uDE00', +]; +const namespaces = [ + "", + "'", + "strings" +]; + +for (let namespace of namespaces) { + for (let constant of constants) { + for ([type, mutable] of goodGlobalTypes) { + test(() => { + let result = instantiateImportedGlobal(namespace, constant, type, mutable, namespace); + assert_equals(result.value, constant); + }); + } + } +} diff --git a/test/js-api/js-string/imports.tentative.any.js b/test/js-api/js-string/imports.tentative.any.js new file mode 100644 index 0000000000..c357760bef --- /dev/null +++ b/test/js-api/js-string/imports.tentative.any.js @@ -0,0 +1,26 @@ +// META: global=window,dedicatedworker,jsshell,shadowrealm +// META: script=/wasm/jsapi/wasm-module-builder.js + +test(() => { + let builder = new WasmModuleBuilder(); + + // Import a string constant + builder.addImportedGlobal("constants", "constant", kWasmExternRef, false); + + // Import a builtin function + builder.addImport( + "wasm:js-string", + "test", + {params: [kWasmExternRef], results: [kWasmI32]}); + + let buffer = builder.toBuffer(); + let module = new WebAssembly.Module(buffer, { + builtins: ["js-string"], + importedStringConstants: "constants" + }); + let imports = WebAssembly.Module.imports(module); + + // All imports that refer to a builtin module are suppressed from import + // reflection. + assert_equals(imports.length, 0); +}); From 7851a7832b14ec0ddb5103fb003dddaec89df13f Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Fri, 23 Aug 2024 10:05:16 -0500 Subject: [PATCH 59/70] Don't validate that every builtin set option is valid --- document/js-api/index.bs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index de61dbd4f5..6a3ae0c79b 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -433,6 +433,7 @@ To instantiate imported strings with module |module| and |importedStr 1. Let |builtinOrStringImports| be the ordered map « ». 1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, 1. Assert: |builtinOrStringImports| does not contain |builtinSetName| + 1. If |builtinSetName| does not refer to a builtin set, then [=iteration/continue=]. 1. Let |exportsObject| be the result of [=instantiate a builtin set=] with |builtinSetName| 1. Let |builtinSetQualifiedName| be |builtinSetName| prefixed with "wasm:" 1. [=map/set|Set=] |builtinOrStringImports|[|builtinSetQualifiedName|] to |exportsObject| @@ -1742,6 +1743,7 @@ To find a builtin with |import| and enabled builtins |builtinSetNames 1. Let |importModuleName| be |import|[0] 1. Let |importName| be |import|[1] 1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, + 1. If |builtinSetName| does not refer to a builtin set, then [=iteration/continue=]. 1. Let |builtinSetQualifiedName| be |builtinSetName| prefixed with "wasm:" 1. If |importModuleName| equals |builtinSetQualifiedName| 1. Let |builtins| be the result of [=get the builtins for a builtin set=] |builtinSetName| @@ -1756,8 +1758,6 @@ To find a builtin with |import| and enabled builtins |builtinSetNames To validate builtin set names with |builtinSetNames|, perform the following steps: 1. If |builtinSetNames| contains any duplicates, return false. -1. [=list/iterate|For each=] |builtinSetName| of |builtinSetNames|, - 1. If |builtinSetName| does not equal the name of one of the builtin sets defined in this section, return false. 1. Return false.
From ab1be326381cf7e9992f6205a404f25aad0d8e9a Mon Sep 17 00:00:00 2001 From: Adam Klein Date: Wed, 4 Sep 2024 10:43:10 -0700 Subject: [PATCH 60/70] Fix null-handling for `equals` in test and polyfill Equals specifically allows null inputs. Update the JS API tests and the polyfill to match. --- test/js-api/js-string/basic.tentative.any.js | 4 ++-- test/js-api/js-string/polyfill.js | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/js-api/js-string/basic.tentative.any.js b/test/js-api/js-string/basic.tentative.any.js index 6275aacd5f..de4a21c976 100644 --- a/test/js-api/js-string/basic.tentative.any.js +++ b/test/js-api/js-string/basic.tentative.any.js @@ -168,7 +168,7 @@ function assert_throws_if(func, shouldThrow, constructor) { } catch (e) { error = e; } - assert_equals(error !== null, shouldThrow); + assert_equals(error !== null, shouldThrow, "shouldThrow mismatch"); if (shouldThrow && error !== null) { assert_true(error instanceof constructor); } @@ -275,7 +275,7 @@ test(() => { builtinExports['equals'], polyfillExports['equals'], a, a - ), !isString, WebAssembly.RuntimeError); + ), a !== null && !isString, WebAssembly.RuntimeError); assert_throws_if(() => assert_same_behavior( builtinExports['compare'], diff --git a/test/js-api/js-string/polyfill.js b/test/js-api/js-string/polyfill.js index e18236899d..7a00d4285d 100644 --- a/test/js-api/js-string/polyfill.js +++ b/test/js-api/js-string/polyfill.js @@ -155,8 +155,8 @@ this.polyfillImports = { return string.substring(startIndex, endIndex); }, equals: (stringA, stringB) => { - throwIfNotString(stringA); - throwIfNotString(stringB); + if (stringA !== null) throwIfNotString(stringA); + if (stringB !== null) throwIfNotString(stringB); return stringA === stringB; }, compare: (stringA, stringB) => { From 4553771999cb239a28536f449b25a2873fddb39a Mon Sep 17 00:00:00 2001 From: Michael Ficarra Date: Fri, 8 Nov 2024 11:04:53 -0700 Subject: [PATCH 61/70] Editorial: replace Type AO with new ECMA-262 type test convention --- document/js-api/index.bs | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 6a3ae0c79b..a0741b624c 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -40,6 +40,24 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT text: ! text: ? text: Type; url: sec-ecmascript-data-types-and-values + url: sec-ecmascript-language-types-bigint-type + text: is a BigInt + text: is not a BigInt + url: sec-ecmascript-language-types-boolean-type + text: is a Boolean + text: is not a Boolean + url: sec-ecmascript-language-types-number-type + text: is a Number + text: is not a Number + url: sec-ecmascript-language-types-string-type + text: is a String + text: is not a String + url: sec-ecmascript-language-types-symbol-type + text: is a Symbol + text: is not a Symbol + url: sec-object-type + text: is an Object + text: is not an Object text: current Realm; url: current-realm text: ObjectCreate; url: sec-objectcreate text: CreateBuiltinFunction; url: sec-createbuiltinfunction @@ -446,7 +464,7 @@ To instantiate imported strings with module |module| and |importedStr 1. Let |o| be |builtinOrStringImports|[|moduleName|] 1. Else, 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). - 1. If [=Type=](|o|) is not Object, throw a {{TypeError}} exception. + 1. If |o| [=is not an Object=], throw a {{TypeError}} exception. 1. Let |v| be [=?=] [$Get$](|o|, |componentName|). 1. If |externtype| is of the form [=external-type/func=] |functype|, 1. If [$IsCallable$](|v|) is false, throw a {{LinkError}} exception. @@ -461,9 +479,9 @@ To instantiate imported strings with module |module| and |importedStr 1. If |v| [=implements=] {{Global}}, 1. Let |globaladdr| be |v|.\[[Global]]. 1. Otherwise, - 1. If |valtype| is [=i64=] and [=Type=](|v|) is not BigInt, + 1. If |valtype| is [=i64=] and |v| [=is not a BigInt=], 1. Throw a {{LinkError}} exception. - 1. If |valtype| is one of [=i32=], [=f32=] or [=f64=] and [=Type=](|v|) is not Number, + 1. If |valtype| is one of [=i32=], [=f32=] or [=f64=] and |v| [=is not a Number=], 1. Throw a {{LinkError}} exception. 1. If |valtype| is [=v128=], 1. Throw a {{LinkError}} exception. @@ -1814,7 +1832,7 @@ Note: The algorithms in this section refer to JS builtins defined on [=String=]. The UnwrapString(|v|) abstract operation, when invoked, performs the following steps: -1. If [=Type=](|v|) is not [=String=] +1. If |v| [=is not a String=] 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. Return |v| @@ -1856,7 +1874,7 @@ The |funcType| of this builtin is `(func (param externref) (result i32))`.
When this builtin is invoked with parameter |v|, the following steps must be run: -1. If [=Type=](|v|) is not [=String=] +1. If |v| [=is not a String=] 1. Return 0 1. Return 1 @@ -2017,9 +2035,9 @@ Note: Explicitly allow null strings to be compared for equality as that is meani When this builtin is invoked with parameters |first| and |second|, the following steps must be run: -1. If |first| is not null and [=Type=](|first|) is not [=String=] +1. If |first| is not null and |first| [=is not a String=] 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. -1. If |second| is not null and [=Type=](|second|) is not [=String=] +1. If |second| is not null and |second| [=is not a String=] 1. Throw a {{RuntimeError}} exception as if a [=trap=] was executed. 1. If [=!=] [=IsStrictlyEqual=](|first|, |second|) is true 1. Return 1. From dd601fb2f298119e1cc0f64c44cbcf45ab6d3ef7 Mon Sep 17 00:00:00 2001 From: Michael Ficarra Date: Fri, 8 Nov 2024 11:05:54 -0700 Subject: [PATCH 62/70] actually remove Type --- document/js-api/index.bs | 1 - 1 file changed, 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index a0741b624c..01dfc58e72 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -39,7 +39,6 @@ urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT url: sec-returnifabrupt-shorthands text: ! text: ? - text: Type; url: sec-ecmascript-data-types-and-values url: sec-ecmascript-language-types-bigint-type text: is a BigInt text: is not a BigInt From 6cbeb82bf66197467d61c5298d90923848f87688 Mon Sep 17 00:00:00 2001 From: Guy Bedford Date: Tue, 11 Feb 2025 13:07:18 -0800 Subject: [PATCH 63/70] support per-component polyfill fallbacks --- document/js-api/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 01dfc58e72..56d5d581a6 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -459,7 +459,7 @@ To instantiate imported strings with module |module| and |importedStr 1. [=map/set|Set=] |builtinOrStringImports|[|importedStringModule|] to |exportsObject| 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), - 1. If |builtinOrStringImports| [=map/exist|contains=] |moduleName|, + 1. If |builtinOrStringImports| [=map/exist|contains=] |moduleName| and |builtinOrStringImports|[|moduleName|] [=map/exist|contains=] |componentName|, 1. Let |o| be |builtinOrStringImports|[|moduleName|] 1. Else, 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). From df1f6dfd4dafd8a7f5eb7ede43f27b2e56289ed3 Mon Sep 17 00:00:00 2001 From: Guy Bedford Date: Tue, 11 Feb 2025 13:17:51 -0800 Subject: [PATCH 64/70] spec rework --- document/js-api/index.bs | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 56d5d581a6..f96f2258a3 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -459,8 +459,10 @@ To instantiate imported strings with module |module| and |importedStr 1. [=map/set|Set=] |builtinOrStringImports|[|importedStringModule|] to |exportsObject| 1. Let |imports| be « ». 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), - 1. If |builtinOrStringImports| [=map/exist|contains=] |moduleName| and |builtinOrStringImports|[|moduleName|] [=map/exist|contains=] |componentName|, - 1. Let |o| be |builtinOrStringImports|[|moduleName|] + 1. If |builtinOrStringImports| [=map/exist|contains=] |moduleName|, + 1. Let |o| be |builtinOrStringImports|[|moduleName|]. + 1. If |o| [=is not an Object=] of if |o| [=map/exist|does not contain=] |componentName|, + 1. Set |o| to [=?=] [$Get$](|importObject|, |moduleName|). 1. Else, 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). 1. If |o| [=is not an Object=], throw a {{TypeError}} exception. From 9cdd245eb75af912dc6055fae2dec620c468f8cb Mon Sep 17 00:00:00 2001 From: Guy Bedford Date: Tue, 11 Feb 2025 13:24:15 -0800 Subject: [PATCH 65/70] typo --- document/js-api/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index f96f2258a3..8c8310decb 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -461,7 +461,7 @@ To instantiate imported strings with module |module| and |importedStr 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), 1. If |builtinOrStringImports| [=map/exist|contains=] |moduleName|, 1. Let |o| be |builtinOrStringImports|[|moduleName|]. - 1. If |o| [=is not an Object=] of if |o| [=map/exist|does not contain=] |componentName|, + 1. If |o| [=is not an Object=] or if |o| [=map/exist|does not contain=] |componentName|, 1. Set |o| to [=?=] [$Get$](|importObject|, |moduleName|). 1. Else, 1. Let |o| be [=?=] [$Get$](|importObject|, |moduleName|). From 7d0b519ef4dd966f22d8ad6e9da798030cdc4efb Mon Sep 17 00:00:00 2001 From: autokagami Date: Thu, 22 Aug 2024 03:02:42 +0200 Subject: [PATCH 66/70] Editorial: Align with Web IDL specification --- document/js-api/index.bs | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 8c8310decb..be030155bd 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -335,11 +335,11 @@ dictionary WebAssemblyCompileOptions { [Exposed=*] namespace WebAssembly { - boolean validate(BufferSource bytes, optional WebAssemblyCompileOptions options); - Promise<Module> compile(BufferSource bytes, optional WebAssemblyCompileOptions options); + boolean validate(BufferSource bytes, optional WebAssemblyCompileOptions options = {}); + Promise<Module> compile(BufferSource bytes, optional WebAssemblyCompileOptions options = {}); Promise<WebAssemblyInstantiatedSource> instantiate( - BufferSource bytes, optional object importObject, optional WebAssemblyCompileOptions options); + BufferSource bytes, optional object importObject, optional WebAssemblyCompileOptions options = {}); Promise<Instance> instantiate( Module moduleObject, optional object importObject); @@ -655,7 +655,7 @@ dictionary ModuleImportDescriptor { [LegacyNamespace=WebAssembly, Exposed=*] interface Module { - constructor(BufferSource bytes, optional WebAssemblyCompileOptions options); + constructor(BufferSource bytes, optional WebAssemblyCompileOptions options = {}); static sequence<ModuleExportDescriptor> exports(Module moduleObject); static sequence<ModuleImportDescriptor> imports(Module moduleObject); static sequence<ArrayBuffer> customSections(Module moduleObject, DOMString sectionName); From 753b7a159451c1292abff98f97b60eaa1240d72d Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 25 Jun 2025 15:48:51 -0500 Subject: [PATCH 67/70] [js-api] Fix incorrect return statement in 'validate builtin set names' --- document/js-api/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index be030155bd..89a6789ea1 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -1777,7 +1777,7 @@ To find a builtin with |import| and enabled builtins |builtinSetNames To validate builtin set names with |builtinSetNames|, perform the following steps: 1. If |builtinSetNames| contains any duplicates, return false. -1. Return false. +1. Return true.
From 70a3869f42809d818cf8229650eba07c6efdce8b Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 25 Jun 2025 15:52:13 -0500 Subject: [PATCH 68/70] [js-api] Remove copy-pasted if statements --- document/js-api/index.bs | 2 -- 1 file changed, 2 deletions(-) diff --git a/document/js-api/index.bs b/document/js-api/index.bs index 89a6789ea1..1ecfaa5263 100644 --- a/document/js-api/index.bs +++ b/document/js-api/index.bs @@ -434,7 +434,6 @@ A {{Module}} object represents a single WebAssembly module. Each {{Module}} obje To instantiate imported strings with module |module| and |importedStringModule|, perform the following steps: 1. Assert: |importedStringModule| is not null. 1. Let |exportsObject| be [=!=] [$OrdinaryObjectCreate$](null). - 1. If |externtype| is of the form [=external-type/func=] functype, 1. [=list/iterate|For each=] (|moduleName|, |componentName|, |externtype|) of [=module_imports=](|module|), 1. If |moduleName| does not equal |importedStringModule|, then [=iteration/continue=]. 1. Let |stringConstant| be |componentName|. @@ -1797,7 +1796,6 @@ To instantiate a builtin set with name |builtinSetName|, perform the 1. Let |builtins| be the result of [=get the builtins for a builtin set=] |builtinSetName| 1. Let |exportsObject| be [=!=] [$OrdinaryObjectCreate$](null). - 1. If |externtype| is of the form [=external-type/func=] functype, 1. [=list/iterate|For each=] (|name|, |funcType|, |steps|) of |builtins|, 1. Let |funcaddr| be the result fo [=create a builtin function=] with |funcType| and |steps| 1. Let |func| be the result of creating [=a new Exported Function=] from |funcaddr|. From b0816d4deaf7387f778481ac74329b5e07032de5 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Wed, 25 Jun 2025 17:11:06 -0500 Subject: [PATCH 69/70] Revert changes to conf and README to prepare for merging --- README.md | 12 ------------ document/core/conf.py | 4 ++-- 2 files changed, 2 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index cc93baefc3..cbc17b462f 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,6 @@ [![CI for specs](https://github.com/WebAssembly/spec/actions/workflows/ci-spec.yml/badge.svg)](https://github.com/WebAssembly/spec/actions/workflows/ci-spec.yml) [![CI for interpreter & tests](https://github.com/WebAssembly/spec/actions/workflows/ci-interpreter.yml/badge.svg)](https://github.com/WebAssembly/spec/actions/workflows/ci-interpreter.yml) -# JS String Builtins Proposal for WebAssembly - -This repository is a clone of [github.com/WebAssembly/spec/](https://github.com/WebAssembly/spec/). -It is meant for discussion, prototype specification and implementation of a proposal to -add support for efficient access to JS string operations to WebAssembly. - -* See the [overview](proposals/js-string-builtins/Overview.md) for a summary of the proposal. - -* See the [modified spec](https://webassembly.github.io/js-string-builtins/) for details. - -Original `README` from upstream repository follows... - # spec This repository holds the sources for the WebAssembly specification, diff --git a/document/core/conf.py b/document/core/conf.py index f021479776..cc2f6bb484 100644 --- a/document/core/conf.py +++ b/document/core/conf.py @@ -66,10 +66,10 @@ logo = 'static/webassembly.png' # The name of the GitHub repository this resides in -repo = 'js-string-builtins' +repo = 'spec' # The name of the proposal it represents, if any -proposal = 'js-string-builtins' +proposal = '' # The draft version string (clear out for release cuts) draft = ' (Draft ' + date.today().strftime("%Y-%m-%d") + ')' From 48e869aa2bd8b2f568ef09eb7e980cf839dca432 Mon Sep 17 00:00:00 2001 From: Ryan Hunt Date: Thu, 26 Jun 2025 09:43:28 -0500 Subject: [PATCH 70/70] [tests] Remove tentative from js-string tests --- test/js-api/js-string/{basic.tentative.any.js => basic.any.js} | 0 .../js-string/{constants.tentative.any.js => constants.any.js} | 0 .../js-api/js-string/{imports.tentative.any.js => imports.any.js} | 0 3 files changed, 0 insertions(+), 0 deletions(-) rename test/js-api/js-string/{basic.tentative.any.js => basic.any.js} (100%) rename test/js-api/js-string/{constants.tentative.any.js => constants.any.js} (100%) rename test/js-api/js-string/{imports.tentative.any.js => imports.any.js} (100%) diff --git a/test/js-api/js-string/basic.tentative.any.js b/test/js-api/js-string/basic.any.js similarity index 100% rename from test/js-api/js-string/basic.tentative.any.js rename to test/js-api/js-string/basic.any.js diff --git a/test/js-api/js-string/constants.tentative.any.js b/test/js-api/js-string/constants.any.js similarity index 100% rename from test/js-api/js-string/constants.tentative.any.js rename to test/js-api/js-string/constants.any.js diff --git a/test/js-api/js-string/imports.tentative.any.js b/test/js-api/js-string/imports.any.js similarity index 100% rename from test/js-api/js-string/imports.tentative.any.js rename to test/js-api/js-string/imports.any.js