Make generated JavaScript Inline Caching friendly using types in BuckleScript version 8
An overview of new data representation in BuckleScript
In the next version of BuckleScript, we will make several major changes to tweak the data representation for various data types, making them more idiomatic and debugger friendly.
Note: since V8 or other JavaScript engines are tweaked to make idiomatic JS code run fast, these changes also results in faster running code.
Another property, for a compiled language like BuckleScript, is that we can reason about IC friendliness by just looking at the type definitions locally; this is very helpful for advanced users to write performance predictable JS code.
This is a nice introduction to inline caching if you are unfamiliar with this topic, or you can skip the section about IC and come back later when you get more familiar.
Note: this article is quite dense, so we will skip the old encoding.
Record (stable)
BuckleScript has compiled records to idiomatic JS objects since version 7. This is great for performance and debugging. We also support label renaming to shorten field names to save space.
Take the code below for example:
REtype int64 = {
loBits : int [@bs.as "lo"],
hiBits : int [@bs.as "hi]
}
let value = {hiBits : 33 , loBits : 32 };
let rand = ({loBits; hiBits}) => loBits + hiBits;
It will generate JS output as below:
JSvar value = {lo : 32, hi : 33}
function rand (param){
return param.lo + param.hi
}
If users want to make it even shorter, in next version, they can choose to compile record as an array, as follows:
REtype int64 = {
[@bs.as "0"] loBits : int,
[@bs.as "1"] hiBits : int
}
let value = {hiBits : 33 , loBits : 32 };
let rand = ({loBits, hiBits}) => loBits + hiBits;
This produces the following JS output:
JSvar value = [32,33]
function rand(param){
return param[0] + param[1]
}
The label renaming techniques can be applied systematically using a syntactic macro; in the future we may provide an advanced mode to apply it automatically. Another nice property is that only the type definition needs to be adapted; other parts of code remain untouched.
IC friendliness
Records are always in a perfect position for Inline Caching(IC); the compiler can ensure all generated records are of the same shape.
Variant (internal)
This encoding for variants may be subject to change in the future, but it is so simple that it makes sense for users to have a basic understanding.
Take the following type definition, for example:
REtype t =
| Black(t, int, t)
| Red(t, int, t)
| Empty;
let empty = Empty ;
let v0 = Black (empty, 3, empty);
let v1 = Red (empty, 3, empty);
The generated JS code would be:
JSvar empty =/*Empty*/ 0;
var v0 = {TAG : 0/*Black*/, _0 : /*Empty*/ 0 , _1 : 3 , _2 : /*Empty */ 0};
var v1 = {TAG : 1/*Red*/, _0 : /*Empty*/ 0 , _1 : 3 , _2 : /*Empty */ 0};
As you can see, variants are divided into two categories.
Variants which do not have a payload are compiled into a number starting from 0,
while variants which have a payload are compiled into an object which has the first slot named TAG
and the following slots named as _0
, _1
..
variant with inline records
Users can give names to the payload, and the compiler respects those names. However, we don't support user-level renaming, i.e, using bs.as
, at this time.
REtype t =
| Black ({l:t, value: int, r: t})
| Red({l:t, value: int, r: t})
| Empty;
let empty = Empty ;
let v0 = Black ({l:empty, value: 3, r:empty});
let v1 = Red ({l:empty, value:3, r: empty});
The generated JS code would be:
JSvar empty =/*Empty*/ 0;
var v0 = {TAG : 0/*Black*/, l: /*Empty*/ 0 , value : 3 , r: : /*Empty */ 0};
var v1 = {TAG : 1/*Red*/, l : /*Empty*/ 0 , value : 3 , r : /*Empty */ 0};
Special case when the number of variants which has payload is only 1.
Take the types below for example:
REtype list =
| Nil
| Cons (int , list);
Since only one variant has a payload, the compiler does not need add TAG
when we destructure the data for pattern matching.
Thus, the following code:
RElet u = Cons(1,Nil)
will generate the following JS output:
JSvar u = {_0: 1, _1 : /*Nil*/ 0 }; // No TAG data.
Specialized for immutable list
The list
type is a built-in type; its type definition is similar to this :
REtype t ('a) =
| []
| (::) ('a , t ('a))
Without any customization, this would generate JS objects with indexes like _0
, _1
, etc.
Since lists are so pervasive, we provide some special treatment so that
RElet u = [0,1,2,3]
Will generate js code as below:
JSvar u = {hd : 0, {tl : {hd : 1, {tl : {hd: 2, {tl : {hd :3 , tl : /*[]*/0 }}}}}}}
This is a minor change; we changed the name of _0
to hd
and _1
to tl
.
IC friendliness
Types that have only one variant with a payload are in a perfect position for IC.
The number of variants which does not carry payload will not affect IC, since the pattern match will do a split first.
Types with variants that have the same number of payloads, such as the red-black-tree example above, are also in a perfect position for IC.
For other cases, it will hit a polymorphic IC in the V8 jit compiler, this is not the fastest running case.
Note that you can always tweak the variant layout to make it IC friendly. For example, you can always introduce one level of indirection to make all variants share the same number of payloads:
REtype t =
| A0 (a0) // 1 payload
| A1 (a1) // 1 payload
| A2 (a2) // 1 payload
| C0
| C1
| C2 // This will not affect IC
Variant in debug mode
Note we only generate constructor names in comments for debugging.
When constructor names are attached to the data, it will be more useful for debugging. When debug mode is activated using -bs-g
,
The generated code will be changed from below
JSvar v0 = {TAG : 0/*Black*/, _0 : /*Empty*/ 0 , _1 : 3 , _2 : /*Empty */ 0};
to
JSvar v0 = {TAG : 0, _0 : /*Empty*/ 0 , _1 : 3 , _2 : /*Empty */ 0, [Symbol.for("name")]: "Black"};
Polymorphic-variant (internal)
Polymorphic variants allow users to use the types without declaring them first:
RElet u = 3 -> `hello
It will generate
JSvar u = {HASH : MAGIC_NUMBER, VAL: 3 }
The field of HASH
is the hash of name "hello"
, while the VAL
is the payload
IC friendliness
Polymorphic variants are always in a perfect position for IC, the compiler can ensure all generated objects are of the same shape. This is because the payload is not unpacked; it is always just one payload.
Polymorphic variant in debug mode
In debug mode, similar to variant, we carry the name in generated code for debugging,
So instead of
JSvar u = {HASH : MAGIC_NUMBER, VAL: 3 }
It will generate
JSvar u = {HASH : MAGIC_NUMBER, VAL: 3 , [Symbol.for("name")] : "hello"}
Conclusion
BuckleScript users will profit from better runtime performance and a better debugging experience for various data types, such as variants, exceptions, lazy values and more since version 8.0.
Happy Hacking!