It is supposed to be a summary of my findings for the open source project xtuc/js-webassembly-interpreter1.
This is in contrast to statically typed languages like C, C++ or Java.
Everything is abstracted as a
// So simple, I love it typeof 7 // "number" typeof .3e-10 // "number" typeof 0x234 // "number" // Okay I can live with this typeof Infinity // "number" // Now wtf is this typeof NaN // "number"
This post is not about the
typeof operator though.
I think it makes perfect sense that “Not-a-Number” is a number and we might see why in a later post.
The takeway here is that all these values are of the same primitive type.
- Integers and floats in bits
- How are floating point numbers represented?
- integers exist in 64-bit floating point?
- Not all is lost
Integers and floats in bits
I used subscripts to indicate that the result is to be read as the number eleven (base 10), but the input as a binary string. As you can see, each bit can add a power of two corresponding to its position in the bit string. If you find this confusing, think about how we write down numbers and determine their values. You will see that you do this everyday but with powers of ten instead of two.
The representation above is simple and unique for all natural numbers. That is great, but we might want negative numbers too. The simplest solution would be to store one additional bit and let it decide about the sign of the value. That is called sign-magnitude representation and can be done, but it turns out that it makes hardware operating on those bit strings unnecessarily complicated. That is why a different representation called two’s complement is used. If you’re interested in how that works, I have written about it before2 but its details are not necessary to understand integers in JS.
Assume we have a way to represent all integers, positive and negative.
That is still not good enough, because the real world deals with real numbers (not really, if you’re into math).
We definitely want to have values like
94.53471 to work with data in any meaningful way.
These numbers are called floating point numbers.
Two’s complement cannot handle these, but there is a widely accepted standard called IEEE754
which describes how to represent floating point numbers with bit strings.
To summarize: A sequence of bits is not a number. Different ways of interpreting the same bit string can yield different values. The rules of interpretation are defined by the type associated to the value. Conversely, the same value can be represented by a different bit string depending on its type.
“The Number type has exactly (that is,) values, representing the double-precision 64-bit format IEEE 754-2008 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic.”
Your reaction to this statement depends mainly on your background: If you come from a place where bits and bytes are too close to the metal for you to be interesting, then you might consider this fact as perfectly reasonable. Floating point can do everything we need, so why not have a nice abstraction for it and never worry about tricky conversion rules between number types. However, if you program in C or other languages that sit just above the assembly level and deal more directly with memory you probably think that this is just crazy. I personally do not think that this design is particularly good or bad, but I want to understand its consequences.
What range of integers can be encoded using 64-bit floating point representation?
But it is also clear that there are less integers than with a 64-bit two’s complement representation
which is used for large integers in C for example.
One bit string can only represent one number after all.
So the fact that
0.5 exists in JS means that there is at least one 64-bit integer that is missing in JS.
Don’t think about this too hard,
we will count all quantities properly in the next paragraph.
How are floating point numbers represented?
In order to understand what integers exist in floating point, we need a clear understanding of how floating point numbers are encoded. Once we know the meaning of each of the 64 bits, we can look at which of these combinations represent integers. A floating point number is represented by three parts:
- A sign bit ( s ) indicating whether the number is positive or negative.
- A number ( m ) between ( 1 ) and ( 2 ) called mantissa or significand.
- An integer ( e ) called the exponent.
The idea behind these numbers is easy to explain: The mantissa is some decimal, i.e. and the exponent moves the decimal point to the left or right in that number. The sign bit can be used to add a minus sign in front of it. The advantage of this representation is that it allows us to go from very small to very large values without loosing precision. If you want to dig further into this, I suggest you compare this representation to fixed point arithmetic4.
Now that we understand the idea it is time to look at it bit for bit. Above I cheated by saying how we interpret the three numbers but not how they are encoded in bits. According to the standard, one bit is used for , bits are for and bits for which adds up to the total 64 bits we want. The sign bit is the easiest, means positive and means negative. The exponent is an integer so we can use 11-bit two’s complement to encode it as bit string5. The mantissa is between and , so we use a simple trick to store it: Ignore the leading part and just store the rest as a binary fraction. A binary fraction works just like decimal fraction:
Again, note how a sequence of bits is not a number but can be interpreted as different numbers. The exponent moves the point in the binary fraction, not the decimal one which is different. So if we have the mantissa as above and exponent , the represented value is . Positive exponents shift the point to the right and negative to the left. In case you are wondering how the value relates to 13: With the implicit leading this value represents the mantissa . Shifting the point 3 places to the right is the same as multiplying by , yielding a result of .
What integers exist in 64-bit floating point?
So we found the representation of the integer . What is the condition for the result to become an integer? Any binary digit after the fractional point has to be zero. As we have seen the binary representation of the mantissa always starts with followed by digits. So if the binary representation of a natural number is of length we can represent that number in floating point by setting the mantissa bits to and the exponent to be . This is exactly what we did for above. The exact same procedure works as long as we manage to shift all digits to the left of the decimal point, so this works for natural numbers whose binary representations are up to digits long.
The maximum integer in 53-bit binary is
so with the method above including the sign bit we can represent all integers .
This includes the range of 32-bit signed integers which goes from .
Good news, it means that the range is not smaller than working with
int’s in C for example.
If you have a feeling for exponentials or powers of two,
you can see that 64-bit floating point in fact has a lot more integers than 32-bit integers.
But then it also has a considerably smaller range of integers than 64-bit integer representation6.
Not all is lost
I have not shown all integers here. For example can be represented by setting the mantissa to and the exponent to . But there are “wholes” once you go past the range described so these values are less useful to work with. ↩