IEEE-754 Functions
Nouveau dans la version 0.12.
IEEE-754 is a standard for the representation of and computations with floating point numbers in binary systems. It is widely used by floating point implementations in CPUs. These functions implement encoding and decoding binary representations of floating point numbers according to IEEE-754.
An IEEE-754 binary float consists of three parts: a sign bit, the exponent and the significand
(sometimes called the mantissa). From these parts, the value is then calculated using the
following formula: -1 ^ sign * 2 ^ (exponent - bias) * 1.significand.  The standard defines
multiple binary formats of different sizes that all follow these rules, but differ in
the number of bits allocated for the exponent and significand. The bias for the default
formats is defined as bias = (2 ^ (exponent_bits - 1)) - 1.
See this article for a more detailed introduction into the subject.
The following binary float formats are defined by the standard:
| Name | Also known as | Exponent bits | Significand bits | 
|---|---|---|---|
| binary16 | Half precision | 5 | 10 | 
| binary32 | Single precision | 8 | 23 | 
| binary64 | Double precision | 11 | 52 | 
| binary128 | Quad precision | 15 | 112 | 
In many programming languages, the binary32 format is available as float
and binary64 is available as double.
- 
ieee754_encode(x; exponent_bits; significand_bits[; exponent_bias])
- Encode a floating point number into a IEEE-754 binary representation. - Parameters: - x – The floating point value to encode.
- exponent_bits – The length of the exponent part, in bits.
- significand_bits – The length of the significand part, in bits.
- exponent_bias – The exponent bias to use. Derived from the length of the exponent if not specified.
 
- 
ieee754_decode(x; exponent_bits; significand_bits[; exponent_bias])
- Calculate the value of an IEEE-754 binary float. - Parameters: - x – The binary float to decode.
- exponent_bits – The length of the exponent part, in bits.
- significand_bits – The length of the significand part, in bits.
- exponent_bias – The exponent bias to use. Derived from the length of the exponent if not specified.
 
- 
ieee754_half_encode(x)
- Encode - xin the half-precision binary format.
- 
ieee754_half_decode(x)
- Decode the half-precision binary float - x.
- 
ieee754_single_encode(x)
- Encode - xin the single-precision binary format.
- 
ieee754_single_decode(x)
- Decode the single-precision binary float - x.
- 
ieee754_double_encode(x)
- Encode - xin the double-precision binary format.
- 
ieee754_double_decode(x)
- Decode the double-precision binary float - x.
- 
ieee754_quad_encode(x)
- Encode - xin the quad-precision binary format.
- 
ieee754_quad_decode(x)
- Decode the quad-precision binary float - x.