Class Float128

  • All Implemented Interfaces:
    Serializable

    public class Float128
    extends Number
    Float128 stores immutable floating point values, with a 16 bits signed exponent, 120 bits fraction, and one sign bit. It has arithmetic for addition, subtraction, multiplication and division, as well as several Math operators such as signum and abs. The fraction follows the implementation of the IEEE-754 standard, which means that the initial '1' is not stored in the fraction.

    Copyright (c) 2020-2021 Delft University of Technology, Jaffalaan 5, 2628 BX Delft, the Netherlands. All rights reserved. See for project information https://djutils.org. The DJUTILS project is distributed under a three-clause BSD-style license, which can be found at https://djutils.org/docs/license.html.

    Author:
    Alexander Verbraeck, Peter Knoppers
    See Also:
    Serialized Form
    • Constructor Detail

      • Float128

        public Float128​(double d)
        Create a Float128 based on a double. The IEEE-754 double is built up as follows:
        • bit 63 [0x8000_0000_0000_0000L]: sign bit(1-bit)
        • bits 62-52 [0x7ff0_0000_0000_0000L]: exponent (11-bit), stored as a the 2-exponent value + 1022.
        • - exponent 000 and fraction == 0: signed zero
        • - exponent 000 and fraction != 0: underflow
        • - exponent 111 and fraction == 0: infinity
        • - exponent 111 and fraction != 0: NaN
        • bits 51-0 [0x000f_ffff_ffff_ffffL]: fraction (52-bit)
        Parameters:
        d - double; the double to store
    • Method Detail

      • plus

        public Float128 plus​(Float128 value)
        Add a Float128 value to this value. Addition works as follows: suppose you add 10 and 100 (decimal).
        v1 = 10 = 0x(1)01000000p3 and v2 = 0x(1)100100000p6. These are the numbers behind the initial (1) before the decimal point that is part of the Float128 in bit 60.
        Shift the lowest value (including the leading 1) 3 bits to the right, and add:
         0x(0)0010100000p6
         0x(1)1001000000p6
         -----------------+
         0x(1)1011100000p6
         
        The last number indeed represents the value 110.
        Parameters:
        value - Float128; the value to add
        Returns:
        Float128; the sum of this Float128 and the given value
      • shift

        protected void shift​(long[] v,
                             int bits)
        Shift the bits to the right for the variable v.
        Parameters:
        v - long[]; the variable stored as two longs
        bits - int; the number of bits to shift 'down'. bits HAS to be >= 0.
      • plus

        public Float128 plus​(double value)
        Add a double value to this value.
        Parameters:
        value - double; the value to add
        Returns:
        Float128; the sum of this Float128 and the given value
      • intValue

        public int intValue()
        Specified by:
        intValue in class Number
      • longValue

        public long longValue()
        Specified by:
        longValue in class Number
      • floatValue

        public float floatValue()
        Specified by:
        floatValue in class Number
      • doubleValue

        public double doubleValue()
        Specified by:
        doubleValue in class Number
      • isZero

        public boolean isZero()
        Return whether the stored value is a signed zero.
        Returns:
        boolean; whether the stored value is signed zero
      • isNaN

        public boolean isNaN()
        Return whether the stored value is NaN.
        Returns:
        boolean; whether the stored value is NaN
      • isInfinite

        public boolean isInfinite()
        Return whether the stored value is infinite.
        Returns:
        boolean; whether the stored value is infinite
      • isFinite

        public boolean isFinite()
        Return whether the stored value is finite.
        Returns:
        boolean; whether the stored value is finite
      • isPositive

        public boolean isPositive()
        Return whether the stored value is positive.
        Returns:
        boolean; whether the stored value is positive
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • toPaddedBinaryString

        public String toPaddedBinaryString()
        Return the binary string representation of this Float128 value.
        Returns:
        String; the binary string representation of this Float128 value
      • toBinaryString

        public String toBinaryString()
        Return the binary string representation of this Float128 value.
        Returns:
        String; the binary string representation of this Float128 value
      • doubleTotring

        public static String doubleTotring​(double d)
        A test for a toString() method for a double.
        Parameters:
        d - double; the value
        Returns:
        String; the decimal 17-digit scientific notation String representation of the double
      • of

        public static Float128 of​(double d)
        Create a Float128 from this double with a significand precision of 52 bits.
        Parameters:
        d - double; the double value
        Returns:
        Float128; a Float128 from this double with a significand precision of 52 bits
      • of

        public static Float128 of​(String sd)
        Create a Float128 represented by this String with a significand precision up to 120 bits. Up to 39 significant digits will be used to represent this value as a Float128. The only representation that is parsed right now is the scientific notation; regular notation will follow.
        Parameters:
        sd - String; a String representation of a double value
        Returns:
        Float128; a Float128 from this string representation with a significand precision up to 120 bits
      • main

        public static void main​(String[] args)
        test code.
        Parameters:
        args - String[] not used