module Fpu:Access to low level floating point functions. THIS LIBRARY ONLY WORKS FOR INTEL PROCESSORS.sig..end
Almost all low level functions are implemented using the x87 functions and x87 rounding modes. There are unfortunately a few problems to understand. The x87 is supposed to be able to return a nearest value and a upper and a lower bound for each elementary operation it can perform. This is not always true. Some functions such as cos(), sin() or tan() are not properly implemented everywhere.
For example, for the angle a= 1.570 796 326 794 896 557 998 981 734 272 092 580 795 288 085 937 5 the following values are computed for cos(a), by (1) the MPFI library (with 128 bits precision), (2) the x87 in low mode, (3) the x87 in nearest mode (default value for the C and Ocaml library on 32 bits linux), (4) the x87 in high mode, (5) the SSE2 implementation (default value for the C and Ocaml library on 64 bits linux):
(1) 6.123 233 995 736 765 886 130 329 661 375 001 464 640 377 798 836e-17
(2) 6.123 031 769 111 885 058 461 925 285 082 049 859 451 216 355 021e-17
(3) 6.123 031 769 111 886 291 057 089 692 912 995 815 277 099 609 375e-17
(4) 6.123 031 769 111 886 291 057 089 692 912 995 815 277 099 609 375e-17
(5) 6.123 233 995 736 766 035 868 820 147 291 983 023 128 460 623 387e-17
The upper bound (4) computed by the x87 is clearly incorrect, as it is lower than the correct value computed by the MPFI library.
The value computed by the SSE2 (5) is much more precise than the one computed by the x87. Unfortunately, there is no way to get an upper and lower bound value, and we are thus stuck with the x87 for computing these (sometimes incorrect) bounds.
The problem here is that the value computed by the standard, C-lib (or ocaml) cos function doesn't always lie in the lower/upper bound interval returned by the x87 functions, and this can be a very serious problem when executing Branch and Bound algorithms which expect the mid-value to be inside the lower/upper interval.
We solved the problem by rewritting the trigonometric functions in order to make them both consistant and correct. We used the following property: when -pi/4<=a<=pi/4 the rounding in 64 bits of the 80 bits low/std/high value returned by the x87 are correct. Moreover, when 0<a<2**53 then (a mod (2Pi_low)) and (a mod (2Pi_high)) are in the same quadrant. Last, (a mod Pi/2_High) <= (a mod Pi/2) <= (a mod Pi/2_Low). With this implementation, the lower and upper bounds are properly set and they are always lower (resp. higher) than the value computed by the standard cos functions on 32 and 64 bits architecture. This rewritting has been done in assembly language and is quite efficient.
Keep in mind that values returned by the standard (C-lib or Ocaml) cos(),
sin() or tan() functions are still
different on 32 and 64 bits architecture. If you want to have a program which
behaves exactly in the same way on both architectures, you can use the Fpu
module fcos, fsin or ftan functions which always return the same values on all
architectures, or even use the Fpu_rename or Fpu_rename_all modules to transparently
rename the floating point functions.
The functions are quite efficient (see below). However, they have a serious disadvantage compared to their standard counterparts. When the compiler compiles instruction ''a+.b'', the code of the operation is inlined, while when it compiles ''(fadd a b)'', the compiler generates a function call, which is expensive.
Intel Atom 230 Linux 32 bits
Intel 980X Linux 64 bits
val ffloat : int -> floatval ffloat_high : int -> floatval ffloat_low : int -> floatval fadd : float -> float -> floatval fadd_low : float -> float -> floatval fadd_high : float -> float -> floatval fsub : float -> float -> floatval fsub_low : float -> float -> floatval fsub_high : float -> float -> floatval fmul : float -> float -> floatval fmul_low : float -> float -> floatval fmul_high : float -> float -> floatval fdiv : float -> float -> floatval fdiv_low : float -> float -> floatval fdiv_high : float -> float -> floatval fmod : float -> float -> floatval fsqrt : float -> floatval fsqrt_low : float -> floatval fsqrt_high : float -> floatval fexp : float -> floatval fexp_low : float -> floatval fexp_high : float -> floatval flog : float -> floatval flog_low : float -> floatval flog_high : float -> floatval flog_pow : float -> float -> floatval flog_pow_low : float -> float -> floatval flog_pow_high : float -> float -> floatval fpow : float -> float -> floatval fpow_low : float -> float -> floatval fpow_high : float -> float -> floatval fsin : float -> floatval fsin_low : float -> floatval fsin_high : float -> floatval fcos : float -> floatval fcos_low : float -> floatval fcos_high : float -> floatval ftan : float -> floatval ftan_low : float -> floatval ftan_high : float -> floatval fatan : float -> float -> floatval fatan_low : float -> float -> floatval fatan_high : float -> float -> floatval facos : float -> floatval facos_low : float -> floatval facos_high : float -> floatval fasin : float -> floatval fasin_low : float -> floatval fasin_high : float -> floatval fsinh : float -> floatval fsinh_low : float -> floatval fsinh_high : float -> floatval fcosh : float -> floatval fcosh_low : float -> floatval fcosh_high : float -> floatval ftanh : float -> floatval ftanh_low : float -> floatval ftanh_high : float -> floatval is_neg : float -> boolBE VERY CAREFUL: using these functions unwisely can ruin all your computations. Remember also that on 64 bits machine these functions won't change the behaviour of the SSE instructions.
When setting the rounding mode to UPWARD or DOWNWARD, it is better to set it immediately back to NEAREST. However we have no guarantee on how the compiler will reorder the instructions generated. It is ALWAYS better to write:
let a = set_high(); let res = 1./.3. in set_nearest (); res;;
The above code will NOT work on linux-x64 where many floating point functions are implemented using SSE instructions. These three functions should only be used when there is no other solution, and you really know what tou are doing, and this should never happen. Please use the regular functions of the fpu module for computations. For example prefer:
let a = fdiv_high 1. 3.;;
PS: The Interval module and the fpu module functions correctly set and restore the rounding mode for all interval computations, so you don't really need these functions.
PPS: Please, don't use them...
val set_low : unit -> unitval set_high : unit -> unitval set_nearest : unit -> unit