Skip to content

Ryu-based to_string function #627

Open
@St-Maxwell

Description

@St-Maxwell

I came up with the idea of implementing an effective function that converts floating point numbers to decimal strings without using internal IO when discussing the disp function. And recently I have tried to work it out.

The original Ryu can generate the shortest precision-preserving string of a floating point number and Ryu printf provides formatting of floating point numbers.

Based on the C and Scala version of Ryu codes, I have implemented the Fortran version of Ryu.

I think we can replace the current implementation in stdlib with Ryu-based codes. So I would like to briefly describe the API of ryu_fortran.

Currently, ryu_fortran provides four routines: f2shortest, d2shortest, d2fixed and d2exp.

use ryu, only: f2shortest, d2shortest, d2fixed, d2exp
use iso_fortran_env, only: real32, real64

write (*, "(A)") f2shortest(3.14159_real32)
write (*, "(A)") d2shortest(2.718281828_real64)
write (*, "(A)") d2fixed(1.2345678987654321_real64, 10)
write (*, "(A)") d2exp(299792458._real64, 5)

! 3.14159
! 2.718281828
! 1.2345678988
! 2.99792E+08

Interface

interface
    function f2shortest(f) result(str)
        real(kind=real32), intent(in) :: f
        character(len=:), allocatable :: str
    end function
    function d2shortest(d) result(str)
        real(kind=real64), intent(in) :: d
        character(len=:), allocatable :: str
    end function
    function d2fixed(d, precision_) result(str)
        real(kind=real64), intent(in) :: d
        integer(kind=int32), intent(in) :: precision_
        character(len=:), allocatable :: str
    end function
    function d2exp(d, precision_) result(str)
        real(kind=real64), intent(in) :: d
        integer(kind=int32), intent(in) :: precision_
        character(len=:), allocatable :: str
    end function
end interface

f2shortest and d2shortest produce shortest precision-preserving decimal strings of floating point numbers, that is, if we convert strings back to floating point values, we should get same binary representation comparing to original numbers. These two routines are suitable for cases where format is not specified. Note: f2shortest and d2shortest always print at least two digits. For example, C version of Ryu produces "1" for 1._real32 while f2shortest produces "1.0".

d2fixed and d2exp do formatting for real64 floating point numbers. The main difference between them and Fortran edit descriptors is that they don't produce "*****".

With these routines, I wrote a simple prototype of to_string for floating point numbers in app/main.f90. For stdlib, I think when format argument is not presented, f2shortest and d2shortest can be called. But when format is specified, there might be a disagreement over whether we follow Fortran convention or not. The good points of Ryu formatting are fast and that it never produces "*****". But it can not control the width of formatted values, which is sometimes required.

I hope in this issue we can discuss the above points and reach certain agreements.

P.S. Benchmark results (edited on 2022.2.9)

Benchmark for f2shortest
f2shortest Time (us): 0.2019531   Std Dev:  0.3438
internal IO Time (us): 1.6445312   Std Dev:  0.3450

Benchmark for d2shortest
d2shortest Time (us): 0.2128906   Std Dev:  0.3496
internal IO Time (us): 2.1968750   Std Dev:  0.4361

Benchmark for d2exp
d2exp Time (us): 0.2976563   Std Dev:  0.3794
internal IO Time (us): 2.0078125   Std Dev:  0.4105

Benchmark for d2fixed
d2fixed Time (us): 0.8589844   Std Dev:  0.9782
internal IO Time (us): 4.4464844   Std Dev:  4.2765

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic: IOCommon input/output related featurestopic: stringsString processing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions