Edward Catmur

2017-04-10 09:52:41 UTC

Firstly, I want to say that I think to_chars and from_chars are a great

addition to the Standard and I look forward to using them in C++17. I have

a few questions regarding their behavior on floating point types.

(As background for the first few questions: for each floating-point type

there are a (relatively) small number of large integers that are exactly

halfway between two adjacent values of that type, and which have a

relatively short scientific decimal representation. For example, 1e23 has

hexadecimal floating-point representation 0x1.52d02c7e14af68p76, which is

exactly halfway between the adjacent IEEE 754 (64-bit) double values

0x1.52d02c7e14af6p76 and 0x1.52d02c7e14af7p76. Parsing the string "1e23"

into double using from_chars [utility.from.chars] is required to produce

one of those two values.)

Firstly, is from_chars expected to have idempotent behavior, or is it

allowed to be dependent on e.g. floating-point environment or the use of

80-bit floating point (32-bit Linux on x86)?

Secondly, is from_chars expected or encouraged to have the same behavior as

the compiler? i.e. for double d; auto s = "1e23" should we expect 1e23 ==

(from_chars(s, s + 4, d), d) ever to fail?

Most importantly, is to_chars permitted to produce an overlong output where

the shorter output round-trips on the same implementation but is not

guaranteed to do so globally? For example, if an implementation always

reads "1e23" as 0x1.52d02c7e14af6p76, is it permitted to output

0x1.52d02c7e14af6p76 as "9.999999999999999e22" on the basis that this is

guaranteed to be read correctly by a different implementation that might

read "1e23" as 0x1.52d02c7e14af7p76?

In addition, I would be interested in knowing whether the following

underspecification is intentional:

Is the result of to_chars() required to represent the closest to the input

value among strings of that length that round-trip? For example

0x1.0000000000001p0 is approx. 1.000000000000000222045, so is

1.0000000000000003 an acceptable output from to_chars, or only

1.0000000000000002? Or consider the smallest positive subnormal IEEE

double, 0x1p-1074, approx. 4.94e-324 - is 4e-324 an acceptable output, or

only 5e-324? (In Florian Loitsch [1], this is the "closeness" property of

Grisu3.)

I hope the above questions don't come across as overly pedantic; I would be

perfectly satisfied to be told that all of the above are QOI matters, but

I'd hope to know what to expect before retiring our current code using

Google double-conversion[2].

Finally, it would be useful to know the minimum buffer size necessary to

guarantee successful conversion in all cases. I would guess this is

something like 4 + numeric_limits<T>::max_digits10 +

max(log10(numeric_limits<T>::max_exponent10), 1 +

log10(-numeric_limits<T>::min_exponent10)) but it would be useful to have

confirmation of this calculation or indeed to have it available in the

Standard as a constant.

Thanks!

1. http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf

2. https://github.com/google/double-conversion

addition to the Standard and I look forward to using them in C++17. I have

a few questions regarding their behavior on floating point types.

(As background for the first few questions: for each floating-point type

there are a (relatively) small number of large integers that are exactly

halfway between two adjacent values of that type, and which have a

relatively short scientific decimal representation. For example, 1e23 has

hexadecimal floating-point representation 0x1.52d02c7e14af68p76, which is

exactly halfway between the adjacent IEEE 754 (64-bit) double values

0x1.52d02c7e14af6p76 and 0x1.52d02c7e14af7p76. Parsing the string "1e23"

into double using from_chars [utility.from.chars] is required to produce

one of those two values.)

Firstly, is from_chars expected to have idempotent behavior, or is it

allowed to be dependent on e.g. floating-point environment or the use of

80-bit floating point (32-bit Linux on x86)?

Secondly, is from_chars expected or encouraged to have the same behavior as

the compiler? i.e. for double d; auto s = "1e23" should we expect 1e23 ==

(from_chars(s, s + 4, d), d) ever to fail?

Most importantly, is to_chars permitted to produce an overlong output where

the shorter output round-trips on the same implementation but is not

guaranteed to do so globally? For example, if an implementation always

reads "1e23" as 0x1.52d02c7e14af6p76, is it permitted to output

0x1.52d02c7e14af6p76 as "9.999999999999999e22" on the basis that this is

guaranteed to be read correctly by a different implementation that might

read "1e23" as 0x1.52d02c7e14af7p76?

In addition, I would be interested in knowing whether the following

underspecification is intentional:

Is the result of to_chars() required to represent the closest to the input

value among strings of that length that round-trip? For example

0x1.0000000000001p0 is approx. 1.000000000000000222045, so is

1.0000000000000003 an acceptable output from to_chars, or only

1.0000000000000002? Or consider the smallest positive subnormal IEEE

double, 0x1p-1074, approx. 4.94e-324 - is 4e-324 an acceptable output, or

only 5e-324? (In Florian Loitsch [1], this is the "closeness" property of

Grisu3.)

I hope the above questions don't come across as overly pedantic; I would be

perfectly satisfied to be told that all of the above are QOI matters, but

I'd hope to know what to expect before retiring our current code using

Google double-conversion[2].

Finally, it would be useful to know the minimum buffer size necessary to

guarantee successful conversion in all cases. I would guess this is

something like 4 + numeric_limits<T>::max_digits10 +

max(log10(numeric_limits<T>::max_exponent10), 1 +

log10(-numeric_limits<T>::min_exponent10)) but it would be useful to have

confirmation of this calculation or indeed to have it available in the

Standard as a constant.

Thanks!

1. http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf

2. https://github.com/google/double-conversion

--

---

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.

To post to this group, send email to std-***@isocpp.org.

Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

---

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.

To post to this group, send email to std-***@isocpp.org.

Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.