Discussion:
Is it possible to implement begin(), end(), and data() on a vector without UB?
(too old to reply)
b***@gmail.com
2017-01-12 21:39:59 UTC
Permalink
Sometimes, there's a need a fixed-capacity vector. It's got some storage, a
size, and begin/end:

template <class T, size_t Capacity>
class fixed_vector {
using storage_t = std::aligned_storage_t<sizeof(T), alignof(T)>;

size_t length_;
storage_t data_[Capacity];

public:
// ...

T* begin() { return reinterpret_cast<T*>(&data_[0]); }

T* end() { return begin() + length_; } // <== (a)
T* end() { return reinterpret_cast<T*>(&data_[length]); } // <== (b)

// ...
};

The problem I immediately run into is:
(a) There's no T array at &data_[0], so the pointer arithmetic is UB (if
length_ > 1)
(b) This is probably UB in itself (no T at &data_[length_]), but then
additionally any algorithm I pass begin() and end() into will incur UB by
doing its own pointer arithmetic when it traverses that range.

One approach might then to be to implement a real iterator that is a
storage_t* but dereferences to a T. This is an annoying indirection, but
solves the begin()/end() problems, and hopefully the compiler can figure
out what I really mean. But then, what if I want to expose data()?

T* data() { return reinterpret_cast<T*>(&data_[0]); }

Wouldn't just about any use of data() be UB, regardless of what happens?

This isn't merely intellectual. I ask because gcc optimized away a bunch of
code that I actually need to run, and I'm not sure that it was a gcc bug to
have done so.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Chris Hallock
2017-01-12 23:08:02 UTC
Permalink
This is issue 2182 <https://wg21.link/cwg2182>.

Does GCC do anything different with any of these changes?

1. data_ is declared as aligned_storage_t<Capacity*sizeof(T),
alignof(T)> data_;
2. data_ is declared as alignas(T) unsigned char
data_[Capacity*sizeof(T)];
3. std::launder is used on the result of reinterpret_cast
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Myriachan
2017-01-13 20:28:23 UTC
Permalink
Post by Chris Hallock
This is issue 2182 <https://wg21.link/cwg2182>.
Does GCC do anything different with any of these changes?
1. data_ is declared as aligned_storage_t<Capacity*sizeof(T),
alignof(T)> data_;
2. data_ is declared as alignas(T) unsigned char
data_[Capacity*sizeof(T)];
3. std::launder is used on the result of reinterpret_cast
Shouldn't a container call placement new, such that a T exists? If so,
what's the point of std::launder?

The core issue seems to be referring to this case:

alignas(alignof(Class)) unsigned char data[sizeof(Class) * 3];
for (unsigned x = 0; x < 3; ++x)
{
new(&data[sizeof(Class) * x]) Class;
}
Class *meow = reinterpret_cast<Class *>(data);
meow += 2;
meow->Function();

Because placement new was called and not placement new[], only one Class
was created at the three adjacent locations within "data". Adding 2 would
then be undefined behavior, because it went past one-past-the-end of an
array of size 1 (as non-array objects are treated for purposes of pointer
arithmetic).

It doesn't seem that this is the intent of the Standard, hence the core
issue. Without changing it, it's impossible to implement your own
std::vector, requiring making it a "magic" class like
std::initializer_list. (The impossibility comes from the need for
supporting the allocator object.)

We really need a compiler that is extremely strict in order to find issues
like this if the Standard is going to stick to these strict rules. For
example, a type sanitizer mode in which each byte of memory records which
type it is in addition to its contents. The type information would
probably be larger than the data stored there, but such a mode would be
useful for finding this kind of noncompliance issue.

Melissa
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Nicol Bolas
2017-01-13 21:07:07 UTC
Permalink
Post by Myriachan
Post by Chris Hallock
This is issue 2182 <https://wg21.link/cwg2182>.
Does GCC do anything different with any of these changes?
1. data_ is declared as aligned_storage_t<Capacity*sizeof(T),
alignof(T)> data_;
2. data_ is declared as alignas(T) unsigned char
data_[Capacity*sizeof(T)];
3. std::launder is used on the result of reinterpret_cast
Shouldn't a container call placement new, such that a T exists? If so,
what's the point of std::launder?
Just because there's a different object there doesn't mean you're allowed
to access it through the original pointer/variable. You're only allowed to
do that if you didn't change the type of the object. Which you did, in this
case.

So `launder` is needed if you have the old pointer/variable and want to
convert it to the object that exists in its place.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Ville Voutilainen
2017-01-13 21:15:14 UTC
Permalink
Post by Chris Hallock
This is issue 2182.
Does GCC do anything different with any of these changes?
data_ is declared as aligned_storage_t<Capacity*sizeof(T), alignof(T)>
data_;
data_ is declared as alignas(T) unsigned char data_[Capacity*sizeof(T)];
std::launder is used on the result of reinterpret_cast
Shouldn't a container call placement new, such that a T exists? If so,
what's the point of std::launder?
Just because there's a different object there doesn't mean you're allowed to
access it through the original pointer/variable. You're only allowed to do
that if you didn't change the type of the object. Which you did, in this
case.
So `launder` is needed if you have the old pointer/variable and want to
convert it to the object that exists in its place.
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
This is important for things like std::optional, which placement-new
over existing
storage. The downside of a naive launder is that it might also turn
off devirtualization.
The jury is still out whether launder should be needed, and whether
the optimization
it's supposed to protect against should be allowed. I predict with
fair confidence
that all this will be discussed in the next committee meeting. (Well,
I don't just
predict it with any confidence, I promise it.)
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Nicol Bolas
2017-01-13 21:56:28 UTC
Permalink
Post by Chris Hallock
Post by Nicol Bolas
Post by Chris Hallock
This is issue 2182.
Does GCC do anything different with any of these changes?
data_ is declared as aligned_storage_t<Capacity*sizeof(T), alignof(T)>
data_;
data_ is declared as alignas(T) unsigned char
data_[Capacity*sizeof(T)];
Post by Nicol Bolas
Post by Chris Hallock
std::launder is used on the result of reinterpret_cast
Shouldn't a container call placement new, such that a T exists? If so,
what's the point of std::launder?
Just because there's a different object there doesn't mean you're
allowed to
Post by Nicol Bolas
access it through the original pointer/variable. You're only allowed to
do
Post by Nicol Bolas
that if you didn't change the type of the object. Which you did, in this
case.
So `launder` is needed if you have the old pointer/variable and want to
convert it to the object that exists in its place.
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
That may be true, but the current wording around launder makes it clear
that it is the only valid way to do the conversion below:

std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>

[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here,
regardless of the members of `SomeType`. Only `std::launder` (as noted in
the non-normative notation) does.
Post by Chris Hallock
If I have enough storage for X copies of `SomeType`, and I construct X of
them, with `sizeof(SomeType)` bytes between each, then that storage counts
as an array of X `SomeType`s for pointer arithmetic purposes.

That's the main problem we have. Pointer arithmetic is defined based on
having arrays, and I know of no language in the specification that allows
you to treat two instances created sequentially in the same storage as an
array.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Myriachan
2017-01-17 21:26:42 UTC
Permalink
Post by Nicol Bolas
Post by Ville Voutilainen
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
That may be true, but the current wording around launder makes it clear
std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>
[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here,
regardless of the members of `SomeType`. Only `std::launder` (as noted in
the non-normative notation) does.
Post by Ville Voutilainen
If I have enough storage for X copies of `SomeType`, and I construct X
of them, with `sizeof(SomeType)` bytes between each, then that storage
counts as an array of X `SomeType`s for pointer arithmetic purposes.
That's the main problem we have. Pointer arithmetic is defined based on
having arrays, and I know of no language in the specification that allows
you to treat two instances created sequentially in the same storage as an
array.
The rule gets complicated if you don't simply allow all pointer arithmetic
so long as it never leaves the boundaries of a single allocated storage
block, and as long as you don't dereference it unless the created type
matches (for aliasing rules).

It would permit things like this, which not everyone here agrees should be
legal:

#include <cstddef>

struct Meow
{
int x[14];
float y;
int z;
};

int Kitty(Meow *meow)
{
meow->z = 4;
char *temp = reinterpret_cast<char *>(meow->x);
temp += offsetof(Meow, z) - offsetof(Meow, x);
*reinterpret_cast<int *>(temp) = 2;
return meow->z;
}

Also, what is to say that the expression &meow->z - meow->x wouldn't also
be well-defined? Since they both have int alignment, it's not possible
(other than with outside-the-standard things like #pragma pack) for the
distance between the two ints to not be a multiple of sizeof(int).

A rule saying that pointer arithmetic is allowed within a block of
allocated storage would also allow the trick of declaring an array at the
end of a struct as [1] and going past it if storage permits. (That writing
to primitive types without calling placement new on each one is a separate
issue).

Allowing pointer arithmetic within a block of storage but not otherwise
also preserves the undefinedness of attempting pointer arithmetic between
fully unrelated objects, which breaks under some segmented architectures.
(For example, the "large" memory model in MS-DOS and Win16, where pointers
are a 32-bit segment:offset, but size_t and ptrdiff_t are 16 bits; on such
a model, subtracting two unrelated pointers is definitely undefined
behavior, and may even cause a trap depending on the circumstances.) Along
the same lines, it preserves the ability for the compiler to assume that if
you're accessing a pointer derived from local variable p, you're not
actually accessing local variable q, because each local variable is
allocated in separate storage units. Similarly for separate global
variables.

This permission would also make well-defined certain custom heap managers.
A custom_free() would be able to do pointer arithmetic to retrieve the
allocation header from before the pointer given to it.

Melissa
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Demi Obenour
2017-01-17 22:06:44 UTC
Permalink
Melissa gets this correct. All code I have written operates on this
assumption, and I suspect that the vast majority of programmers also make
this assumption.

I think that there does need to be some total ordering on pointers though.
For one, memmove is trivial to implement if such an ordering exists.
Post by Nicol Bolas
Post by Ville Voutilainen
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
That may be true, but the current wording around launder makes it clear
std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>
[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here,
regardless of the members of `SomeType`. Only `std::launder` (as noted in
the non-normative notation) does.
Post by Ville Voutilainen
If I have enough storage for X copies of `SomeType`, and I construct X
of them, with `sizeof(SomeType)` bytes between each, then that storage
counts as an array of X `SomeType`s for pointer arithmetic purposes.
That's the main problem we have. Pointer arithmetic is defined based on
having arrays, and I know of no language in the specification that allows
you to treat two instances created sequentially in the same storage as an
array.
The rule gets complicated if you don't simply allow all pointer arithmetic
so long as it never leaves the boundaries of a single allocated storage
block, and as long as you don't dereference it unless the created type
matches (for aliasing rules).

It would permit things like this, which not everyone here agrees should be
legal:

#include <cstddef>

struct Meow
{
int x[14];
float y;
int z;
};

int Kitty(Meow *meow)
{
meow->z = 4;
char *temp = reinterpret_cast<char *>(meow->x);
temp += offsetof(Meow, z) - offsetof(Meow, x);
*reinterpret_cast<int *>(temp) = 2;
return meow->z;
}

Also, what is to say that the expression &meow->z - meow->x wouldn't also
be well-defined? Since they both have int alignment, it's not possible
(other than with outside-the-standard things like #pragma pack) for the
distance between the two ints to not be a multiple of sizeof(int).

A rule saying that pointer arithmetic is allowed within a block of
allocated storage would also allow the trick of declaring an array at the
end of a struct as [1] and going past it if storage permits. (That writing
to primitive types without calling placement new on each one is a separate
issue).

Allowing pointer arithmetic within a block of storage but not otherwise
also preserves the undefinedness of attempting pointer arithmetic between
fully unrelated objects, which breaks under some segmented architectures.
(For example, the "large" memory model in MS-DOS and Win16, where pointers
are a 32-bit segment:offset, but size_t and ptrdiff_t are 16 bits; on such
a model, subtracting two unrelated pointers is definitely undefined
behavior, and may even cause a trap depending on the circumstances.) Along
the same lines, it preserves the ability for the compiler to assume that if
you're accessing a pointer derived from local variable p, you're not
actually accessing local variable q, because each local variable is
allocated in separate storage units. Similarly for separate global
variables.

This permission would also make well-defined certain custom heap managers.
A custom_free() would be able to do pointer arithmetic to retrieve the
allocation header from before the pointer given to it.

Melissa
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Richard Hodges
2017-01-17 22:14:28 UTC
Permalink
What would be the problem if the above code being well-defined? Only the
most perverse compiler would make it not so.
Post by Demi Obenour
Melissa gets this correct. All code I have written operates on this
assumption, and I suspect that the vast majority of programmers also make
this assumption.
I think that there does need to be some total ordering on pointers
though. For one, memmove is trivial to implement if such an ordering
exists.
Post by Nicol Bolas
Post by Ville Voutilainen
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
That may be true, but the current wording around launder makes it clear
std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>
[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here,
regardless of the members of `SomeType`. Only `std::launder` (as noted in
the non-normative notation) does.
Post by Ville Voutilainen
If I have enough storage for X copies of `SomeType`, and I construct X
of them, with `sizeof(SomeType)` bytes between each, then that storage
counts as an array of X `SomeType`s for pointer arithmetic purposes.
That's the main problem we have. Pointer arithmetic is defined based on
having arrays, and I know of no language in the specification that allows
you to treat two instances created sequentially in the same storage as an
array.
The rule gets complicated if you don't simply allow all pointer arithmetic
so long as it never leaves the boundaries of a single allocated storage
block, and as long as you don't dereference it unless the created type
matches (for aliasing rules).
It would permit things like this, which not everyone here agrees should be
#include <cstddef>
struct Meow
{
int x[14];
float y;
int z;
};
int Kitty(Meow *meow)
{
meow->z = 4;
char *temp = reinterpret_cast<char *>(meow->x);
temp += offsetof(Meow, z) - offsetof(Meow, x);
*reinterpret_cast<int *>(temp) = 2;
return meow->z;
}
Also, what is to say that the expression &meow->z - meow->x wouldn't also
be well-defined? Since they both have int alignment, it's not possible
(other than with outside-the-standard things like #pragma pack) for the
distance between the two ints to not be a multiple of sizeof(int).
A rule saying that pointer arithmetic is allowed within a block of
allocated storage would also allow the trick of declaring an array at the
end of a struct as [1] and going past it if storage permits. (That writing
to primitive types without calling placement new on each one is a separate
issue).
Allowing pointer arithmetic within a block of storage but not otherwise
also preserves the undefinedness of attempting pointer arithmetic between
fully unrelated objects, which breaks under some segmented architectures.
(For example, the "large" memory model in MS-DOS and Win16, where pointers
are a 32-bit segment:offset, but size_t and ptrdiff_t are 16 bits; on such
a model, subtracting two unrelated pointers is definitely undefined
behavior, and may even cause a trap depending on the circumstances.) Along
the same lines, it preserves the ability for the compiler to assume that if
you're accessing a pointer derived from local variable p, you're not
actually accessing local variable q, because each local variable is
allocated in separate storage units. Similarly for separate global
variables.
This permission would also make well-defined certain custom heap
managers. A custom_free() would be able to do pointer arithmetic to
retrieve the allocation header from before the pointer given to it.
Melissa
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Nicol Bolas
2017-01-18 15:26:37 UTC
Permalink
Post by Myriachan
Post by Nicol Bolas
Post by Ville Voutilainen
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
That may be true, but the current wording around launder makes it clear
std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>
[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here,
regardless of the members of `SomeType`. Only `std::launder` (as noted in
the non-normative notation) does.
Post by Ville Voutilainen
If I have enough storage for X copies of `SomeType`, and I construct X
of them, with `sizeof(SomeType)` bytes between each, then that storage
counts as an array of X `SomeType`s for pointer arithmetic purposes.
That's the main problem we have. Pointer arithmetic is defined based on
having arrays, and I know of no language in the specification that allows
you to treat two instances created sequentially in the same storage as an
array.
The rule gets complicated if you don't simply allow all pointer arithmetic
so long as it never leaves the boundaries of a single allocated storage
block, and as long as you don't dereference it unless the created type
matches (for aliasing rules).
It's not that complicated.

You define the concept of "sequence of objects" as referring to a
contiguous collection of objects of the same type. You then redefine
pointer arithmetic to work on sequences. This also allows [basic.types]/4's
"sequence of unsigned char" object representation wording to work.

You don't need generalized pointer arithmetic within an allocation. Just
within an object sequence. The ability to use `offsetof` is essentially an
outgrowth of that.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Columbo
2017-01-27 11:28:26 UTC
Permalink
Post by Ville Voutilainen
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>
[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here, regardless of the members of `SomeType`. Only `std::launder` (as noted in the non-normative notation) does.
If I have enough storage for X copies of `SomeType`, and I construct X of them, with `sizeof(SomeType)` bytes between each, then that storage counts as an array of X `SomeType`s for pointer arithmetic purposes.
That's the main problem we have. Pointer arithmetic is defined based on having arrays, and I know of no language in the specification that allows you to treat two instances created sequentially in the same storage as an array.
The rule gets complicated if you don't simply allow all pointer arithmetic so long as it never leaves the boundaries of a single allocated storage block, and as long as you don't dereference it unless the created type matches (for aliasing rules).
It's not that complicated.
You define the concept of "sequence of objects" as referring to a contiguous collection of objects of the same type. You then redefine pointer arithmetic to work on sequences. This also allows [basic.types]/4's "sequence of unsigned char" object representation wording to work.
We could take a similar approach as for single object pointer arithmetic. I.e. every contiguous sequence of objects of the same (most-derived) type is, for the purpose of pointer arithmetic, a hypothetical array?
Post by Ville Voutilainen
You don't need generalized pointer arithmetic within an allocation. Just within an object sequence. The ability to use `offsetof` is essentially an outgrowth of that.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Nicol Bolas
2017-01-27 14:07:54 UTC
Permalink
Post by Nicol Bolas
Post by Ville Voutilainen
The point of launder was to provide a mechanism for telling optimizers
not to assume
that a memory location where there was a reference or a const member
can not change.
That may be true, but the current wording around launder makes it clear
std::aligned_union_t<SomeType> val;
new(&val) SomeType();
SomeType *p = <<Convert From val to SomeType*>>
[basic.life]/8 doesn't allow `reinterpret_cast` alone to work here,
regardless of the members of `SomeType`. Only `std::launder` (as noted in
the non-normative notation) does.
Post by Ville Voutilainen
Post by Chris Hallock
If I have enough storage for X copies of `SomeType`, and I construct X
of them, with `sizeof(SomeType)` bytes between each, then that storage
counts as an array of X `SomeType`s for pointer arithmetic purposes.
Post by Ville Voutilainen
That's the main problem we have. Pointer arithmetic is defined based on
having arrays, and I know of no language in the specification that allows
you to treat two instances created sequentially in the same storage as an
array.
Post by Ville Voutilainen
The rule gets complicated if you don't simply allow all pointer
arithmetic so long as it never leaves the boundaries of a single allocated
storage block, and as long as you don't dereference it unless the created
type matches (for aliasing rules).
Post by Ville Voutilainen
It's not that complicated.
You define the concept of "sequence of objects" as referring to a
contiguous collection of objects of the same type. You then redefine
pointer arithmetic to work on sequences. This also allows [basic.types]/4's
"sequence of unsigned char" object representation wording to work.
We could take a similar approach as for single object pointer arithmetic.
I.e. every contiguous sequence of objects of the same (most-derived) type
is, for the purpose of pointer arithmetic, a hypothetical array?
I don't like calling it a "hypothetical array". I much prefer creating a
different term, which arrays fit into, but with user-created constructs can
fit into as well. Pointer arithmetic would be defined in terms of this new
term.

Granted, this would also require scanning through the specification and
finding every place that uses "array" to decide if it should be changed to
"contiguous sequence of objects" instead.
Post by Nicol Bolas
Post by Ville Voutilainen
You don't need generalized pointer arithmetic within an allocation. Just
within an object sequence. The ability to use `offsetof` is essentially an
outgrowth of that.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Demi Obenour
2017-01-13 03:21:59 UTC
Permalink
Have you tried casting to a size_t, doing arithmetic on that, and then
casting back to a pointer? Does -fno-strict-aliasing help? What about both
together? Using char pointers?
Post by b***@gmail.com
Sometimes, there's a need a fixed-capacity vector. It's got some storage,
template <class T, size_t Capacity>
class fixed_vector {
using storage_t = std::aligned_storage_t<sizeof(T), alignof(T)>;
size_t length_;
storage_t data_[Capacity];
// ...
T* begin() { return reinterpret_cast<T*>(&data_[0]); }
T* end() { return begin() + length_; } // <== (a)
T* end() { return reinterpret_cast<T*>(&data_[length]); } // <== (b)
// ...
};
(a) There's no T array at &data_[0], so the pointer arithmetic is UB (if
length_ > 1)
(b) This is probably UB in itself (no T at &data_[length_]), but then
additionally any algorithm I pass begin() and end() into will incur UB by
doing its own pointer arithmetic when it traverses that range.
One approach might then to be to implement a real iterator that is a
storage_t* but dereferences to a T. This is an annoying indirection, but
solves the begin()/end() problems, and hopefully the compiler can figure
out what I really mean. But then, what if I want to expose data()?
T* data() { return reinterpret_cast<T*>(&data_[0]); }
Wouldn't just about any use of data() be UB, regardless of what happens?
This isn't merely intellectual. I ask because gcc optimized away a bunch
of code that I actually need to run, and I'm not sure that it was a gcc bug
to have done so.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Thiago Macieira
2017-01-13 17:09:03 UTC
Permalink
Post by Demi Obenour
Have you tried casting to a size_t, doing arithmetic on that, and then
casting back to a pointer? Does -fno-strict-aliasing help? What about both
together? Using char pointers?
Compiler switches do not apply to the standard. The OP's question is how to do
this without triggering UB as per the standard.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
c***@gmail.com
2017-01-31 23:16:45 UTC
Permalink
I'm wondering if using a union would solve the problems...

template <class T, size_t Capacity>
class fixed_vector {
using union_t = union ustorage { T t; ~ustorage(){} ustorage(){}};
size_t length_;
union {
union_t udata_[Capacity];
T tdata_[Capacity];
};
public:
// ...
fixed_vector():length_(0) {}

void push_back(T&&t)
{
new (&udata_[length_++].t) T(move(t));
}

~fixed_vector()
{
while (length_)
{
udata_[--length_].t.~T();
}
}

T* begin() { return (&udata_[0].t); }
T* end() { return begin() + length_; }
T* data() { return tdata_; }

};

The original code did not have a T array, but this change has one (as an
inactive union member). As long as the "active" part of the array is always
starts at udata[0], the rules for unions with common initial sequence
should apply. Maybe?
Or, although this seems to work, I would not be surprised if UB is still
there.

See also:

http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior
http://en.cppreference.com/w/cpp/language/union

--Chet.
Post by b***@gmail.com
Sometimes, there's a need a fixed-capacity vector. It's got some storage,
template <class T, size_t Capacity>
class fixed_vector {
using storage_t = std::aligned_storage_t<sizeof(T), alignof(T)>;
size_t length_;
storage_t data_[Capacity];
// ...
T* begin() { return reinterpret_cast<T*>(&data_[0]); }
T* end() { return begin() + length_; } // <== (a)
T* end() { return reinterpret_cast<T*>(&data_[length]); } // <== (b)
// ...
};
(a) There's no T array at &data_[0], so the pointer arithmetic is UB (if
length_ > 1)
(b) This is probably UB in itself (no T at &data_[length_]), but then
additionally any algorithm I pass begin() and end() into will incur UB by
doing its own pointer arithmetic when it traverses that range.
One approach might then to be to implement a real iterator that is a
storage_t* but dereferences to a T. This is an annoying indirection, but
solves the begin()/end() problems, and hopefully the compiler can figure
out what I really mean. But then, what if I want to expose data()?
T* data() { return reinterpret_cast<T*>(&data_[0]); }
Wouldn't just about any use of data() be UB, regardless of what happens?
This isn't merely intellectual. I ask because gcc optimized away a bunch
of code that I actually need to run, and I'm not sure that it was a gcc bug
to have done so.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Thiago Macieira
2017-02-01 00:04:22 UTC
Permalink
Post by c***@gmail.com
The original code did not have a T array, but this change has one (as an
inactive union member). As long as the "active" part of the array is always
starts at udata[0], the rules for unions with common initial sequence
should apply. Maybe?
This would help in your case for fixed_vector.

But how do you do it for a dynamically allocated one?

Even what I've so far proposed would only apply to trivial types and your
union, because of a user-supplied constructor, isn't one.

I'm not sure whether adding a non-trivial union wrapper to trivial types helps
or hinders more. For non-trivial element types, you'd need to placement-new
them anyway, so I don't see that as a problem.

Finally, there's the question of whether an array would benefit from the rules
as well.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Continue reading on narkive:
Loading...