This article is not about "delayed initialization" or "lazy loading of singletons", but about not constructing a type after allocating the space needed for it (i.e., not starting the lifetime of the object), or, more commonly, skipping the execution of the object's constructor. In other words, it skips the execution of the object's constructor.
Usage Scenarios
We know that it doesn't matter whether we define an object of a certain type or use theoperator new
The memory is requested and the object's constructor is executed immediately. This is the behavior we expect most of the time.
But there are also a few times when we want the construction of an object not to be executed immediately, but to be deferred.
Lazy loading is one of the above scenarios, where perhaps the object is constructed with a lot of overhead, so we want to create it when we do need it.
Another scenario is in a container like small_vector.
small_vector will request a block of stack space beforehand, and then provide vector-like api to let the user insert/delete/update elements. The stack is not as easy to dynamically request space as a heap, so usually code that needs stack space will be written this way:
template <typename Elem, std::size_t N>
class small_vec
{
std::array<Elem, N> data;
};
I know there are functions like alloc that can be used, however it's underperforming and poorly portable, and basically everything you can find on it says it's not recommended for production environments, ditto for VLA, which isn't even c++ standard syntax.
Back on topic, there are two downsides to writing this way:
- The type Elem must be able to be initialized by default, otherwise you have to initialize every element in the array in the constructor
- We requested space for 10 Elem's but only ended up using 8 (which is a common scenario for containers like vectors), but we had to construct Elem's ten times, which is obviously wasteful, and even worse these default constructs deal with objects that are useless and get overwritten later on when push_back is done, so all ten constructs shouldn't have happened.
C++ preaches a don't-pay-for-what-you-don't-use, so deferred construction on stack-space based containers like small_vec is a pressing need.
As a language that pursues performance and expressiveness, c++ has quite a few options available for realizing such needs, so let's pick three common ones to introduce.
Utilizing std::byte and placement new
The first approach is more tricky. c++ allows the object's memory data and thestd::byte
Interconvert between them, so the first option is to use thestd::byte
array/container in place of the original object array, so that because the array is constructed with only thestd::byte
, will not construct the Elem, and thestd::byte
construction is mundane, i.e., it does nothing (but is initialized to a value of zero because of the aggregation initialization of std::array).
This naturally bypasses Elem's constructor. Let's take a look at the code:
template <typename Elem, std::size_t N>
class small_vec
{
static_assert(SIZE_T_MAX/N > sizeof(Elem)); // Prevent size_t loopbacks from causing the requested space to be less than required
alignas(Elem) std::array<std::byte, sizeof(Elem)*N> data; // In addition to calculating the size, the alignment needs to be set correctly as well, or else it will be an error
std::size_t size = 0;
};
In addition to commenting that one out, beware of requesting more space than the system set stack size.
I say this approach is more tricky because instead of constructing the Elem directly, we take thestd::byte
Made the replacement, and while it's true that we don't construct N Elem objects by default now, the code gets complicated when we actually need to get/store Elem.
The first is push_back, in which we need to use the "placement new" function to add a new value in the successivestd::byte
Construct the object on:
void small_vec::push_back(const Elem &e)
{
// Check to see if size exceeds the upper limit of data, and continue adding new elements if it doesn't.
new(&this->data[this->size*sizeof(Elem)]) Elem(e);
++this->size;
}
You can see that we have constructed an Elem object directly in the corresponding location, and if you can use C++20, then you also need a wrapper function that can simplify the code.std::construct_at
Available.
The code for getting looks cumbersome, mainly because of the type conversion required:
Elem& small_vec::at(std::size_t idx)
{
if (idx >= this->size) {
throw Error{};
}
return *reinterpret_cast<Elem*>(&this->data[idx*sizeof(Elem)]);
}
The destructor requires us to call the Elem's destructor, because the array is a byte, and it won't destruct the Elem object for me:
~small_vec()
{
for (std::size_t idx = 0; idx < size; ++idx) {
Elem *e = reinterpret_cast<Elem*>(&this->data[idx*sizeof(Elem)]);
e->~Elem();
}
}
This scheme is the most common because it can be used on more than just the stack. Of course this scheme is also very error-prone, because we need to always calculate the real index where the object is located, and we also have to keep an eye on whether the object should be destructed or not, which is a heavier burden on the mind.
Using union
It is not usually recommended to use union directly in c++, but to use it, you have to use tagged union.
However, unions are naturally good at skipping constructs/destructs: if a member of a union has a non-trivial default constructor/destructor, then the union's own default constructor and destructor will be deleted and need to be redefined by the user, and the union guarantees that no member will be initialized or destroyed except for the ones explicitly spelled out in the constructor and destructor.
This means that union inherently skips the constructors of its own members, and we can guarantee that the constructors of union's members won't be automatically executed by simply writing another default constructor for union that does nothing.
Look at an example:
class Data
{
public.
Data()
{
std::cout << "constructor\n";
}
~Data()
{
std::cout << "destructor\n";
}
};
union LazyData
{
LazyData() {}
~LazyData() {} // Try deleting these two lines and seeing the error.
Data data.
Data data; }
Data data; }; int main()
{
LazyData d; // Nothing will be output.
}
Output:
If it isstruct LazyData
The lines "constructor" and "destructor" are output. So we can see that the execution of the constructor is indeed skipped.
Union also has the benefit of automatically calculating the size and alignment needed for the type, and now our array index is the index of the object, and the code is much simpler:
template <typename Elem, std::size_t N>
class small_vec
class small_vec {
union ArrElem
ArrElem() {
ArrElem() {}
~ArrElem() {}
ArrElem() {}; ~ArrElem() {}
}
std::array<ArrElem, N> data; // No need to manually calculate the size and alignment again, less prone to errors
std::size_t size = 0; }; std::array<ArrElem> data
};
Option 2 also doesn't automatically construct elements, so adding elements still relies on placement new, where we use the aforementionedstd::construct_at
Simplify the code:
void small_vec::push_back(const Elem &e)
{
// Check to see if the size exceeds the upper limit of data, and continue adding new elements only if it doesn't.
std::construct_at(std::addressof(this->data[this->size++].value), e);
}
Getting the element is also relatively simple, as there is no need to force a type conversion anymore:
Elem& small_vec::at(std::size_t idx)
{
if (idx >= this->size) {
throw Error{};
}
return this->data[idx].value;
}
The same goes for the destructor, we need to destruct it manually, so I won't write it here. Also never destruct any member of a union in its destructor, don't forget that members of a union can skip the constructor call, it's undefined behavior to call the destructor.
Option 2 is simpler than 1, but still has the annoyance of needing to construct and destruct manually, and if you forget somewhere you're going to get a memory error.
Using std::optional
The first two solutions rely on size to distinguish whether an object is initialized or not, and require manual management of the object's lifecycle, which are potentially risky because manual is always shaky.
std::optional
Just the thing to use to solve this problem, even though it wasn't originally created for this purpose.
std::optional
The default constructor for optional will only construct an optional object in the "empty" state, which means that Elem will not be constructed. The default constructor for optional only constructs an optional object in the "empty" state, which means that the Elem will not be constructed. Most importantly, optional automatically manages the lifecycle of the values stored in it, destructing them when it's time to destruct them.
The code can now be changed to look like this:
template <typename Elem, std::size_t N>
class small_vec
{
std::array<std::optional<Elem>, N> data; // Automated lifecycle management
std::size_t size = 0;
};
Since you don't have to manually destruct anymore, small_vec can now not even write a destructor, just leave it to the default generated one.
Adding and fetching elements is also made easy. Adding is assigning a value to an optional, and fetching is calling a member function of the optional:
void small_vec::push_back(const Elem &e)
{
// Check to see if size exceeds data's upper limit, and continue adding new elements only if it doesn't.
this->data[size] = e;
}
Elem& small_vec::at(std::size_t idx)
{
if (idx >= this->size) {
throw Error{};
}
return *this->data[idx]; // You can also use value(), but it's empty in the optional here and will throw std::bad_optional_access exception
}
But using optional is not without cost: optional in order to distinguish whether the state is empty or not need an extra flag bit to record their state information, it needs to take up additional memory, but we can actually determine whether there is a value through the size to determine whether there is a value, the index is smaller than the size of the optional is certainly a value, so this additional overhead seems a bit unnecessary. So this extra overhead is a bit unnecessary, and many methods inside the option need to additionally determine the current state, which is also a bit less efficient.
The extra overhead of determining state is usually irrelevant unless it's in a performance hotspot, but the extra memory cost can be tricky, especially on a space-constrained place like the stack. Let's look at the specific overhead:
union ArrElem
{
ArrElem() {}
~ArrElem() {}
long value;
};
int main()
{
ArrElem arr1[10];
std::optional<long> arr2[10];
std::cout << "sizeof long: " << sizeof(long) << '\n';
std::cout << "sizeof ArrElem arr1[10]: " << sizeof(arr1) << '\n';
std::cout << "sizeof std::optional<long> arr2[10]: " << sizeof(arr2) << '\n';
}
On MSVC long is 4 bytes, so the output is as follows:
Under GCC on Linux x64 long is 8 bytes and the output becomes this:
That means you're wasting twice as much memory with optional.
So many container libraries go with option 2 or 1, like Google; option 3 is rarely used in such libraries.
summarize
Why didn't I recommend it?std::variant
What about it, isn't it the preferred alternative to union in modern c++?
The reason for this is that in addition to being a waste of memory like OPTIONAL, it forces the type of the first template parameter to be able to be constructed by default, otherwise it must be constructed with thestd::monostate
does padding, so by using it in a delayed construction scenario you're wasting memory and making the code verbose, with no obvious benefit, and it's not recommended.
Option 1 is not really recommended, because it is like dancing on the tip of a knife, the martial arts is naturally good to use, but just one negligence will be doomed.
My advice is that if you only want a delayed construct that is not very sensitive to wasting memory, then go with thestd::optional
Otherwise, go with Option 2.