C++ auto

| 分类 C++  | 标签 C++_keywords  Effective_Modern_C++ 

Item5: Prefer auto to Explicit Type Declarations

Let’s move on to the simple joy of declaring a local variable to be initialized by dereferencing an iterator:

template<typename It> // algorithm to dwim ("do what I mean")
void dwim(It b, It e) // for all elements in range from
{                    // b to e
    while (b != e) {
        typename std::iterator_traits<It>::value_type
        currValue = *b;
        ...
    }
}

auto variables have their type deduced from their initializer, so they must be initialized. That means you can wave goodbye to a host of uninitialized variable problems as you speed by on the modern C++ superhighway:

template<typename It> // algorithm to dwim ("do what I mean")
void dwim(It b, It e) // for all elements in range from
{                    // b to e
    while (b != e) {
        auto
        currValue = *b;
        ...
    }
}

And because auto uses type deduction (see Item 2), it can represent types known only to compilers:

auto derefUPLess =
[](const std::unique_ptr<Widget>& p1,
const std::unique_ptr<Widget>& p2)
{ return *p1 < *p2; };

In C++14, the temperature drops further, because parameters to lambda expressions may involve auto:

auto derefUPLess =
[](const auto& p1,
const auto& p2)
{ return *p1 < *p2; };

Because lambda expressions yield callable objects, closures can be stored in std::function objects. That means we could declare the C++11 version of derefUPLess without using auto as follows:

std::function<bool(const std::unique_ptr<Widget>&,
                   const std::unique_ptr<Widget>&)>
derefUPLess = [](const std::unique_ptr<Widget>& p1,
                const std::unique_ptr<Widget>& p2)
                { return *p1 < *p2; };

std::function object typically uses more memory than the auto-declared object. And, thanks to implementation details that restrict inlining and yield indirect function calls, invoking a closure via a std::function object is almost certain to be slower than calling it via an auto-declared object. In other words, the std::function approach is generally bigger and slower than the auto approach, and it may yield out-of-memory exceptions, too.

The advantages of auto extend beyond the avoidance of uninitialized variables, verbose variable declarations, and the ability to directly hold closures. One is the ability to avoid what I call problems related to “type shortcuts.” Here’s something you’ve probably seen—possibly even written:

std::vector<int> v;
...
unsigned sz = v.size();

The official return type of v.size() is std::vector<int>::size_type, but few developers are aware of that. std::vector<int>::size_type is specified to be an unsigned integral type, so a lot of programmers figure that unsigned is good enough and write code such as the above. This can have some interesting consequences. On 32-bit Windows, for example, both unsigned and std::vector<int>::size_type are the same size, but on 64-bit Windows, unsigned is 32 bits, while std::vector<int>::size_type is 64 bits. This means that code that works under 32-bit Windows may behave incorrectly under 64-bit Windows, and when porting your application from 32 to 64 bits, who wants to spend time on issues like that? Using auto ensures that you don’t have to:

auto sz = v.size(); // sz's type is std::vector<int>::size_type

Then consider this code:

std::unordered_map<std::string, int> m;
...
for (const std::pair<std::string, int>& p : m)
{
...
// do something with p
}

Recognizing what’s amiss requires remembering that the key part of a std::unordered_map is const, so the type of std::pair in the hash table (which is what a std::unordered_map is) isn’t std::pair<std::string, int>, it’s std::pair <const std::string, int>. But that’s not the type declared for the variable p in the loop above. As a result, compilers will strive to find a way to convert std::pair<const std::string, int> objects (i.e., what’s in the hash table) to std::pair<std::string, int> objects (the declared type for p). They’ll succeed by creating a temporary object of the type that p wants to bind to by copying each object in m, then binding the reference p to that temporary object. At the end of each loop iteration, the temporary object will be destroyed. If you wrote this loop, you’d likely be surprised by this behavior, because you’d almost certainly intend to simply bind the reference p to each element in m.

Such unintentional type mismatches can be autoed away:

for (const auto& p : m)
{
    ...
}

The last two examples—writing unsigned when you should have written std::vector<int>::size_type and writing std::pair<std::string, int> when you should have written std::pair<const std::string, int>—demonstrate how explicitly specifying types can lead to implicit conversions that you neither want nor expect. If you use auto as the type of the target variable, you need not worry about mismatches between the type of variable you’re declaring and the type of the expression used to initialize it.

Item6: Use the explicitly typed initializer idiom when auto deduces undesired types

Item 5 explains that using auto to declare variables offers a number of technical advantages over explicitly specifying types, but sometimes auto’s type deduction zigs when you want it to zag. For example, suppose I have a function that takes a Widget and returns a std::vector<bool>, where each bool indicates whether the Widget offers a particular feature:

std::vector<bool> features(const Widget& w);

Further suppose that bit 5 indicates whether the Widget has high priority. We can thus write code like this:

Widget w;
...
bool highPriority = features(w)[5]; // is w high priority?
...
processWidget(w, highPriority); // process w in accord
                                // with its priority

There’s nothing wrong with this code. It’ll work fine. But if we make the seemingly innocuous change of replacing the explicit type for highPriority with auto,

auto highPriority = features(w)[5];
// is w high priority?

the situation changes. All the code will continue to compile, but its behavior is no longer predictable:

processWidget(w, highPriority);    // undefined behavior!

As the comment indicates, the call to processWidget now has undefined behavior. But why? The answer is likely to be surprising. In the code using auto, the type of highPriority is no longer bool. Though std::vector<bool> conceptually holds bools, operator[] for std::vector<bool> doesn’t return a reference to an element of the container (which is what std::vector::operator[] returns for every type except bool). Instead, it returns an object of type std::vector<bool>::reference (a class nested inside std::vector<bool>).

std::vector<bool>::reference exists because std::vector<bool> is specified to represent its bools in packed form, one bit per bool. That creates a problem for std::vector<bool>’s operator[], because operator[] for std::vector<T> is supposed to return a T&, but C++ forbids references to bits. Not being able to return a bool&, operator[] for std::vector<bool> returns an object that acts like a bool&.

std::vector<bool>::reference is an example of a proxy class: a class that exists for the purpose of emulating and augmenting the behavior of some other type. Proxy classes are employed for a variety of purposes. std::vector<bool>::reference exists to offer the illusion that operator[] for std::vector<bool> returns a reference to a bit, for example, and the Standard Library’s smart pointer types (see Chapter 4) are proxy classes that graft resource management onto raw pointers. The utility of proxy classes is well-established. In fact, the design pattern “Proxy” is one of the most longstanding members of the software design patterns Pantheon.

As a general rule, “invisible” proxy classes don’t play well with auto. Objects of such classes are often not designed to live longer than a single statement, so creating variables of those types tends to violate fundamental library design assumptions. That’s the case with std::vector<bool>::reference, and we’ve seen that violating that assumption can lead to undefined behavior. You therefore want to avoid code of this form:

auto someVar = expression of "invisible" proxy class type;

The explicitly typed initializer idiom involves declaring a variable with auto, but casting the initialization expression to the type you want auto to deduce. Here’s how it can be used to force highPriority to be a bool, for example:

auto highPriority = static_cast<bool>(features(w)[5]); // casting to bool

Applications of the idiom aren’t limited to initializers yielding proxy class types. It can also be useful to emphasize that you are deliberately creating a variable of a type that is different from that generated by the initializing expression. For example, suppose you have a function to calculate some tolerance value:

double calcEpsilon();  // return tolerance value

calcEpsilon clearly returns a double, but suppose you know that for your application, the precision of a float is adequate, and you care about the difference in size between floats and doubles. You could declare a float variable to store the result of calcEpsilon,

float ep = calcEpsilon();    // impliclitly convert
                            // double → float

but this hardly announces “I’m deliberately reducing the precision of the value returned by the function.” A declaration using the explicitly typed initializer idiom, however, does:

auto ep = static_cast<float>(calcEpsilon());

So this explicitly typed initializer idiom is a way to emphasize the type of the variable you’re declaring. It can help the readibility of your code, and it can help the compiler to infer the type of the variable you’re declaring.


上一篇     下一篇