poita.org

Imagine C++ Without Function Overloading

Posted: 2012-04-29 - Link

I’ve spent quite a lot of time in the D programming language community, and one thing I’ve noticed is that people are very quick to suggest new features, but rarely do people suggest to remove a feature (unless it is outright broken). It’s easy to see the benefits of adding to a language, but the costs are often harder to see, or are simply ignored.

For a bit of fun, this post will explore the benefits of removing a useful feature from C++: function overloading.

Quick recap: function overloading is a feature of C++ that allows you to have two functions with the same name that vary only on their parameters. In C, you would to name your functions differently, e.g. with min and fmin.

int min(int x, int y) { return x < y ? x : y; }
float fmin(float x, float y) { return x < y ? x : y; }

int a = min(1, 2);
float b = fmin(1.0f, 2.0f);

In C++, you give them the same name, and the version is chosen based on parameters.

int min(int x, int y) { return x < y ? x : y; }
float min(float x, float y) { return x < y ? x : y; }

int a = min(1, 2);
float b = min(1.0f, 2.0f);

In C, you could use macros as a hacky form of overloading, and in C++ you could use templates, but I’ll ignore those for the moment.

One clear advantage to overloading is that you don’t need to name your functions differently if they do the same things, but on different types. It makes things easier to remember and arguably makes the intent of your code clearer. It also means that you can change the types of your arguments (e.g. changing from float to double) without having to go around changing all your function names. It can also help with generic programming.

I’m sure there are other benefits, but those are the main ones. Now let’s go through the costs.

The first cost is the cost of name mangling. In C, since there can only be one function for a given name, the symbol assigned to that function is just the name of the function. In C++, two functions can have the same name, but they need to have different symbols, so C++ compilers must mangle the names to include the parameter type information.

Using g++, the two C++ min functions above mangle to __Z3minii and __Z3minff. Those are quite simple examples. Things get more nasty when more intricate types are involved. For example the std::sort of an std::deque<std::pair<int,int>> mangles to __ZSt4sortISt15_Deque_iteratorISt4pairIffERS2_PS2_EEvT_S6_.

Why does this matter? It matters if you have to look at call stacks, or object dumps, or care about the size of your debug files. Many tools handle common C++ name mangling schemes, but this is extra work for the tool developers. It also doesn’t help that every compiler mangles names differently, which is why you can rarely link code generated by two different C++ compilers.

Without function overloading, there would be no need to mangle the parameter types into the symbol name.

Another cost of function overloading is the quality of compile time errors. If I try to call min(1, 2.0f) in C++, I will get an error because it cannot figure out whether I want to call the int version or the float version. The only way to express my intent is to cast one of the arguments min(1, (int)2.0f). Without funcition overloading, if I want to call the int version, I call min, and if I want to call the float version, I call fmin.

Having no direct way to specify what overload I want to use has other consequences. Suppose I want to get the address of one of the overloads because I want to use it as a parameter to a higher-order function.

int a[] = {3, 2, 5, 1, 4};
int amin = std::accumulate(a, a + 5, a[0], &min); // get the min

// ERROR!
// error: no matching function for call to 'accumulate(int [5], int*,
int, <unresolved overloaded function type>)'

How do you do it? &min could refer to either the int or float version.

Do you know how to do it?

You have to cast the function pointer to the resolved overload function pointer type, i.e. (int (*)(int, int))&min gives you a pointer to the int version. This would be easier if the functions were named separately.

In the above scenario, things would be easier if min were a template as you could just refer to them as &min<int> and &min<float>. Just be careful when you mix overloads and specialisation though, as they don’t play nicely with each other.

While we’re talking about things that don’t play nicely with each other. Have you ever tried to override a particular function overload in a derived class? That’s not simple either.

So, it would seem there are at least a few costs to overloading. I’m not sure if the costs are so great to warrant its removal from C++, but it’s worth thinking about these things every once in a while.