Commentary: User-Defined Types

Steven Zeil

Last modified: Aug 31, 2017

Contents:

1 typedef

2 enum

3 namespaces

3.1 Name spaces and namespaces

3.2 Shortening the names

This lessons is devoted to the simpler ways in which programmers can define their type names.

Programmers do, in fact, frequently introduce new data types in their programs. Usually that is done by declaring new structured types (“structs” or “classes”), which will be discussed in a section.

In this section, we are looking at some more basic mechanisms that may not be used as often, but are still important. And we also look at the idea of “name spaces”, an attempt to manage the problems that can arise when programmers start adding hundreds (or more) of new identifiers to their programs.

1 typedef

Many times we describe types by describing how they are constructed, e.g., an array of integers, a reference to a string, etc. When written in an actual programming language, those kinds of descriptions are called type expressions.

This is an analogy to the more common expressions that we might write like “x+y”, but instead of using thyings like “+” to operate on values to get a new value, we are now thinking of phrases like “array of” as operators that are applied to data types to create new data types.

A typedef assigns a name to a type expression.

Sometimes we do this to better indicate the “role” that our data is playing. For example, look at the number “1.0”. Is that a measure of weight, or a measure of speed? Obviously, it could be either, or something else entirely. We can’t tell without some context.

A typedef can supply that context:

typedef double Speed;
typedef double weight;
⋮
Speed train1, train2;
Weight load1, load2;

means the exact same thing as

double train1, train2, load1, load2;

but the first can be more expressive.

Typedefs are sometimes used to provide abbreviations for complicated type expressions that we would not want to type (or read!) over and over. As a beginning C++ programmer, you are still far away from needing to deal with type expressions like std::map<std::string,int>::value_type, but when the time comes, you will be very happy to write

typedef std::map<std::string,int>::value_type CountedWord;
   ⋮
CountedWords wc1 = ...;
   ⋮
CountedWords wc2 = ...;
   ⋮
vocabulary.insert(CountedWords("the",24));

rather than

std::map<std::string,int>::value_type wc1 = ...;
   ⋮
std::map<std::string,int>::value_type wc2 = ...;
   ⋮
vocabulary.insert(std::map<std::string,int>::value_type("the",24));

If you are a Java programmer

2 enum

enum types are common to many programming languages. They are a bit frustrating in C++, however, because of some important limitations:

There’s no easy way to print an enum value. Among other things, this makes debugging code that uses them rather awkward.
There’s no easy way to read an enum value.

They can, however, be used as a convenient way of grouping related named constants, as array indices, and as a basis for loops.

3 namespaces

3.1 Name spaces and namespaces

Whenever we are writing code, there is a certain set of names that we have access to. That set changes based on where in the code we are, and based upon what wee have already written.

For example, when I write

⋮  ➀
int x = 2;
⋮  ➁

we can make use of the name “x” in the region marked ➁, but not in the region marked ➀. Or, on the other hand, if we wanted to introduce a different variable/type/function named “x”, we probably can’t do that in either place.

Scopes affect this set of available names as well. If I have written

int foo(int i)
{
  int x = 2*i;
  return x;
}
  ⋮
int x = foo(22);

there’s no conflict between my two declarations of “x” because the scope of the first one is limited to its surrounding { }.

Intuitively we think of all of the already-declared names of things as floating freely in “name space”. A “name space conflict” or “name space clash” occurs when we want to declare something new and give it an appropriate name, only to discover that our chosen name is already in our name space, that something declared elsewhere already is using that name.

This can be a real problem because the number of predefined things can be quite large, if not because of our own code then just because of the C++ standard library. And if we (or the company that we program for) have obtained and installed some extra libraries, those could be adding still more names to our name space. It can become a real challenge to know what is taken and what is not. (And, as libraries are updated, they may change the sets of names they use.)

Example 1: Naming Functions example 1

Suppose that you were writing a bunch of functions that you have tentatively named “min”, “max”, “sum”, and “distance”. How many of those are already in the C++ standard library?

Answer

min, max, and distance are taken, but sum is not.

But if you didn’t know that, don’t feel bad. It’s hard to keep track of all the simple or natural names that we might want to use in our own code but that are already in the standard library.

This is a common problem in most programming languages.

In the early days of C++, I worked with two different libraries, written by different people, both of which declared a new data type named “string”. That made it all but impossible to actually use both libraries in the same program. Later, when the C++ standard library added its own string type, both libraries had to rename theirs.

The namespace (without the blank) is a C++ construct to help manage your name space by allowing you to divide the “space” up into different, named, regions. The best-known of these is std, the C++ standard library.

Example 2: Naming Functions example 2

Suppose that you were writing a bunch of functions that you have tentatively named “min”, “max”, and “distance”. Some spoilsport has already told you that those names are taken in the C++ standard library. Can you do it anyway?

Yes, you can. Those names are already in the std namespace. But if you write
int min (int x, int y);
it goes into the default, unnamed namespace. Your function can be accessed as min or even as ::min, and the standard function as std::min.

Example 3: Naming Functions example 3

Suppose that you are going to use a typedef to name a “Speed” type, but someone else in the company has already used that name – but you really think it’s the only reasonable name for your data type. Can you do it anyway?

Yes, just put it in a namespace of your own:
namespace myCoolNamespace {
     typedef ... Speed;
}
and you can refer to your data type as “myCoolNamespace::Speed”.

If you are a Java programmer

C++ namespaces are largely the same thing as Java packages.

However, Java ties packages to the actual directories/folders containing the source code. C++ has no such ties. In fact, it is possible to declare things in multiple names spaces within the same file:

namespace formal {
    typedef double Velocity;
}
namespace informal {
    typedef formal::Velocity Speed;
}

and that file would not need to be in a directory named “formal” nor one named “informal”.

3.2 Shortening the names

Often we would prefer not to type out the entire long name of an identifier. The using statement allows us to “import” certain names in their abbreviated form into our current name space.

There are two forms of using statement.

First, if we only want to use the shortened form of a few specific identifiers, we can import them with:

using full-name-of-identifier ;

For example, if we have previously done

namespace myCoolNamespace {
     typedef ... Speed;
}

and then later write

using myCoolNamespace::Speed;

then, afterwards, we can refer to that data type as simply “Speed”.

If we want to use the short form of a large number of identifiers from a common namespace, then we can import all of the names in that namespace with the statement

using namespace namespace-name ;

You’ve probably already seen this used many times as “using namespace std”; For example, we might write the “Hello world” program as

#include <iostream>
#include <string>

const std::string greeting = "Hello, world!";

int main()
{
    std::cout << greeting << std::endl;
}

#include <iostream>
#include <string>

using namespace std;

const string greeting = "Hello, world!";

int main()
{
    cout << greeting << endl;
}

The effects of the using statements are limited to the score in which they appear. In particular, if a using appears within { }, its effect ends at the closing }. So it’s possible to permit short names within a particular function body, for example, without allowing them through the whole program.

If you are a Java programmer

Java programmers who come to C++ often make the mistake of assuming that C++ #include statements are similar to Java import statements. After all, they both appear in similar places (at the very start of a file of source code) and “include” and “import” would seem to be synonyms.

But that’s actually not true at all. C++ using statements are actually direct analogues of Java import statements.

For example, the C++ statement

using std::list;

does exactly the same thing as the Java

import java.util.List;

In both cases, we are simply saying that the following code will be able to use the short name “list” (or “List” in Java) instead of the full name “std::list” in C++ or “java.util.List” in Java.

And the C++ statement

using namespace myOwnLibrary;

does exactly the same thing as the Java statement

import myOwnLibrary.*;

You might wonder, then, what is the Java equivalent to a C++ #include? The answer is that Java doesn’t have, and doesn’t need an equivalent to #include. The purpose of #include is to tell the C++ compiler to load a certain file of source code so that we can use the identifiers declared within it. C++ needs this sort of explicit instruction because there are no rules in C++ linking the content of a file of C++ code to the file name and directories in which it is stored.

But Java “knows” that you are going to place your code in directories that match your package names, and it knows that your file names will match the name of the only public class in that file. So when you write code like

myOwnLibrary.MyType mt;

Java “knows” where to look for the code declaring MyType and simply goes out and reads it. C++, on the other hand, would have no clue where to look for a declaration of MyType in similar circumstances. So C++ programmers have to use an #include statement to fetch the appropriate file. It’s the price they pay for the freedom of naming their files whatever they want and arranging their code directories however they like.