Commentary: the std::string Class

Chris Wild and Steven Zeil

Last modified: Aug 31, 2017
Contents:

One of the most commonly used data types in any programming language is the character string.

In C++, string is NOT a built-in data type, but is part of the standard library.

1 Strings

The std::string type supports:

If you are a C programmer.
If you are a Java programmer.

2 Converting Other Data Types to/from String

2.1 Converting can be a challenge

Although a very common operation, converting to/from strings is not trivial in C++.

If you are a Java programmer

But the most common approach in C++ is to use I/O operations to read and write from “string streams”. Remember that part of the C++ model is that we can read (>>) and write (<<) from streams, but different kinds of streams may connect to different kinds of devices. One such “device” can be a variable holding a string, e.g.:

#include <iostream>
#include <sstream>
#include <string>

using namespace std;

⋮
string dataIn = "42 3.14159";

istringstream in (dataIn); // in actually reads from the string dataIn
int i;
double d;
in >> i >> d; // i will be 42, and d will be 3.14.159
in.close();

or, when converting to a string:

#include <iostream>
#include <sstream>
#include <string>

using namespace std;

⋮
string dataOut;

int i = 42;
double d = 3.14159;
ostringstream out; // out actually writes to a string
out << i << ' ' << d;
dataOut = out.str(); // Get the string that has been written into
out.close();

2.2 String Literals are not Strings

Sometimes programming languages (like some people) do not show their age gracefully.

C++ is the direct descendant of the C programming language. In fact, the very name “C++” is considered to be a bit of a joke: the ++ operator takes us “one step beyond” the value it is applied to. So C++ is just “one step beyond” C.

C dates back the late 1970’s. One of the goals of C++ was to maintain as much backwards compatibility as possible with C – most old code written in C should still compile and run using a C++ compiler.

One of the places where this shows up is in string handling. C did not have a data type named “string”. Instead, C used arrays of characters. When used to store character strings, the convention was to indicate the end of a string by inserting final character containing the ASCII character code 0, known as NUL. So strings in C are often described as null-terminated arrays. (“NUL” and “null” aren’t actually identical, but tradition has conflated them.)

The one place where you may notice this is that string literals (the string “constants” we write in our code) do not have data type std::string. They actually are considered to be of type “const char*”. In the next module we will see that this means “an array of characters in which we are prohibited from changing the individual characters”.

const char* myName = "Steven Zeil";

You might wonder where the NUL terminator is in the above value. The answer is that you can’t see it. NUL was designed to be an invisible, non-printing value in the ASCII character set. So the array containing "Steven Zeil" will actually have 12 characters, even though you only see 11. The final character is a NUL, automatically inserted by the compiler.

The data type “const char*”, together with the convention of null termination, is generally referred to as “character arrays” or C strings.

Converting from C strings to std::string is easy:

string myNameAsAString = myName;
string myNameAsAString2 = "Steven Zeil";

These lines rely on the fact that the C++ string type is declared in a way that automatically converts const char* values to string values.

Every now and then, however, you run across a bit of code (possibly old code) that takes parameters of type const char*, and you are forced to remember that string literals are not same thing as strings:

void oldCode (const char* fileName);
  ⋮
oldCode("foo.txt"); // OK
string aFileName = "bar.txt";
oldCode (aFileName); // compilation error

That last statement gets a compilation error because aFileName has type string, but oldCode is expecting a character array. The first call to oldCode worked because the string literal “foo.txt” really is a character array.

For those odd occasions when you really need a character array/C string, the std::string class provides a function to do the conversion:

void oldCode (const char* fileName);
  ⋮
oldCode("foo.txt"); // OK
string aFileName = "bar.txt";
oldCode (aFileName.c_str()); // also OK

3 String I/O

There are three approaches used for reading strings.

3.1 Read characters until a whitespace character is found

3.2 Read until the end of line.

Here is a program for reading names:

stringRead.cpp
#include <string>
#include <iostream>

using namespace std;

int main()
{
    string firstName, lastName, fullName;
    string greeting("Hello ");


    cout << "What is your name? ";
    getline (cin,  fullName);
    if (fullName.size() > 0 && fullName[0] == ' ')
       { // Trim leading blanks from fullName
         int charPosition = fullName.find_first_not_of (" ");
         fullName = fullName.substr(charPosition);
       }

    // Split fullName into parts
    int blankPosition = fullName.find(' ');
    if (blankPosition != string::npos)
      {
        firstName = fullName.substr (0, blankPosition);
        int charPosition = fullName.find_first_not_of (" ", blankPosition);
        lastName = fullName.substr (charPosition);
      }
    else
        lastName = fullName;

    greeting.append(lastName);
    greeting.append(", " + firstName);
    string banner(greeting.length() + 4,'$'); // construct string with bunch of '$'s
    cout << banner << endl;
    cout << "$ " << greeting + " $" << endl;
    cout << banner << endl << endl;
    if(firstName < lastName)
        cout << "your first name is alphabetically before your last\n";
    else
        cout << "your first name is alphabetically after your last\n";
    return 0;
}

3.3 Read until a special character is found.

stringRead2.cpp
#include <string>
#include <iostream>

using namespace std;

int main()
{
    string firstName, lastName, fullName;
    string greeting("Hello ");


    cout << "What is your name (first last separated by a space)? ";
    getline (cin, firstName, ' ');
    getline (cin, lastName);
    cin >> firstName >> lastName;
    greeting.append(lastName);
    greeting.append(", " + firstName);
    string banner(greeting.length() + 4,'$'); // construct string with bunch of '$'s
    cout << banner << endl;
    cout << "$ " << greeting + " $" << endl;
    cout << banner << endl << endl;
    if(firstName < lastName)
        cout << "your first name is alphabetically before your last\n";
    else
        cout << "your first name is alphabetically after your last\n";
    return 0;
}

So which method of inputting strings is the best to use?

Remember always that >> skips over leading whitespace and stops before (but does not consume) trailing whitespace. getline preserves leading whitespace and consumes (discards) its stopping character.