Strings in C++

[ Definition | Input/Output | String Library ]


String Definitions

A string is a sequence of characters like "this is a string".
The length of a string can varying from the empty string to a string containing all the characters in the novel War and Peace.
A constant string is called a string literal and is usually shown between double quotes (like the string above).
The empty string is shown by two double quotes together. like this ""

Strings, while very useful, are not a primitve data type in C++ (the primitive types are int, float, char and address).
Strings are implemented in C++ using an array of characters.
Since strings are a sequence of characters and since arrays can be of any size, it is natural to store strings in characters arrays.

However, arrays in C++ have a number of shortcomings and dangers which affect strings as well.

In C++ a clever way to determine the length of the string is to store a null character (written as '\0') which has the numeric value 0.
This is clever since the value for FALSE also has the numeric value 0 and so one can write efficient loops for traversing a string.
In this way any string up to the maximum storage available in the character array can be stored in that array and its length can be determined by counting the number of characters before the null character. An array of characters which stores a sequence of characters ending in the null character is usually referred to as a string in C++.

Because of the null character at the end of a string, the space needed for a string is always one greater than the length of the string.
Thus the literal string "hi" takes 3 characters.
Even the null string ("") takes one character.


String Input/Output

OUTPUT: It is easy to output a string literal.

	cout << "This is a literal string";

This works because the type of a string literal is pointer to char, which is interpreted as a character array which has the null character at the end of the string.

In fact any character array can be outputted if it has the null character marking the end of the string.

	char greeting[3];
	greeting[0] = 'H'; greeting[1] = 'i'; greeting[2] = '\0';
	cout << greeting;

INPUT: This is harder since the length of the string cannot be determined from a null character.
There are three approaches used for reading strings.

  1. Read characters until a whitespace character is found (Whitespace characters are: blank, tab and new line).
    Initial whitespace characters are ignored.
    Use the regular istream extraction operator ('>>') for this purpose.
    One problem with this approach is that you cannot read a string with blank characters in it.
    If the string in the input stream is bigger than the array, memory beyond the end of the array is overwritten.
    You must assume that the input is never too big (including the null character).
  2. Read until the end of line.
    Use the function getline or get for this purpose. getline will put all characters into the string including blanks and tabs (but not newline characters).
    getline also includes a parameter for the size of the string. This protects the program from input which is longer than the array size.
  3. Read until special character is found. The special character is passed as the optional third parameter of get function.
    This method allwos you to read any characters, including the newline, except the special character and thus can be used to read multiple line input strings.

Examples:

   char someString[11]; // holds up to 10 characters plus the null character
   cin >> someString; // reads next bunch of non-whitespace characters into somestring
                      // inserts null character ar end
                      // assumes that this will be 10 or less characters 
   cin.getline(someString,11); // reads at most 10 characters or until end of line
        // inserts null character at end
        // if the line contains 9 or less characters, the new line character is removed
        // if the line contains more than 9 characters, the extra characters (if any) and the newline
        // are kept.
   cin.get(someString,11);  // reads at most 10 characters or until end of line
        // inserts null character at end
        // Unlike getline above, the new line character is never removed.
   cin.get(someString,11,':'); // reads at most 10 characters or until the character ':' is found

So which is the best to use?

// to handle getting one line of input, even if it is too long
// ignoring the extra characters (if any)
   cin.get(someString, 11);
   cin.ignore(200,'\n'); // ignore up to 200 characters but stop at the first newline is less than 200
              // assumes that there will be less than 200 characters extra on this line.
// Using getline and ignoring extra characters at end of line
// NOTICE you must use this method if the number of characters to be read on a line
// is exactly the size of the array - 1.
// That is, if the line contains exactly 10 characters, then the new line character is
// NOT extracted and you need to extract the new line or ignore it later on.
// if the number of characters on the line is less than the size of the array-1.
// then the new line is extracted.
   cin.getline(someString,11);
   if(strlen(someString) == 10) // did not consume the new line
      cin.ignore(200,'\n');

string library (string.h)

string.h contains a number of useful functions for dealing with null terminated strings.
A brief description of the functions defined in "string.h" is given below along with their function prototypes.
Strings parameters are characters arrays. But sunce all arrays are passed as reference, these will be shown as pointers to characters instead.
Many of these functions also return a pointer which is equal to one of the paramter pointers.
Also note that most of theses functions have two versions, one which assumes that any receiving string is big enough to hold the result and one which has a length parameter which prevents overflowing the array.

Click the function name to see more details with examples.

char* strcpy(char* To,const char* From);
// Copies string "From" to string "To", also returns a pointer to string "To"

char* strncpy(char* To, const char* From, int size);
// Copies string "From" to string "To" but no more than size characters
// if string "From" is bigger than size, than no null character is added to "To"
// otherwise it is

 

int strcmp(const char* string1, const char* string2);
// alphabetically compares string1 and string2
// returns 0 is string 1 is the same as string2
// returns a negative number if string1 is alphabetically before string2
// returns a positive number if string2 is alphabetically after string2.

 


string library (string.h): strcpy

 

char* strcpy(char* To,const char* From);
// Copies string "From" to string "To", also returns a pointer to string "To"

Example.

   char string1[10] = "hi there";
   char string2[10];
   strcpy(string2, string1);
   // string 2 now contains a copy of "hi there"

 

 


string library (string.h): strcmp

int strcmp(const char* string1, const char* string2);
// alphabetically compares string1 and string2
// returns 0 is string 1 is the same as string2
// returns a negative number if string1 is alphabetically before string2
// returns a positive number if string2 is alphabetically after string2.

Example:

   char string1[10] = "justice";
   char string2[10] = "peace;
   if(strcmp(string1, string2)< 0)
      cout << string1 << " is alphabetically before " << string2 << endl;
   else if(strcmp(string1, string2) > 0)
      cout << string1 << " is alphabetically after " << string2 << endl;
   else if(strcmp(string1, string2) == 0)
      cout << string1 << " is equal to " << string2 << endl;