2. Basic Data Types

Basic Data Types

Not to oversimplify, but there are four basic data types in C++: booleans, integers, floats and strings. These four data types will typically have different names in an actual program:

A boolean is a relatively new intrinsic data type — it is preferred over various other substitutes that were present in earlier versions and languages. It has the identifier "bool."
An integer might be called a "short," an "int," or a "long," and this list doesn't begin to cover the numerous minor variations.
A float is as likely to be called a "double" in a modern program, but its basic behavior remains the same.
A string is not a number like the other data types, it is a sequence of characters.

Each of these data types has a special area of applicability, and may sometimes be converted into one of the other types. Here is a short exposition of the numeric data types:

Booleans are used to indicate one of two states — true or false:
```
bool equal = (x == y); // true if x equals y
        
```
Booleans can only contain one of these two states, not a numeric value.
Integers can contain a "whole number," a number that does not have a fractional part:
```
int x = 1234;
        
```
If you attempt to assign a number with a fractional part to an integer, the fractional part will be truncated:
```
int x = 2.0/3.0;
cout << x << endl;
        
```
This code snippet will print "0," not ".666666," as you might expect.
Integers typically can contain a fixed range of signed values. The maximum value of these signed numbers may vary from system to system and compiler to compiler, so it is a very good idea to know what that range is. Typically, the C++ header file "limits.h" will contain the numeric limits apropriate to your compiler. An experienced programmer uses the values in "limits.h" to assure the numeric limits are not exceeded.
Floats can contain a "floating-point" (hereafter FP) number, a number than can contain a fractional part, and a number that may have an exponent. Typically a float is declared using the keyword "double," a comparatively large float that is most commonly used in modern programs. The old keyword "float" is only used where storage is limited, as for example when a very large array is declared and the numeric and resolution limits are small and well-established.
Here are examples of FP numbers:
```
double a = 1e6; // one million, or 1 followed by 6 zeros
double b = 1e-6; // 1/one million, or .000001
        
```
The programming shorthand "1e6" means "one with an exponent of 6," or, more accurately, "one multiplied by ten raised to the 6th power."
Not all the conventions of numeric notation are carried over into C++:
```
double c = 1,000,000; // creates a syntax error — don't use commas.
      
```

From a programming standpoint, your program will typically run faster if you use integers where there is no need for FP representation, and where the values of your numbers fall within a well-defined range.

Floating-point Cautions

As typically implemented in a C++ compiler, FP numbers have some behaviors that you need to be aware of and guard against. Floats are normally (but are not required to be) stored in binary form, but are displayed in decimal form. This dual identity causes some problems. Here's an example. Let's say you are designing a loop, and you want to increment the loop by 1/10. Here's your code:

#include <iostream>

using namespace std;

int main()
{
	for(double i = 0;i != 10;i += .1) {
		cout << i << endl;
	}
	return 0;
}

This code looks very straightforward, absolutely simple, and yet (on most compilers) it will never terminate. Why?

To know why, you need to know how 1/10 is represented in binary. It looks like this:


	1/10₁₀ = .0001100110011001100110011 ... ₂

The ellipsis symbol "..." means "repeat forever." In essence, 1/10 represented in binary is an infinitely repeating fraction. FP C++ variables can represent a number with a fractional part, but, because of limited storage, they can only do this approximately — in this case, they round off the infinite repeating fraction. This means our variable never exactly equals 10 in the loop, and the boolean test "!= 10" is always true, forever.

There are many other exampes of FP misbehaviors you need to guard against, too many to catalogue in a finite space, but in general, you should be aware of the limits of FP representation. To make FP variables do what you want, learn the required mathematics, understand the underlying representation of FP numbers, and always test your code in all the ways it will be used in your program.

And, where possible, avoid the use of FP variables as loop indices.

Strings

Strings differ from the numeric data types. A string is a sequence of characters:

	string a = "This is a string";

Some of the symbols used with numeric data types have a different meaning when used with strings:

	string a = "This is a string";
	string b = " and so is this."
	string c = a + b;

String c is not, as you might think, the sum of string a and string b. It is the "concatenation" of a and b, meaning it contains string a with string b appended to it. String c contains the characters "This is a string and so is this."

Comparison operators can be used with strings, but they act somewhat differently — they compare strings based on their alphabetic ordering. This comparison test --

	string a = "Buffalo";
	string b = "Zebra";
	if(a < b) {
		// do something here
	}

— will succeed because "Zebra" is "larger" than "Buffalo".

Test some words with this example program:

#include <iostream>
#include <string>

using namespace std;

void display(string a, string b, string comp)
{
	cout << "The word \""
	<< a << "\" is "
	<< comp << " the word \""
	<< b << "\"" << endl;
}

int main(int argc, char **argv)
{
	if(argc < 3) {
		cout << "Please type two words after the program name." << endl;
	}
	else {
		string a = argv[1];
		string b = argv[2];
		if(a < b) {
			display(a,b,"less than");
		}
		else if(a > b) {
			display(a,b,"greater than");
		}
		else if(a == b) {
			display(a,b,"equal to");
		}
		
	}
	return 0;
}

Compile and run this program. When you run it, type two words after the program name. Run the program repeatedly to see how it compares different words.

Now please pay attention to something about this program — it contains unnecessary code. Carefully examine the listing above and see if there is a way to make the program a little simpler.

Give up? Well, the program runs three tests. We test "if(a < b)", then we test "if(a > b)", then we test "if(a == b)". Now think — if string a is neither less than nor greater than string b, it must be equal to string b. The third test serves no purpose. The third statement might as well be:

		else {
			display(a,b,"equal to");
		}

This leads to a programming axiom — there is no program that cannot be improved. And the easiest way to improve a program is to walk away from your workstation, come back, read the program again, and ask yourself what the purpose of each statement is.

Notice also that this program knows how to read the "command line," the optional commands you may type after a program name. In this case, the commands are words for the program to compare. But in other programs, they could be anything — file names, special information to control how your program operates, anything.

www.arachnoid.com Main Page

Home |

C++ Tutorial |

4. Data Types and Handling |

Share This Page