Basic Data Types
Not to oversimplify, but there are four basic data types in C++: booleans,
integers, floats and strings. These four data types will typically have
different
names in an actual program:
-
A boolean is a relatively new intrinsic data type — it is preferred over
various other substitutes that were present in earlier versions and languages. It has the identifier "bool."
-
An integer might be called a "short," an "int," or a "long," and this list
doesn't begin to cover the numerous minor variations.
-
A float is as likely to be called a "double" in a modern program, but its
basic behavior remains the same.
-
A string is not a number like the other data types, it is a sequence of
characters.
Each of these data types has a special area of applicability, and may sometimes
be converted into one of the other types. Here is a short exposition of the numeric data types:
-
Booleans
are used to indicate one of two states — true or false:
bool equal = (x == y); // true if x equals y
Booleans can only contain one of these two states, not a numeric value.
-
Integers
can contain a "whole number," a number that does not have a fractional part:
int x = 1234;
If you attempt to assign a number with a fractional part to an integer, the
fractional part will be truncated:
int x = 2.0/3.0;
cout << x << endl;
This code snippet will print "0," not ".666666," as you might expect.
Integers typically can contain a fixed range of signed values. The maximum
value of these signed numbers may vary from system to system and compiler to
compiler, so it is a very good idea to know what that range is. Typically, the
C++ header file "limits.h" will contain the numeric limits apropriate to your
compiler. An experienced programmer uses the values in "limits.h" to assure the
numeric limits are not exceeded.
-
Floats
can contain a "floating-point" (hereafter FP) number, a number than can contain
a fractional
part, and a number that may have an exponent. Typically a float is declared
using the keyword "double," a comparatively large float that is most commonly
used in modern programs. The old keyword "float" is only used where storage is
limited, as for example when a very large array is declared and the numeric and
resolution limits are small and well-established.
Here are examples of FP numbers:
double a = 1e6; // one million, or 1 followed by 6 zeros
double b = 1e-6; // 1/one million, or .000001
The programming shorthand "1e6" means "one with an exponent of 6," or, more
accurately, "one
multiplied by ten raised to the 6th power."
Not all the conventions of numeric notation are carried over into C++:
double c = 1,000,000; // creates a syntax error — don't use commas.
From a programming standpoint, your program will typically run faster if you
use integers where there is no need for FP representation, and
where the values of your numbers fall within a well-defined range.
Floating-point Cautions
As typically implemented in a C++ compiler, FP numbers have some
behaviors that you need to be aware of and guard against. Floats are normally
(but are not required to be) stored in binary form, but are displayed in
decimal form. This dual identity causes some problems.
Here's an example. Let's say you are designing a loop, and you want to
increment the loop by 1/10. Here's your code:
#include <iostream>
using namespace std;
int main()
{
for(double i = 0;i != 10;i += .1) {
cout << i << endl;
}
return 0;
}
This code looks very straightforward, absolutely simple, and yet (on most
compilers) it will never terminate. Why?
To know why, you need to know how 1/10 is represented in binary. It looks
like this:
1/1010 = .0001100110011001100110011 ... 2
The ellipsis symbol "..." means
"repeat forever."
In essence, 1/10 represented in binary is an infinitely repeating fraction.
FP C++ variables can represent a number with a fractional part,
but, because of limited storage, they can only do this approximately — in this
case, they round
off the infinite repeating fraction.
This means our variable never exactly equals 10 in the loop, and the boolean
test
"!= 10" is
always true, forever.
There are many other exampes of FP misbehaviors you need to guard against, too
many to catalogue in a finite space, but in general, you should be aware of the
limits of FP representation. To make FP variables do what you want, learn the
required mathematics, understand the underlying representation of FP numbers,
and always test your code in all the ways it will be used in your program.
And, where possible, avoid the use of FP variables as loop indices.
Strings
Strings differ from the numeric data types. A string is a sequence of characters:
string a = "This is a string";
Some of the symbols used with numeric data types have a different meaning when used with strings:
string a = "This is a string";
string b = " and so is this."
string c = a + b;
String c is not, as you might think, the sum of string a and string b. It is the "concatenation" of a and b, meaning it contains string a with string b appended to it. String c contains the characters "This is a string and so is this."
Comparison operators can be used with strings, but they act somewhat differently — they compare strings based on their alphabetic ordering. This comparison test --
string a = "Buffalo";
string b = "Zebra";
if(a < b) {
// do something here
}
— will succeed because "Zebra" is "larger" than "Buffalo".
Test some words with this example program:
#include <iostream>
#include <string>
using namespace std;
void display(string a, string b, string comp)
{
cout << "The word \""
<< a << "\" is "
<< comp << " the word \""
<< b << "\"" << endl;
}
int main(int argc, char **argv)
{
if(argc < 3) {
cout << "Please type two words after the program name." << endl;
}
else {
string a = argv[1];
string b = argv[2];
if(a < b) {
display(a,b,"less than");
}
else if(a > b) {
display(a,b,"greater than");
}
else if(a == b) {
display(a,b,"equal to");
}
}
return 0;
}
Compile and run this program. When you run it, type two words after the program name. Run the program repeatedly to see how it compares different words.
Now please pay attention to something about this program — it contains unnecessary code. Carefully examine the listing above and see if there is a way to make the program a little simpler.
Give up? Well, the program runs three tests. We test "if(a < b)", then we test "if(a > b)", then we test "if(a == b)". Now think — if string a is neither less than nor greater than string b, it must be equal to string b. The third test serves no purpose. The third statement might as well be:
else {
display(a,b,"equal to");
}
This leads to a programming axiom — there is no program that cannot be improved. And the easiest way to improve a program is to walk away from your workstation, come back, read the program again, and ask yourself what the purpose of each statement is.
Notice also that this program knows how to read the "command line," the optional commands you may type after a program name. In this case, the commands are words for the program to compare. But in other programs, they could be anything — file names, special information to control how your program operates, anything.