Note: C++ is a superset of C.
General
Nature: object oriented language;
procedural language
C++ was developed in 1983 by Bjarne
Stroustrup at Bell Telephone Laboratories to extend C for object oriented
programming.
Hello World example
int main(int argc, char *argv[])
{
cout << "Hello World" << endl;
return 0;
}
Structure
Format: free form
Lexical
elements
source
code character set:
A C++ compiler may use any character set that includes at least the following
characters: the 52 upper case and lower case alphabetic characters ( A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c
d e f g h i j k l m n o p q r s t u v w x y z ), the 10 decimal digits (
0 1 2 3 4 5 6 7 8 9 ), the blank or space
character, and 29 designated graphic characters ( !
# % ^ & * ( ) - _ + = ~ [ ] \ | ; : ' " { } , . < > / ?
). Five formatting characters (backspace, horizontal tab, verticle tab, form
feed, and carriage return) are often used in C++ (formatting characters are
treated as spaces). The dollar sign ($) and the at sign (@) are also commonly
used (but not required by the standard). Some form of line separator is
required, but it doesn’t have to be an actual character or character sequence.
Execution
character set: The
execution character set for C++ is required to have the standard characters of
the source code character set, plus a null character and a newline character.
The null character must have the value 0 and is used to mark the end of
strings. The newline character is used to divide character streams into lines
during input or output. Run time libraries may convert between the newline
character and some other character(s) (or lack of characters) during execution
(such as compacting the carriage return/line feed combination into the newline
character or generating the newline character at the end of a logical record or
transforming between various record separators and the newline character).
White
space: White space
in C++ includes the blank (space character), horizontal tab, end-of-line,
vertical tab, form feed, and comments. White space is ignored by the compiler
(except when required to separate tokens or when used in a character or string
constant), and therefore can be used freely by the programmer to make the
program easy for a human to read. Some implementations of C++ treat nonstandard
source characters as either white space or line breaks.
Line
termination: Each
line in a C++ source program is terminated with an end-of-line character or
character sequence. Optionally, certain formatting characters (such as carriage
return, form feed or vertical tab) can also terminate lines. An empty line is a
line that consists of only a terminating character or character sequence or
white space and line termination. A logical source line can be continued past a
line termination by using the backslash character (\) or the ANSI C trigraph
??/ immediately before the line termination. String constants and preprocessor
command lines can cross line breaks through the use of logical source lines. In
some implementations of C++, tokens can also cross line breaks through the use
of logical source lines.
Line
length: Many C++
compilers impose a maximum line length (both for physical source lines and for
logical source lines).
Escape
characters: The
backslash character (\) is used as an escape character, allowing a programmer
to include characters that would normally have a special meaning for the
compiler.
Alternative
characters: Some
C++ compilers support the ANSI C trigraphs (see C).
Multibyte
characters: C++
supports both wide characters and multibyte characters.
Wide
characters are binary characters that are more than one byte, typically used
for expressing large alphabets.
Multibyte
characters are the external representation of a wide character, in either the
source or exeuction character set.
Comments: Comments are started by the
occurence of the two character sequence /* at any time other than within a
character or string constant. Comments are terminated by the two character
sequence */. Comments can also start with // and end with the end of the line.
Tokens
A
C++ compiler always collects characters into the longest possible tokens, even
if the result is not valid C++. White space always divides tokens. White space
must be used to separate an identifier, reserved word, integer constant,
floating point constant from a following identifier, reserved word, integer
constant, or floating point constant.
Operators: C++ has 15 simple operators ( ! % ^ & * - + = ~ | . < > / ? ), 11
compound assignment operators ( += -= *= /= %=
<<= >>= &= ^= |= == ), and 10 other compound operators (
-> ++ -- << >> <= >= !=
&& || ).
Separators:
C++ has 9 separator tokens ( ( ) [ ] { } , ; :
).