We can motivate the need for programming languages to provide
constructs to define data types in much the same that we motivated the need
for constructs to define control: without them, the code hides the
intent of the programmer. Just as the unrestricted use of
goto's can produce spaghetti code in which it is very
hard to understand the flow of control, so the unrestricted use of pointers
can produce spaghetti storage in which it is very hard to understand
the structure of data. Just as programming languages slowly evolved better
and better constructs for control flow, so they have also evolved better
and better constructs for structuring data. The main concept is that of a
type, which is a name for a collection of data items (called "data
representations" by Sethi - and the same as r-values) having similar
structure; the types of a language divide its data items into distinct (but
not necessarily disjoint) classes.
If a language has a good type system, then programs will be easier to read, because it will be clearer what is going on; also, the compiler will be able to detect many errors, so that programs will also be easier to write, since easier to debug; moreover, types can help the compiler produce better code, especially for storage allocation. A strongly typed language requires type declarations for all r-values, whereas an untyped language requires no type declarations. Older languages tend to be less strongly typed than newer languages.
In lectures notes from 1967, Christopher Strachey of Oxford University
gave a classification of the different kinds of polymorphism. (This is the
same Strachey who introduced the very useful notions of "l-value" and
"r-value" in these same lecture notes, and who designed CPL, which inspired
C; in addition, he is the co-founder with Dana Scott of denotational
semantics.) The kinds of polymorphism are parametric, subtype, and ad hoc.
Ad hoc polymorphism is basically arbitrary overloading, for example,
using + for both integer addition and Boolean exclusive or.
Subsort polymorphism requires consistency across a type (or sort)
hierarchy, for example, + for integer addition should agree
with + for addition of reals and addition of rationals. The
most original idea is parametric polymorphism, where an operation is
parameterized by the type of its arguments; for example, head
makes sense for lists of any type of element, and can be considered to have
rank list a -> a, for any type a. Parametric
polymorphism plays an important role in the ML language.
Abstract types are an important topic not discussed in this chapter. By hiding the representation of a data type, they make it impossible for certain kinds of problem to arise; one example would be the infamous Y2K problem.
Sethi uses the word "object" for "something meaningful for an
application" (page 102), but this seems too vague and general, and anyway,
the word is better reserved for entities created as instances of classes in
the object paradigm. So the first bullet about uses of types on page 103
would be better phrased as "classufy data representations." On page 104,
note that <field-list> is a comma separated list of
<field>'s. On page 116, there are two instances of
ord in code that should be ord. On page 123, the last
sentence of Section 4.5 would be more precise if it said "subclass" instead
of class.
Of historical interest in this chapter is the discussion of Konrad Zuse's Plankalkul, which was perhaps the first language that could reasonably be called "high level," even though, because of World War II, it was never implemented (maybe fortunately for the Allies). See pages 101 and 146.
Finally, the different kinds of type equivalence discussed near the end of the chapter are interesting; many people do not realize that there are different notions of type equivalence, or that different languages make different choices, and that the kind of type equivalence used can sometimes make a big difference in the meaning of code (see page 139 ff).