Some authors describe object-oriented programming as programming abstract data types and their relationships. Within this section we introduce abstract data types as a basic concept for object-orientation and we explore concepts used in the list example of the last section in more detail.
The first thing with which one is confronted when writing programs is the problem. Typically you are confronted with ``real-life'' problems and you want to make life easier by providing a program for the problem. However, real-life problems are nebulous and the first thing you have to do is to try to understand the problem to separate necessary from unnecessary details: You try to obtain your own abstract view, or model, of the problem. This process of modeling is called abstraction and is illustrated in Figure 3.1.
The model defines an abstract view to the problem. This implies that the model focusses only on problem related stuff and that you try to define properties of the problem. These properties include
by the problem.
As an example consider the administration of employees in an institution. The head of the administration comes to you and ask you to create a program which allows to administer the employees. Well, this is not very specific. For example, what employee information is needed by the administration? What tasks should be allowed? Employees are real persons who can be characterized with many properties; very few are:
Certainly not all of these properties are necessary to solve the administration problem. Only some of them are problem specific. Consequently you create a model of an employee for the problem. This model only implies properties which are needed to fulfill the requirements of the administration, for instance name, date of birth and social number. These properties are called the data of the (employee) model. Now you have described real persons with help of an abstract employee.
Of course, the pure description is not enough. There must be some operations defined with which the administration is able to handle the abstract employees. For example, there must be an operation which allows you to create a new employee once a new person enters the institution. Consequently, you have to identify the operations which should be able to be performed on an abstract employee. You also decide to allow access to the employees' data only with associated operations. This allows you to ensure that data elements are always in a proper state. For example you are able to check if a provided date is valid.
To sum up, abstraction is the structuring of a nebulous problem into well-defined entities by defining their data and operations. Consequently, these entities combine data and operations. They are not decoupled from each other.
The example of the previous section shows, that with abstraction you create a well-defined entity which can be properly handled. These entities define the data structure of a set of items. For example, each administered employee has a name, date of birth and social number.
The data structure can only be accessed with defined operations. This set of operations is called interface and is exported by the entity. An entity with the properties just described is called an abstract data type (ADT).
Figure 3.2 shows an ADT which consists of an abstract data structure and operations. Only the operations are viewable from the outside and define the interface.
Once a new employee is ``created'' the data structure is filled with actual values: You now have an instance of an abstract employee. You can create as many instances of an abstract employee as needed to describe every real employed person.
Let's try to put the characteristics of an ADT in a more formal way:
Definition (Abstract Data Type) An abstract data type (ADT) is characterized by the following properties:
With the first property it is possible to create more than one instance of an ADT as exemplified with the employee example. You might also remember the list example of chapter 2. In the first version we have implemented a list as a module and were only able to use one list at a time. The second version introduces the ``handle'' as a reference to a ``list object''. From what we have learned now, the handle in conjunction with the operations defined in the list module defines an ADT List:
However, all of these properties are only valid due to our understanding of and our discipline in using the list module. It is in our responsibility to use instances of List according to these rules.
The principle of hiding the used data structure and to only provide a well-defined interface is known as encapsulation. Why is it so important to encapsulate the data structure?
To answer this question consider the following mathematical example where we want to define an ADT for complex numbers. For the following it is enough to know that complex numbers consists of two parts: real part and imaginary part. Both parts are represented by real numbers. Complex numbers define several operations: addition, substraction, multiplication or division to name a few. Axioms and preconditions are valid as defined by the mathematical definition of complex numbers. For example, it exists a neutral element for addition.
To represent a complex number it is necessary to define the data structure to be used by its ADT. One can think of at least two possibilities to do this:
Point 3 of the ADT definition says that for each access to the data structure there must be an operation defined. The above access examples seem to contradict this requirement. Is this really true?
Let's look again at the two possibilities for representing imaginary numbers. Let's stick to the real part. In the first version, x equals c[0]. In the second version, x equals c.r. In both cases x equals ``something''. It is this ``something'' which differs from the actual data structure used. But in both cases the performed operation ``equal'' has the same meaning to declare x to be equal to the real part of the complex number c: both cases archieve the same semantics.
If you think of more complex operations the impact of decoupling data structures from operations becomes even more clear. For example the addition of two complex numbers requires you to perform an addition for each part. Consequently, you must access the value of each part which is different for each version. By providing an operation ``add'' you can encapsulate these details from its actual use. In an application context you simply ``add two complex numbers'' regardless of how this functionality is actually archieved.
Once you have created an ADT for complex numbers, say Complex, you can use it in the same way like well-known data types such as integers.
Let's summarize this: The separation of data structures and operations and the constraint to only access the data structure via a well-defined interface allows you to choose data structures appropriate for the application environment.
ADTs are used to define a new type from which instances can be created. As shown in the list example, sometimes these instances should operate on other data types as well. For instance, one can think of lists of apples, cars or even lists. The semantical definition of a list is always the same. Only the type of the data elements change according to what type the list should operate on.
This additional information could be specified by a generic parameter which is specified at instance creation time. Thus an instance of a generic ADT is actually an instance of a particular variant of the ADT. A list of apples can therefore be declared as follows:
List<Apple> listOfApples;
The angle brackets now enclose the data type for which a variant of the generic ADT List should be created. listOfApples offers the same interface as any other list, but operates on instances of type Apple.
As ADTs provide an abstract view to describe properties of sets of entities, their use is independent from a particular programming language. We therefore introduce a notation here which is adopted from [3]. Each ADT description consists of two parts:
As an example the description of the ADT Integer is presented. Let k be an integer expression:
Consequently, the postcondition of this operation is sum = N+k. Don't confuse this with assign statements as used in programming languages! It is rather a mathematical equation which yields ``true'' for each value sum, N and k after add has been performed.
The description above is a specification for the ADT Integer. Please notice, that we use words for names of operations such as ``add''. We could use the more intuitive ``+'' sign instead, but this may lead to some confusion: You must distinguish the operation ``+'' from the mathematical use of ``+'' in the postcondition. The name of the operation is just syntax whereas the semantics is described by the associated pre- and postconditions. However, it is always a good idea to combine both to make reading of ADT specifications easier.
Real programming languages are free to choose an arbitrary implementation for an ADT. For example, they might implement the operation add with the infix operator ``+'' leading to a more intuitive look for addition of integers.
ADTs allows the creation of instances with well-defined properties and behaviour. In object-orientation ADTs are referred to as classes. Therefore a class defines properties of objects which are the instances in an object-oriented environment.
ADTs define functionality by putting main emphasis on the involved data, their structure, operations as well as axioms and preconditions. Consequently, object-oriented programming is ``programming with ADTs'': combining functionality of different ADTs to solve a problem. Therefore instances (objects) of ADTs (classes) are dynamically created, destroyed and used.