Introduction to the Computer Graphics Reference Model

1 Introduction
  1.1 What is a reference model?
  1.2 What is computer graphics?
  1.3 Why do we need a CGRM?

2 Looking out
  2.1 Operator interface
  2.2 Application interface
  2.3 Data capture metafile interface
  2.4 Audit trail metafile interface

3 Looking in

4 The details
4.1 Data elements
4.2 Processing elements

5 Output by example
  5.1 Tessellation and transformation
  5.2 Property binding
  5.3 PHIGS implementation examples
  5.4 A more complex example
  5.5 Another input example

6 Input
6.1 Where do tokens get processed?
6.2 PHIGS examples

7 Expressing current standards using the CGRM
  7.1 Graphical Kernel System-ISO 7942
  7.2 Graphical Kernel System for Three Dimensions-ISO 8805
  7.3 Programmer's Hierarchical Interactive Graphics System-ISO 9592
  7.4 Metafile for the Storage and Transfer of Picture Description Information- ISO 8632

A History annex
   A.1 Level of detail
   A.2 Audience
   A.3 Architectural evolution
   A.4 Architecture
   A.5 Influence of old standards
   A.6 Components and frameworks

1 Introduction

The most unusual of the computer graphics standards developed by ISO/IEC is ISO/IEC 11072. Its title is the Computer Graphics Reference Model. Unlike other computer graphics standards, the CGRM cannot be implemented. It is under 40 pages in length, however that does not mean it contains little information. It is a terse, formally written document containing few examples.

Our aims here are threefold: to give a readable introduction explaining why this standard exists; to tell you when you should use it; and to give some examples which, hopefully, will make it all a lot easier to understand. Our intended audience is the person who is not a computer graphics expert but does have an interest in computer graphics standards and some knowledge of at least one such standard.

1.1 What is a reference model?

To understand what a reference model is, we must look at the meanings of the two words reference and model. A reference is something that can be referred to as an authority, those large books full of facts that have their own section in the library! A model is an example for imitation or comparison. For example, architects have models to show clients what they are aiming to produce.

So a reference model is an authoritative basis for the development of standards. It provides a pattern or set of principles which they must adhere to. Just as the architect's model will not show all the details, the CGRM gives the overall rules while leaving the individual standards developers to fill in the details for their particular requirements.

1.2 What is computer graphics?

As the CGRM is a Reference Model for Computer Graphics, the first thing it has to do is define computer graphics. That was probably the hardest part! What came out in the end was:

Computer Graphics is the creation of, manipulation of, analysis of, and interaction with pictorial representations of objects and data using computers.

As you can see, these sentences contain a lot of detail. Let's look at these phrases one at a time:

Pictorial representation of objects: note that objects themselves do not have to be pictures. Thus a computer graphics standard could deal with objects that are not themselves pictures. For example, the object could be a function of several variables which needs to be visualized.

Creation of: so it includes the construction of objects possibly based on graphical output and input. Scanning in images, using a digitizer and model building using data structures are all examples of creation.

Manipulation of: so you can access these objects, possibly modify them, build new ones out of existing ones and so on. This is all part of computer graphics.

Analysis of: so it includes all the processing which produces objects or their properties using geometric, topological or any other mathematical analysis. So illumination calculations and pattern recognition can both be described as part of computer graphics.

Interaction with: we are not just talking about drawing pictures. The whole area of an application interacting with a user via graphical input and output is regarded as part of computer graphics.

So the area defined as computer graphics is quite wide. It incorporates most aspects of activities such as window management and image processing within its broad definition. The emphasis is that at some stage the objects being dealt with will use pictures to describe them. In fact, a particular graphics standard might only be concerned with manipulating the objects prior to pictorial representation. For example, many of the PHIGS structure creation and manipulation functions are concerned with the production and manipulation of structure hierarchies.

If you look at the Definition section of the CGRM, you will see all important definitions used in the CGRM organized in a tree-like structure. These start, of course, with the definition of computer graphics and work down to more specific definitions of particular functions. If a word appears in the CGRM that is not in the Definitions section, its meaning is the one that a computer professional would expect it to be.

1.3 Why do we need a CGRM?

In any complex activity-such as building a car or constructing a house-different parts will be done by separate groups of individuals. Consequently, there is a need for all the groups to have an overall view of what is being produced plus the details of how their activity fits into the grand plan. The same applies in development of computer graphics standards. We have groups defining standards for transmitting graphics, others are concerned with the rendering of pictures and so on. The CGRM provides the basic framework for these activities. Individual groups can see where their activity fits in the grand plan and what rules have to be obeyed so that their piece fits into the overall jigsaw.

Just as the car or building has to fit in with its surroundings, computer graphics resides in the milieu of computing. Computer graphics may need to be embedded in textual documents or be part of a CAD suite or need to be transmitted over wide area communications networks. These other areas have their own rules and constraints. Consequently, a second reason for the CGRM is to allow other areas to see where computer graphics fits in the total scene and what demands it will place on the other areas and vice versa.

The first set of computer graphics standards was completed before the CGRM. The aim was that they should all interwork but, due to the absence of this overall design philosophy, there are places where the parts fit less well together than they should. The hope is that the CGRM will ensure much closer compatibility for future computer graphics standards and provide a common terminology and vocabulary for their production.

2 Looking out

The overall structure of the CGRM at a coarse level of detail is shown in figure 1.

Figure 1 - Computer graphics

This view of the CGRM concentrates on the interfaces between computer graphics and external objects. The four important external objects identified by the CGRM are application, operator, data capture metafile, and audit trail metafile. The first of these permits external software to control the graphics system, the second allows human operators to interact with the system, and the last two permit the storage and retrieval of graphical information.

2.1 Operator interface

The ultimate aim of most computer graphics systems is to produce a visual display of information intended for viewing by human beings. To meet this aim, systems often accept graphical input from human beings. The operator interface is the final interface through which such information flows as it enters and leaves the computer graphics system. In some important cases it is convenient to think of the operator as being another system or piece of software rather than a human being. Thus the CGRM gives a very general definition of an operator as the external object that observes the contents of the display in the realization environment (to be explained below) and provides physical input tokens (to be explained below). Because the word "physical" implies a piece of hardware, we have used "realization" to make it clear that the operator may or may not be human and that he or she (the real physical operator) may be accessing the graphics system via some other software or hardware. By defining the operator in such a general way, we can think of complex issues-like the relationship between computer graphics and windowing systems-in convenient terms.

2.2 Application interface

ISO computer graphics standards have thus far consisted of generally useful (i.e. kernel) sets of graphics functions from which application specific toolkits might be built. Thus computer graphics systems based only on standards do not run by themselves. Additional application software must be added to do useful work. For example, graphics standards provide basic output units, like lines and filled areas, but do not directly provide higher level objects like bar and pie charts. Similarly, computer graphics standards provide input tools to identify graphical objects or enter text strings but do not provide a complete user interface for constructing application models. From the perspective of computer graphics, this higher-level software that tailors kernel graphics functionality to the needs of a particular constituency is called an application. The application interacts with computer graphics through the application interface. Typically an application interface is a library of procedure calls or message formats.

2.3 Data capture metafile interface

There are many cases where we want to capture graphical data for later use within the same system or for transfer to another system. For example, we might want to build a picture description which could be imaged to create a 35mm slide, printed on a colour printer, or used as clip art in a presentation at a remote site. In other cases an application wants to build a graphical model that can be saved and reloaded at a later time. The data capture metafile is the external object which represents all or part of the data in computer graphics for storage, retrieval or transmission. Computer graphics systems often provide special interfaces for data capture metafile import and export.

If you look at the detailed statements concerning data capture metafiles in the CGRM, you will see that the CGRM makes some clear statements about what you are allowed to do with them. It says that all the items you put in a metafile must be derived from graphical data at the same level of detail. You have to import metafiles at the same place you exported them - you cannot save some information and then attempt to use it for a different purpose.

If you read the CGRM quickly, you will miss this kind of rule. Nearly every statement in the CGRM is giving advice about good practice. The wording has been chosen with quite a lot of detailed wordsmithing to make it as succinct as possible. Read it carefully or you will miss some important points. These details are like the fine print in a contract!

2.4 Audit trail metafile interface

It is sometimes useful to capture the information exchanged between an application and a graphics system. For example, such information is useful for performance analysis studies and for resetting the state of a graphics system following an abnormal termination. The CGRM allows an audit trail metafile as the external object that represents the flow of information across the application interface.

You may wonder why audit trails cannot occur anywhere in the computer graphics system. Clearly, you can capture the flow of data through any particular point in a piece of software and can re-run the program using that data to test the remaining code. Sophisticated, commercially available debugging tools allow you to do this. The real problem-as with debuggers-is to know what you are seeing. If you have ever tried using a debugger on code generated by an optimizing compiler you may have noticed that this is not easy. Values that you would have assumed to have been set by a particular point may not be because the compiler has moved the code closer to where it is required.

Similar problems arise in computer graphics. The CGRM gives you a model explaining how a computer graphics system will work. But that is all it is. Different real systems may have completely different internal structures as dictated by their internal hardware and software characteristics and their performance requirements. There is nothing wrong with this as long as the overall effect conceptually fits the model. Consequently, there can be no fixed internal interfaces that an implementation must adhere. And audit trails are only allowed at the application interface!

3 Looking in

The next level of description of the model is shown in figure 2.

Figure 2 - The five environments

The key idea in this description is that computer graphics can be considered as a series of transformation steps between the application and operator. This view applies equally to output data and input data. On the output side, information is transformed as it flows from the application to the operator. It is changed from an abstract model into light emanating from a display or into ink on a piece of paper. On the input side, physically transduced measurements-such as button presses and relative movements-are converted as they flow from the operator to the application into abstract models of operator intent.

The five environments were chosen to capture the most fundamental data and processing abstractions in computer graphics. To some extent the number of environments is arbitrary, reflecting the level of detail most often used to model computer graphics systems. Environments were added or removed based on the amount of insight provided to the description of computer graphics standards and their relationships to the external world. The internal structure of an environment is described in the next section, and we will see that all environments share a common architecture. For the moment it is sufficient to state that each environment has well-defined inputs, outputs and internal state.

Here is a more detailed explanation of the well-defined purpose of each environment:

Construction environment. Most applications are not intrinsically graphical in nature but use graphics to represent application data. This environment allows the application to construct a model representing its application data. This model serves as the basis for eventually producing pictures and for relaying operator input to the application.

Virtual environment. Most graphics systems can be separated into device independent and device dependent parts. There are important economic reasons for making such a distinction. First, a single system may incorporate many different graphical devices based on inherently different technologies. Second, such device technologies are continually evolving. Finally, since computer graphics software is expensive and difficult to create, it is only natural to seek to minimize the changes necessary to add a new device to the system. The virtual environment is the environment in which a device independent representation of graphics is created. We call this representation- consisting of the totality of viewable objects-the scene. On the output side, this means that abstract output primitives are created for later realization (mapping) to device-dependent representations. One example might be the mapping of a coloured virtual line into a gray line on a bitonal output device. On the input side, device dependent input must be represented in terms of abstract device independent characteristics.

Viewing environment. In this environment, a specific view of the scene is taken to produce a picture. This is particularly important in 3D systems where a 3D scene is projected onto a plane to produce a 2D picture for display. The concept is equally applicable to 2D systems where the viewing environment defines the size and orientation of the 2D scene within the 2D picture. On the input side, in a 3D system, 2D input coordinates must be mapped by the inverse of the viewing transformation into the correct 3D information.

Logical environment. The viewing environment produces a picture that is an ideal view of the scene without consideration of device-dependent characteristics of the environment where it will be displayed. Binding of attributes that describe rendering is performed in the logical environment. Examples of such attributes are the nominal width of a line (which might depend on the pen size of a plotter or the pixel size of a CRT display), the linestyle used, and the availability of colour. Just as output devices have differing physical capabilities that must be accommodated, so do input devices. For example, device dependent relative positions from a mouse might be transformed into absolute 2D locations in a device independent coordinate system.

Realization environment. The realization environment finishes the processing of graphical output by completely defining the image to be displayed. By the logical environment, all graphical output is complete in the sense that all attributes needed to view it have been bound. However, more information may need to be added and compromises must be made if the device is not perfect. For example, outputting straight lines on a pixel device is always a compromise. Similarly, input is received from the operator and transformed to the form required by the logical environment. The CGRM conveniently sidesteps the issue of whether physical graphical devices are contained within this environment by taking an abstract view of what can be considered to be an operator. If the operator is a traditional human being, then it is convenient to think of the physical environment as encompassing graphical hardware. In this case the operator interface is in reality a physical one based on light emitted from displays, ink on paper, and the transduction of physical motion. On the other hand, the operator might be another computer system-such as a virtual terminal, window management system or machine vision system-or even another computer graphics system.

Although figure 2 shows only single instances of each environment, the CGRM does allow multiple instances of any environment. Thus fan-in (many to one mappings) and fan-out (one to many mappings) are allowed only at the interfaces between environments. They may occur between any pair of adjacent environments and both fan-in and fan-out may occur at the same time in a graphics system. An obvious example of fan-out occurs in the description of the flow of output data in the graphics standard GKS-3D (ISO/IEC 8805), where a single virtual environment (corresponding to a normalized device coordinate (NDC) space maps onto multiple viewing, logical and physical environments (workstations). This is illustrated in figure 3. There is a corresponding example of fan-in in the description of the flow of input data in GKS-3D between the environments modelling workstations and the virtual environment.

Figure 3 - Fan-out in GKS-3D

4 The details

The CGRM provides a third level of description that defines the internal structure of each environment. This detailed environment model is illustrated in figure 4. The CGRM uses a specific set of data elements (round objects in figure 4) and processing elements (rectangular objects in figure 4) to describe computer graphics. An environment thus consists of a set of such data elements and a set of such processing elements.

Figure 4 - Inside each environment

The number of data and processing elements were chosen based on the level of description felt necessary to characterize the work performed in each environment at this level of abstraction. Their arrangement is based on an abstract view of the common arrangement of data and processing steps found in many graphics standards and systems. Abstract names were chosen for each of them so as not to bias the reader's perception of their purposes based on preconceived notions. For example, the word collection store is used rather than segment store or structure store since the later two terms have specific meanings in the GKS and PHIGS standards, respectively. The notion of a collection store neatly abstracts the essential common properties of both segments and structures in a way that is completely general and applicable at each environment.

In the view of the model, the most important graphical processing occurs as output primitives-atomic units of graphical output like lines, polygons, or text strings-flow through the various environments from the application to the operator or as input tokens-atomic units of graphical input like locations or text strings-flow through the various environments from the operator to the application. At places in the processing chain output primitives or input tokens might be stored or converted from one form to another. For example, a 3 dimensional line might be projected to make a 2 dimensional line.

You will notice when you read the CGRM that the wording for input and output is not completely symmetrical. This is the perceived difference between human input and human output. Humans produce quite complex, highly-structured graphical output in drawing or painting, but we receive input in quite low-level chunks. Consequently, the imbalance in wording is deliberate and not accidental!

4.1 Data elements

Taking the data elements first, their purposes are:

Composition. The composition provides the abstract notion of capture of the graphical output working set of each environment. This is motivated by considering that the application is striving to create a picture to be observed by the operator. The composition captures this idea of picture at each of level of abstraction recognized in the CGRM. The notion agrees so well with our intuition that well-known and instantly recognizable names can be assigned to the composition in each environment. For example, in the construction environment this output set is called the model, while in the viewing environment it is called the picture. The composition is constructed from output primitives in a well defined, spatially structured order and there is only a single composition at any one time within each environment.

Collection store. A collection is simply a named, structured set of graphical objects. These pieces are the building blocks from which compositions-like models and pictures-are produced. It is common for more than one such collection to exist simultaneously at a given level of abstraction, and for collections to have a degree of permanence spanning at least the duration of a typical interactive graphical session. This allows collections to be manipulated (edited) to eventually produce graphical output. The collection store-as one of the abstract storage mechanisms internal to an environment-provides a mechanism for storing such graphical information.

The composition and collection store are the only places where you would find graphical output within an environment.

Token store. The token store is the analogue in input processing of the composition in output processing. It represents the input working set of the environment.

Aggregation store. The aggregation store is the input analogue of the collection store. It provides a working space that may be used by input processing in constructing input tokens in the form required by the token store.

Environment state. The environment state data element represents other state information, for example current values of properties (such as "modal" settings of graphical attributes) which may be shared between processing elements.

4.2 Processing elements

The processing elements also have precise purposes in the model. Of the five processes, four are dedicated to transforming input and output data at the four main interfaces to each environment. The fifth performs processing internal to the environment. A more detailed description of each follows. Note that the interfaces at the top of the construction environment and the bottom of the physical environment are special since they are external interfaces rather than interfaces to other graphical environments. To simplify the presentation, we ignore this in the following descriptions and simply refer to data coming from/going to the next lower environment.

Absorption. It is most often appropriate to think of the transformation of graphical data from one form to another as happening as the data is absorbed into a lower environment. Thus the absorption process does things like coordinate transformation, property binding, and clipping. The location and nature of such processing is so well understood that the CGRM was able to assign commonly understood names to each absorption process. For example, absorption into the viewing environment is called projection , while absorption into the logical environment is called completion.

Emanation. It is most appropriate to think of input as being transformed as it leaves an environment for the next higher environment. Geometric and other transformations may be applied. For example, before input data leaves the viewing environment in a 3D graphics system it must be transformed from a 2D coordinate system to a 3D coordinate system. This is based on the general CGRM principle that the same coordinate system must be used for both input and output information passing between each pair of adjacent environments. Thus, emanation and absorption within each environment must be coordinated so that output entities entering an environment and input entities leaving the environment are defined in the same coordinate system.

Manipulation. The centre process in the detailed environment model is manipulation. In many ways this process does the bulk of the difficult work within the environment. It is responsible for providing all linkages between the collection store, the composition, the aggregation store and the token store. For example, manipulation takes the collections in the construction environment and produces the model that is passed on to lower environments. This generally involves translating abstract graphical object descriptions into graphical output primitives. The second important thing that manipulation does is to link input and output. For example, the echoing of input values and the tracking of cursor movements is done by the manipulation process. It alone is in a position to observe the contents of the aggregation store and reflect any necessary changes back into the composition.

Distribution. Output data must be distributed to lower environments. This is the function of the distribution process. One thing that it does is to fan-out data to several lower environments when this is required. Notice that most of the intense transformation work on graphical output data is done as it is absorbed into the lower environment rather than as it is distributed from a higher environment.

Assembly. Input data must be accepted from lower environments. This is the job of the assembly process. Fan-in to the environment from several lower environments is one thing it does. Notice that most of the intense transformation work on graphical input data is done as it is emanated from the lower environment rather than as it is assembled from a lower environment.

5 Output by example

Output primitives include such things as polylines, text, cell arrays, and splines. They have properties that affect their style, like colour, line width, and fill pattern. Other properties are geometric, such as transformations and viewing information.

5.1 Tessellation and transformation

The absorption process of an environment may replace one output primitive by several new ones, using some of the properties to do this. For example, if the construction environment can handle splines as output primitives but the virtual environment can not, the absorption process could replace the primitive by one or more polylines, using a given tolerance attribute. After this operation, commonly referred to as tessellation, the information that the output primitive was originally a spline, including its tolerance attribute, is lost.

Whenever a primitive is transformed, the result must be representable as (possibly different) primitives within the environment: in other words, the set of output primitives is invariant under geometric transformations. For example, if clipping is provided above the physical environment, the result of clipping must be primitives in that environment. If the clipped primitive cannot be expressed in this way, clipping has to wait (possibly indicated by a clip attribute) until the realization environment.

While we are on clipping, you should notice that clipping is a word defined in the CGRM as:

The action of constraining the geometric shape and extent of either an output primitive or input token to be within a specified region.

Note that this definition does not say clipping is to be within a certain boundary. The definition of clipping includes shielding and even clipping against a half-plane. Read the definitions carefully or you may think they say things that they don't!

5.2 Property binding

A property is said to be bound to a primitive at the moment it is used for processing the primitive. For example, in GKS the colour index, line width, and line style are all bound to a polyline primitive at specification time. So, each time this particular polyline is referenced it has the same colour index, and so on. In PHIGS, on the other hand, as the structure is built, only the geometry of the polyline is known. Attributes are not bound until traversal of the hierarchical structure, so the same geometric description can be referenced several times with different attributes. It is also worth noting that the attribute is colour index, so that, even though all the attributes have been bound to the primitives, there is still work to do in getting the colour required by accessing the colour table with the colour index. Finally, there can be properties that are specified independently of the graphical workstation on which a picture is eventually displayed or printed, but which may be applied on each workstation in potentially different ways. A good example is a device-independent colour specification which must eventually be mapped to a "closest" equivalent available colour on each workstation. A pen plotter might have only a limited set of pen colours available and a monochrome monitor might not be able to display anything but shades of gray.

In the second example, the property is bound in the absorption process. It may then either belong to the description of the primitive itself, or it may be taken from a store, notably the state of the environment where it is absorbed. Where a property gets bound and to which primitive it is bound is an important distinguishing characteristic of environments.

Construction environment: Here the geometry of output primitives need not yet be fully defined. For example, a fill area may be specified originally by indices into a table of 3D coordinates, a table which is not defined along with the primitive but as part of the virtual environment state. Other properties must be specified prior to this environment, i.e. at the time of primitive creation. This is because they control the interaction with the model, such as segment names and pick identifiers in GKS.

Virtual environment: Before a primitive enters this environment all properties that define its geometry in a scene must be bound. This is because it is not possible to view a scene where parts are missing or whose geometry is not fully defined. For example, the coordinates of a polyline must reside somewhere in the environment: in the primitive definition, in the state, collection store, or composition store. It cannot be contained in the state of a lower environment. On the other hand, properties needed for rendering, e.g. colour, need not be bound in the virtual environment.

Viewing environment: On entry in this environment, all properties related to viewing and projection must be bound to the output primitives. Examples are eye point, stereo information, and type and plane for projection. They also include which view is to be applied to specific primitives. Here primitives may have a lower geometric dimensionality than in the virtual environment. Most often the display is a 2D device, so the absorption process will take a 3D primitive and project it (possibly with perspective), reducing its dimensionality to 2D. It could also reduce it to so-called 2.5D, where some depth information is maintained to assist hidden-surface rendering in the lower environment.

Logical environment: Here all attributes that define the rendering of output primitives into an image must be bound. Examples are colour, area fill and texturing, and line styles. Note that actual rendering will not take place until at least the realization environment.

Realization environment: All properties needed for the presentation on a physical display have to be bound to the primitives prior to this environment. For example, in order to present on a raster display, the colour value of each pixel in the display must be known. As noted above, the result of clipping is not necessarily expressible in terms of output primitives in the realization environment. For example, clipping a string of text may cause portions of characters being shown, i.e. shapes that do not exist in the current repertoire of primitives.

5.3 PHIGS implementation examples

PHIGS is a standard that fits the CGRM model in some areas but different implementations demonstrate different aspects of the CGRM.

The first example covers the entire realm from application to operator. The application builds structures in the central structure store that is the collection store in the construction environment. The preparation process takes all the commands from the application and effects all structure operations on the collection store. Workstation dependent settings proceed directly through to distribution to the Virtual Environment. A PHIGS workstation is modelled as the set of environments from the Virtual Environment to the Physical environment. Hence if the application has multiple workstations open, then it will have multiple sets of Virtual to Realization environments. The fan-out of the system is between the single construction environment and the multiple Virtual to Physical environments.

The composition in the construction environment consists of the information in the structure store. There may be more information in this "working set" than is actually traversed. The traversal process in the absorption process of the virtual environment constructs the composition in the virtual environment from the composition in the construction environment.

The absorption process in the viewing environment applies the view and projection to the output primitives, adds the view clip to the attributes of the primitive, and transmits the results to the logical environment. Note that view clipping cannot take place here as PHIGS does not define clipped output primitives.

The completion process in the logical environment takes the projected output primitives and binds the entries from the bundle tables so that all aspects of primitives are bound.

The absorption process of the physical environment takes the output primitives and their associated aspects to produce the composition at this level. It is here that the mapping from colour index to colour model values (such as RGB) would take place.

The major problem with PHIGS is that the ability to delay output means that changes to the composition in the virtual environment do not necessarily result in changes to the compositions at the lower environments. Consequently, there is always an indeterminacy lower down that makes it difficult to define metafile output at those levels.

5.4 A more complex example

Let us see how we would use the CGRM in earnest. Suppose we have decided to produce some new graphics standards for the developers of clip art. We start with the usual back-of-the-envelope diagram in figure 5:

Figure 5 - Clip art graphics

Clip art figures are made up from existing templates using the input device. The clip art library will be made by inputing new graphics or extending existing examples. For complicated ones needing the skills of more than one person it is useful to partial figures between designers in their unfinished state.

It looks like we need a couple of standards, the first is the application programmer interface standard for clip art design and the second is a transmission of clip art standard.

The principle architect for the set of standards picks up his well-thumbed copy of the CGRM and starts to see where these standards will fit. It turns out that the clip art makers are only interested in the overall pattern generated and the individual clip art figures do not have any intrinsic size or coordinate system. Consequently, it seems natural to consider the main composition being achieved at the virtual environment. The only operations being performed on existing clip art is scaling and positioning. Nothing gets changed geometrically in the subsequent viewing so this seems the natural place.

The architect proposes the following:

Figure 6 - Architect's diagram

The client is reasonably happy with the plan so far but is interested in seeing more detail on the design particularly how the connection to the devices will take place. The architect is not sure that everything is right yet and wants to ask further questions of the customer to ensure that his requirements will be satisfied.

It appears that the clip art designers give unique names to all their creations and calling up existing ones for use in making new clip arts is done using these unique names or browsing through the current set.

As several designers are at work together, they do not always have the up-to-date version of the clip art catalogue and, consequently, often ship the catalogue backwards and forwards. Finally, when receiving a clip art from another designer in the partially complete state it is often interesting to see how the design progressed.

At this stage, the architect begins to realize that his first attempt at the design is not correct. The problem is with the naming of clip art in the catalogue. These are names known to the application. The CGRM makes it clear that any naming associated with both input and output by the application must be complete in the construction environment. Consequently as these names are a fundamental part of the dialogue between the operator and the application, the main activities should be at the construction environment. It later turned out that the names of the entities that were used to construct a new clip art are retained in the description of the new one so these names are attributes of the primitives in the collection store as well.

The second diagram produced is in figure 7:

Figure 7 - Refined architecture

The free-hand input can be put straight into the current composition and does not require the application to be aware of what has been done, consequently this is delegated to the manipulation in the construction environment. Unfortunately, the free-hand input is not included in the audit trail in this case. The audit-trail is only able to capture the actions taken by the application. Other operations are performed by giving the information to the application and it makes the necessary changes to the composition.

As CGI can deliver the required tokens and no manipulation is required upon them or storage, there is no need to have an aggregation store. The distribution process can only pass on the coordinates used in the construction environment but as these can be set up to the VDC coordinate system of CGI, there is no need to have a virtual environment with a production process to do the coordinate transformation.

We could carry on with the design, gradually putting more and more detail in. Hopefully, what has been done so far shows how the CGRM is used to define the overall architecture. Effectively, CGI can be used to implement the three lower environments and the virtual environment has no effect.

5.5 Another input example

The next example shows how input tokens can be consumed within a graphics system without being passed to the application. The operator is provided with a set of potentiometers, which can be used to control the view of a scene to be displayed. The application is not concerned with the view selected by the operator and delegates responsibility for this interaction to the graphics system. The input tokens from the potentiometers in the physical environment are transformed to lie within appropriate high-low ranges by the emanation process in the physical environment.

The manipulation process in the logical environment echoes the values of each potentiometer on the screen in an appropriate manner. This might take the form of a slider bar for the potentiometers controlling positional and scaling parameters and dials for potentiometers controlling orientational parameters. Input tokens from the physical environment are also passed to the viewing environment.

The scene is held in the collection store in the viewing environment, and the viewing transformation to be applied to the scene is controlled by the manipulation in the viewing environment. The values of the elements of the viewing transformation matrix are derived from the input tokens generated by the potentiometers in a continuous fashion, such that as the physical potentiometers are rotated by the operator, the view of the scene displayed is changed accordingly.

The manipulation in the viewing environment might also give additional feedback to the operator of the view selected, for example by displaying the relative positions and orientations of the object, scene and picture coordinate systems.

The input tokens generated by the physical input devices are thus consumed by the viewing environment and, in this environment, are not passed back to the application.

6 Input

6.1 Where do tokens get processed?

The interaction between the operator and the application requires input from the graphical devices be acted upon by the application. CGRM uses the term "input token" as the atomic unit of input information at each environment. "Token" is the name often given to units of information in computer language descriptions and thus its use here. Conceptually, input is directed at the application but it is possible for the application to delegate the processing of input information to an environment.

The main process of input is emanation. This is where the correct input tokens for the next higher environment are created using the information available in the token and aggregation stores. The assembly process does little more than ship the tokens to either the aggregation or token store. One example from the current standards is STROKE input where the individual positions would be retained in the aggregation store until all are available. It is only then that the single STROKE token for the next higher environment can be deduced. Another example might be mouse input, where the delta x and y movements are placed in the aggregation store to be manipulated into a single token giving the new absolute position of the cursor. For text input, the aggregation store is the input buffer allowing editing and insertion of characters if that is desired. Similarly, for free-hand input, the aggregation store could be used to contain the current path as it is being developed.

Emanation can be a heavy weight process. For example, the transformation of coordinates to those required by the next higher environment is accomplished here. It may need to satisfy priority considerations if more than one transformation is possible. The transformation itself is usually defined in the form relevant for output so emanation is responsible for establishing the inverse transformation.

The key to good echoing and feedback is to have the manipulation generate appropriate output depending on the input tokens being added to the aggregation and token stores.

6.2 PHIGS examples

Let us look at some examples using the PHIGS input model. We shall assume that the devices are being used in event mode and all have been initialized.

A trivial example might be valuator input. Suppose the operator turns a dial and presses a button, this generates a real number. Probably at the physical environment this value will be scaled and shifted to fit appropriately into some defined high-low range. Other than this, it is likely that the input token will be passed without change through all environments until it reaches the application.

The next example increases the complexity a little. Suppose the operator has a tablet and generates a 3D locator input by putting the puck somewhere on the tablet and pressing a button. The physical environment would assemble a token consisting of a 2D absolute position that would be in the surface coordinates of the input device. This would be placed in the aggregation store. The manipulation process would align this position with a display coordinate position so that the entry it places in the token store is the equivalent 2D position in device coordinates.

The emantion process in the physical environment (called accumulation) transforms the 2D position from DC to DC3 by adding a zero z value and performing the inverse of the workstation transformation. We have introduced a logical coordinate system LDC3 as the one used in the logical environment. Many implementations will use NPC3 coordinates here so that the emanation process in the logical environment (called abstraction) is a null operation and the input token that gets delivered to the viewing environment is in NPC3 coordinates. This value is assembled into the aggregation store. The manipulation process in the viewing environment will have knowledge of the set of views and their priorities. By determining the highest priority view that contains the point in the aggregation store, it is possible to generate an input token consisting of the position in NPC3 space together with the relevant view index. The emanation process in the viewing environment (called elevation) transforms the position from NPC3 to WC3. This is accomplished by taking the view defined by the input token, calculating the inverse of the appropriate viewing operation and doing this transformation.

The result of elevation is an entity consisting of a view index and position in WC3 that is passed as an input token directly to the token store in the virtual environment. This is the event queue. When the relevant GET LOCATOR 3 function is performed, the emanation process will be required to deliver this token to the construction environment and eventually to the application.

As you can see, this quite simple example has a lot going on.

One final example shows the full complexity of what is happening when complex echoing is allowed. Again, we are considering event input but this time we are trying to input a STROKE in three dimensions using the same 2D tablet As points are input, a cursor gives an echo of the current position on the display. A full perspective view of the stroke positions input so far is echoed as a polyline. Finally a dashed line is echoed from the last completed point of the STROKE to the position in 3D that is in the process of being defined. The question is in which environment do these echoes take place and how are they achieved.

The sequence of points already input is assumed to be S1, S2 and S3. The point S4 is being input but the trigger to accept the fourth position has not been initiated yet. ST is used to denote a tablet position, SD to denote a device coordinate, SN to denote an NPC position and SW to denote a WC position.

Figure 8 shows the state of the various stores while the fourth point is being defined.

Figure 8 - Input after the fourth point

At the physical environment, the current tablet input value is placed in the aggregation store and the manipulation defines the equivalent device coordinate position placing it in the token store for upward transmission and causes a cursor picture to be displayed at that position.

The emanation process of the physical environment converts the device coordinate position to NPC and passes it via the token store in the logical environment to the aggregation store in the viewing environment. This contains two entities, the sequence of points already defined and the current value of the new position. The sequence defines the view index of the highest priority view that contains all the sequence. This index V is used to transform the fourth point to world coordinates. Note that this position might be outside this view but while the tablet is in the process of inputting a position it does not seem sensible to change the view used in echoing on the fly.

The aggregation store of the virtual environment will have SW4 added to it. It will already contain the sequence of points in world coordinates already defined together with the associated view index. The manipulation of the virtual environment causes a polyline to be drawn based on the completed sequence and a rubber band line between SW3 and SW4 using the same view.

When the trigger for SN4 fires, the various stores are updated as shown in figure 9.

Figure 9 - Input after a trigger fires

The fourth position causes the relevant view index to be updated to V´, the view with the highest input priority containing all the four points.

Finally, when the trigger is hit indicating that the STROKE is complete, say after point S has been input, the sequence {SN1, SN2, SN3, SN4, SN5} in the viewing environment will cause a single {SW1, SW2, SW3, SW4, SW5, V} to be generated and passed through the construction environment to the application. All the intermediate contents of the aggregation and token stores will be removed and the echoes deleted from the various compositions.

7 Expressing current standards using the CGRM

7.1 Graphical Kernel System-ISO 7942

In GKS, the production of primitives in NDC space with attributes bound corresponds to the virtual environment. Since GKS is a 2D standard, the viewing environment performs only the identity transformation. GKS workstations correspond to the logical and physical environments. Binding of bundled aspects is done in the logical environment. Transformation of coordinates from NDC to DC is performed in the logical environment during completion. The presentation process in the physical environment does the transformation of colour index values to RGB.

In GKS, those aspects that are definitely geometric are bound at the virtual (NDC) environment, however, some geometric text attributes (alignment) cannot be completely bound until the logical environment (contrary to the CGRM). The individual/bundled model fits into the CGRM as long as the complete geometry is specified at the NDC level. In the individual mode of working, PHIGS has corrected this GKS deficiency for at least two fonts.

Segment store in GKS is identified as a virtual collection store (WISS) and a viewing collection store (WDSS).

The event queue in GKS corresponds to a token store in the virtual environment. Transformation of locator and stroke input values from NDC to WC coordinates is performed by an emanation in the virtual environment. Echoing occurs when manipulation creates the appropriate output primitives in the image.

GKS has no clear concept of composition at any level. Its poor compatibility with CGM is a result of this.

Figure 10 - GKS concepts in a CGRM context

7.2 Graphical Kernel System for Three Dimensions-ISO 8805

GKS-3D has the same overall architecture as GKS. GKS-3D workstations correspond to the viewing, logical and physical environments. There is a clear separation between the scene created by the application in NDC3 coordinate space in the virtual environment and the picture to be rendered and presented on a particular workstation. The viewing transformation is performed in the viewing environment.

The attribute binding model is the same as the GKS model.

7.3 Programmer's Hierarchical Interactive Graphics System-ISO 9592

Several examples of how PHIGS concepts relate to the CGRM have been given already. Below we show how a particular implementation-and in particular the optimizations it has attempted-might be explained using the CGRM.

The structure store of PHIGS is centralized so it must be workstation independent. When Open PHIGS is called, an instance of a construction environment is created. The collection store in the construction environment corresponds to the Centralized Structure Store (CSS) of PHIGS. The preparation process of the construction environment does all editing operations.

A PHIGS workstation in this implementation comprises the virtual, viewing, logical and physical environments. Each Open Workstation call creates an instance of the virtual to physical environments. Fan-out of graphical output occurs between the construction and virtual environments. Since the updating of the pictures is controlled by workstation dependent deferral mode, traversal happens in the virtual environment. The operation of posting causes the manipulation process of the construction environment to communicate the structure being posted to the virtual environment of the specified PHIGS workstation. As editing operations are made to structures in the collection store of the construction environment, commands to make changes to the structures copied to the workstation are also required.

In this implementation, the manipulation process of the virtual environment traverses the PHIGS structures as required and requested by the application. The manipulation process also performs modelling clip and applies the composite modelling transformation to the primitives. The primitives and other structure elements are then distributed to the viewing environment. As a clipped primitive does not remain a primitive in PHIGS, the implementation is forced to define its own internal set of clipped primitives.

The projection process of the viewing environment accepts primitives in world coordinates and performs the viewing operations of view orientation transformation, view mapping transformation and view clip. The view tables, which contain the specific information, are kept in the environment state of the viewing environment. The results are then distributed to the logical environment.

The completion process of the logical environment fully defines the primitives and their attributes. It does the final binding of the attributes to the primitives, determining the values from bundle tables maintained in the environment state if necessary. The output of this stage is primitives in the form required by the realization environment. In the case where the collection store is a frame buffer, the output would be pixel locations with colour indices. In the case where the display device is a calligraphic device, the output would be endpoints with colour and/or intensity. The coordinates of the output are in NPC space.

The presentation process of the realization environment applies the workstation clip and transformation to the output primitives and stores the information in the composition. The manipulation process takes the data from the collection store and presents it to the operator. In the case of a raster system, the video scans out of frame buffer memory and sends the electronic signals to the display that is then viewed by the operator.

The PHIGS archive file corresponds to a metafile to/from the collection store in the construction environment.

7.4 Metafile for the Storage and Transfer of Picture Description Information- ISO 8632

The CGM is a data capture metafile for capturing 2D pictures (compositions) in the viewing environment, as illustrate in figure 6. The various tables and lists used in the CGM-such as the colour table, pattern table, bundle table, and font list-are shorthand notation for representation purposes and do not represent the state tables of any graphical device. The segments of Amendment 1 to CGM are not parts of a collection. Instead, they are simply a shorthand notation for representing the CGM picture.

Figure 11 - CGM in a CGRM context

The term "metafile" is used differently by CGM and the CGRM. In CGRM, a data capture metafile contains a single composition (picture). A CGM metafile may contain multiple pictures. A CGM may be thought of as a structured set of CGRM data capture metafiles.

A History annex

The history of development of the CGRM is best understood in terms of the major controversies that shaped the final document.

A.1 Level of detail

How large should the CGRM be and how much detail it should contain? On the one hand were those who argued that the CGRM should contain only the barest set of principles with absolutely universal applicability and permanence. On the other side were those who wanted the CGRM to contain enough details to successfully coordinate the development of the next generation of computer graphics standards. The final text is a compromise. Work is proceeding on a set of Component Models to supplement the CGRM in describing particular aspects of computer graphics.

A.2 Audience

As an International Standard, the CGRM is necessarily terse and provides no examples or explanatory material except for the Annexes. Since these are labelled informative, in legal terms they are not part of the standard. The CGRM is intended to be readable by SC24 standards developers. This Introduction was written to make the CGRM accessible to a general audience.

A.3 Architectural evolution

The number of distinguished environments and their names changed considerably during the development of the CGRM. The goal was always to provide a sufficient number of environments to distinguish concepts at different levels of abstraction while not providing so many different levels as to be unwieldy. The names changed right up to the last meeting as attempts were made to use words that expressed a particular function but did not carry prior assumptions about their meaning.

A.4 Architecture

After the Frankfurt meeting in February 1986, two approaches were considered to CGRM development. The first concentrated on establishing a reference model that was primarily concerned with the relationship between computer graphics and the external world. The external world included applications, the operator, graphical metafiles and audit trails. The second concentrated on establishing an internal reference model for computer graphics showing how the various concepts in graphics should fit together.

Various architectural approaches were tried before the final form emerged; most were essentially "layered" structures in which the layers represented different levels of abstraction. The models differed in terms of the criteria used to decide on the levels of abstraction at which attention should focus, and the names given to a generic layer, for example "environment" and "stage".

There are two natural ways in which levels of abstraction in computer graphics might be described. One focuses attention on the processing steps involved and pays little attention to all but important (retained) data. The second view focuses on the data elements and avoids specific mention of the processing by which the data elements are linked. After the Tucson meeting, two approaches to CGRM were pursed in parallel, and it was not until the Olinda meeting that it was realized that one approach was essentially a process oriented view and the other a data oriented view. After the Olinda meeting these two are combined and the present general architecture of the CGRM was agreed upon.

A.5 Influence of old standards

The initial approach of CGRM development was to collect important definitions and concepts from current ISO standards-such as GKS and the CGM-and to use these as a basis for the CGRM. It was only after considerable effort was expended that it was realized that not only were these concepts many times at the wrong level of abstraction for a reference model for all future computer graphics standards, but that there was not universal agreement on their correctness. Thus the approach based on "condensing" the old standards was abandoned and work started fresh to define a universal set of principles. Because of this, existing computer graphics standards do not necessarily fit precisely into the Reference Model. On the other hand, the concepts embodied in these standards did significantly influence the CGRM.

A.6 Components and frameworks

One of the underlying conceptual models of most standard graphics systems, in particular of those existing and proposed international standards for graphics, and of many existing graphics packages, are most often seen as being graphics processing pipelines. This view gives a framework consisting of levels of abstraction between the application and the operator through which input and output pipelines flow.

In the case of graphics output, graphics data is refined as it passes down the pipeline, by associating graphical attributes, transforming coordinates, clipping etc., until it reaches a form which is suitable for display on a particular workstation or device. Graphical input can be viewed as a pipeline of processes transforming the data resulting from some input interaction into a form suitable for use by the application. The input interaction may also involve processes from the output pipeline in order to achieve any desired prompts and echoes.

Clearly, the composition of these pipelines and the order of components within them may differ widely between models, however there are often a reasonable number of components common to most. Examples of these are transformations, attributes, clipping, storage etc. These common components play an equivalent role in each model, even though the internal details of the components will most likely differ. It can be shown that a large number of the differences between graphics system models can be expressed in terms of the different orderings (or configurations) of these components.

This observation leads to a component view of graphical processing that is complementary to the framework view. Early work on the CGRM focused on an abstract reference model of graphics data states by developed a processing pipeline model by isolating the smallest incremental changes to the states of graphics information (or storage areas), and by defining when graphics data undergoes transitions between these states by the application of specialized processes.

As work on the CRGM progressed, it was realized that a single integrated view of both a framework-levels of abstraction (construction through physical) from the application to the operator-and the components was possible. The development of the common conceptual model for the processing and data within each environment shown in Figure 2 achieved this. In this integrated view each environment contains storage components and processing components. The CGRM specifies which component processes perform component functions such as attribute binding, transformation, clipping, and dimensionality reduction. In fact a "stream" view can be constructed by identifying a string of adjacent data and processing steps.

Table 1. - Important Meetings

Timberline, Oregon, USA	July 1985	ad-hoc group formed to look at feasibility
Frankfurt, FRG	February 1986	component level model based on strands and streams introduced
Egham, UK	September 1986	New Work Item proposal finalized
Valbonne, France	May 1987	a multi-stage model was developed with major influences from windowing system work
Tucson, Arizona, USA	July 1988	introduced the separate naming of input and output processes; developed a 7-stage model
Paris, France	January 1989	previous drafts discarded and a completely new baseline draft was prepared; "internal" and "external" reference models merged into a single, integrated model; four major concepts adopted: pictures, collections, metafiles, and archives; condensed model to 4 environments
Darmstadt, FRG	June 1989	separation of collections from compositions; better symmetry between input and output
Olinda, Brazil	October 1989	process and data model added in substantially the final form; initial issues log developed and issues processing started; audience is developers of SC24 standards; introduced viewing environment
Seal Beach, California USA	January 1990	major improvements made to the quality and consistency of the document
Ottawa, Canada	July 1990	focus on issues processing and defining relationships to windowing and imaging; agreed on 5 environments
Norwich, UK	April 1991	agreed to advance to DIS
Las Vegas, Nevada USA	July 1991	produced DIS Text
Amsterdam, Netherlands	February 1992	substantial work on this introduction; initiated work on component models
Redondo Beach, California USA	May 1992	completed IS Text; revised windowing annex; improved wording, removing repetition; changed one environment name and one absorption name

Table 2 - Historically important documents

TC97 SC21/WG2 N340	10 Sept 1985	Reference Model Task Group Report
TC97 SC21/WG2 N378	24 June 1986	CGRM draft from Frankfurt meeting
TC97 SC21 N1402	September 1986	New Work Item proposal for CGRM
TC97 SC21/WG2 N512	14 Sept 1986	CGRM draft from Egham meeting
JTC1/SC24 N139	14 April 1988	Component Process for the Development of Standards
JTC1/SC24 N177	7 Sept 1988	CGRM draft from Tucson meeting
JTC1/SC24 WG1 N49	9 March 1989	CGRM draft from Paris meeting
JTC1/SC24 WG1 N84	June 1989	CGRM draft from Darmstadt meeting
JTC1/SC24 N512	27 July 1990	CD text (output of Ottawa meeting)
JTC1/SC24 WG1 N183	April, 1991	Final CGRM Issues Log
JTC1/SC24 N594	29 July 1991	DIS text (output of Norwich meeting, as modified at Las Vegas editing meeting)
JTC1/SC24 N???	May 1992	IS text