Some observations on modular design technology and the use of microprogramming

As modules become more complex the advantages and disadvantages of modularity have become more pronounced. The cost of modularity is measured not only in added hardware but also in a loss of flexibility. Functions that are easy to implement at a submodule level may be very difficult, or even impossible, to duplicate at the modular level. We term this a loss of transparency.
 PMS (Processor-Memory-Switch) level modules could be available in the next four-six years. Their existence will open many significiant areas of research. It appears that the overhead for PMS modular systems will be on the order of 30%-50% but with decreasing hardware costs this will be tolerable. The expendable components will be processors and there will be no effort to obtain a high utilization factor for the individual processors in a system. An 80%-90% idle time may be acceptable. The high sales volume required by the semiconductor industry suggests that, in the foreseeable future, PMS level components will be oriented towards mass market applications like personal calculators and intelligent terminals. It is interesting to note that as the cost per digital function has decreased the design time and cost per system has remained relatively constant. So instead of obtaining a cheaper system with the same functions a user gets a more complex system at the same cost. This is best exemplified by observing the evolution of minicomputers and noting that the cost per system of a 1965 vintage minicomputer (e.g., PDP-8) is about as costly as a ]974 minicomputer (e.g., the PDP-]I)*. Finally, microprogrammed modules are an attractive control element for PMS level modules from both an economic and a transparency point of view.


-INTRODUCTION
Increased attention has been focused on the understanding of the design process as systems, whether physical or social, become more complex. The sheer complexity of many systems demands an orderly design approach. Freeman and Newell [Freeman, 1971]  In other words, "functional design" is a hierarchical process. Modules at one level are used to construct super-modules at a higher level, which may in turn be used to construct super-super-modules, and so on. To take an example from current computer technology: chips are interconnected to make boards, boards are plugged into back panels to make system units, and system units are plugged into cabinets to form a system.
The advantages of having a standard set of modules at any given level are well documented [Bell, 1973;Davidow, 1972;Parnas 1971 On the other hand modularity also has disadvantages, not often stated:

1) Rigid intramodule connections may result in both suboptimal use of resources and suboptimal performance
2) There is an overhead incurred by simply making a collection of functions part of a module set 3) Functions that are easy to derive given some basic modules may be extremely difficult or impossible to achieve at the supermodule level. We term this a lot* of transparency.
This paper will survey the evolution of technology and its impact on modular design. Existing module sets will be reviewed and some quantitative result will be given on the advantages and disadvantages of modular design outlined above.
Microprogramming in modular systems will be shown to be a cost effective substitute for random control logic and a means to minimize the loss of transparency. Finally some predictions will be given on the future shape of modules and the cost of modular design.

DIGITAL FUNCTIONS AND THE DIGITAL DESIGN HIERARCHY
In this section the "functional design" process described by Freeman and Newell [Freeman, 1971] is given in terms of digital systems design. First the functions are described and subsequently the hierarchy of design levels.
Bell and Newell [Bell, 1971] have identified seven basic component types in terms of functions:

Data-operation (D) components produce information with new meanings
Processor (P) components are capable of interpreting a program to execute a sequence of operations.

Modules with functions of these types can be interconnected into super-modules
that provide one or more of these functions. This hierarchy of digital design levels is given by Bell and Newell [Bell, 1971]  This is clearly an abstraction since the modules that are used to build the RT components are active all the time. This abstraction allows us to concentrate on those components that are changing values.

The first such module set was the macromodules developed by Washington
University in 1967 [Clark, 1967]. Macromodules consist of a set of data and control modules that are stacked together and interconnected via bus cables. Due to the existence of several buses (or data paths) in a macromodule system a high degree of concurrency is available. The major goal of the macromodule project was to provide a set of easily used modules that could handle indefinite expandability (as typified by variable word length).

In 1971 a set of Register Transfer Modules (RTM's) became available from Digital
Equipment Corporation (DEC) [Bell, 1972a[Bell, , 1972b]. RTM's were designed by DEC, whose primary goal was to look for a means of incorporating Medium Scale Integration

Bus (prewired)
The connections shown in the figure are all that are required to construct the system.
Note that the primitive functions are very similar to those available at the assembly language level of programming. Other RT level module sets are being developed at MIT [Patil, 1972], the University of Washington, and the University of Delaware [Robinson, 1973]. An interesting feature of the latter module set is that the data part is composed solely of commercially available MSI chips.
The advantages of design with these module sets are dramatic. A PDP-8 like minicomputer could be designed and built in 6-7 man-months using discrete components.

A similar processor built from SSI components might take 2-3 man-months and from
MSI/LSI components about one man-month to design and construct [Bell, 1974] with RTM's has enabled the construction of a program called EXPL [Barbacci, 1973] which takes the description of an algorithm in a RT level language, ISP [Bell, 1971], and some cost-time constraint as inputs. EXPL produces a near optimal RTM solution as output by use of graph tranformations and heuristic search techniques. Another effort [Rege, 1974] explores the space of data part designs, producing optimal allocations of operators (physical components) to the data operations of a sequential flowchart.
To obtain these advantages a price is paid. A design using module sets tends to be slower and costlier in terms of hardware than a comparable system designed with SSI and MSI components. The RTM PDP-8 cost twice as much and ran at 40% of the speed of the real PDP-8. A system designed with macromodules might cost between two and ten times more than a comparable MSI/SSI system. The extra cost is due to the overhead of making a module part of a module set (e.g., to establish control protocols, to allow word extensibility, to permit physical connections, etc.) This overhead is approximately 30% for RTM's and 70% for macromodules [Fuller, 1973a].
Although module sets may be slower than comparable systems built from lower level primitives, they are extremely competitive with general purpose computers. An algorithm can be hardwired with modular components and not incur the overhead of fetching and decoding instructions. Also, the modular implementation can take advantage of any parallelism in the algorithm. For example the macromodule implementation of certain algorithms were ten times faster than programs on a PDP-9 and between 1 and 2 times faster than a CDC 6600 [Fuller, 1973a].

The component interconnection rules that define a RT level module set can also
contribute to the increased cost of a modular implementation. These rules can potentially lead to gross inefficiencies, inefficiencies that should be carefully weighted in future module sets.
To formalize this notion we will introduce the concept of

transparency. A module suppresses some of the detail of its constituent components and their interconnections while providing a set of functions to the user. This suppression of detail is both the strenght and the weakness of a modular approach. It
is a strenght in that a designer can conceptualize and construct at a higher level.

Advantages stem from having a smaller set of components and facts to keep in mind. A weakness arises if some required function, which the constituent modules can perform,
is not available to the user. This we term as a loss of transparency [Parnas, 1972].

The missing function may be very difficult or even impossible to reproduce at the module level. For example, RTMs use four four-bit ALU (Signetics SN74181) in their arithmetic element (DMgpa). The ALU is capable of performing 16 arithmetic and 16
Boolean functions of two parameters yet only 12 of these are usable at the module

level. As another example, consider BCD arithmetic where the carry from each BCD digit is required. Only the carry from the most significant four-bit ALU is available to the user in RTMs. Thus to perform arithmetic on four digit, packed BCD numbers in
RTMs a penalty of a factor of 4-5 in speed is paid over a module set in which the carry bits from each four-bit ALU is available.
As noted before, a primary advantage of RT level module sets is systematic design. Semiconductor manufacturers currently offer a comprehensive set of data part modules (e.g., Texas Instruments catalog, Signetics. catalog) and these even form the data part of one RT level module set [Robinson, 1973]  It should be noted, however, that a unary encoded, distributed control does not 18 make efficient use of logic gates (i.e., they can be replaced by control memory bits on an almost 1-1 ratio). A more realistic guess would be 5 to 12 control bits to replace a gate [Davidow, 1972] In summary, microprogramming appears to be a cost effective replacement for random logic for control of large modular systems. The next section will make a few projections about the shape of future modules.

-FUTURE MODULE SETS
The advent of Large Scale Integration (LSI) technology has made the chip a natural boundary for a module. Currently a 4K bit MOS random access memory (RAM) or an eight bit MOS microprocessor are available on a single chip. Since 1960 the optimum chip complexity has doubled every one to two years [Fuller, 1973a]. There is no indication that this trend should not continue for the next several years. Thus in 4-6 years a 16-bit microprocessor with IK words of memory could be available on a chip. What should a module look like when it is of this complexity?. When placed on a chip memories, due to their regular patterns, can be four times as dense as random logic [Fuller, 1973a]. This strongly suggests that a microprogrammable controller be employed. The ability to alter control sequences (i.e., programming) also implies that the majority of the functions internal to the module will be available to the user thus reducing the loss of transparency.
Assuming that the modules of the future are microprogrammable processors with associated memory some interesting conclusions can be drawn. First, by observing the , 1972a, 1972b] it appears that a current microprocessor is equivalent in complexity and cost to ~500 memory words (of the same width as the processor's data path). Extrapolating this to microprogrammable processors it can be seen that for modules with even moderately size memories (4-8K) the processor is expendable since the dominating cost is that of the memories. Further, a single, versatile module would be more attractive to the semiconductor industry than a module set consisting of 10 or more modules since a successful chip needs a sales volume of ~2 million units/year [Fuller, 1973a]. This is in contrast with the current minicomputer market of 30,000 units/year.

Intel MCS-4 and MCS-8 [INTEL
In order to achieve high computational power with these modules a mechanism must be provided for efficient communications between modules. The I/O structure of a conventional processor would be a poor way to conduct intermodule communication since a response to an interrupt might take tens of microseconds. Intermodule communication schemes based on shared memory have been studied in connection with multiprocessors [Wulf, 1972;Heart, 1973] and might be a viable alternative.

One prototype intermodule communication scheme for PMS level modules based
on shared memory is the one employed by Computer Modules [Bell, 1973;Fuller, 1973b]. The structure of a typical CM network is shown in Figure 5.

5) Problem decomposition
The last point is particularly important. Methods for decomposing a problem so that it can be executed in parallel on several interconnected, cooperating PMS level modules must be developed in order to realize the potentiality of the PMS modules.

-CONCLUSIONS
PMS level modules could be available in the next 4-6 years. Their existence will open many significant areas of research. It appears that the overhead for PMS modular systems will be on the order of 30%~50% but with decreasing hardware costs this will be tolerable. The expendable components will be processors and there will be no effort to obtain a high utilization factor for the individual processors in a system. An 807o~907o idle time may be acceptable.
The high sales volume required by the semiconductor industry suggests that, in the foreseeable future, PMS level components will be oriented towards mass market applications like personal calculators and intelligent terminals. It is interesting to note that as the cost per digital function has decreased the design time and cost per system has remained relatively constant. So instead of obtaining a cheaper system with the same functions a user gets a more complex system at the same cost. This is best exemplified by observing the evolution of minicomputers and noting that the cost per system of a 1965 vintage minicomputer (e.g., PDP-8) is about as costly as a 1974 minicomputer (e.g., the PDP-11)* [Bell, 1974].