Prospects for Conversion to C++ of the Andrew User Interface System Wilfred J. Hansen Andrew Consortium December, 1992 Summary The Andrew User interface System is written in the C language with a few conventions which create an object-oriented programming environment. The industry has begun to adopt C++ as the object-oriented programming model, so the question addressed in this paper is whether to convert AUIS to C++. The non-technical aspects of the conversion include the cost of conversion--almost four programmer-years for the Consortium staff and more for member staffs--and the opportunity cost of not pursuing other enhancements to AUIS. After conversion to C++, there is some potential for greater acceptance of AUIS as a toolkit for X since existing toolkits are closely bound to C, however, the Fresco effort may be maturing at about the time the conversion would be done. The most important technical issue is dynamic loading. When the 40 mega-bytes of Andrew source code are statically linked in one object, the link time and memory occupancy are exorbitant. The only solutions are machine dependent, whether we remain in C or convert to C++. Even if dynamic loading is solved, it may not be feasible to convert AUIS, but experiments have indicated it can be done. One outcome from conversion is the availability of C++ features such as multiple inheritance and in-line functions. Over the next months, Consortium members will have to determine whether the potential benefits justify the conversion effort. _____________________________________________ Introduction This paper considers the question of converting the Andrew User Interface System from the C programming language to C++. The issues are delineated but a final decision rests on support from Consortium members. The Andrew User Interface System (AUIS) is both a graphical user environment for the X Windows system and a toolkit for extending that environment with new objects and applications. The quintessential AUIS application is ez, an editor for text and programs which encourages modern styled text and allows embedded images, diagrams, equations, and so on. The set of embeddable objects is open-ended; new objects are regularly created at many sites. The key advantage of the toolkit is that applications can take advantage of the text object and other objects so it is almost trivial to create applications that offer full-featured editing throughout their screen image. AUIS is written in the C programming language. To implement objects for end users, the programming environment is augmented with a set of conventions to create an object-oriented programming environment. The programmer interface to an object is described in a .ch file which is preprocessed to produce header files to be imported by client objects. Object implementation are written in ordinary C and preprocessed with the standard C preprocessor. (Other than a few simple rules about procedure headings, programs are written in standard C code. The rules are much briefer than those of the Xt environment. A note is available from Todd Inglett (IBM, Rochester) comparing C++ and the AUIS coding conventions for C.) If all objects were always loaded, the executable size (let alone the time to link the executable) would become unacceptable. To reduce the size of executables, to facilitate arbitrary introduction of new objects, and to speed testing, a dynamic loading facility for objects was developed. It is perfectly possible--and even common--to insert into an ez document an object whose implementation was begun long after ez was completed; an object, moreover, about which the ez application has no built-in knowledge, not even a table containing the name. Few existing versions of Unix offer any form of dynamic loading, so it was incorporated as part of the AUIS environment; porting this mechanism constitutes the major effort required to port AUIS to a new platform. That this is not an overwhelming task is attested to by the fact that the system has been ported to over thirty combinations of hardware and operating system. About the time of developing AUIS, C++ was also becoming available. We chose not to employ it for AUIS because it was proprietary, had an awkward implementation, was not widely accepted, and lacked a means to introduce dynamic loading. Since that time object-oriented programming has become more popular and C++ has come to be seen as the shortest path into object-oriented programming for programmers already familiar with C. Given the increasing interest in C++, the question arises as to whether to convert AUIS to C++. Although there are some technical complexities, the decision hinges more on non-technical issues. Both aspects are covered in subsequent sections. _____________________________________________ Non-technical issues in conversion The non-technical issues pertaining to the conversion decision pull in both directions. They include desires for standardization, the need for programmer training, and the opportunity costs to the Consortium staff of spending time on the conversion. Large organizations, such as the Consortium members, have difficulty managing the proliferation of programming environments. With the advent of object-oriented (O-O) programming, managers have seen an opportunity to standardize on C++ as the single programming environment spanning the range from traditional C to O-O programming. This standardization lends impetus to the drive to convert from the informal O-O environment of AUIS to a C++ implementation. Since C++ is object-oriented, it lends itself more to computer-aided software engineering tools than plain C does. Consequently a number of such tools are emerging for C++. It will be some time before their quantity and features reach that of similar C tools, but they have the potential to exceed the capabilities of C tools because of the O-O approach. As part of a transition to O-O, organizations will have to retrain programmers. Formal training programs are available for C++, but not as readily so for the AUIS environment. The Consortium staff can provide such training, if desired, but the environment is so close to C that formal training has not been necessary in most cases. Indeed, the AUIS environment offers the distinct advantage of being able to do O-O programming within the familiar C language. Programmers can learn the O-O style of programming with no effort devoted to learning a new language. This experience should prove readily transferable to the full-blown O-O environment of C++. It might be thought that conversion to C++ would make AUIS interoperate more easily with other programming systems. This is unlikely. The real issues of interoperation are interfaces and not programming languages; interfaces such as those for sharing a data stream, screen space, input devices, and print page space. These interfaces are defined at the level of AUIS, but not at the language level. An alternative effort, called Fresco, is in progress to define interfaces similar to AUIS interfaces for a toolkit written in C++ and operating under X Windows. This project is based largely on Interviews and to a lesser extent on ideas from the Andrew Toolkit; two of the principal contributors are Mark Linton who directed the Interviews project and Andrew Palay, an early manager and designer of the Andrew Toolkit. Were we to convert AUIS to C++, it would become available at about the same time as other products based on the Fresco design. These latter would be newer code and would initially lack many features of AUIS, but could perhaps advance more rapidly due to an architecture based in part on experience with that of Andrew. From the standpoint of AUIS as a living system, conversion to C++ has the disadvantage of requiring staff time that could otherwise be spent on enhancing and adapting the system. For instance, the transition from AFS to DFS will require significant effort in the Andrew Mail Delivery System component. And a number of universities including CMU are working on alternative, server-based mail delivery approaches which would require changes to Andrew. Users are continually requesting minor improvements in text and other objects to make them even more suited to, for instance, producing technical papers. For creating applications, we envision changes that would make application creation much less programmer intensive. A major problem area is the printing model which is still tied to the aging troff mechanism. Even the effort to release the code to the X11R6 tape in December 1993 could be imperiled by conversion costs. Conversion to C++ is likely to result in a subset of the existing system since some current capabilities are too little used to justify the cost of conversion. Zip, for example, is unlikely to be converted, although existing zip insets can be handled via the new 'figure' inset. The Appendix shows that the cost of converting to C++ will be essentially an entire year of Consortium staff effort. While this year would have benefits to programmers, it would offer no visible improvements to users, thus constituting a significant hiatus in efforts toward wider adoption of the Andrew User Interface System. One way to bridge this hiatus would be ensured support of the Consortium for an additional year beyond the time required to do the conversion. _____________________________________________ Dynamic Loading The most problematic technical issue in the AUIS programming environment is dynamic loading and linking. This facility makes it possible to have a vast arsenal of objects available on demand and extensible without system modification. However, it also means that porting AUIS to a new hardware/operating-system platform requires more than just recompilation; the dynamic loader must be adapted to the vagaries of the object code file representation on the new platform. C++ lacks a dynamic loading capability. It is an option, however, to continue using the basic dynamic loading capability already in the AUIS environment. The principal additional difficulty is that C++ compilers "mangle" names to adapt them to the arbitrary restrictions of older loaders. Under this strategy, C++ would be used but porting would still be machine dependent. An alternative is to statically load the entire system. This can work in an establishment with well defined requirements for a limited set of objects. Indeed, the Consortium staff has already committed itself to produce a statically loaded version of a subset of AUIS in 1993. It may be that this will be an attractive option and that dynamic loading will be seen as a relic. With this option, however, the link time to incorporate a new object will be very large. Thus the edit-compile-test cycle will be far more trying than at present. Additional factors to consider are introduced by the coming availability of shared libraries. When such a library is employed, multiple ATK applications can be executed with little additional cost in swapping space. In some implementations of sharing, there is even the option of dynamically loading members of the library as they are needed so even less space is required. (However, if members are small and the implementation always loads them on page boundaries, something like half of memory can be wasted in unoccupied space.) On the IBM RS/6000/AIX and Hewlett-Packard 700/800 series, dynamic loading for AUIS is already implemented using shared library and platform-standard dynamic loading facilities. Unfortunately, shared library implementations differ sufficiently between platforms that porting via shared libraries will be at least as machine dependent as the present scheme. In short, the issue of dynamic loading is more or less orthogonal to conversion to C++; we can (and will) introduce static loading without converting and we can convert and retain or discard dynamic loading. If we do convert, however, porting AUIS to new platforms will be more difficult due to the greater differences between implementations of C++ in areas such as name mangling. If we were to begin conversion in 1993, there would still be too few platforms on which shared libraries are available. Thus it would be necessary to implement the more general model of dynamic loading despite the complexities introduced by C++. _____________________________________________ Other technical issues Is conversion technically feasible and advantageous? Various experiments conducted by Todd Inglett and Rob Ryan have demonstrated that a mapping can be defined from the current AUIS code to C++ code and that furthermore a substantial fraction of this mapping can be performed mechanically. Details of the mapping and translator are available on request. A major difference between the two environments is that in the current environment each object is defined in its own header file and implemented in a separate C file. In C++, multiple objects can be defined in a header and implemented in a source file. This is no impediment to conversion because C++ accepts the single-object-per-file mode as a trivial subset. Conversion of header files requires reordering of certain parts of declarations and the introduction of new keywords. A Ness function has been created which achieves this for the majority of existing header files. Conversion of source files requires reordering the code in method calls and revising function headers. In C++, the object operated on is not explicitly mentioned in the header and is usually omitted in references in function bodies. However, the keyword -this- is acceptable as a representation of the object, so the most likely conversion is to replace the AUIS word -self- with -this-. A Ness function will effect this conversion as well. An unfortunate side-effect of any conversion effort is the temptation to 'fix' things as one goes. Done this way, the conversion could easily take twice as long. The result would be cleaner, more maintainable code, but little observable increase in functionality. The development of automatic tools for the conversion will make it possible, we hope, to minimize the temptation to rework the code. C++ offers a number of new language features of varying degrees of merit. The use of in-line code to replace macros, for instance, can considerably reduce the temptation to create 'clever' macros the real effect of which is to render the code less maintainable. In the area of the object model, C++ is quite close to the existing environment, but with one addition; C++ offers multiple inheritance. With this feature objects can inherit properties from more than one parent. For instance, it would no longer be necessary to decide if an object were more like a list or a view--it can be both. This can be a significant advantage in designing new objects, though it will have little impact on existing code which has already been created with the single parent inheritance model. _____________________________________________ Conclusion Conversion of the Andrew User Interface System from its current C base to C++ is feasible, but costly. The principal cost is the opportunity cost of making no enhancements to Andrew for the entire year required for conversion. The principal advantage is that the system will then be ready if and when there are more C++ programmers available than C programmers. Otherwise the main advantage to C++ is the possibility of creating new objects utilizing mutiple inheritance rather than the single inheritance of the existing environment. The consortium staff is able and reluctantly willing to undertake the conversion, however, the decision to begin will depend on the requirements and support of Consortium members. _____________________________________________ Appendix - Cost estimates Since hardware and software are in place, the cost of the conversion will be personnel time--measured here in programmer-months. These estimates are based somewhat on our conversion experiments, but have a large component of educated guessing based on past experience with other conversions. Costs within the Consortium staff: Tools development 2 programmer-months Convert headers 4 Convert code 4 System integration 8 Retraining 7 Documentation 4 5 more platforms 15 (@3 per platform) Total 44 programmer-months (estimate) This estimate totals almost four programmer years. Since the Consortium staff consists of four programmers this will leave us with no slack for other projects. If these numbers appear high, please remember that there is no estimate given for debugging and releasing the resulting code, an effort which usually takes about half a programmer-year. Costs within member sites: If the members request the conversion, it can be assumed they are prepared to absorb elsewhere the costs of providing a C++ environment and training their own staff. We also assume below that the members use conversion tools and documentation provided by the consortium and further that they do the conversion for only one platform. Learn new AUIS conventions .2 mon / pgrmr Learn new AUIS function names .3 mon / pgrmr Convert object headers .05 pm / object Convert object implementation .2 pm / object Integrate converted objects .2 pm / object For a shop with Q objects and P programmers the estimated cost would be .45Q + .5P For a shop with 20 AUIS objects and 6 programmers already familiar with AUIS and C++, the estimated cost would be 12 programmer months; that is, the conversion could be done in about two months,