ISSUE: Integer Portability REVISION HISTORY: Rob MacLachlan, 21 February 94 STATUS: Open RELATED ISSUES: CATEGORY: Addition PROBLEM DESCRIPTION: The current definition of the class and integer arithmetic will cause serious portability problems when Dylan programs are moved between implementations that adopt different strategies on extended precision arithmetic: 1] There is no way for a Dylan compiler to detect that an otherwise portable program depends on extended precision arithmetic. 2] There is no portable way to express that extended arithmetic is unnecessary, and that the implementation is expected to produce good inline code. PROPOSAL: Add: -- , an optional sealed subclass of , and -- The type (a finite subtype of ) Three implementation strategies are permitted: 1] is equivalent to (the same class as) . The class must implement indefinite-precision integers. represents the subrange of integers with efficient representation. This option allows Lisp-like integer semantics. 2] and are disjoint subclasses of the abstract class . The consequences of overflow is implementation defined (see below), whereas use of guarantees indefinite precision arithmetic. 3] Like 2, but no class is defined, and the variable has no definition. This provides a clear way for an implementation to indicate that semantics are not supported. is equivalent to . Implementations of this kind might support upgrading to 2 by loading an library. In other words, implementations may support: -- only extended integers, -- both fixed and extended integers, or -- only fixed integers. In discussion below, these options are referred to as [only-extended], [both] and [only-fixed]. An "integer function" is defined to be one of the following. +, *, - (unary or binary), floor, ceiling, truncate, round, floor/, ceiling/, truncate/, round/, modulo, remainder, expt, logior, logxor, logand, lognot, ash, binary-lcm, binary-gcd The intent is that this list should include all functions which compute a new integer value from integer arguments using arithmetic and logic operations. Representations and Coercion: as (type == , value :: ) [G.F. Method] as (type == , value :: ) [G.F. Method] These methods coerce integers to the fixed or extended types. Conversion to a may cause overflow, see below. [both] In this model, any integer representable as a is also representable as an . In this case there are two semantically distinguishable representations for mathematically identical small integers: id?(3, as(, 3)) ==> #f (3 = as(, 3)) ==> #t For integer functions (as defined above), "extendedness" is contagious: -- If all the arguments are s, then the integer results will also be s (unless overflow occurs, see below.) -- If any of the arguments are , then the integer results are also s, and any internal computation is immune to overflow. [only-extended] In this model, all integers are extended integers. The AS methods are still defined, but simply return the argument value after checking that it is already of the correct type, so: as(, x) <==> check-type(x, ) [only-fixed] In this model, the variable will be undefined. The environment will presumably statically detect undefined variable references (before run-time.) Overflow: overflow occurs when the mathematically correct result of an integer function is outside the range of . Implementations may either return the correct (non-fixed) result or signal an error. (Note that [only-extended] must always return, and [only-fixed] can only signal an error.) Integer literals: If an integer literal can be represented as a , then it is. If an implementation supports extended integers and a literal is too large to be a fixed integer, then it is parsed as an . In [only-fixed] large literals will cause a compile-time error. Size: The type is guaranteed to be able to represent at least all integers between -(2^27) and (2^27)-1, i.e. a 28bit signed integer. The largest and smallest values are stored in: $most-positive-fixed-integer $most-negative-fixed-integer The size and keys of objects are guaranteed to be s. All Dylan functions that currently accept integers still accept any class of integer as arguments, but functions that return vector sizes or vector keys will coerce these values to . If implemented, the class should be an arbitrarily large variable-length integer representation. Limited integers: LIMITED can be used to make subtypes of and as well as . If INT-CLASS is one of , or , then: instance?(object, limited(INT-CLASS, ...)) is true if: instance?(object, INT-CLASS) is true and the value is in the range specified to limited. Given two limited integer types T1 and T2, T1 is a subtype of T2 if: 1] The integer class of T1 is a (possibly improper) subtype of the class of T2, and 2] The range of T1 includes no integer values not also included in T2. Interaction w/ floats: When a float is coerced to an integer by TRUNCATE, TRUNCATE2, CEILING, etc., then, if possible, the result will be represented as a . If the result is too large and is implemented, then the result will be an extended integer. RATIONALE: This proposal allows clear guidelines for portable efficient Dylan code: -- Where the contract is adequate, slots, global variables, and function arguments and return values should be declared as . -- Where extended arithmetic might be needed (non-vector-indices larger than 28 bits), then before doing operations that might overflow, AS should be used to coerce any fixed operands to extended integer. Values could be declared either or , but declaring as is safer, since this provides some checking that necessary coercions have been done. -- If efficiency is unimportant and there is little possibility of porting to implementations that don't implemented extended integers, then declarations (or no declarations at all) are adequate. The reason for making the consequences of overflow implementation-defined is that this allows the [only-extended] model to treat as simply designating a subrange of integers (like Common Lisp fixnum.) Implementations which support [both] are encouraged to also support [only-extended], with the choice being controlled by a system configuration option that can be set according to the local user community. EXAMPLES: If "fx", "fy" and "fz" are variables or expressions, then this expression (which doesn't use any implementation-specific type declarations) should compile with comparable efficiency to any equivalent non-portable expression: fx + fy + fz Consider this definition of the factorial function: define method factorial(n) if (zero?(n)) 1; else n * factorial(n - 1); end; end; If N is already an , then overflow can't occur. This code will compile in implementations that don't support , but will only be guaranteed to work for results up to 28 bits. To ensure that results will always be generated from N's, the code would have to have a coercion added somewhere, like: define method factorial(n) if (zero?(n)) as(, 1); else n * factorial(n - 1); end; end; Note that this version won't compile under [only-fixed]. COST TO IMPLEMENTORS: This proposal offers a wide range of implementation freedom, and is intended to allow implementors to choose the best implementation strategy for their platform and user community. COST TO USERS: Some potential for confusion is created by the optional disjointness of classes and the possibility of overlapping integer classes. This can be avoided in non-efficiency-critical applications (e.g. teaching) by not presenting the subclasses and choosing a Dylan implementation which supports the [only-extended] model. PERFORMANCE IMPACT: This proposal provides clear coding style guidelines which allow even simple implementations to generate good inline integer arithmetic. BENEFITS: Clear access to extended arithmetic and increased portability of efficient code. COST OF NON-ADOPTION: Implementations will be forced to develop idiosyncratic ways to describe fixed-precision arithmetic. Users will be forced to learn and convert between different fixed-integer conventions, and portability of efficiency-critical code will suffer. AESTHETICS: The possibility in [both] of two = but non-id? integers is ugly, but when simplicity is the primary concern, implementations can avoid this problem by using the [only-extended] model instead. FUTURE FEATURES: A standard read/print library would probably want a read/print syntax that makes extended and fixed integers look different.