default nanop ccnop ldnop sb aluz yd car0 ss0 alubr mdrikd marika ; FARITH - floating point arithmetic routines for Ikonas BPS ; Paul Heckbert, NYIT 9 Aug 83 ;---------------------------------------------------------------------- ; floating point format is the same as a VAX's ; a 32-bit float: ffffffff,ffffffff,seeeeeee,efffffff ; where s=sign bit, e=exponent+128, f=fraction with msb lopped off ; -127<=exponent<=127, fraction has 24 bits counting hidden bit ; all bits 0 means zero (a special case - no hidden bit) ; Except when there's a floating point error, MAR and MDR are untouched ; by these subroutines. They use only one entry on the subroutine stack. ; FLOATING POINT ERROR CODES: (get or-ed into "ferr" word) ; You can clear ferr after each check you make to help localize the error. puf=1 ; positive underflow pof=2 ; positive overflow nuf=4 ; negative underflow nof=8 ; negative overflow div0=16 ; divide by zero iof=32 ; integer overflow (in flt_int) ferr: data 0 ; error word global ferr,int_flt,flt_int,fadd,fsub,fmul,fdiv,fcmp ;---------------------------------------------------------------------- ; int_flt: convert 32-bit two's complement integer to a float ; input and output in r14 ; takes 3-10 us. ; uses r13, Q, UDR int_flt: ps b14 qd ldudr 0x4000 ps sq nccneg jmpdf ifpos ifneg: ms car1 sq qd ; take absolute value aluz b14 rlfbd ss1 nccneg jmpdf ifskip ; 80000000 (sign bit) ms sq qd jmpdf ifskip ; -2147483648 -> -2147483647 ifpos: aluz b14 bd cczero naretn ; 00000000; tricky zero check! ifskip: pr rlimm -128-32 b13 lsxbd ; initialize exponent ras rimm 0x0000 sq ; test bit 30 slnml car1 b13 ncczero jmpdf canon ; quit if bit 30 on slnml car1 b13 nccovr jmpdf . ; shift Q left, r13++ till ; normalized, shifts 1-31 bits canon: rps rlimm 0x0080 sq qd ; round ms car1 b13 bd ncccar jmpdf nover ; adjust exponent rlqyd incrs b13 ; if rounding overflow nover: ps b13 rlfbd sr ldudr 0x7fff ras rimm 0xff00 sq qd ; mask fraction rors ra13 sq b13 bd ; tack them together pr rbbr b13 bd ; rotate into standard form rors ra13 b14 bd jmpdf rotate ; or in sign bit ;---------------------------------------------------------------------- ; flt_int: convert float to 32-bit two's complement integer ; input and output in r14 ; takes 3-12 us. ; uses r13, Q, UDR flt_int: pr rbbr b14 qd ldudr 0x8000 ras rimm 0x007f sq b13 llfbd sr ; extract exponent smr car1 rlimm 128 b13 bd ; underflow? nccgtz jmpdf zero ; answer=0 if exponent<=0 smr car1 rlimm 32 b13 ; overflow? nccneg jmpdf iover ; overflow if exponent>=32 pr rbhr b14 bd ldudr 0x807f ras rimm 0xffff b14 bd ; extract fraction (sign/mag) ldudr 0x0080 rors rimm 0x0000 b14 bd ; add hidden bit smr car1 rlimm 24 b13 bd ; compute r14<<(exponent-24) ps b13 ccneg jmpdf rshift smtc carz b14 cczero naretn ; sign/mag -> two's comp. lshift: smr car1 rlimm 1 b13 bd ; SHIFT LEFT 1 to 7 bits ps b14 lafbd ccgtz jmpdf lshift ; r14<<=1 while r13-->0 naretn rshift: ps car1 b13 bd ; SHIFT RIGHT 1 to 23 bits ps b14 rafbd ccneg jmpdf rshift ; r14>>=1 while r13++<0 smtc carz b14 naretn ; sign/mag -> two's comp. ;---------------------------------------------------------------------- ; fadd: floating point add, computes c<-a+b, or r14<-r14+r13 ; fsub: floating point subtract, computes c<-a-b, or r14<-r14-r13 ; takes 7-17 us. ; uses r12, r11, Q, UDR ; ; rounding errors: 16777216-.6=16777216 (see Knuth Vol. 2, Sec. 4.2.1, ex. 5) ; -16777216-1.1=-16777216 ; r14=a,af, cf,c ; r13=b,bf ; r12=ae, ce ; r11=be fsub: ps b13 reos rlimm 0x8000 b13 bd cczero naretn ; negate b, if b=0 return fadd: pr rlimm 255 qd ; UNPACK INTO EXPONENTS & FRACTIONS pr ra14 b12 llfbd sr ldudr 0x807f pr rbhr b14 bd cczero jmpdf a_0; a=0? ras rimm 0xffff b14 lafbd ; af=a&0x807fffff<<1 (a's fraction) ras rbbr b12 sq bd ; ae=a>>7&255 (a's exponent) pr ra13 b11 llfbd sr pr rbhr b13 bd cczero jmpdf b_0; b=0? ras rimm 0xffff b13 lafbd ; bf=b&0x807fffff<<1 (b's fraction) ras rbbr b11 sq bd ldudr 0x0100; be=b>>7&255 (b's exponent) ; fractions in sign/magnitude form now rps rimm 0 b14 lafbd ; add hidden msb rps rimm 0 b13 lafbd ; af,bf now 26 bit (2 guard bits) rms car1 ra12 b11 qd ; Q=ae-be (diff. of exponents) ps sq nccneg jmpdf bsmall ; EQUALIZE EXPONENTS asmall: rps rlimm 25 sq ; now shift af right -Q bits ps car1 sq qd ccneg jmpdf af0 ; if Q < -25 then a negligible ps b14 rafbd nccneg jmpdf aok aloop: ps car1 sq qd ; while Q<0: Q++ ps b14 rafbd ccneg jmpdf aloop ; af=af>>1 aok: pr ra11 b12 bd jmpdf equal ; ce=be af0: aluz b14 bd jmpdf aok ; a negligible: af=0 bf0: aluz b13 bd jmpdf equal ; b negligible: bf=0 bsmall: ms car1 sq qd cczero jmpdf equal ; Q=-Q rps rlimm 25 sq ; now shift bf right -Q bits ps car1 sq qd ccneg jmpdf bf0 ; if Q < -25 then b negligible ps b13 rafbd nccneg jmpdf equal bloop: ps car1 sq qd ; while Q<0: Q++ ps b13 rafbd ccneg jmpdf bloop ; bf=bf>>1 ; ce=ae ; ADD FRACTIONS equal: smtc carz b14 ; convert frac to two's compl. smtc carz b13 rms car1 rlimm -3 b12 lsxbd ; r12=-3-ce rps ra13 b14 bd ldudr 0x7fff ; cf=af+bf, set UDR for later rps ra14 b14 llfbd cczero naretn ; cf=0? rps ra14 b14 qd ccneg jmpdf neg ; Q=cf<<3: preshift saves time ; NORMALIZE FRACTION, c>0 pos: slnml car1 b12 nccovr jmpdf . ; Q<<=1, r12++ till normalized ; loops 2-26 times for fadd, 2 or 3 times for fmul rps rlimm 0x0080 sq qd ; round ms car1 b12 bd ncccar jmpdf pskip ; ce=-r12 rlqyd incrs b12 ; if rounding overflow pskip: nccgtz jmpdf puflow ; if exponent underflow smr car1 rlimm 256 b12 ps b12 rlfbd sr nccneg jmpdf poflow ; if exponent overflow ras rimm 0xff00 sq b14 bd ; mask normalized cf ; UDR set to 7fff long ago rors ra12 b14 bd jmpdf finis ; tack them together ; NORMALIZE FRACTION, c<0 neg: ms car1 sq qd ; take abs. value of fraction slnml car1 b12 nccovr jmpdf . ; Q<<=1, r12++ till normalized ; loops 2-26 times for fadd, 2 or 3 times for fmul rps rlimm 0x007f sq qd ; round ms car1 b12 bd ncccar jmpdf nskip ; ce=-r12 rlqyd incrs b12 ; if rounding overflow nskip: nccgtz jmpdf nuflow ; if exponent underflow smr car1 rlimm 256 b12 ps b12 rlfbd sr nccneg jmpdf noflow ; if exponent overflow ras rimm 0xff00 sq b14 bd ; mask normalized cf ; UDR set to 7fff long ago rors rlimm 0x0080 b14 bd ; add sign bit rors ra12 b14 bd jmpdf finis ; tack them together finis: pr rbbr b14 bd ; PUT RESULT IN STANDARD FORM rotate: pr rbhr b14 bd naretn a_0: pr ra13 b14 bd naretn ; a=0, return b b_0: pr rbbr b12 rlfbd sr ; b=0, repack and return a ps b14 rafbd rors ra12 b14 bd jmpdf rotate ;---------------------------------------------------------------------- ; fmul: floating point multiply, computes c<-a*b, or r14<-r14*r13 ; takes about 8 us. ; uses r12, r11, Q, UDR, MPX, MPY ; r14=a,af, cf,c ; r13=b,bf,temp ; r12=ae, ce ; r11=be,temp fmul: pr rlimm 255 qd ; UNPACK INTO EXPONENTS & FRACTIONS pr ra14 b12 llfbd sr ldudr 0x807f pr rbhr b14 bd cczero naretn ; a=0? ras rimm 0xffff b14 bd ; af=a&0x807fffff (a's fraction) ras rbbr b12 sq bd ; ae=a>>7&255 (a's exponent) pr ra13 b11 llfbd sr pr rbhr b13 bd cczero jmpdf zero; b=0? ras rimm 0xffff b13 bd ; bf=b&0x807fffff (b's fraction) ras rbbr b11 sq bd ldudr 0x0080; be=b>>7&255 (b's exponent) ; fractions in sign/magnitude form now rps rimm 0 b14 bd ; add hidden msb rps rimm 0 b13 bd smtc carz b14 ; convert to two's complement smtc carz b13 rps ra11 b12 bd ; ADD EXPONENTS ; at this point, ce=r12-128 rms car1 rlimm 126 b12 bd ; r12=126-r12 (set up r12 for slnml) ; MULTIPLY FRACTIONS ; NOTATION: cf = af*bf/2^24 = (2^9*a1 + a0) * (2^9*b1 + b0) / 2^24 ; = a1*b1/2^6 + (a0*b1+a1*b0)/2^15 ; where a0,b0 are 9 bit; a1,b1 are 15-bit; cf is 24 bit ras rlimm 0x01ff b14 alumpx ; MPX=a0=am&1ff pr ra13 b11 bd ; r11=bm pr rbbs b13 rlfbd alumpy ; MPY=b1=bm>>9 pr rbbs b14 rlfbd alumpx ; MPX=a1=am>>9 mpzbr b13 bd ; r13=MPZ=a0*b1 (23-24 bits) ras rlimm 0x01ff b11 alumpy ; MPY=b0=bm&1ff mpzbr b14 bd ; r14=MPZ=a1*b1 (29-30 bits) mpzbr b14 smr car1 rlimm 0x8080 b14 bdqd; r14=MPZ=a1*b0 (23-24 bits) ; Q=a1*b1-0x8080 (unround 3x) ; (a0*b0 is insignificant) rps ra13 b14 rlfbd ; r14=r13+r14>>1 ps b14 rlfbd ; r14=r14>>1 pr rbbs b14 lsxbd ; r14=SXT(r14>>8) ps b14 llfbd ldudr 0x7fff ; a0*b1+a1*b0>>9 rps ra14 sq qd pos ; Q+=r14, set UDR for later ccneg jmpRdf neg ; jump to subroutines in fadd ;---------------------------------------------------------------------- ; fdiv: floating point divide, computes c<-a/b, or r14<-r14/r13 ; takes about 11 us. ; uses r12, r11, Q, UDR, CNT ; ; note: 0/0 returns 0 ; r14=a,af, cf,c ; r13=b,bf,temp ; r12=ae, ce ; r11=be,temp fdiv: pr rlimm 255 qd ; UNPACK INTO EXPONENTS & FRACTIONS pr ra14 b12 llfbd sr ldudr 0x807f pr rbhr b14 bd cczero naretn ; a=0? ras rimm 0xffff b14 bd ; af=a&0x807fffff (a's fraction) ras rbbr b12 sq bd ; ae=a>>7&255 (a's exponent) pr ra13 b11 llfbd sr pr rbhr b13 bd cczero jmpdf divby0; b=0? ras rimm 0xffff b13 bd ; bf=b&0x807fffff (b's fraction) ras rbbr b11 sq bd ldudr 0x0080; be=b>>7&255 (b's exponent) ; fractions in sign/magnitude form now rps rimm 0 b14 bd ; add hidden msb rps rimm 0 b13 bd smr car1 ra11 b12 bd ; SUBTRACT EXPONENTS rps rlimm 128 b12 bd ; correct bias to 128 ; DIVIDE FRACTIONS smtc carz b13 ; convert to two's complement smtc carz b14 ccneg jmpdf nD ; is bf<0? pD: aluz qd ccneg jmpdf pDnN ; is af<0? pDpN: RMS CAR1 ra14 b13 ; af>0, bf>0 CCNEG jmpdf ppok ; if af>=bf then ps b14 rlflqbd ; pre-shift numerator right 1 incrs b12 ; the following three lines do: Q = r14,Q / r13; r14 = r14,Q % r13 ppok: dlnml ra13 b14 lr ldcnt 1-25 ; 25th iteration for rounding tcdiv ra13 b14 lr carz ncccntz jmpcnt .; shift and subtract in loop tcdivc ra13 b14 ss1 carz ; divide correction RPS CAR0 ra13 b14 rlflqbd ; r14+=r13; (r14,Q)>>=1 PS CAR1 sq b14 rlfbd LDUDR 0x807f ; r14 gets quotient ps b12 jmpdf ppack pDnN: RPS ra14 b13 ; af<0, bf>0 CCGTZ jmpdf pnok ; if -af>=bf then ps b14 rlflqbd LR LDUDR 0x8000 ; pre-shift numerator right 1 RORS RIMM 0x0000 B14 BD ; restore sign bit incrs b12 pnok: dlnml ra13 b14 lr ldcnt 1-25 ; 25th iteration for rounding tcdiv ra13 b14 lr carz ncccntz jmpcnt .; shift and subtract in loop tcdivc ra13 b14 ss1 carz ; divide correction RPS CAR0 ra13 b14 rlflqbd LDUDR 0xfc00 ; r14+=r13; (r14,Q)>>=1 RPS RIMM 1 sq b14 rlfbd SS1 ; r14 gets quotient ps b12 jmpdf npack nD: aluz qd ccneg jmpdf nDnN ; is af<0? nDpN: RPS ra14 b13 ; af>0, bf<0 CCNEG jmpdf npok ; if af>=-bf then ps b14 rlflqbd ; pre-shift numerator right 1 incrs b12 npok: dlnml ra13 b14 lr ldcnt 1-25 ; 25th iteration for rounding tcdiv ra13 b14 lr carz ncccntz jmpcnt .; shift and subtract in loop tcdivc ra13 b14 ss1 carz ; divide correction SMR CAR1 ra13 b14 rlflqbd LDUDR 0xfc00 ; r14-=r13; (r14,Q)>>=1 RPS RIMM 1 sq b14 rlfbd SS1 ; r14 gets quotient ps b12 jmpdf npack nDnN: RMS CAR1 ra14 b13 ; af<0, bf<0 CCGTZ jmpdf nnok ; if -af>=-bf then ps b14 rlflqbd LR LDUDR 0x8000 ; pre-shift numerator right 1 RORS RIMM 0x0000 B14 BD ; restore sign bit incrs b12 nnok: dlnml ra13 b14 lr ldcnt 1-25 ; 25th iteration for rounding tcdiv ra13 b14 lr carz ncccntz jmpcnt .; shift and subtract in loop tcdivc ra13 b14 ss1 carz ; divide correction SMR CAR1 ra13 b14 rlflqbd ; r14-=r13; (r14,Q)>>=1 PS CAR1 sq b14 rlfbd LDUDR 0x807f ; r14 gets quotient ps b12 jmpdf ppack ; PACK RESULT ppack: nccgtz jmpdf puflow ; if exponent underflow smr car1 rlimm 256 b12 pr rbbr b12 rlfbd sr nccneg jmpdf poflow; if exponent overflow ras rimm 0xffff b14 bd ; mask cf (UDR set earlier) rors ra12 b14 bd jmpdf rotate ; tack them together npack: smtc carz b14 nccgtz jmpdf nuflow ; if exponent underflow smr car1 rlimm 256 b12 pr rbbr b12 rlfbd sr nccneg jmpdf noflow; if exponent overflow ldudr 0x807f ras rimm 0xffff b14 bd ; mask normalized cf rors ra12 b14 bd jmpdf rotate ; tack them together ;---------------------------------------------------------------------- ; fcmp: compares two floating point numbers a and b (in r14 and r13, resp.) ; returns -1 if ab ; takes 3 us. ; uses r12, r11, Q, UDR fcmp: smr car1 ra13 b14 cczero jmpdf zero ; a=b? pr rbbr b14 qd ldudr 0x8000 ; UNPACK EXPONENTS & FRACTIONS ras rimm 0x007f sq b12 llfbd sr ; extract ae pr rbhr b14 bd ldudr 0x807f ras rimm 0xffff b14 bd ; extract af (sign/mag) pr rbbr b13 qd ldudr 0x8000 ras rimm 0x007f sq b11 llfbd sr ; extract be pr rbhr b13 bd ldudr 0x807f ras rimm 0xffff b13 bd ; extract bf (sign/mag) ; don't bother with hidden bit ps b14 ccneg jmpdf bneg bpos: smr car1 ra11 b12 qd ccneg jmpdf less ; ae-be; a<0 & b>0? ps sq ccneg jmpdf less ; ae=0 here) smr car1 ra13 b14 ccgtz jmpdf more ; af-bf; ae>be? ccneg jmpRdf less ; af0 & b<0? ps sq ccneg jmpdf more ; aebe? ccneg jmpRdf more ; af