I) Do bottom-up register allocation with 3 registers. When choosing a register to allocate always choose the lowest numbered one available. When choosing register to spill, choose the non-dirty register that will be used farthest in future. If all registers are dirty, choose the one that is used farthest in future. In case of a tie, choose the lowest numbered register. #### Ans: | | Instruction | Live | |----|-------------|-----------------| | 1 | A = B + C | { A, B } | | 2 | C = A + B | { A, B, C } | | 3 | T1 = B + C | { A, B, C, T1 } | | 4 | T2 = T1 + C | { A, B, C, T2 } | | 5 | D = T2 | { A, B, C, D } | | 6 | E = A + B | { C, D, E } | | 7 | B = E + D | { B, C, D } | | 8 | A = C + D | { A, B } | | 9 | T3 = A + B | { T3 } | | 10 | WRITE(T3) | { } | #### 1. A = B + C | Code generated | Register Map | Comments | | | |----------------|------------------------------|------------------------------------------------------------------|--|--| | | (* indicates dirty register) | | | | | LD B R1 | R1: B R2: A* | <ul> <li>ensure(B) returns R1, associates B with R1</li> </ul> | | | | LD C R2 | | and generates LD B R1, | | | | ADD R1 R2 R2 | | <ul> <li>ensure(C) returns R2, associates C with R2</li> </ul> | | | | | | and generates LD C R2 | | | | | | <ul> <li>free(R2) marks R2 as free</li> </ul> | | | | | | (because C is dead) | | | | | | <ul> <li>allocate(A) returns R2, associates R2 with A</li> </ul> | | | | | | <ul> <li>generate code for ADD</li> </ul> | | | | | | <ul> <li>mark R2 as dirty</li> </ul> | | | ### 2. C = A + B | Code generated | Register Map (* indicates dirty register) | Comments | | | |----------------|-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | ADD R2 R1 R3 | R1: B R2: A* R3: C* | <ul> <li>ensure(A) returns R2</li> <li>ensure(B) returns R1</li> <li>allocate(C) returns R3, associates R3 with C</li> <li>generate code for ADD</li> <li>mark R3 as dirty</li> </ul> | | | # 3. T1 = B + C | Code generated | Register Map (* indicates dirty register) | Comments | | | | |----------------|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--| | | | <ul> <li>ensure(B) returns R1</li> <li>ensure(C) returns R3</li> <li>allocate(T1) returns R1, associates R1 with T1 (No register is free. Chooses R1 to free since R1 is non-dirty and is used farthest in future (in instruction 6))</li> <li>generate code for ADD</li> <li>mark R1 as dirty</li> </ul> | | | | ### 4. T2 = T1 + C | Code generated | Register Map | Comments | | | |----------------|------------------------------|-------------------------------------------------------------------|--|--| | | (* indicates dirty register) | | | | | ADD R1 R3 R1 | R1: T2* R2: A* R3: C* | ensure(T1) returns R1 | | | | | | <ul><li>ensure(C) returns R3</li></ul> | | | | | | <ul> <li>free(T1) marks R1 as free</li> </ul> | | | | | | (No Store despite R1 is dirty because T1 is dead) | | | | | | <ul> <li>allocate(T2) returns R1, associates R1 with C</li> </ul> | | | | | | generate code for ADD | | | | | | <ul> <li>mark R1 as dirty</li> </ul> | | | # 5. D = T2 | Code generated | Register Map | Comments | | |----------------|------------------------------|------------------------------------------------------------------|--| | | (* indicates dirty register) | | | | | R1: D* R2: A* R3: C* | <ul><li>ensure(T2) returns R1</li></ul> | | | | | <ul> <li>free(T2) marks R1 as free</li> </ul> | | | | | (No Store despite R1 is dirty because T2 is dead) | | | | | <ul> <li>allocate(D) returns R1, associates R1 with D</li> </ul> | | | | | No code generated | | | | | mark R1 as dirty | | ### 6. E = A + B | Code generated | Register Map (* indicates dirty register) | Comments | | | |------------------------------------|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | ST R3 C<br>LD B R3<br>ADD R2 R3 R2 | R1: D* R2: E* R3: | <ul> <li>ensure(A) returns R2</li> <li>ensure(B) returns R3 (No register is free. All registers dirty. So, a call to allocate(B) chooses R3 among R1 and R3 to free. R3 is chosen because C is used farthest in future. free(R3) generates store because C is live. Associates R3 with B. Generates Load from B into R3)</li> <li>free(A) marks R2 as free (No Store despite R2 is dirty because A is dead)</li> <li>free(B) marks R3 as free (because B is dead)</li> <li>allocate(E) returns R2, associates R2 with E</li> <li>Generate code for ADD</li> <li>mark R2 as dirty</li> </ul> | | | ### 7. B = E + D | Code generated | Register Map | Comments | | |----------------|------------------------------|------------------------------------------------------------------|--| | | (* indicates dirty register) | | | | ADD R2 R1 R2 | R1: D* R2: B* R3: | ensure(E) returns R2 | | | | | <ul><li>ensure(D) returns R1</li></ul> | | | | | <ul> <li>free(E) marks R2 as free</li> </ul> | | | | | (No Store despite R2 is dirty because E is dead) | | | | | <ul> <li>allocate(B) returns R2, associates R2 with B</li> </ul> | | | | | Generate code for ADD | | | | | mark R2 as dirty | | # 8. A = C + D | Code generated Register Map (* indicates dirty register) | | Comments | | | |----------------------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | LD C R3<br>ADD R3 R1 R1 | R1: A* R2: B* R3: | <ul> <li>ensure(C) returns R3, associates R3 with C, generates load from C into R3</li> <li>ensure(D) returns R1</li> <li>free(C) marks R3 as free (C is dead)</li> <li>free(D) marks R1 as free</li> <li>(no store despite R1 being dirty because D is dead)</li> <li>allocate(A) returns R1, associates R1 with A</li> <li>Generate code for ADD</li> <li>mark R1 as dirty</li> </ul> | | | # 9. T3 = A + B | Code generated | Register Map | Comments | | | |----------------|------------------------------|--------------------------------------------------------------------|--|--| | | (* indicates dirty register) | | | | | ADD R1 R2 R1 | R1: T3* R2: R3: | <ul><li>ensure(A) returns R1</li></ul> | | | | | | <ul><li>ensure(B) returns R2</li></ul> | | | | | | <ul> <li>free(A) marks R1 as free</li> </ul> | | | | | | (A is dead, No store.) | | | | | | <ul> <li>free(B) marks R2 as free</li> </ul> | | | | | | (B is dead. No store) | | | | | | <ul> <li>allocate(T3) returns R1, associates R1 with T3</li> </ul> | | | | | | Generate code for ADD | | | | | | <ul> <li>mark R1 as dirty</li> </ul> | | | # 10. WRITE(T3) | Code generated Register Map | | Comments | | |-----------------------------|------------------------------|-----------------------------------------------|--| | | (* indicates dirty register) | | | | WRITE(R1) | R1: R2: R3: | <ul><li>ensure(T3) returns R1</li></ul> | | | | | <ul> <li>free(T3) marks R1 as free</li> </ul> | | | | | Generate code for WRITE | | # Summarizing: | | Instruction | Live | Registers | | S | Code | |----|-------------|----------------|-----------|----|----|--------------| | | | | R1 | R2 | R3 | | | 1 | A = B + C | { A, B } | В | A* | | LD B R1 | | | | | | | | LD C R2 | | | | | | | | ADD R1 R2 R2 | | 2 | C = A + B | { A, B, C } | В | A* | C* | ADD R2 R1 R3 | | 3 | T1 = B + | { A, B, C, T1 | T1* | A* | C* | ADD R1 R3 R1 | | | С | } | | | | | | 4 | T2 = T1 + | { A, B, C, T2 | T2* | A* | C* | ADD R1 R3 R1 | | | С | } | | | | | | 5 | D = T2 | { A, B, C, D } | D* | A* | C* | | | 6 | E = A + B | { C, D, E } | D* | E* | | ST R3 C | | | | | | | | LD B R3 | | | | | | | | ADD R2 R3 R2 | | 7 | B = E + D | { B, C, D } | D* | B* | | ADD R2 R1 R2 | | 8 | A = C + D | { A, B } | A* | B* | | LD C R3 | | | | | | | | ADD R3 R1 R1 | | 9 | T3 = A + | { T3 } | T3* | | | ADD R1 R2 R1 | | | В | | | | | | | 10 | WRITE(T3) | { } | | | | WRITE R1 | II. Two ALUs (fully pipelined) and one LD/ST unit (not pipelined) are available. Either of the ALUs can execute ADD (1 cycle). Only one of the ALUs can execute MUL (2 cycles). LDs take up an ALU for 1 cycle and LD/ST unit for two cycles. STs take up an ALU for 1 cycle and LD/ST unit for one cycle. i) Draw reservation tables, ii) DAG for the code shown iii) schedule using height based list scheduling. - 1. LD A R1 - 2. LD B R2 - 3. LD C R3 - 4. LD D R4 - 5. R5 = R1 + R2 - 6. R6 = R5 \* R3 - 7. R7 = R1 + R6 - 8. R8 = R6 + R5 - 9. R9 = R4 + R7 - 10. R10 = R9 + R8 - 11. ST R10 E - 12. ST R7 F #### Ans: (i) #### **Reservation Tables** | ADD | MUL | LD | ST | |----------------------------------|-----------------|-----------------|--------------------| | ALU1 ALU2 LD/ST ALU1 ALU2 LD/ST | ALU1 ALU2 LD/ST | ALU1 ALU2 LD/ST | ALU1 ALU2 LD/ST ✓ | | | | ALU1 ALU2 LD/ST | ALU1 ALU2 LD/ST | ADDs take up either of the ALUs and occupy that ALU for a single cycle. Hence, we show two tables and a single row of occupancy. MULs can execute on only one of the ALUs. Hence, we show a single table. Here, I assume that only ALU1 can execute MUL. Further, MULs take up two cycles to complete. Hence, we show two rows of occupancy. The ALUs are fully pipelined. This means that while ALU1 is executing the second cycle of mul1, it is also available to execute another instruction (mul2 / add1 / ld1 / st1). LDs take up an ALU (either of the ALUs) for one cycle and LD/ST unit for two cycles. Hence, we show two tables and three rows of occupancy. STs take up an ALU (either of the ALUs) for one cycle and LD/ST unit for one cycle. Hence, we show two tables and two rows of occupancy. # (ii) DAG (iii) DAG with heights assigned to nodes (height of a leaf node = latency of that instruction. Height of an interior node = maximum of heights of all children + latency of that instruction) | Cycle<br># | Available<br>Instructions | Scheduled<br>Instructions | Completed<br>Instructions | ALU1 | ALU2 | LD<br>/ST | Comments | |------------|---------------------------|---------------------------|---------------------------|------|------|-----------|--------------------------------------------------------------------------------------------------| | 0 | 1, 2, 3, 4 | 1 | - | 1 | | | 1 and 2 have max height. 1 (picked at random) scheduled. | | 1 | 2, 3, 4 | - | - | | | 1 | | | 2 | 2, 3, 4 | 2 | - | 2 | | 1 | 2 scheduled to utilize available ALUs | | 3 | 3,4 | - | 1 | | | 2 | | | 4 | 3,4 | 3 | - | 3 | | 2 | 3 scheduled to utilize available ALUs | | 5 | 4,5 | 5 | 2 | 5 | | 3 | 5 becomes available as 1 and 2 are complete. 4 has to wait till the last cycle of 3 is executed. | | 6 | 4 | 4 | 5 | 4 | | 3 | 5 finishes. 4 can now be scheduled. | | 7 | 6 | 6 | 3 | 6 | | 4 | 3 finishes. 6 becomes available. Takes two cycles on fully-pipelined ALU1. | | 8 | | | | | | 4 | ALU1 is also executing 6 in its pipeline. | | 9 | 7,8 | 7,8 | 4,6 | 7 | 8 | | 4 and 6 finish. As a result, 7 and 8 become available. Both can be scheduled on different ALUs. | | 10 | 9, 12 | 9, 12 | 7,8 | 9 | 12 | | 7 and 8 finish. As a result, 9 and 12 are available. Both can be scheduled. | | 11 | 10 | 10 | 9 | 10 | | 12 | 9 finishes. 10 becomes available. | | 12 | 11 | 11 | 10,12 | 11 | | | 10 and 12 finish. 11 becomes available. | | 13 | | | | | | 11 | | | 14 | | | 11 | | | | 11 finishes. |