It seems to me there are 10 kinds of people in this server, those who understand and have created CLAs and those who only understand classic Ripple-carry adders.
For this tut I'm going to do a 4-bit CLA, though it should apply to word size. I assume you know how ripple-carry adders work.
Introduction
First some terminology, and a look at why classic Ripple-carry adders are bad:
There are three 4-digit numbers, and I'll call their digits A0..A3, B0..B3, and Q0..Q3 (0 being the least significant). A+B=Q. There is also a carry in C0 and carry out C4. During calculation, intermediate carrys C1..C3 are generated.
Adding goes like this:
For example:
So, the carry out is 1 because the result doesn't fit into a 4-digit number.
As you see, to calculate the carry out C4 you'll have to calculate every digit before it in sequence.[/b]
Going binary
Here we can find a very useful concept that doesn't exist outside of binary: Carry generation and carry propagation.
A carry is said to generate if both An and Bn are 1. In this case, Cn+1 will always be 1, no matter what Cn is. This happens in A0+B0 and A3+B3
A carry is said to propagate if either of An or Bn is 1 but not both. In this case, Cn+1 will be equal to Cn. This happens in A1+B1
In the remaining case, where both An and Bn are 0, the carry is extinguished. This happens in A2+B2.
Using carry-lookahead
With this concept, we can look ahead in time at what a carry will be, and calculate everything in parallel. In the next sections, I'll explain how to calculate them.
For example, assuming carrys have been generated already:
Note that these calculations are calculated 'modulo 10' (⊕) this means that every digit is independent from others and only the least significant digit is kept. So it's not just 10110+1890 on your everyday calculator!
This can even be simplified as:
But in a while I'll show why this isn't efficient in any way.
In binary, modulo 10 is the same as XOR.
Generating carries ahead of time
First we calculate the carry generate and carry propagate lines, P0..P3 and G0..G3:
As you can see, we need the XOR of the inputs to find the carry propagations, P=A⊕B. Hence the simplified version above is inefficient. We already have P=A⊕B, so it would be foolish to calculate Q=A⊕B⊕C when you can calculate Q=P⊕C. We still have to calculate C though.
Let's reflect. When a carry generates, it's 1 per definition. When it propagates it's the same as the previous one. How do we know if a given carry is 1 of 0?
If generate is 1, the next carry becomes 1, or
If propagate is 1 and carry is 1, the next carry becomes 1
Going formal:
Cn+1 = Gn OR (Pn AND Cn)
Expanding this,
As you see, this is still sequential and not an inch better than the classic approach.
Fortunately, we can recursively solve this and then optimize.
Solving recursively
To make it non-recursive, we are only allowed to have P, G and C0 in the input. So you have to fill in the calculated carries with their respective formulas:
Now this might look intimidating. To make it more efficient (2 layers of gates), we have to convert it to either sum-of-products or product-of-sums. As it turns out, sum-of-products wins here:
We can even interpret this in a very intuitive way! Any carry is 1 if:
This is honestly the fastest way to make an adder (save for instant-carry adders which require pistons), though it requires a lot of logic. To save a bit on logic, we can add a 3rd layer by precalculating some repeating elements:
This is just one way out of dozens in which it can be done, and I don't guarantee (even deny) that this is the most efficient way for any practical design. In fact, depending on your design you need to precalculate different things.
The Proper Lookahead Adder (by properenglish) does it in a different way indeed. He even inverts some lines so they can be combined with 0-tick Wired-ORs, and uses the glowstone/slabs trick to prevent signals from propagating back, making all three layers work in just 2 ticks.
As a bonus, here is the complete reverse engineered design from the video above. Note that he doesn't have a carry-in.
For this tut I'm going to do a 4-bit CLA, though it should apply to word size. I assume you know how ripple-carry adders work.
Introduction
First some terminology, and a look at why classic Ripple-carry adders are bad:
There are three 4-digit numbers, and I'll call their digits A0..A3, B0..B3, and Q0..Q3 (0 being the least significant). A+B=Q. There is also a carry in C0 and carry out C4. During calculation, intermediate carrys C1..C3 are generated.
Adding goes like this:
Code:
C3 C2 C1 C0
A3 A2 A1 A0
B3 B2 B1 B0 +
----------------
C4 Q3 Q2 Q1 Q0
For example:
Code:
0 10 110 0110 0110
5432 5432 5432 5432 5432
6468 + 6468 + 6468 + 6468 + 6468 +
------- ------- ------- ------- -------
0 00 900 1900 11900
As you see, to calculate the carry out C4 you'll have to calculate every digit before it in sequence.[/b]
Going binary
Code:
0
1011
1001 +
-------
10100
Here we can find a very useful concept that doesn't exist outside of binary: Carry generation and carry propagation.
A carry is said to generate if both An and Bn are 1. In this case, Cn+1 will always be 1, no matter what Cn is. This happens in A0+B0 and A3+B3
A carry is said to propagate if either of An or Bn is 1 but not both. In this case, Cn+1 will be equal to Cn. This happens in A1+B1
In the remaining case, where both An and Bn are 0, the carry is extinguished. This happens in A2+B2.
Using carry-lookahead
With this concept, we can look ahead in time at what a carry will be, and calculate everything in parallel. In the next sections, I'll explain how to calculate them.
For example, assuming carrys have been generated already:
Code:
5432 10110 (the carries)
6468 ⊕ 1890 ⊕
------- -------
1890 11900
This can even be simplified as:
Code:
10110 (the carries)
5432
6468 ⊕
-------
11900
In binary, modulo 10 is the same as XOR.
Generating carries ahead of time
First we calculate the carry generate and carry propagate lines, P0..P3 and G0..G3:
Code:
G = A AND B
P = A XOR B
As you can see, we need the XOR of the inputs to find the carry propagations, P=A⊕B. Hence the simplified version above is inefficient. We already have P=A⊕B, so it would be foolish to calculate Q=A⊕B⊕C when you can calculate Q=P⊕C. We still have to calculate C though.
Let's reflect. When a carry generates, it's 1 per definition. When it propagates it's the same as the previous one. How do we know if a given carry is 1 of 0?
If generate is 1, the next carry becomes 1, or
If propagate is 1 and carry is 1, the next carry becomes 1
Going formal:
Cn+1 = Gn OR (Pn AND Cn)
Expanding this,
Code:
C1 = G0 OR (P0 AND C0)
C2 = G1 OR (P1 AND C1)
C3 = G2 OR (P2 AND C2)
C4 = G3 OR (P3 AND C3)
As you see, this is still sequential and not an inch better than the classic approach.
Fortunately, we can recursively solve this and then optimize.
Solving recursively
To make it non-recursive, we are only allowed to have P, G and C0 in the input. So you have to fill in the calculated carries with their respective formulas:
Code:
C1 = G0 OR (P0 AND C0)
C2 = G1 OR (P1 AND (G0 OR (P0 AND C0)))
C3 = G2 OR (P2 AND (G1 OR (P1 AND (G0 OR (P0 AND C0)))))
C4 = G3 OR (P3 AND (G2 OR (P2 AND (G1 OR (P1 AND (G0 OR (P0 AND C0)))))))
Now this might look intimidating. To make it more efficient (2 layers of gates), we have to convert it to either sum-of-products or product-of-sums. As it turns out, sum-of-products wins here:
Code:
C1 = G0 OR (P0 AND C0)
C2 = G1 OR (P1 AND G0) OR (P1 AND P0 AND C0)
C3 = G2 OR (P2 AND G1) OR (P2 AND P1 AND G0) OR (P2 AND P1 AND P0 AND C0)
C4 = G3 OR (P3 AND G2) OR (P3 AND P2 AND G1) OR (P3 AND P2 AND P1 AND G0) OR (P3 AND P2 AND P1 AND P0 AND C0)
We can even interpret this in a very intuitive way! Any carry is 1 if:
- It's generated one digit before, or
- It's generated two digits before and propagated once, or
- It's generated three digits before and propagated twice, or
- ...etc.
This is honestly the fastest way to make an adder (save for instant-carry adders which require pistons), though it requires a lot of logic. To save a bit on logic, we can add a 3rd layer by precalculating some repeating elements:
Code:
X = P1 AND P0 AND C0
Y = P1 AND G0
Z = P2 AND G1
W = P3 AND P2
C1 = G0 OR (P0 AND C0)
C2 = G1 OR Y OR X
C3 = G2 OR Z OR (P2 AND Y) OR (P2 AND X)
C4 = G3 OR (P3 AND G2) OR (P3 AND Z) OR (W AND Y) OR (W AND X)
The Proper Lookahead Adder (by properenglish) does it in a different way indeed. He even inverts some lines so they can be combined with 0-tick Wired-ORs, and uses the glowstone/slabs trick to prevent signals from propagating back, making all three layers work in just 2 ticks.
As a bonus, here is the complete reverse engineered design from the video above. Note that he doesn't have a carry-in.