# Line drawing

In this post we are going to take a look at Bresenham’s line drawing algorithm and how the Amiga 500 implements it.

This post starts little differently than previously, since letter XII of the Amiga Machine Code Course only provides code, and is extremely brief on the details! So let’s start with some history 😃.

Bresenham’s line drawing algorithm was invented back in 1962 by Jack Elton Bresenham. His algorithm was one of the earliest in computer graphics, and is still being used today, due to it’s simplicity and speed.

Bresenham was working for IBM, while developing the algorithm. He described his work at the time with these words.

I was working in the computation lab at IBM’s San Jose development lab. A Calcomp plotter had been attached to an IBM 1401 via the 1407 typewriter console. [The algorithm] was in production use by summer 1962, possibly a month or so earlier. Programs in those days were freely exchanged among corporations so Calcomp (Jim Newland and Calvin Hefte) had copies. When I returned to Stanford in Fall 1962, I put a copy in the Stanford comp center library.

The IBM 1401 was announced back in 1959 and was considered an affordable general purpose computer for businesses, that made it very popular in the early 1960s, with several thousand units sold.

In 1959 the system had three units: The 1401 processing unit, the 1402 card reader, and the 1403 text printer. Here’s a promotional picture from back then.

The IBM 1401 did not come with any display. Essentially the IBM 1403 text printer provided the output from running the programs. Output could also be provided using the IBM 1407 typewriter console, and as something new back then, it also made it possible for the user to provider input. However, waiting for that input, could halt the whole system, which in todays money would cost 10 dollars a minute to rent. 😱

Later on the IBM 1401 system was extended with the possibility of connecting a plotter, that made it possible to print lines and figures. This plotter was the CalComp 565, which was also sold as the IBM 1627.

It’s perhaps not surprising that computer graphics had it’s humble beginnings on printed paper - Here’s a video of the CalComp 565 plotter in action ❤️

There’s a video on youtube that shows the IBM 1401 system in action. It was a noisy machine and it was a great achievement to get this old machine up and running again. If you find this stuff interesting, then there’s a page for the revival project over at the Comupter History Musuem. 👍

Bresenham never did patent his algorithm, and it was exchanged freely among peers at the time. In fact a very early version of his code, for the IBM 1440 system, a lower cost version of the IBM 1401, can be found over at bitsavers.org. The code mentions the IBM 1447 - a physical console, which was part of the 1440 system.

Fast forward to the 80’s we have the Amiga 500, that would become one of the most iconic 16-bit home computers of the decade. Underneath the hood of this machine was an implementation of Bresenham’s algorithm, that was not much different than that of the old IBM machines. But in contrast to IBM, the Amiga implemented line drawing in hardware, inside the blitter.

Before we dive into Bresenham’s algorithm, let’s consider the problem of drawing a simple line between two points.

## Let’s Draw a Line

The equation for a line look like this. $$ y = ax + b$$

Now, let’s say the line is drawn between the two points $(x_1, y_1)$ and $(x_2, y_2)$, then we can define the distances $\Delta x = x_2 - x_1$ and $\Delta y = y_2 - y_1$, and calculate the gradient $a$.

$$a = \frac{\Delta y}{\Delta x}$$

And then we solve for $(x_1, y_1)$, and find $b$ $$b = y_1 - \frac{\Delta y}{\Delta x} x_1$$

To keep things simple, we want the gradient to only assume positive values. We do this by adding constraints; let’s say $0 \leq x_1 \lt x_2$ and $0 \leq y_1 \leq y_2$, which gives us a gradient $a \geq 0$. Last, but not least, both $x$ and $y$ should be integers.

Let’s start by making a naive algorithm to draw a line.

```
void naiveLineDraw(int x1, int y1, int x2, int y2) {
int dx = x2 - x1;
int dy = y2 - y1;
float a = (float) dy / dx;
float b = y1 - a * x1;
plot(x1,y1)
while (x1 < x2) {
x1 += 1
y1 = Math.round( a * x1 + b );
plot(x1,y1)
}
}
```

For readability I omitted verious checks - like checking for division by zero or drawing outside the screen. The rounding method rounds to the nearest integer.

The naive algorithm calls a plot function, that delivers an output in pixels, like the one shown below for a line between the points $(1,1)$ and $(11,5)$ with the green squares as pixels.

The $y$ axis points downward, a typical thing in computer graphics, and the pixels are defined by their midpoint. The blue arrow is the *true* line.

Well, that seems to work quite well, so are we done yet?

Not quite, there’s one more problem with the naive algorithm. It has to do with the gradient $a$. When it becomes larger than one, holes starts to appear in our line. E.g. try to use the naive algorithm on the two points $(1,1)$ and $(3,5)$.

To avoid the holes, $y$ must not be incremented by more than $1$, and the only way to ensure this, is by introducing a new constraint on the gradient $a$.

$$0 \leq a \leq 1$$

So, let’s revist our constraints

$$ 0 \leq x_1 < x2 \Rightarrow \Delta x \gt 0$$ $$ 0 \leq y_1 \leq y2 \Rightarrow \Delta y \geq 0 $$ and because $$ 0 \leq a = \frac{\Delta y}{\Delta x} \leq 1$$ it follows that $$ \Delta y \leq \Delta x$$

Now that we have our constraints in place, the next step is to optimize the code.

## Optimizations

The naive algorithm has some issues, especially for old machines, like the IBM 1440, but also for an Amiga 500. The performance hit comes from the division and the multiple multiplications. The use of floating point numbers, are not trivial either, since the 68K CPU does not have an FPU 😱.

We need to get rid of the floating point types, the division, and the multiplications. It all comes together in the rouding function, so let’s start by doing something about that.

```
y1 = Math.round( a * x1 + b );
```

The rounding function rounds the input to the nearest integer. If the fractional part of the input is below $0.5$, it rounds down, otherwise it rounds up.

It’s easy to emulate that behaviour by adding $0.5$ and using the integer part of the result.

```
y1 = (int) (a * x1 + b + 0.5);
```

The cast to an integer indicates that it’s the integer part of the evaluated expression that is stored. It’s basically a truncation, where we strip the result of it’s fractional part.

The above expression has the advantage, that we can come with any $x_1$ and get a $y_1$, but that flexability is not needed. Since $x_1$ is always incremented by one, we can use an incremental calculation instead.

We know that

$$y_n = ax_n + b$$ $$y_{n+1} = ax_{n+1} + b$$

Let’s change the calculation of $y$ to use a difference equation.

$$y_{n+1} - y_{n} = a (x_{n+1} - x_{n})$$

Since each iteration increment $x$ by one, we can further reduce this to

$$y_{n+1} - y_{n} = a$$

And voila, we have a new expression for the next $y$ 👍

$$y_{n+1} = y_{n} + a$$

A nice side effect of the incremental calculation, is that we eliminated the floating point $b$ in the process. Now, let’s see what we can do about $a$.

The rounding effect of $0.5$ is not part of the difference equation, but we must add it somehow. We can’t add it to the initial $y_1$, outside the while loop, because $y_1$ is an integer.

```
y1 += 0.5 + a; // compile error
```

There’s also a problem in the while loop, because $a$ is a floating point.

```
// inside the while loop
y1 += (int) (a); // won't work
```

Updating $y_1$ by truncating $a$ to an integer won’t work. When $a < 1$ the truncated integer value will become zero, and $y_1$ will never increase.

Previously we prevented holes in the line, by only letting $y$ increase by no more than one. Let’s use that, to introduce a floating point accumulator, that we contionously update in the while loop.

```
float accumulator = 0.5 + a;
// inside the while loop
if (accumulator < 1) {
accumulator += a;
} else {
y1 += 1;
accumulator += a - 1
}
```

We know that the *true* line is increased with the gradient $a$, when we increment $x$ by one. We express this in the accumulator, by increasing it with $a$ at every iteration.

When the accumulator becomes larger than one, we know that the *true* line has moved so much, that we can increment $y_1$ with one pixel. And because we do this, we also need to subtract one from the accumulator.

The accumulator can be changed to an integer, without loosing precision. Let’s start by looking at the value of the accumulator, the first time we draw a pixel.

$$\textrm{accumulator} = 0.5 + a = 0.5 + \frac{\Delta y}{\Delta x}$$

We can get rid of the expensive division by multiplying with $2\Delta x$, which yields an equivalent accumulator.

$$\textrm{accumulator} = \Delta x + 2 \Delta y$$

The accumulator is now an integer 😃. We need to update the algorithm also.

```
// multiplying with 2*dx
int accumulator = dx + 2*dy;
// inside the while loop
if (accumulator < 2*dx) {
accumulator += 2*dy;
} else {
y1 += 1;
accumulator += 2*(dy - dx)
}
```

We can make the if-check simpler by subtracting $2\Delta x$ from the initial accumulator.

```
// subtracting 2*dx
int accumulator = 2*dy - dx;
// inside the while loop
if (accumulator < 0) {
accumulator += 2*dy;
} else {
y1 += 1;
accumulator += 2*(dy - dx)
}
```

The floating point numbers are now gone, the division is gone, and the multiplications can be replaced by bit shifting. This has become a usable algorithm! 🚀

The new and better algorithm look like this.

```
void betterLineDraw(int x1, int y1, int x2, int y2) {
int dx = x2 - x1;
int dy = y2 - y1;
plot(x1,y1)
int accumulator = 2*dy - dx;
while (x_1 < x2) {
x1 += 1
if (accumulator < 0) {
accumulator += 2*dy;
} else {
y1 += 1;
accumulator += 2*(dy - dx)
}
plot(x1,y1)
}
}
```

If we try out the better line draw algorithm on a gradient less than zero or larger than one, it won’t work. It’s also quite sensitive to the order of the endpoints. It can draw a line between the points $(1,1)$ and $(11,5)$, but not the other way around. That’s not very usable, so the algorithm has to be extended with the concept of octants.

## The Octants

A correct implementation of Breseneham’s line algorithm, must of course be able to plot lines in all directions. The better line draw algorihthm only works in one region, also known as an octant. An octant is relative to the line starting position and spans an angle of 45 degrees, and it takes eight of them to cover all 360 degrees.

It turns out that if you can draw a line in one octant, then you can draw a line in all directions, due to symetries between octants.

The better line draw algorithm can only draw lines in octant 7, so it must be tweaked a bit to be generalized to all octants.

Try rewriting the better line draw algorithm, so that it works on all octants. When that’s done, try to simplify the algorithm.

The algorithm should end up resembling the Bresenham implementation in WinUAE.

If you’re stuck, then take a look at my linedraw script, that runs in octave. It’s a bit messy though…

## The Amiga Way

Line drawing on the Amiga can be done in two ways: Either in software, by using the CPU, or in hardware, by using the blitter.

The blitter excels at drawing pixels, and is an early precursor for the mordern graphic card. It can draw a million pixels per second, which is an order of magnitude faster than the CPU. The CPU offers more freedom, but the blitter offers more speed.

Under the hood, the blitter uses Bresenham’s line drawing algorithm, which becomes accessible by the build-in line mode. Stick around and I’ll show you.

### Set the Blitter Into Line Mode

The Amiga Hardware Reference Manual gives a good explaination of how the line mode works. There’s even an overview of the register setup, which comes in handy, since there are quite a few things to set. These references should suffice to get started.

The Amiga blitter implementation resembles the *betterLineDraw* algorithm from previous, with the important difference that it has been generalized to all eight octants.

One of the main registers responsible for line mode is BLTCON1. Here’s an overview.

The LINE bit at bit zero sets the blitter into line mode. This changes the meaning of many other registers to something that has to do with line drawing.

### Setting up the octants

Three flags in the BLTCON1 register determines in which octant the line is drawn, relative to it’s starting point. These flags are:

- SUD - Sometimes up or down
- SUL - Sometimes up or left
- AUL - Always up or left

The naming of these flags, relates to the minor and major direction. E.g a line that goes from $(10,10)$ to $(15,8)$ has a major direction of right, and a minor direction of up. Notice that *up* is opposite the $y$-direction.

While drawing the line, the Bresenham algorithm *always* takes a step in the major direction, and *sometimes* a step in the minor direction. This is why the flags are named as they are.

The SUD flag, should be set if the minor direction is up or down. The SUL flag, should be set if the minor direction is up or left. The AUL flag should be set if the major direction is up or left. The three flags uniquely identifies the octant.

The three flags are set in code, by a series of steps. First, evaluate the inequality: $$|\Delta y | \leq |\Delta x|$$

Set the SUD flag, if the result is true.

If true, then the major direction is along the $x$-axis. If false, then swap $\Delta x$ and $\Delta y$, to ensure that the gradient $a = \frac{\Delta y}{\Delta x}$ never is larger than one.

After this step, $\Delta x$ will be a value along the *major direction*, and $\Delta y$ will be a value along the *minor direction*.

Next, evaluate the minor direction:

$$\Delta y < 0$$

Set the SUL flag, if the result is true, and make $\Delta y$ positive.

Last, evaluate the major direction:

$$\Delta x < 0$$

Set the AUL flag, if the result is true and make $\Delta x$ positive.

Example of setting the octant flags:

A line is drawn between $p_1(0,0)$ and $p_2(-2,10)$ and belongs to octant 5.

$$\Delta x = -2, \textrm{ } \Delta y = 10$$

The major direction is found along the $y$-direction.

$$|\Delta y | \leq |\Delta x| \Rightarrow 10 < 2 \Rightarrow \textrm{false}$$

The SUD flag is set to 0, and $\Delta x$ and $\Delta y$ are swapped. By doing this, $\Delta x$ always becomes the major direction, and $\Delta y$ always becomes the minor direction.

$$\Delta x = 10, \textrm{ } \Delta y = -2$$

Next, evaluate the minor direction.

$$\Delta y < 0 \Rightarrow -2 < 0 \Rightarrow \textrm{true}$$

The SUL flag is set to 1. Also remember to make $\Delta y$ positive.

$$\Delta y = 2$$

Finally we evaluate the major direction.

$$\Delta x < 0 \Rightarrow 10 < 0 \Rightarrow \textrm{false}$$

The AUL flag is set to 0. Since $\Delta x$ is already positive, there is nothing more to do.

So to recap, for octant 5 we have:

- SUD = 0
- SUL = 1
- AUL = 0

Try to use the same approach to find the flags for all the octants.

Another way to look at these steps, is that they have the effect of “normalizing” the line into octant 7.

The next part of the line setup assumes that $\Delta x$ and $\Delta y$ have been swapped, if needed, so that $\Delta x$ is a value along the major direction.

### Setting up the accumulator

Three registers are needed to setup the accumulator: The initial value, and the increment and decrement values, by which it’s updated.

*betterLineDraw*algorithm.

Description | Register | Address | Expression |
---|---|---|---|

initial accumulator value | BLTAPTL | $\$DFF052$ | $4\Delta y-2\Delta x$ |

accumulator increment without minor step | BTLBMOD | $\$DFF062$ | $4\Delta y$ |

accumulator decrement with minor step | BLTAMOD | $\$DFF064$ | $4(\Delta y - \Delta x)$ |

All the accumulator expressions are multiplied by two, when compared to the *betterLineDraw* alogrithm. We have to do this, because the
BLTAPTL register only hold even numbers, since it doubles as an address pointer register.

If an uneven number is written to the accumulator registers, it will be trunctated to a smaller even number. This will introduce an error into the accumulator, which we avoid by doubling the expressions. The doubeling is just a scaling factor, that will not change the way the algorithm works.

Because the accumulator registers are unsigned, we have to communicate the sign of the initial accumulator value. If it’s negative, we set the SIGN bit in BLTCON1.

### Setting Up the Texture

The line texture pattern is set by preloading BLTBDAT with a 16 bit texture value.

The BLTCON1 register bits 15-12 holds the start bit for the line texture, where 0 sets the start to the least significant bit of the texture given in BLTBDAT.

The texture pattern repeats itself using a circular shift to the right, where the texture start in BLTCON1 denotes the rotate count of the rotation.

To draw a solid line, set BLTBDAT to $\$FFFF$. By changing this value, a wide range of dotted lines can be drawn.

Example. We set BLTBDAT to hold the texture $\$95c0$ and set the start texture bit in BLTCON1 to $\$1$.

The texture start shift, works the same way as the ROR operation in 68K assembly. It takes the texture and shift it to the right. When the texture start is set to zero, the rotate count is one, which might be surprising. Well, it surprised me 😃

When setting the texture start to $\$1$, the texture is shifted two bits to the right, and those bits that rotate out of the low-order bit go back in at the high-order bit.

$
\begin{split}
Texture &= \$95c0 &= \%1001\ 0101\ 1100\ 0000 \\\

ror(2) &= \$2570 &= \%0010\ 0101\ 0111\ 0000
\end{split}
$

One way to think about the texture, is that it’s painted to the source channel $B$, inside the blitter, and is an input into the blitter logic function.

It’s possible to use different texture start values per octant. Here’s an example of the texture painted in all eight octants, where the texture start is set to zero for all but octant 7, where it’s set to one.

Notice how the pixel at the start point of the line (the origo), has no texture bit set. That’s because our texture ends on zeros, and the texture start is either zero, for octants 0 to 6, or one for octant 7. Had the texture start been e.g. $\$f$, then the texture bit would be set at the start point of the line.

The texture can be seen as mask that blocks or adds line drawing in certain areas. We will dive into this later, when we look at the logic function and it’s minterms.

### Fills

In the BLTCON1 register we also find the SING flag. When set, the blitter will only output one single pixel pr. raster line. This is very usefull, when doing fills, like filling a polygon. Exiting stuff for sure, but it have to wait for another post. 🚀

### Setting the Starting Point

The next important register for line draw, is BLTCCON0. This register sets the starting point and the logic function for the line draw. Here’s an overview.

The four most significant bits are devoted to defining a start point for the line as an offset to a memory address.

The bitplanes are like other memory only addressable on word boundaries i.e. every second byte. We use BLTCPT and BLTDPT to communicate the word that contains the first pixel of the line. According to the register setup, both registers should be set to the same value.

The four start bits defined at bit 12 to 15 in BLTCCON0, provides an offset from 0 to 15 pixels, from the address word boundaries, so that a line can start at an arbitrary place on the screen. Those four start bits are set to the four lowest bits of the starting point $x_1$.

The first pixel is written to the address pointed to by BLTDPT, all following pixels are written to the address pointed to by BLTCPT.

This little trick could come in handy, if the first pixel needs to be skipped for some reason. Source.

### Brush Definition

The line that the blitter draws, is set by the brush definition in the BLTADAT register. The brush is 32 bits wide and if a bit is set, then a pixel is drawn.

According to the documentation, the Amiga 500 only supports a brush of one pixel, which means that BLTADAT should be preloaded with $\$8000$. Any other value is not supported.

What happens if we try something undocumented like setting BLTADAT to e.g. $\$8888$? 😈

Well, then this happens, with the logic function $D = A$.

I would have expected four lines, since four pixels are set in the brush. But the staircase effect shows that something is not right.

If we change the logic function to $D = \overline A$, we see the areas that the blitter is working on.

Yup, something is wrong. The blitter is word based, which is seen by the width of the areas, that are 16 bits. However, the height of the areas have been too tightly optimized for a brush that is one pixel. Source.

### Blitter Channel Setup

The blitter can take input from three memory channels $A$, $B$, and $C$, and produce an output to the destination memory channel $D$. Setting up those channels are done by using bits 11 to 8 in the BLTCCON0 register.

The channel bits corresponds to USEA, USEB, USEC and USED and are grayed out, because they are supposed to be fixed in line mode. Setting them to anything else, will produce undocumented behaviour.

What happens on an Amiga 500, if you don’t obey the documented blitter channel setup? 😈

#### Clearing bit USEA:

This disables source channel $A$, which stops the error term accumulator in BLTAPTL from updating. The effect is that only lines along the octant dividers are drawn, because the acccumulator keeps it’s initial value, through the whole blit.

#### Setting bit USEB:

This should enable source channel $B$, but since line draw does not use BLTBPT, this has no effect.

#### Clearing bit USEC:

This disables source channel $C$. The effect is that no lines are drawn.

#### Clearing bit USED:

I expected would disable destination channel $D$ and thereby any output from the blitter. To my surprise that didn’t happen - line draw works OK without the bit set.

This was inspired by the post: Mysterious world of blitter line mode.

### The Function Generator

The blitter function generator can take input from three source memory channels $A$, $B$, and $C$, which are combined in the function generator to produce an output at the destination memory channel $D$.

The source channels $A$, $B$, and $C$ are combined using a boolean function inside the function generator, by using so called minterms. Those minterms are set by the lower byte in BLTCCON0.

The function generator and its minterms are unafected of line mode.

However, line mode introduces a slight redefinition of the source channels. The $A$ channel represents the line, the $B$ channel represents the pattern, and finally the $C$ channel represents the original source.

Let’s take a look at an example.

We draw a line from $p_1(106,100)$ to $p_2(116,121)$ with a pattern of $\$f0f0$. The patterned line should be drawn upon a stribed bitplane filled with $\$aaaa$ words.

The illustrations below show what happens when minterms are applied. The bounding boxes show where the blitter operates in memory.

When the blitter is in line-mode it operates on serveral bounding boxes, that have been adapted to fit closely to the line. Otherwise it operates on one bounding box.

Ok let’s start with the first illustration.

Here the $A$ channel holds the bits for the line drawn by the blitter, and the $B$ channel holds the pattern. Notice the one pixel white line at the top of the pattern. Apparently, when the start texture bit is set to zero, the pattern will be drawn shifted one bit to the right.

We combine $A$ and $B$ using a minterm that represents the expression

$$AB = D$$

The result is the output $D$, which is a line with a pattern.

Now we move on to the second illustration.

Again we have $A$ channel holding the bits for the line drawn by the blitter, but this time inverted by using a different minterm! The $C$ channel contains the original source, which is taken from the bounding box area of the bitplane filled with $\$aaaa$ words.

Finally, we combine $\overline{A}$ and $C$ using a minterm that represents the expression

$$\overline{A}C = D$$

The resulting output $D$ contains the background source and a negative line.

The third and final illustration, shows more combinatorial fun with the source channels 😃👍.

Here we take the examples from the previous two illustrations and combine them. We calculate a single minterm that represents the expression $$AB + \overline{A}C = D$$ The result is a line with a pattern drawn onto a background from the original source $C$, which is the part of memory pointed to by BLTCPT and BLTDPT. In this example it’s a bitplane filled with $\$aaaa$ words.

It might seem strange that we have to set both BLTCPT and BLTDPT, but that’s because they can point at different parts of memory!

BLTDPT determines where the first pixel of the line goes, while the following pixels are writen to the address pointed to by BLTCPT. This is very handy when doing polygon fills - but that’s for another post 😃.

## More Than Just Line Drawing

When I first started looking at Bresenham’s algorihtm, I thought it only could be used for line drawing. But it turns out it applies to a much wider scope, even things I would never have thought of. Here’s a couple of examples.

Bresenham’s algorithm can be used to do
evenly-spaced sampling. Let’s say we want to pick 7 evenly-spaced samples from an array containing the integers from 0 to 999. Bresenham algorithm gives us something distributed *as evenly as possible*.

$$[0, 167, 333, 500, 666, 833, 999] $$

Another use-case for Bresenham’s algorithm is to make smooth color fades. Here’s an example from the game Petris.

We have covered a lot of grounds in this post, and it’s time to take a little break. In the next post we are going to look at an Amiga assembler program that does some line drawing.

Stay tuned! 😃🚀

Previous post: Amiga Machine Code Letter XII- Horizontal Sine Shifting