Amiga Machine Code Letter X - Memory

Jul 4, 2019 14 min read

Amiga Machine Code - Letter X

The machine code course is nearing it’s end, and now it’s time to look at the libraries and operating system of the Amiga. This cathedral of software is actually rather beutiful and placed a multitasking system in the hands of the hobby user back in the late 80’ties. Multitasking, that previously belonged to the realm of expensive unix machines, was now entering the livingroom.

The next series of posts, will revolve around Letter X, and all the source code can be found on Disk1. We are going to take a look at memory allocation, reading and writting of files. How to use the command line interface (CLI) for reading arguments from the user, and using the floppy drive for reading and writting.

Working on these posts have left me with a few eye openers, that have given me much more insight into how computers work - even the modern ones 😃.

Back in the previous post we looked at interrupts and allocated memory for a small program, that was called each time the keyboard interrupt was recieved. In this post we are going to look a bit more at memory allocation.

Finding the Documentation

Before we dive in, a little word is needed about the documentation. There are multiple sources out on the web. One of the best complete collections I have found is over at Archive.org. The wealth of magazines, disks and books are just incredible. If you, like me, finds this work important, then please consider a donation!

I have relied heavily upon the book Mapping the Amiga for information about the library functions, that came with Kickstart 1.3. The kickstart is a ROM that contains all the Amiga libraries.

I have also found a great online reference for the include files and autodocs, over at amigadev.elowar.com. This documentation is very rich, but I could not find the vital offsets for the library functions in them, so here I have resorted to those provided by Mapping the Amiga.

EDIT: Found the offsets in the includes and outdocs 😃

The documentation is vital, since Letter X in the Machine Code Course, is very brief about the offsets, and also about how to build the structures needed for the more advanced stuff. I think this is by purpose, as not to overwhelm the reader.

However, I am not afraid of overwhelming anybody, so I will try to give an explaination to all the magic numbers, that are littered gratiously around the code in Disk1 😜.

Allocating and Freeing Memory

The Amiga OS comes with a memory manager, that’s accessed via the Exec library. The OS is a so-called multitasking system, where different tasks holds different portions of memory. If a task just used memory as it pleased, without allocating it through the memory manager, bad things could happen. Like overwritting memory belonging to other tasks. So that’s why we use a memory manager.

In the following, we will make extensive use of calling library functions, so if you don’t know how that’s done, take a look at the previous post.

The next piece of code, takes a closer look at how to use the memory library routines to allocate and deallocate memory. You can find the mc1001 program on disk1.

The program is a parade of different memory subroutines, that wraps around the Amiga Exec library functions AllocMem and FreeMem.

Before we start, let’s take a closer look at their calling syntax. These are from the book Mapping the Amiga. Especially notice the provided offset, that is essential for jumbing to the routine, when we know the library base pointer.

AllocMem is used for allocating memory given a size and attributes as input. The attributes control if memory should be allocated in chip or fast ram. If nothing explictly is stated, the function will try chip ram first and then move on to fast ram.

The function searches the memory for a “hole” of unused memory, that is large enough to hold the requested size. The pointer to the memory block is stored in d0. If no memory was found, d0 will contain zero - so remember to check for that!

Also be aware that memory has to be allocated in chip memory, if the co-processors should be able to use it.

AllocMem
Description: allocates memory
Library:     exec.library
Offset:      -$C6 (-198)
Syntax:      memoryBlock = AllocMem(byteSize, attributes)
ML:          d0 = AllocMem(d0,d1)
Arguments:   byteSize = number of bytes required
             attributes = type of memory
             MEMF_ANY      ($00000000)
             MEMF_PUBLIC   ($00000001)
             MEMF_CHIP     ($00000002)
             MEMF_FAST     ($00000004)
             MEMF_LOCAL    ($00000008)
             MEMF_24BITDMA ($00000010)
             MEMF_CLEAR    ($00010000)
Result:      memoryBlock = allocated memory block

When allocating memory, it’s also important to do some cleanup. Every task that allocates memory needs to consider to deallocate it again, or else the task will leak memory.

The caller deallocates memory, by calling FreeMem, and providing a pointer to a memory block, and a size.

FreeMem
Description: deallocates memory
Library:     exec.library
Offset:      -$D2 (-210)
Syntax:      FreeMem(memoryBlock, byteSize)
ML:          FreeMem(a1,d0)
Arguments:   memoryBlock = the memory block to free
             byteSize = the size of the desired block in bytes;
             this will be rounded to a multiple of the system memory chunk size
Result:      none

You can actually provide a pointer, that points to arbitrary memory locations, since the Amiga 500 does not have hardware memory partitioning. Compared to modern computers and operating systems, this tastes a bit like the wild west 💥.

The program mc1001 is given below, with my comments added. The gist of the program is just to provide you with some easy to use subroutines, that will become handy in your own programs.

For each subroutine, we start with pushing the contents of the registers onto the stack. Then just before returning with RTS, we pop the contents into the registers to reestablish the program state from before the subroutine call.

We use a machine code instruction called MOVEM to move multiple registers to and from the stack.

The program is very simple; it allocates memory and then frees it. There are also added some basic error handling, in the situation where memory can’t be allocated.

move.l	#100000,d0 ; set d0 input to allochip to 100.000 bytes
bsr	allocchip  ; branch to subroutine allocchip

cmp.l	#0,d0      ; compare output from allocchip with 0
beq	nomem      ; if 0 goto nomem (could not allocate memory)

lea.l	buffer,a0  ; put address of buffer into a0
move.l	d0,(a0)    ; store d0 (pointer to allocated memory) into the address in a0



move.l	#100000,d0 ; set d0 input to freemem to 
lea.l	buffer,a0  ; move address of buffer into a0)
move.l	(a0),a0    ; put the pointer to the allocated memory into a0
bsr	freemem        ; branch to subroutine freemem to free the alocated memory
rts                ; return from subroutine

nomem:
rts                ; return from subroutine

buffer:
dc.l	0          ; buffer for holding a pointer to allocated memory


allocdef:                   ; subroutine for allocating memory - first fast then chip. ML: d0 = allocdef(d0).
movem.l	d1-d7/a0-a6,-(a7)   ; push registers on the stack
moveq	#1,d1               ; trick to quickly get $#10000
swap	d1                  ; set d1 to MEMF_CLEAR initialize memory to all zeros
move.l	$4,a6               ; fetch base pointer for exec.library
jsr	-198(a6)            ; call AllocMem. d0 = AllocMem(d0,d1)
movem.l	(a7)+,d1-d7/a0-a6   ; pop registers from the stack
rts                         ; return from subroutine

allocchip:                  ; subroutine for allocating chip memory. ML: d0 = allocchip(d0).
movem.l	d1-d7/a0-a6,-(a7)   ; push registers on the stack
move.l	#$10002,d1          ; set d1 to MEMF_CHIP
move.l	$4,a6               ; fetch base pointer for exec.library
jsr	-198(a6)            ; call AllocMem. d0 = AllocMem(d0,d1)
movem.l	(a7)+,d1-d7/a0-a6   ; pop registers from the stack
rts                         ; return from subroutine

allocfast:                  ; subroutine for allocating fast memory. ML: d0 = allocfast(d0).
movem.l	d1-d7/a0-a6,-(a7)   ; push registers on the stack
move.l	#$10004,d1          ; set d1 to MEMF_FAST
move.l	$4,a6               ; fetch base pointer for exec.library
jsr	-198(a6)            ; call AllocMem. d0 = AllocMem(d0,d1)
movem.l	(a7)+,d1-d7/a0-a6   ; pop registers from the stack
rts                         ; return from subroutine

freemem:                    ; subroutine for deallocating. ML: freemem(a1,d0).
movem.l	d0-d7/a0-a6,-(a7)   ; push registers on the stack
move.l	a0,a1               ; set a1 to the memory block to free
move.l	$4,a6               ; fetch base pointer for exec.library
jsr	-210(a6)            ; call FreeMem. FreeMem(a1,d0)
movem.l	(a7)+,d0-d7/a0-a6   ; pop registers from the stack
rts                         ; return from subroutine

Here’s some not very interesting documentation of the subroutines.

allocdef
Description: allocate memory and initialize it to zero 
             tries fast memory first and then chip memory
Syntax:      memoryBlock = allocdef(byteSize)
ML:          d0 = allocdef(d0)
Arguments:   byteSize = the size of the desired block in bytes;
             this will be rounded to a multiple of the system memory chunk size
Result:      memoryBlock = allocated memory block

allocchip
Description: allocates chip memory             
Syntax:      memoryBlock = allocchip(byteSize)
ML:          d0 = allocchip(d0)
Arguments:   byteSize = the size of the desired block in bytes;
             this will be rounded to a multiple of the system memory chunk size
Result:      memoryBlock = allocated memory block

allocfast
Description: allocates fast memory             
Syntax:      memoryBlock = allocfast(byteSize)
ML:          d0 = allocfast(d0)
Arguments:   byteSize = the size of the desired block in bytes;
             this will be rounded to a multiple of the system memory chunk size
Result:      memoryBlock = allocated memory block

freemem
Description: deallocates memory
Syntax:      freemem(memoryBlock, byteSize)
ML:          freemem(a0,d0)
Arguments:   memoryBlock = the memory block to free
             byteSize = the size of the desired block in bytes;
             this will be rounded to a multiple of the system memory chunk size
Result:      none

The subroutines calling syntax are very similar to the library functions, and also adds some overhead each time we push and pop the stack. It’s hard to see the benefits of the subroutines, besides the slight gain in simplicity.

Wild West

Back in the text above, I called FreeMem a bit of a wild west function, since we could just deallocate arbitrary memory locations. You can try this for yourself by rewritting the mc1001 program a bit. Here I added an offset of 1000 to the address in a0, so that we will free memory we aren’t supposed to.

move.l	1000(a0),a0 ; put the pointer to the allocated memory into a0
bsr	freemem         ; branch to subroutine freemem to free the alocated memory

I’ve altered the program and when I ran it, I got the following Guru Meditation - in other words, I was allowed to cause havoc to the system.

Guru Meditation

In the Guide to Guru Meditation Error Codes the format is given as:

Subsystem ID	General Error	Specific Error	Address of task
81	00	0005	48454C50

The error code that I recieved, is a specific alert code in the Exec library, that indicates a corrupted memory list. A very fitting error in this case. The last part could be the address of a task, but if the cause is not known, then it will just display “HELP”, which in ASCII is 48 45 4C 50. Use FreeMem with care!

Upward compatibility

Disclaimer: This section is a bit sketcy - I would love to get feedback. Take the following with a grain of salt.

The mc1001 program contained three subroutines for allocating memory. All of them seems to use attributes that leave out the flag MEMF_PUBLIC, which according to the autodocs does the following:

Memory that must not be mapped, swapped, or otherwise made non-addressable. ALL MEMORY THAT IS REFERENCED VIA INTERRUPTS AND/OR BY OTHER TASKS MUST BE EITHER PUBLIC OR LOCKED INTO MEMORY! This includes both code and data.

Notice the shouting, this is obviously important! Perhaps one reason for not using MEMF_PUBLIC in mc1001 is, that it’s only meant to be memory referenced within the task itself, and not to be shared by others? However, when looking at the mc0901 program in the previous post, we see the exact same exclusion of MEMF_PUBLIC. This is even more puzzling, since it’s a program that allocates memory for another program, that is referenced via an interrupt. It fits the use-case of MEMF_PUBLIC perfectly.

As I understand it, the memory allocated in mc0901 is “locked” into memory e.g. we do not allow the operating system to move it by e.g. swapping it to disk or otherwise move it around to optimize memory layout. However, this locking is not done explicitly, which is a bit “secret”. I must admit that my knowledge about the Amiga memory system in Exec is a bit sketchy. A modern OS will almost never expose programs to a physical address, but use a memory map, so that memory usage can be optimized in many kind of ways.

What led me down this rabbit hole, was the otherwise excellent book The Kickstart Guide to the AMIGA, which I will come back to in a future post. It has the following to say about MEMF_PUBLIC.

… data structures (such as messages) which are going to be accessed by more than one task should be AllocMem’d MEMF_PUBLIC - this is for upward compatibility whith any future products which may support hardware memory partitioning.

Disregard the comment “such as messages” for now. What this quote says, is that we should use MEMF_PUBLIC, because some future AMIGA product will suppport partioned allocation. Code executing in one partion, cannot modify memory outside it’s partition, unless it’s MEMF_PUBLIC. This would actually be a great enforcement of code quality, since FreeMem would not be allowed to roam free, as we have seen above.

Memory Systems

Even to this day the AmigaOS continues to be upgrated, so we must do a bit of time travel to figure out what is up with MEMF_PUBLIC - or rather, what was up with MEMF_PUBLIC.

Today, we have a an AmigaOS, which as of this writting is in version 4.1. It is developed by Hyperion Entertainment.

For this post we will disregard Execs modern memory allocation system, and focus on Execs legacy memory allcoation system, that was used prior to AmigaOS 4.0.

Here’s what is said about the now obsolete MEMF_PUBLIC, which I guess is the correct documentation for Exec, when Kickstart 1.3 ruled.

MEMF_PUBLIC is about one of the most misused features of AmigaOS. MEMF_PUBLIC was more or less described as “memory that cannot be swapped out, moved, or otherwise made unavailable.” Unfortunately, this more or less applied to any memory in the classic AmigaOS. Therefore, many people just added the MEMF_PUBLIC flag to just about any allocation.

MEMF_PUBLIC assumes that a memory block is allocated that cannot be physically moved around, is contiguous and will not be swapped. If you look at it this way, these requirements are almost always unnecessary. A normal application does not need to care about the physical address of a memory block nor will it have to think about the block not being contiguous as long as the virtual addresses are. The only requirement an application may have is to pin a block of memory, preventing it from being swapped out, for performance reasons. Swapped out memory might take a much longer time to be made available which can be an issue depending on the application.

MEMF_PUBLIC, due to its design, implies that the block of memory is available to all tasks and entities in the system. This is important for sending messages, for example. At the moment, there is nothing in the system that actually prevents access to another task’s memory but semantics dictate that messages should be globally shared and a future memory system will enforce these semantics.

So what I gather from all of this, is that when the book The Kickstart Guide to the AMIGA mentions that we should be explicit about MEMF_PUBLIC, it’s because there is an expectation, that one day an Amiga product will arive, that actually enforces memory partioning, and thus MEMF_PUBLIC will make sense. When that day arrives, your programs will not need to be rewritten, since you already use MEMF_PUBLIC.

From the quote above, what actually happened was that many used MEMF_PUBLIC without regard to the actual use case. This is bad, since locked memory will give memory optimization algorithms a hard time - especially if not needed, and used extensively.

To recap; as I see it, the mc1001 program allocates memory correctly, since the memory is intended to be consumed by the program itself.

However, the mc0901 program makes an error by omitting MEMF_PUBLIC, because it needs to allocate memory that must be locked to an address, since that address is placed in the interrupt vector for the keyboard interrupt. The program will work on kickstart 1.3, but is not future compatible, and will need to be rewritten, when an AmigaOS with memory partioning arives.

Bonus Material

If you are keen to learn more about memory management, there’s a series of excellent videos about the subject over at Jacob Schrum’s Youtube channel. It is not Amiga specific, but general computer science, explained on a whiteboard. There are many videos, but I can especially recommend the following:

The next post is going to be about reading and writting files. Stay tuned 😃.

Amiga Machine Code Course

Previous post: Amiga Machine Code Letter IX - Interrupts.

Next post: Amiga Machine Code Letter X - Files.