Due to strong similarities between the SA-1100 and SA-1110 microprocessors, a great deal of usable code was already available in ARM Linux by the time Assabet[Ass] was released. The initial porting activity, which was undertaken by several parallel groups, involved only a few changes to the memory setup and serial port configuration. The initial Assabet patch release by the Wearable Group occurred on 13 April, 2000, for kernel version 22.214.171.124
In subsequent months, a number of additional features have been added to the kernel which improve support for Assabet. The Neponset[Nep] board is now operational, with its SA-1111 companion processor. PCMCIA[PCM] implementations for both the Assabet CompactFlash[Com] slot and the Neponset PCMCIA and CF slots are available. Drivers for a CompactFlash ethernet device, IDE storage devices such as the IBM Microdrive[Mic], and the Lucent WaveLAN/IEEE 802.11[Wav] interface are all functional.
The next section describes the specific contributions to ARM Linux made by the Wearable Group to improve Assabet support in the kernel. Following this is a section describing work on ARM Linux performed by other groups external to Carnegie Mellon involving Assabet.
Development on Assabet at Carnegie Mellon follows more than a year of work on another StrongARM-based platform, Itsy. In addition to low-power research on the standalone Itsy, which was developed at the Digital Equipment Corporation Western Research Laboratory (DEC WRL, now Compaq WRL), more recent work has centered around an enhanced version of the Itsy which includes a higher-capacity lithium ion battery and dual PCMCIA slots. This platform, known as the Itsy/Cue[Itsb], has been used for research in the areas of audio interfaces and mobile wireless networking.
The introduction of the PCMCIA slots to the Itsy hardware presented an interesting challenge in terms of software support. The official kernel release for Itsy at the time of this work was a derivative of Linux 2.0.30 which was somewhat out of sync with the standard ARM Linux tree. A standalone software package, PCMCIA Card Services for Linux, was available for architectures which used one of a small number of PCMCIA controllers, such as the Intel 82365. In order to support the built-in PCMCIA control functionality of the SA-1100, an effort was undertaken by the Wearable Group to adapt Card Services to the StrongARM. The first patch to Card Services 3.1.8 was released on 16 January, 2000.
This early work on ARM Linux and PCMCIA Card Services served as a foundation for development of subsequent patches involving Assabet. A detailed explanation of the Card Services implementation for ARM Linux appears later in this section. Also covered in this section are the topics of kernel configuration, memory, serial ports, debug LEDs, Neponset, the SA-1111, serial audio control, and PCMCIA client drivers.
The original Assabet patch released by the Wearable Group contained a number of changes to the naming conventions used within the kernel source tree. The changes generalized the nomenclature from ``SA-1100'' to ``SA-11x0,'' the better to capture the fact that much of the code applies equally to the SA-1100 and SA-1110 processors. The ARM Linux community chose not to adopt this change, preferring instead to follow the model of other architectures supported by Linux. (For example, i386 encapsulates the 80486 and various models of the Pentium processor.)
In the current ARM Linux kernel, the symbol CONFIG_ARCH_SA1100 is defined whenever the kernel is configured for SA-1100 or SA-1110 processors. Support for Assabet is indicated by the symbol CONFIG_SA1100_ASSABET, and additional support for Neponset is indicated by CONFIG_ASSABET_NEPONSET. Supporting Neponset implies support for the SA-1111 companion processor, which is indicated by the symbol CONFIG_SA1111.
Assabet includes a quantity13 of 32-bit SDRAM in dynamic RAM bank 0. Neponset, when present, provides additional memory running at half the speed of the Assabet SDRAM. During startup, when the kernel detects that it is running on Assabet, an entry for the single bank of SDRAM at 0xc0000000 is added to the database of memory resources. Space for an initial RAM disk, up to 3MB14 in size, is reserved at 0xc0800000. Kernel -- or zImage -- execution begins at 0xc0008000, with the page tables residing below this address. The final RAM disk and kernel image locations in SDRAM were selected by Nicolas Pitre.
In addition to RAM, the StrongARM SA-1100 memory map contains regions of the address space which allow access to I/O devices. In order to operate on these regions, a virtual mapping for the physical address ranges must be constructed in the kernel page tables. Such mappings can be constructed on demand, but it is convenient to install several mappings at startup time. These mappings include space for the Peripheral Control Module (USB, serial I/O), the System Control Module (interrupts, GPIO), Memory Control (RAM configuration, PCMCIA timing), LCD control, DMA control, and access to the I/O, attribute, and common memory spaces of both PCMCIA slots. Of course, these regions are common to all SA-1100 and SA-1110 systems; when the kernel is running on Assabet, several additional ranges are mapped as well. These include the Assabet Board Control Register, the Neponset CPLD registers, and the SA-1111 itself.
As mentioned previously, a boot loader may obtain kernel and filesystem images at boot time via a serial port on the processor. Once control has been transferred to the kernel, however, serial I/O continues to be important as it may be the primary method of interaction with the system. Assabet provides one RS-232 interface, routed to serial port 1 on the SA-1110. Neponset provides additional hardware for two RS-232 ports, one of which is routed to serial port 3, the other preempts the interface on Assabet to implement port 1.
Because of a design choice in the development of the Angel boot loader, which is the most common boot loader used with Assabet, the rôles of the various serial ports are somewhat muddled. When Assabet is booted without the companion board, all of the Angel debug messages are of course directed to serial port 1. Subsequently, the kernel decompressor outputs status information on this same port, as does the kernel itself. When Neponset is attached, however, Angel switches to serial port 3 and uses port 1 for debugging information.
In order to avoid requiring developers to tether their boards with two serial cables when one should be sufficient, several changes were made in both the kernel decompressor and the kernel itself. Using a method described in a later subsection, the decompressor detects the presence of Neponset at boot time, and configures SA-1110 UART 3 to be the target of all status and debug messages. Using the same method,15 the kernel also detects the companion board and chooses the console serial port accordingly.
Regardless of which UART is selected for output, the port is always configured to run at 9600bps using 8-bit data, no parity, and one stop bit. Hardware flow control is not supported. The userland configuration seen most frequently on Assabet, assembled by Nicolas Pitre, also uses these parameters for programs such as getty. This allows a single terminal emulation session to successfully receive all messages output by Assabet, from the boot loader all the way to the login prompt.
Assabet includes six LED indicators, two of which (D8 and D9) are general-purpose and can be manipulated directly by software.16 During kernel bring-up, these two diodes are the only method of obtaining status information from the board, a constraint which can require some creative assembly coding on the part of the developer. Fortunately, now that the kernel runs on Assabet, these indicators can be assigned to some other task.
If the kernel is suitably configured,17 the two general-purpose LEDs can indicate a regular ``heartbeat'' strobing pattern as well as a CPU activity monitor. The green LED D8 is assigned to the former task, while the red LED D9 is responsible for the latter. This simple status information has proven to be quite helpful for identifying system lock-ups and other troublesome events. An API -- first authored by Russell King -- also exists in the kernel to allow these assignments to be overridden, thus exposing the LEDs to manual control.
Perhaps more useful than the pair of general-purpose indicators found on Assabet is the array found on Neponset. Thirty-two separate LEDs are accessible through a register implemented by the CPLD on the companion board. This register appears in the address space of the SA-1110, and can be manipulated using simple word loads and stores.18 The default value displayed on the LED array at startup time is a copy of the Neponset WHOAMI register, which contains the system ID.
Many services in the kernel require specialized configuration when the SA-1111 Microprocessor Development Module (Neponset) is present. Console output may switch to a different serial port, special interrupt handling services may be initialized, and device configuration for hardware on the companion board may have to occur. One solution for dealing with these special requirements might be to statically configure a kernel which would run only on systems with an attached Neponset, but this is clumsy. A better approach would be to detect the presence of Neponset at run time, and perform these additional initialization tasks as needed.
Assabet provides a mechanism for detecting ceratin aspects of the basic hardware configuration, which is documented in the User's Guide for that board. Through an interface named the ``System Configuration Register,'' (SCR) information such as the size of SDRAM, the size of flash RAM, and the presence of optional expansion boards may be obtained. The procedure for reading this register is reviewed below.
GPIO pins 9:2 on the SA-1110 are selectively attached to 100k pull-down resistors. These pins can be driven to 3V by writing 0xff to the relevant bits in the GPIO Pin Output Set Register (GPSR). After precharging the pins, the pin voltages may be sensed once the levels on the pulled-down pins reach a valid logic zero state (2s), but before the pins which are not pulled down cease to be in the valid logic one region (10s). The values read constitute the contents of the System Configuration Register, which is, in fact, not a register at all.
The SCR is read once by the kernel decompressor, and again by the kernel during early initialization. Kernel code may test for the presence of Neponset at any time by using the macro machine_has_neponset(), which simply checks for the SA-1111 bit in the value of the SCR sampled at initialization time.
Currently, there are several resources aboard Neponset which are not being used. These include the additional 32MB of SDRAM,19the PS/2 port, and the SMSC9196 ethernet controller. The initial work on Neponset has focused almost exclusively on the SA-1111, which is described next.
The StrongARM SA-1111 Microprocessor Companion Chip is a compact part20 which encapsulates a number of features which might be useful on a mobile communications device. Specifically, it internalizes the majority of a Compact Flash implementation, a PCMCIA implementation, an IS[I2S] or AC-Link serial audio bus, PS/2 mouse and trackpad interfaces, a USB host controller, and 18 additional GPIO pins. This section covers the process of activating the SA-1111 and servicing the interrupts it generates.
It has been the experience of the author that the SA-1111 documentation was not as clear on the matter of setup as might have been hoped. In actuality, the intitialization process is straightforward, and will be described here.
Data transfers between the SA-1110 and the companion SA-1111 occur via the full 32-bit processor memory bus, using a functional block on the SA-1111 known as the System Bus Interface. Addresses presented to the System Bus Interface may be decoded21, and directed to some other internal functional block. The datapath which allows this communication is the Register Access Bus (RAB).
In order to operate the RAB, the SA-1111 must receive a clocking signal.22 The SA-1110 can provide such a signal, using the 3.6864MHz clock output on GPIO pin 27. To enable this clock, pin 27 is configured for its alternate function in the GPIO Alternate Function Register (GAFR), and set to output in the GPIO Pin Direction Register (GPDR). Then, a test-select bit must be set in the Test Unit Control Register (TUCR) to actually route the internal clock out to GPIO 27.23 Once the clock is enabled, the SKCR on the SA-1111 is programmed to enable the internal phase-locked loop and system bus clocks. At this point, the remaining SA-1111 functional blocks should be accessible.
One of the most interesting functional blocks in the SA-1111 is the Interrupt Controller, which manages no fewer than 50 separate interrupt sources within the chip. These include 18 for GPIO, six for the PS/2 interfaces, three for the synchronous serial port, 11 for audio, six for USB, and three for each of the PCMCIA interfaces. All fifty of these sources share a single interrupt line back to the SA-1110. Although the SA-1111 manual recommends that this signal be routed to GPIO 0 or GPIO 1 on the SA-1110, we will see that Neponset incorporates a different design.
SA-1111 interrupts are enabled using a pair of Interrupt Enable registers (INTEN0, INTEN1). Once a source is unmasked using these registers, a level transition at the source generates an interrupt according to the setting of the matching Interrupt Polarity registers (INTPOL0, INTPOL1). Interrupts can be set on either a rising edge (the default), or a falling edge.24 When an interrupt from the SA-1111 is detected, the Interrupt Status/Clear registers (INTSTATCLR0, INTSTATCLR1) can be consulted to determine the source, and then used to clear the pending interrupt from the controller.
Of course, an interrupt must be configured for the SA-1111 itself back at the SA-1110. This can be set up in the same manner as for all GPIO interrupts; polarity is set via the appropriate bit in either the GPIO Rising-Edge Detect Register (GRER) or the GPIO Falling-Edge Detect Register (GFER), and the interrupt is unmasked using the Interrupt Controller Mask Register (ICMR). When an interrupt occurs, the appropriate registered IRQ handler is invoked. In the case of a GPIO interrupt, the handler may consult the GPIO Edge Detect Status Regsiter (GEDR) to determine which pin experienced a transition. The handler then clears the interrupt, again using the GEDR, and services the requesting device.
There are two unfortunate design choices in the Assabet and Neponset combination that are worth noting with respect to interrupt handling. The first is that the Neponset interrupt signal is routed to GPIO 25 on the SA-1110, which means that the interrupt handler cannot immediately begin by servicing the board, as this pin shares an IRQ with the other pins in the set of GPIO 11-27. Some additional logic is required to ensure that a transition actually occurred on pin 25. Worse, this signal is not the SA-1111 interrupt, but rather, an interrupt shared by three devices on Neponset; the actual source can be decoded from the Interrupt Reason Register (IRR) on the companion board. The flow of execution in servicing SA-1111 interrupts is shown below.
In Figure 2, an example first-tier interrupt source attached directly to the SA-1110 via GPIO pins 0-10 is illustrated for reference. The illustration is meant to indicate that processing of the source can occur immediately, given the IRQ number, once the interrupt arrives. By comparison, handling of any SA-1111 interrupt sources requires several layers of indirection before the identity of the requesting agent is known.
Not only does this somewhat awkward interrupt architecture increase servicing latency, it also creates problems for device driver authors seeking to write code which will be portable across StrongARM board designs. Traditionally, a Linux driver requests an IRQ, and unless the IRQ is requested in ``shared'' mode, the driver can assume that when it is entered, the device it manages has asserted an interrupt. This does not necessarily hold for drivers running on the StrongARM, however. The IRQ requested by the device might be shared among several GPIO sources, which themselves may multiplex additional sources downstream as shown in Figure 2. Legacy drivers -- especially those ported from other architectures -- cannot be expected to deal directly with this organization. A solution for dealing with this problem is presented later in this report.
One of the significant functional blocks in the SA-1111 which can be accessed after performing the setup procedures just described is the Serial Audio Controller (SAC). The SAC implements a number of standard control and data interfaces to external audio codecs, which are responsible for the actual analog-to-digital and digital-to-analog audio functions. Specifically, codecs which support the AC-link, IS, and MSB-Justified audio formats can be used, as can those which implement the IS and L3 bus command interfaces. The SA-1111 SAC is capable of full-duplex operation, and can either move data between the host processor and the codec using manually-loaded FIFOs, or move data directly between main memory and the codec using internal DMA engines. Audio processing may occur at sample rates up to 22.05kHz, using 16-bit, two-channel samples.
Neponset is populated by default with a Philips UDA1341 audio codec attached to the SA-1111 L3 bus interface.25 This is the same codec found on Assabet, but on that board, the codec attaches directly to the SA-1110 Synchronous Serial Port (SSP, serial port 4). In the Assabet configuration, four GPIO pins (10-13) are consumed for clocking and framing, three more are taken for L3 functions (15, 17, 18), and yet another is lost in receiving the externally-generated sample clock (19). Adding additional complexity to the Assabet approach is the need for a complete software implementation of the L3 bus protocol, which manually clocks all inbound and outbound address and data words.
By comparison, the UDA1341 on Neponset requires fewer SA-1110 resources, and can be implemented with a simpler device driver. All SAC interrupt sources are multiplexed along the single SA-1111 interrupt signal, described in the previous subsection. The sample clock is generated internally from the SA-1111 Phase-Locked Loop (PLL) clock, which is in turn derived from the incoming 3.6864MHz oscillator on SA-1110 GPIO 27.26 Already, this frees six SA-1110 GPIO pins and eliminates the need for an additional oscillator and clock generator.27 The real complexity savings come from the SAC L3 bus implementation, however. To read or write command words from or to the codec, all the driver must do is load the desired target address into a control register (L3CAR), and in the case of a write, load a word into a data register (L3CDR). The driver can then poll a SAC status register (SASR0) to determine that the operation has completed, or enable a processor interrupt to indicate completion. In the case of a read, the returned value is placed in the L3 data register. The L3 bus can perform single or multiple-byte transfers in this manner.
The L3 interface is used to configure various features of the codec, such as input gain or bass boost. To actually transfer audio samples to or from the codec, a separate serial interface is used. These samples may be encoded using a number of formats, but in the case of the UDA1341, IS is the encoding chosen. IS uses a serial audio data line -- one for each of output and input -- combined with a bit clock (to drive the codec bit-sampling logic) and a SYNC signal to delineate left and right audio channels. Samples, which are always 16 bits per channel, are sent MSB-first, left channel followed by right channel. This encoding and the serial protocol used to transmit it are hidden completely from software by the SAC. In order to exchange samples with the codec, the SA-1110 can load or store up to eight 32-bit samples in a data register bank (SADR). The SAC can interrupt the processor when a configurable transfer threshold condition has been met.
A more efficient method for managing audio transfers is the one implemented in the Linux driver; on-board SAC DMA engines can move audio samples directly to or from the SDRAM bank on Assabet.28 During SA-1111 initialization, just after enabling the SA-1111 clock signal, the kernel configures the memory bus request/grant mechanism needed for SA-1111 DMA. The mechanism consists of a pair of GPIO pins on the SA-1110 (21, 22) whose alternate function it is to allow the SA-1111 to claim the memory bus. Memory request support is enabled in the SA-1110 by setting a bit in the TUCR, discussed in the previous subsection. Because the SA-1111 will be driving the Assabet SDRAM directly, the shared memory controller on the SA-1111 must be configured to generate correct addresses and provide the appropriate access latency. For the default Assabet memory load, the control register (SMCR) is set to use 14-bit row addresses and to expect data to be latched in response to an SDRAM read 3 memory clock cycles after deassertion of the Column Address Strobe (CAS) signal.29 Once the shared memory interface is set up, 32-bit linear audio samples may be placed in (or retrieved from) main memory30 as two 16-bit channels (data for the ``left'' channel is placed in bits 15:0, the ``right'' channel in 31:16), with samples having fewer than 16 bits being left-justified in the 16-bit field. Two transfer engines (``A'' and ``B'') are available for each of the transmit (to codec) and receive (from codec) directions. A transfer begins by loading bits 25:0 of the SDRAM buffer physical address into a start register (one of SADTSA, SADTSB, SADRSA, or SADRSB), and loading the length of the transfer in bytes into the corresponding count register (one of SADTCA, SADTCB, SADRCA, or SADRCB). The engine is started by setting the appropriate bit in the control/status register for the transfer direction (SADTCS or SADRCS). When the transfer completes, the SAC can interrupt the processor, and a new transfer can be configured.
The SA-1111 documentation is unusually poor where the DMA features of the SAC are concerned. The Linux driver code owes a great deal to the experiences of developers on other operating systems, some of which strongly informed the final structure of the software. One developer observed that one or both of the ``A'' and ``B'' engines would occasionally lock up, thus requiring a watchdog timer and a facility to reset the engine. Another noticed that once a DMA engine is ``enabled,'' it can never be ``disabled.'' This developer also observed that the interrupt-enable bits in the DMA control/status registers are nonfunctional, as are the ``transfer done'' indicator bits and the ``buffer in use'' bits. This developer also discovered that the engines lock up if transfers are not launched in alternating sequence (``A,'' ``B,'' ``A,'' ...), or if FIFO depths other than the default are specified. Combined with the DMA corruption bug recently announced in a specification update, the SA-1111 SAC leaves something to be desired, at least in the current metal revision. The author hopes for a respin.
In the opinion of the author, one of the more compelling features of the StrongARM SA-1100 processor is the built-in support for Personal Computer Memory Card International Association-compliant devices, also known as PCMCIA cards, or more recently, ``PC cards.'' The range of peripherals available in this form factor is interesting, and includes ethernet adapters, wireless LAN interfaces, GPS receivers, flash RAM modules, rotating disk drives, POTS modems, speech coprocessors, and more. In addition, a follow-on standard which is a subset of PCMCIA, known as CompactFlash, provides similar functionality in a form factor smaller than a book of matches. The SA-1110, plus some external glue logic and a power controller, can support up to two slots of either type. The SA-1111 internalizes all of the glue logic necessary to support one full PCMCIA device and one CompactFlash slot.
A device driver for a given PCMCIA or CompactFlash device may be written in much the same way as a driver for a conventional device, such as a PCI ethernet controller. Such a driver would not take advantage of one of the chief advantages of PCMCIA, however: hot-swappability. A PCMCIA implementation has the ability to power on a card at insert time, determine its time, load an appropriate device driver, and configure the device dynamically. Correspondingly, upon card removal, any resources occupied by the device should be released, including the loadable driver.
The software infrastructure necessary to support this kind of operation is known as Card Services, and an implementation for Linux on other architectures (such as the i386) has been available for more than six years at the time of this writing. Card Services abstracts away the details of a particular PCMCIA implementation,31 allowing drivers for individual cards to be written in a fairly general fashion. In the ideal case, a single Card Services device driver will work on multiple architectures without modification.
The SA-1110 contains internal logic to implement many aspects of the PCMCIA interface, but this is not to say that all SA-1110 board designs are similar in the way they choose to fully implement PCMCIA. Several aspects of the interface are left at the discretion of the designer, such as the routing of interrupts, and the control mechanism for an external power converter. The design elements which can vary among StrongARM SA-1100 implementations are presented below.
Because each of these aspects of the PCMCIA implementation may be handled in at least two or three different manners, it becomes clear that the operating system needs to intervene and provide some abstraction. In the parlance of Card Services, this comes in the form of a software component called the ``socket driver.'' The socket driver receives requests from Card Services using an API named ``Socket Services,'' and implements the requests using the available hardware. An example request might be to configure the attribute memory space of a given slot mapped at a particular location to run at an access speed of 300ns.
With a socket driver to handle the low-level details, much of the remaining portions of Card Services can be made remarkably general. Immediately above the socket driver is the PCMCIA core, which manages resources such as memory maps and IRQs, and also manages communication with the card device drivers as well as userland configuration tools.
Card device drivers, also known as client drivers, make many of their resource requests through the PCMCIA core. This indirection frees clients from having to discover available system resources, such as memory maps, for themselves. Note that some legacy drivers which have recently been made compatible with Card Services may perform their own requests, such as interrupt registration. Such drivers still query the PCMCIA core to learn the set of available IRQ numbers.
Many configuration operations take place with the help of userland tools, and in particular, cardmgr. This dæmon reads a configuration file on startup, and passes a resource policy down to the PCMCIA core, which runs in kernel mode. The policy might include a list of memory ranges to map (or not to map), and a set of allocatable IRQs. Communication occurs via a special client driver which implements the Driver Services interface. The Driver Services client is always installed, and handles all interaction with userland.
The dæmon also maintains a database of information about specific PCMCIA devices it expects to see. Each PCMCIA card contains a Card Information Structure which describes various aspects of the device, such as manufacturer, model, and general device class. When a card is inserted, this structure is read, and the identifying information is passed up to cardmgr, which consults its database. A successful lookup will yield the name of a client driver, which is a kernel loadable module. The dæmon examines the filesystem, and attempts to load the module, which is then configured by the PCMCIA core running in the kernel.
Five separate components of Card Services have just been covered: the socket driver, the PCMCIA core, client drivers, the special Driver Services client driver, and the userland cardmgr. Initially, every one of these save for Driver Services required some degree of attention in the port to the StrongARM.
We begin with the socket driver, which had to be authored from scratch. This component was originally developed on the Itsy/Cue hardware which, while based on the Brutus SA-1100 development platform, already demonstrated significantly different design choices where the PCMCIA interface was concerned. Specifically, while both use a CPLD ``register'' to absorb reset signals, power control, voltage sense, and I/O status change, the register layouts are completely different. Brutus uses GPIO assignments for BVD, CD, and device interrupt, whereas Itsy/Cue leaves BVD in the CPLD (thus requiring only two pairs of GPIO pins, albeit still using different assignments from Brutus). Newer designs are even more different; Neponset uses only a single GPIO interrupt, but requires navigation of the bizarre interrupt source detection hierarchy discussed previously. Rather than consolidating control to a single CPLD register, the SA-1111 distributes PCMCIA control over several internal registers, some of which drive external, supporting hardware.
Early incarnations of the sa1100 socket driver sought to discover a fully-general representation for the various resources which need to be managed on various boards. As it eventually became clear that no wieldy generalization was poised to present itself, an abstraction layer was added to the kernel, below which board-specific details could be hidden. This reorganization allowed the ``generic'' socket driver to shrink to a more manageable 1100 lines of code, with the average board-specific routines occupying about 200 lines.
In the new scheme, a StrongARM PCMCIA implementation can be supported by supplying five routines which are invoked from the ``generic'' driver. The first (init) performs any necessary GPIO setup, power control configuration, and interrupt registration which must be in place for normal operation. The second (shutdown) undoes the effects of init. The third (socket_state) fills in a structure for each configured socket describing the status of various signals such as card detection, battery voltage detection, voltage sense, write protection, and so forth. The fourth (get_irq_info) returns, for the requested socket, the IRQ assignment which should be requested for service to a device in that socket. The last (configure_socket) allows Card Services to configure V and V for a given socket, as well as assert reset. At the time of this writing, the kernel includes versions of these routines for four distinct PCMCIA implementations.
With the abstraction layer in place, the ``generic'' socket driver need only implement the Socket Services API, occasionally receiving requests from Card Services (and passing some of them to the abstraction layer), and occasionally receiving interrupts from the abstraction layer (and forwarding them up to Card Services). Examples of requests issued by Card Services include set_socket, which configures the per-socket V , V , reset and so forth; and set_io_map, which configures PCMCIA I/O memory mapping and access speed.33Examples of interrupts received by the socket driver include those for card detection or a change in battery charge state. When any interruptable event comes in, a bit field is filled in (subject to an event mask) to describe the change in status. This field is passed up to Card Services via a callback that was arranged at initialization time. Note that the one interrupt not serviced by the socket driver is that for the card itself; this interrupt is requested by the card device driver directly.
Continuing with the remaining Card Services components which required modification, the PCMCIA core ended up requiring very few alterations. Card Services, like the Linux kernel, contains a number of legacy assumptions about the nature of I/O access. These include constraints on the kind of addresses used to reach I/O (in particular, they assume 16-bit ISA ``ports''), and the facilities available for performing such access (such as the x86 I/O instructions inb, outb, inw, and outw). This can present an awkward fit for a machine which maps I/O directly into a 32-bit address space, and which can access this memory via normal loads and stores, such as the StrongARM. The address width problem was more or less straightforward to correct, while the I/O ``instruction'' problem required slightly more effort. The Card Services code invokes wrappers for the x86 I/O instructions (e.g., the C routine inb() for inb), as is common for kernel drivers. The ARM architecture port of the kernel implemented versions of these wrappers that were meant to simplify the task of hacking together a PCMCIA client driver to work directly on the StrongARM (this was necessary before the port of the full Card Services suite). The ARM wrappers assumed that the routines would always be used to access I/O memory in PCMCIA slot 0, and so the wrappers performed arithmetic on the target addresses they were passed. When the PCMCIA core was corrected to properly handle 32-bit addresses, the ARM wrappers were also revised to no longer make any assumptions about the locations they were to manipulate. The effects of this latter change were somewhat surprising.34
Several changes were made to the PCMCIA core early in the porting process which involved cardmgr, the userland configuration dæmon. These were later backed out of the patch set, but will be discussed here for historical account. One rôle of cardmgr in Card Services is to read a user configuration file which describes certain resources available for use by PCMCIA software; this information is then delivered to the PCMCIA core running in the kernel. The resources managed by the dæmon include IRQ assignments, I/O ranges, and memory ranges. The original port of Card Services to the StrongARM added the ability to separately manage common memory and attribute memory ranges. Also added were provisions for a number of register addresses, bit masks, and other items necessary for accommodating the range of variation in StrongARM PCMCIA implementations. The goal of this approach was to produce a fully-generic set of kernel modules which would work on any StrongARM SA-1100 board without recompilation; all configuration would happen via the userland configuration tool. In the end, this approach proved to be less wieldy than the per-board abstraction layer described earlier. The various memory ranges do not change across StrongARM SA-1100 processors, so userland management of them is not necessary. The other resources, such as IRQ assignments, register locations, and so forth, are so numerous (more than thirty such features exist) that a substantial redesign of the Card Services resource system -- needed only for the ARM architecture -- would be required. The final design, therefore, goes nearly to the opposite extreme by ignoring the userland configuration entirely. The goal of kernel portability across multiple boards is now being approached in another way: compile-time macros and run-time tests enable various per-board features.
Concluding the discussion of Card Services components are Driver Services and the client drivers. Driver Services is a special case of a client driver which has the distinguishing ability to communicate with the userland cardmgr. No direct changes to the Driver Services module were necessary to support the StrongARM. In general, client drivers have tended to require only a single change: expansion of the internal data structure used to store I/O addresses from a 16-bit data type to the 32-bit field needed to represent StrongARM virtual addresses. Devices which have been confirmed to work using Card Services on StrongARM-based systems include the Lucent WaveLAN/IEEE 802.11 PCMCIA device (first using a port of the Lucent proprietary driver, then using the open source wvlan_cs driver), the IBM Microdrive (using ide_cs), the Socket Communications LP-E CF+[LPE] ethernet device (using a modified version of pcnet_cs; see below), and other CompactFlash flash RAM devices (using ide_cs). In most cases, once a driver is made to function on one StrongARM SA-1100 board, it should be able to support its device on other boards without modification.
Assabet provides one type-II CompactFlash slot, suitable for accepting the included Socket Communications LP-E CF+ ethernet card, or a larger device such as the IBM 1GB Microdrive. The CompactFlash interface is mapped to PCMCIA slot 1 on the SA-1110, and can only be used when Neponset is not attached.35 The control interface for CompactFlash on Assabet consists of four GPIO pins and three pins in the CPLD Board Control Register (BCR). The control interface for the slots on Neponset is completely different, a fact which causes the User's Guide for Assabet to mention, ``Two versions of the CF drivers are required for the SA-1110 Development Platform, one for the SA-1111 Development Module and one for the SA-1110 Development Board.'' As we will see, Card Services makes this statement untrue.
CompactFlash initialization on Assabet begins by asserting the CF_Bus_On bit in the BCR. Next, the four GPIO lines are configured. Briefly, these are the card interrupt signal (CF_IRQ, GPIO 21), the Card Detect signal (CF_CardDetect, GPIO 22), and the Battery Voltage Detect signals (CF_BVD1 and CF_BVD2, GPIO 25 and GPIO 24, respectively). All four signals will be used as inputs, a characteristic which is configured in the GPIO Pin Direction Register (GPDR).
The SA-1110 can generate software interrupts based on level transitions at the GPIO pins. IRQ handlers may be invoked at a rising edge, falling edge, or both based on the values of the GPIO Rising-Edge Detect Register (GRER) and GPIO Falling-Edge Detect Register (GFER). This flexibility was not anticipated in the Linux IRQ request facility, so some extra work is required to properly set up an interrupt handler for devices such as the CompactFlash interface. After trying several approaches, the current solution -- refined outside of Carnegie Mellon -- maintains a set of GPIO pins which should trigger rising edge interrupts, and another set for falling edge interrupts. These sets are configured prior to issuing the normal request_irq() operation, and are loaded into the GPIO edge detect registers during normal interrupt processing. In the case of Assabet, all four GPIO lines have their edge detection policy set during initialization, but only three (CF_CardDetect, CF_BVD1, CF_BVD2) are actually registered. The CompactFlash device IRQ will be registered at card initialization time by the card driver itself using the normal Linux mechanism.
After initialization, the generic sa1100 socket driver makes a number of socket_state or configure_socket requests, possibly in response to a CompactFlash-related interrupt or some other Card Services event. Returning the state of the CompactFlash slot is straightforward; the GPIO Pin-Level Register (GPLR) contains the current states of the Card Detect and Battery Voltage Detect signals, which can be copied directly into the state descriptor. Because Assabet does not deliver Write Protect status to the processor, and the power supply can only provide 3.3V to the card, no additional information can be returned to the socket driver beyond the various Detect signals. To configure the socket, the kernel first disables interrupts, then samples the last value written to the Board Control Register, which is write-only. Based on the configuration structure passed, the CF_PWR (3.3V supply) and CF_RST (reset) bits are adjusted, and the resulting value is written to the BCR, after which interrupts are re-enabled. The V , output enable, and speaker enable features are not available for configuration on Assabet.
Because Assabet ships with a CompactFlash device for which Linux support has been added, it seems appropriate to discuss the driver for that device here. The Socket Communications LP-E CF+ ethernet card is based around an NE2000-clone controller, for which support previously existed in the kernel. The 8390 driver provides the core support for this controller, including most of the hooks into the Linux ethernet device code. A wrapper for this driver, pcnet_cs, implements the necessary PCMCIA Card Services operations and performs the various dynamic configuration tasks associated with PC card versions of this controller. In order to properly configure the device, a driver must first read the ethernet Medium Access Control (MAC) address from the CompactFlash card attribute memory, and then program the controller using this address. The location of the MAC address is variable across card implementations, and Socket Communications was unwilling to provide documentation for their device. A brute force search method was sufficient to discover the address location, and an entry was added to the hardware address search table in the driver to support the LP-E CF+ card.
A more serious problem associated with the Socket Communications card was discovered by an engineer at Intel, for which a software workaround has been added to the Linux driver. The symptom consists of corruption on the last octet of an odd-length transfer from the card to the processor during a frame receive, when the card is programmed to transfer 16-bit quantities. The corruption is fatal to TCP-dependent applications, or indeed, any protocol with sufficiently sensitive error detection. The current fix is to detect these odd-length buffers, and drop the card into 8-bit transfer mode prior to moving the final octet.
As mentioned previously, the PCMCIA and CompactFlash slots on Neponset are accessed in a completely different fashion from the Assabet slot. The GPIO pins assigned to CompactFlash on Assabet are instead responsible for the SA-1111 shared memory interface and daughterboard interrupts. Card configuration occurs largely via the PCMCIA functional block on the SA-1111 instead of the Assabet BCR, with some power control features moved to the Neponset Control Register (NCR_0). Despite these significant differences, it is not necessary to implement separate client drivers for a CompactFlash device running on Assabet and Neponset. The socket driver abstraction facility masks these implementation issues completely.
PCMCIA and CompactFlash initialization begins by assuming that access to Neponset has been set up, and that the SA-1111 Register Access Bus has been properly activated. GPIO block A on the SA-1111 includes four signals which are connected to the MAX1600 power switching network on Neponset. As these willl be used exclusively as outputs, their direction is configured using the block A Data Direction Register (PA_DDR). The MAX1600 is programmed to standby mode by clearing all four V signals (by writing the block A Data Write Register, PA_DWR) as well as the two V signals (by writing to NCR_0).
Like the SA-1110, the SA-1111 can assert a software interrupt based on level transitions36 for a number of signals. Of the nearly 50 interruptable signals available, six will be of interest to the PCMCIA implementation: S0Readynint, S0CDValid, S0_Bvd1Stschg, and their slot 1 equivalents. The SA-1111 manual notes that non-GPIO interrupts -- such as the PCMCIA signals -- should assert on a rising edge. Interestingly, on Neponset, all six signals are properly asserted on their falling edge, instead. Assertion polarity is set by directly accessing the SA-1111 Interrupt Controller registers INTPOL0 and INTPOL1.37 With the polarity configured, interrupts for the Card Detect and Battery Voltage Detect signals are registered using request_irq().
The Neponset versions of socket_state and configure_socket have the same responsibilities as their Assabet equivalents, but enjoy control over more features. All state information for either slot can be sampled through the SA-1111 PCMCIA Status Register (PCSR). For each slot, the following information is returned: Card Detect status, Card Ready status, both Battery Voltage Detect signals, Write Protect status, and the 3.3V and ``X.XV'' Voltage Sense signals. Socket configuration is slightly more complex, and begins by sampling three registers: the PCMCIA Control Register (PCCR), NCR_0, and PA_DWR. V can be configured to either 0V, 3.3V, or 5V for either slot by first programming the PCCR to float the socket control lines (in the case of 0V), or by setting the appropriate one of the S0PSE or S1PSE bits (for 3.3V or 5V). Power is actually switched to the sockets by asserting the appropriate signals attached to the block A GPIO lines, controlled by PA_DWR. In a similar manner, V to either socket is configured on the MAX1600 by two signals accessible through the Neponset CPLD. Bits A0VPP and A1VPP in NCR_0 switch the 12V supply to slot 0 (the PCMCIA slot), but switch 0V to slot 1 (CompactFlash devices do not accept a 12V programming voltage). The last configurable PCMCIA setting is the socket reset signal, which is accessible through the S0_RST and S1_RST bits in the PCCR. Once all of the new power and reset selections have been processed, new values are written to each of the PCCR, NCR_0, and PA_DWR.
Unhappily, there is an exception to the previously offered claim that client drivers require no modification when moving to Neponset from Assabet, or from any other StrongARM SA-1100 board. It has been discovered that the Socket Communications LP-E CF+ card -- and possibly others -- cannot transfer neither frame headers nor frame data to the processor in 16-bit word mode. A runtime hack has been added to the pcnet_cs driver to disable the use of word mode, requiring transfers to occur one octet at a time. This has serious implications for the effective bandwidth to and from the card. Unfortunately, an insufficient number of cards have been tested on Neponset to determine at this time whether or not this is a fault of the LP-E device.
The Linux kernel in general is a massive distributed software development effort, aided by thousands of developers worldwide. Hundreds of core features which are critical to the usefulness of Linux on Assabet were put in place by these contributors over the last decade, such as the process scheduler, file systems, and serial terminal drivers. To properly acknowledge all of this work would require volumes of text, so this section aims to identify portions of the kernel which relate specifically to Assabet that have been contributed recently by developers not affiliated with Carnegie Mellon.
As discussed earlier in this report, interrupt organization on the StrongARM SA-1100 is perhaps more complex than might be considered ideal. On the SA-1110, some general-purpose I/O pins receive their own IRQ, while others share a common IRQ and require software to consult a register to identify the source. When the SA-1111 is added, more levels of indirection are added in order to consult the interrupt source registers on that chip. If the SA-1111 is part of a board like Neponset, there may be yet additional indirection required just to identify the companion chip as one among many potential requestors.
Historically, this has not been a problem, as many drivers written for Linux running on the StrongARM could be aware of this architecture, and behave accordingly. With the arrival of PCMCIA Card Services, however, the prospect arose of dozens of client drivers appearing which were written in a general fashion, with no expectation of such an unusual organization. A typical client driver issues a query to Card Services for the IRQ number it should request of the kernel, and after a successful IRQ request, the client assumes that the only work performed by its interrupt handler is that which relates specifically to the card itself. No provision exists for interrupt source lookup, or for any additional ``cleanup'' of the hardware after interrupt processing completes.
To avoid requiring these otherwise portable drivers to support every possible StrongARM board, the IRQ management code in the kernel was improved by Nicolas Pitre to simplify interrupt usage by drivers. The SA-1110 supports 32 separate IRQs, twelve of these associated with the GPIO interface. IRQs 0-10 map directly to GPIO pins 0-10, with the remaining 17 GPIO sources shared among IRQ 11. Using the new IRQ system, software can request IRQ numbers 32-48, which are not ``real'' processor interrupts, but instead cause a proxy handler to be registered for IRQ 11. At runtime, when a shared interrupt comes in on IRQ 11, the GEDR is consulted, reset, and the registered interrupt handler is invoked just as though it were a first-class interrupt.
IRQ support for the SA-1111 on Neponset, initially added to this framework at Carnegie Mellon, takes the same approach. New IRQ numbers 49-104 have been added for the interrupt sources on the SA-1111. When one of these interrupts are requested, several proxy handlers are installed. The first, for the shared IRQ 11 GPIO lines, is required because the Neponset interrupt is on GPIO 25. The second, for IRQ ``46,'' is a proxy for the three interrupt sources on Neponset: an ethernet controller, a keyboard encoder, and the SA-1111. This proxy reads the Neponset IRR and dispatches a handler for the requesting device. The last proxy, for SA-1111 interrupts, reads and resets the INTSTATCLR0 and INTSTATCLR1 registers on the SA-1111, then dispatches handlers for the active interrupt sources.
The end result of this organization is that almost any interrupt source on Assabet or Neponset can be requested by software using the standard Linux request_irq() service, albeit with perhaps a higher IRQ number than is typical. As mentioned previously, because the SA-1110 and SA-1111 allow flexibility in configuring the kinds of level transitions that constitute an interrupt condition, this approach cannot mask the underlying implementation completely. Software must still specify using a separate interface -- either through routines associated with IRQ handling, or by accessing the polarity registers directly -- which interpretation should be used. In practice, this has not posed a portability problem, as the correct convention is generally known at startup time, and can be configured in board-specific initialization code.
This section is provided more to mention an aspect of the code which will be interesting to pursue in the future, but which has not been tested as of this writing. Nicolas Pitre has moved Neponset detection to an early part of the kernel setup process in order to potentially configure the 32MB of SDRAM provided on that board. At present, this memory is not available to the kernel, but enabling this bank would be interesting for two reasons. First, memory on Neponset is clocked at half the speed of the Assabet SDRAM, which would result in non-uniform access times. Second, some devices are quite specific about which RAM bank they will access. For example, the SA-1111 performs DMA transfers to only one of the two boards, the selection of which is based on hardware jumpers. The Linux non-cacheable buffer allocation implementation does not provide the ability to express constraints on acceptable memory regions.38 Adding support for this functionality is a to-do item at the time of this writing.
The SA-1111 Serial Audio Controller support described previously could not have been attempted without the availability of a UDA1341 driver, authored by Nicolas Pitre. The original driver was written for the part as it is connected on Assabet, using the GPIO-based L3 bus implementation. When running against the SA-1111, the driver does not make use of the L3-related code, but is able to leverage the existing L3 read/write API. Similarly, the programming interface for SA-1111 DMA is mimiced from the SA-1110 version so as to preserve the code flow in the rest of the driver. Much of the remaining driver code, such as the implementation of the Open Sound System[OSS] API, and the various UDA1341 command register programming features, remain unchanged from the version which works on Assabet.
Data may be transferred between system memory and any of the five SA-1110 serial ports using any of six internal DMA engines. Like the engines in the SA-1111 Serial Audio Controller, an ``A'' and ``B'' transfer can be configured for each engine. Transfers can be issued either to or from the serial controller in 8- or 16-bit quantities, and the engines can perform endianness conversion in flight. The DMA controller is capable of interrupting the processor upon completion of any transfer.
Nicolas Pitre has added an API to the kernel which supports the DMA controller. Calling code passes a register block to sa1100_start_dma(), through which the control registers for one of the six engines are accessible. The service routine selects one of the ``A'' or ``B'' transfers to initiate, and then loads the appropriate registers with the appropriate flags, target address, and transfer count. It is the responsibility of the calling code to handle the interrupt associated with the engine being used. Generally, such code will dequeue a fresh buffer for transfer, and re-invoke the DMA start routine. As mentioned previously, the SA-1111 SAC DMA interface is sufficiently similar that a counterpart API for that chip was easily added alongside the SA-1110 code.
The SA-1110 includes a special DMA engine which resides directly on the internal ARM data bus, which exclusively serves to transfer graphics data from an SDRAM framebuffer to an external liquid crystal display. The LCD controller is quite flexible, permitting displays of up to 1024 768 pixels at a 16-bit color depth. Assabet ships with a Sharp 3.9-inch LCD touchscreen, which supports either 8-bit or 16-bit color at 320 240 pixels. When running at the lower color depth, a CPLD on Assabet maps the 8-bit pixel data to the RGB444 (12-bit) format expected by the display. At the 16-bit depth, the RGB565 data output by the StrongARM is mapped to the 18-bit (RGB666) encoding used by the LCD. Which translation the CPLD performs is governed by the setting of the LCD12or16 bit in the Board Control Register, with LCD power controlled by the LCD On bit.
Support for the CPLD interface, as well as palette configuration and other Assabet-specific details have all been added to the generic SA-1100 framebuffer driver. The result enables the use of Nano-X, XFree86, and other graphical display servers, as well as the standard Linux text console. Principal contributors to this effort include Jeff Sutherland, Tak-Shing Chan, and Nicolas Pitre.
The Sharp 3.9-inch LCD module includes a touchscreen with a standard 4-wire resistive output. On Assabet, this output is directed to a Philips UCB1300 ADC. The kernel presently includes several touchscreen drivers, but an effort is under way to standardize the form of these drivers within the kernel.[Tou] The current version suitable for use on Assabet was authored by Tak-Shing Chan, based largely on a previous driver by Peter Danielsson.