2月 10, 2024

Embedded Systems Architecture: A Comprehensive Guide for Engineers and Programmers

Embedded Systems Architecture

A Comprehensive Guide for Engineers and Programmers

By
Tammy Noergaard

CHAPTER 4 Embedded Processors

An electronic device contains at least one master processor, acting as the central controlling device, and can have additional slave processors that work with and are controlled by the master processor.

For ex., STPC ATLAS is a powerful X86 Core PC Compatible Information Appliance System-on-Chip .

The STPC Atlas integrates a standard 5th generation x86 core along with a powerful UMA graphics/video chipset, support logic including PCI, ISA, Local Bus, USB, EIDE controllers and combines them with standard I/O interfaces to provide a single PC compatible subsystem on a single device, suitable for all kinds of terminal and industrial appliances.

In the block diagram of an x86 reference board, the Atlas STPC is the master processor, and the super I/O and ethernet controllers are slave processors.

The complexity of the master processor usually determines whether it is classiﬁed as a micro-processor or a microcontroller:

microprocessors contain a minimal set of integrated memory and I/O components
microcontrollers have most of the system memory and I/O components integrated on the chip.

With fewer components and lower power requirements, an integrated processor may result in a smaller and cheaper board.
Processors are considered to be of the same architecture when they can execute the same set of machine code instructions.

4.1 ISA Architecture Models

The features that are built into an architecture’s instruction set are commonly referred to as the Instruction Set Architecture or ISA.

CHAPTER 5 Board Memory

Embedded platforms can have a memory hierarchy, a collection of different types of memory, each with unique speeds, sizes, and usages.

Some of this memory can be physically integrated on the processor, like registers and certain types of primary memory, which is memory connected directly to or integrated in the processor such as ROM, RAM, and level-1 cache.
Some types of primary memory, such as ROM, level-2+ cache, and main memory, and secondary/tertiary memory, which is memory that is connected to the board but not the master processor directly

The basics of memory operation are essentially the same whether the memory is integrated into an IC or located discretely on a board.

Primary memory is typically a part of a memory subsystem made up of three components:

The memory IC
An address bus
A data bus

Memory ICs that can connect to a board come in a variety of packages,

dual inline packages (DIPs)
single in-line memory modules (SIMMs)
dual in-line memory modules (DIMMs)

The capacitors in the memory array of DRAM are not able to hold a charge (data). The charge gradually dissipates over time, thus requiring some additional mechanism to refresh DRAM, in order to maintain the integrity of the data.
SRAMs usually consume less power than DRAMs, since there is no extra energy needed for a refresh.
DRAM is usually the “main” memory in larger quantities, as well as being used for video RAM and cache.

Level 2+ (level 2 and higher) cache is the level of memory that exists between the CPU and main memory in the memory hierarchy.

Basically, cache is used to store subsets of main memory that are used or accessed often.

Writes must be done in both cache and main memory to ensure that cache and main memory are consistent
When the CPU wants to read data from memory, level-1 cache is checked ﬁrst. If the data is in cache, it is called a cache hit, the data is returned to the CPU and the memory access process is complete. If the data is not located in level-1 cache, it is called cache miss. External off-chip caches are then checked, and if there is a miss there also, then on to main memory to retrieve and return the data to the CPU.

In systems with memory management units (MMU) to perform the translation of addresses, cache can be integrated between the master processor and the MMU, or the MMU and main memory.

5.4 Memory Management of External Memory

The two most common types of memory managers found on an embedded board are memory controllers (MEMC) and memory management units (MMUs).

Memory Controller:

The memory controller is a hardware component responsible for managing the flow of data between the CPU and the system memory (RAM).
Implement and provide glueless interfaces to the different types of memory in the system, such as SRAM and DRAM, synchronizing access to memory and verifying the integrity of the data being transferred.
It controls the timing and organization of data transfer between the CPU and RAM, including tasks such as fetching instructions and data from memory and writing data back to memory.
The memory controller ensures that data is transferred reliably and efficiently between the CPU and memory modules.
It also handles various memory-related tasks such as memory refresh operations (in DRAM), error correction, and sometimes memory mapping.
The controller manages the request from the master processor and accesses the appropriate banks, awaiting feedback and returning that feedback to the master processor.

Memory Management Unit (MMU):

The MMU is also a hardware component but is more closely associated with the CPU.
Its primary function is to translate virtual addresses generated by the CPU into physical addresses used by the memory subsystem.
The MMU enables the use of virtual memory, allowing programs to address more memory than physically available by utilizing disk space as an extension of RAM.
It implements techniques such as paging and segmentation to manage virtual memory, allocate memory space to processes, and control memory access permissions.
Additionally, the MMU often handles memory protection, ensuring that processes cannot access memory locations outside their allocated address space.

In the case of translated addresses, the MMU can use level-1 cache or portions of cache allocated as buffers for caching address translations, commonly referred to as the translation lookaside buffer (TLB), on the processor to store the mappings of logical addresses to physical addresses.

5.5 Board Memory and Performance

The performance throughput can be negatively impacted by main memory especially, since the DRAM used for main memory can have a much lower bandwidth than that of the processors.

CHAPTER 6 Board I/O (Input/Output)

Input/output (I/O) components on a board are responsible for moving information into and out of the board to I/O devices connected to an embedded system.
Board I/O can consist of:

input components
output components
components that do both

In short, board I/O can be

as simple as a basic electronic circuit that connects the master processor directly to an I/O device, such as a master processor’s I/O port to a clock or LED located on the board

more complex I/O subsystem circuitry that includes several units

6.2 Interfacing the I/O Components

I/O hardware is made up of all or some combination of integrated master processor I/O, I/O controllers, a communications interface, a communication port, I/O buses, and a transmission medium.

For off-board I/O devices, such as keyboards, mice, LCDs, printers, and so on, a transmission medium is used to interconnect the I/O device to an embedded board via a communication port.

The communication port would then be interfaced to an I/O controller, a communication interface controller, or the master processor (with an integrated communication interface) via an I/O bus on the embedded board .

An I/O bus is essentially a collection of wires transmitting the data.
I/O buses typically support various protocols and standards, such as USB (Universal Serial Bus), SATA (Serial ATA), PCIe (Peripheral Component Interconnect Express), and Ethernet, depending on the type of device and the speed and bandwidth requirements.

The design of the communications interface between the I/O controller and master is based on four requirements:

An ability of the master CPU to initialize and monitor the I/O Controller.
I/O controllers can typically be conﬁgured via control registers and monitored via status registers.
A way for the master processor to request I/O.

memory-mapped I/O, in which the I/O controller registers have reserved spaces in main memory

A way for the I/O device to contact the master CPU.

asynchronous interrupt

Some mechanism for both to exchange data.

DMA

CHAPTER 8 Device Drivers

A device driver that is architecture-speciﬁc manages the hardware that is integrated into the master processor (the architecture).
A device driver that is generic manages hardware that is located on the board and not integrated onto the master processor.

8.2 Example 2: Memory Device Drivers

The master processor and programmers view memory as a large one-dimensional array, commonly referred to as the Memory Map. In the memory map, each cell of the array is a row of bytes (8 bits) and the number of bytes per row depends on the width of the data bus (8-bit, 16-bit, 32-bit, 64-bit, etc.).
Sample memory map,

When physical memory is referenced from the software’s point-of-view, it is commonly referred to as logical memory, and its most basic unit is the byte.

Logical memory refers to the memory space that a process or program can access.
It consists of the logical addresses generated by the CPU during program execution.
Programs interact with logical memory through pointers, variables, and data structures.
Logical memory provides a uniform and abstracted view of memory for programs, allowing them to operate independently of the underlying hardware.
Logical memory is made up of all the physical memory (registers, ROM, and RAM) in the entire embedded system.

The memory subsystem includes all types of memory management components, such as memory controllers and MMU, as well as the types of memory in the memory map, such as registers, cache, ROM, DRAM, and so on.
A more complex address translation scheme is implemented in which the logical address provided via OS is made up of a segment number (address of start of segment) and offset (within a segment) which is used to determine the physical address of the memory location.
The primary role of the Memory Management Unit (MMU) is to translate logical addresses generated by the CPU into physical addresses.
This translation process allows programs to operate using logical addresses while the MMU handles the mapping of these logical addresses to physical memory locations.
Through this address translation mechanism, the logical memory space of each process is mapped to the physical memory space available in the system.

Virtual memory is a memory management technique that extends the available logical memory beyond the physical memory capacity of the system.

It allows programs to use more memory than physically available by utilizing disk space as an extension of RAM.
Virtual memory systems maintain a mapping between logical addresses and physical addresses, swapping data between physical memory and disk storage as needed.
This enables efficient memory utilization and allows multiple programs to run simultaneously without exhausting physical memory resources.

OS implement various memory management policies to efficiently manage the mapping between logical and physical memory.
These policies include page replacement algorithms (e.g., LRU, FIFO), memory allocation strategies (e.g., paging, segmentation), and memory protection mechanisms.

The terms "virtual address" and "logical address" are sometimes used interchangeably, but they can have distinct meanings depending on the context.
The main differences:

In virtual memory systems, each process has its own virtual address space, which may be larger than the physical memory available in the system.
Logical addresses typically correspond directly to physical memory locations in traditional memory addressing schemes.

logical

physical

Virtual addresses are translated into physical addresses by the MMU, allowing for dynamic mapping of memory pages to physical memory locations.

Virtual addresses are generated by the CPU during program execution according to the memory addressing scheme implemented by the OS.
The process of generating virtual addresses involves several steps:

Address Space Allocation

virtual memory

process's address space

segments

Segmentation and Paging

Segmentation divides the process's address space into logical segments

Paging

Pages can be mapped to physical memory frames or swapped out to disk as needed

Logical Address Generation

The CPU generates logical addresses during program execution. These addresses are relative to the base address of the segment or page being accessed.

segment

offset

segment

offset

Translation to Physical Address

The MMU (Memory Management Unit) translates logical addresses into physical addresses

If the translation is successful, the MMU retrieves the corresponding physical address from the page table or TLB.
If the translation is not found in the TLB, it may result in a page fault, prompting the operating system to load the required page into physical memory from disk.

Accessing Physical Memory

8.2.1 Memory Management Device Driver Pseudocode Examples

The following pseudocode demonstrates implementation of various memory management routines on the MPC860.

Initializing the Memory Controller and connected ROM/RAM

memory controller

the general-purpose chip-select machine (GPCM)
the user-programmable machines (UPMs)

memory bank

PowerPC connected to SRAM

PowerPC connected to DRAM

memory controller

bank

base registers

start address

option registers

length

base

option

Initializing the Internal Memory Map

special purpose registers (SPRs)

Initializing the MMU

MMU

4 GB

pages

memory map or page table

MMU

caches (translation lookaside buffers, TBL)

instruction and data

the TLBs are 32-entry and fully associative caches
the entire memory map is stored in cheaper off-chip main memory as a two-level tree of data structures that deﬁne the physical memory layout of the board and their corresponding effective memory address

TLB

MMU

TLB miss

trap

exception handler

tablewalk

traversing the MPC860’s two-level memory map tree in main memory

The ﬁrst level has 1024 entries(for 10-bit level-1 indexing), where each entry is 4 bytes (24 bits), and represents a segment of virtual memory that is 4 MB in size.
Within each level-2 table, every entry represents the pages of the respective virtual memory segment.

logical

page size

For a 4 kB (0x1000, 12 bits addressing range) page
For a 16 kB (0x4000, 14 bits addressing range) page

CHAPTER 9 Embedded Operating Systems

9.3 Memory Management

CPU only executes task code that is in cache or RAM.
The OS treats memory as one large one-dimensional array, called a memory map. Either a hardware component integrated in the master CPU or on the board does the conversion between logical and physical addresses (such as an MMU), or it must be handled via the OS.

Kernel routines run in kernel mode (also referred to as supervisor mode), Higher layers of software run in user mode, and can only access anything running in kernel mode via system calls, the higher-level interfaces to the kernel’s subroutines.

Because multiple processes are sharing the same physical memory when being loaded into RAM for processing, The operating system uses memory “swapping,” where partitions of memory are swapped in and out of memory at run-time. The most common partitions of memory used in swapping are segments and pages . Swapping is the foundation for virtual memory.

A process encapsulates all the information that is involved in executing a program, all of the different types of information within a process are divided into “logical” memory units of variable sizes, called segments. A segment is a set of logical addresses containing the same type of information.
Most OSes typically allow processes to have all or some combination of ﬁve types of information within segments:

text (or code) segment
data segment
bss (block started by symbol) segment
stack segment
heap segment

The data, text, and bss segments are all ﬁxed in size at compile time, and are as such static segments; it is these three segments that typically are part of the executable ﬁle.
The OS creates a task’s image by memory mapping the contents of the executable file.
The stack and heap segments, on the other hand, are not ﬁxed at compile time, and can change in size at runtime and so are dynamic allocation components.

Either with or without segmentation, some OSes divide logical memory into some number of ﬁxed-size partitions, called blocks, frames, pages or some combination of a few or all of these.

When a process is loaded in its entirety into memory (in the form of pages), its pages may not be located within a contiguous set of frames. Every process has an associated process table that tracks its pages, and each page’s corresponding frames in memory.

The logical address spaces generated are unique for each process, typically made up of a page-frame number, which indicates the start of that page, and an offset of an actual memory location within that page.
the logical address is the sum of the page number and the offset.

Dividing up logical memory into pages aids the OS in more easily managing tasks being relocated in and out of various types of memory in the memory hierarchy, a process called swapping.

Virtual memory is typically implemented via demand segmentation and/or demand paging memory fragmentation techniques.When virtual memory is implemented via these “demand” techniques, it means that only the pages and/or segments that are currently in use are loaded into RAM.

The OS :

generates virtual addresses based on the logical addresses
maintains tables for the sets of logical addresses into virtual addresses conversions
manages more than one different address space for each process (the physical, logical, and virtual)

The process views memory as one continuous memory space, whereas the kernel actually manages memory as several fragmented pieces which can be segmented and paged, segmented and unpaged, unsegmented and paged, or unsegmented and unpaged.

搜尋此網誌

I'm Jay's father