Linux Device Drivers- IV


Content
  • Chapter 11: Data Types in the Kernel
  • Chapter 12: PCI Drivers
  • Chapter 13: USB Drivers
  • Chapter 14: The Linux Device Model

Chapter 11: Data Types in the Kernel

Chapter 12: PCI Drivers

Chapter 13: USB Drivers

Topologically, a USB subsystem is not laid out as a bus; it is rather a tree built out of several point-to-point links.

The USB host controller is in charge of asking every USB device if it has any data to send.
Because of this topology, a USB device can never start sending data without first being asked to by the host controller.

The bus is very simple at the technological level, as it’s a single-master implementation in which the host computer polls the various peripheral devices.

The USB core provides an interface for USB drivers to use to access and control the USB hardware, without having to worry about the different types of USB hardware controllers that are present on the system.

USB Device Basics

The Linux kernel provides a sub-system called the USB core to handle most of the complexity in the USB standard.

Endpoints

A USB endpoint can carry data in only one direction, either from the host computer to the device (called an OUT endpoint) or from the device to the host computer (called an IN endpoint).
Endpoints can be thought of as unidirectional pipes.

A USB endpoint can be one of four different types that describe how the data is transmitted:

  • CONTROL
  • They are commonly used for configuring the device, retrieving information about the device, sending commands to the device, or retrieving status reports about the device.
    Every USB device has a control endpoint called “endpoint 0” that is used by the USB core to configure the device at insertion time.
  • INTERRUPT
  • Interrupt endpoints transfer small amounts of data at a fixed rate every time the USB host asks the device for data.
  • BULK
  • Bulk endpoints transfer large amounts of data.
  • ISOCHRONOUS
USB endpoints are described in the kernel with the structure struct usb_host_endpoint .
This structure contains the real endpoint information in another structure called struct usb_endpoint_descriptor .

Interfaces

USB endpoints are bundled up into interfaces.
  • USB interfaces handle only one type of a USB logical connection, such as a mouse, a keyboard, or a audio stream.
  • Some USB devices have multiple interfaces, such as a USB speaker that might consist of two interfaces: a USB keyboard for the buttons and a USB audio stream.
Because a USB interface represents basic functionality, each USB driver controls an interface For the USB speaker example, Linux needs two different drivers for one hardware device.
USB interfaces may have alternate settings, which are different choices for parameters of the interface. The initial state of a interface is in the first setting, numbered 0 .

USB interfaces are described in the kernel with the struct usb_interface structure.

Configurations

USB interfaces are themselves bundled up into configurations.
A USB device can have multiple configurations and might switch between them in order to change the state of the device.
For example, some devices that allow firmware to be downloaded to them contain multiple configurations to accomplish this.

USB and Sysfs

Both the physical USB device (as represented by a struct usb_device ) and the individual USB interfaces (as represented by a struct usb_interface ) are shown in sysfs.
For ex., a simple USB mouse

[    3.113394] usb 1-1.1: New USB device found, idVendor=046d, idProduct=c077, bcdDevice=72.00

How the kernel labels the USB devices:

root_hub-hub_port:config.interface
root_hub-hub_port-hub_port:config.interface
  • The first USB device is a root hub.
  • This is the USB controller, usually contained in a PCI device.
    The controller is a bridge between the PCI bus and the USB bus, as well as being the first USB device on that bus.
    All root hubs are assigned a unique number by the USB core.
    There is no limit on the number of root hubs that can be contained in a single system at any time.
  • Every device that is on a USB bus takes the number of the root hub as the first num- ber in its name. That is followed by a - character and then the number of the port that the device is plugged into
  • as more and more USB hubs are used, the hub port number is added to the string following the previous hub port number in the chain.

USB Urbs

Writing a USB Driver

Chapter 14: The Linux Device Model


The kernel 2.6 developed a unified device model that a general abstraction describing the structure of the system was added.
It is now used within the kernel to support a wide variety of tasks, including:
  • Power management and system shutdown
  • For example, a USB host adaptor cannot be shut down before dealing with all of the devices connected to that adaptor. The device model enables a traversal of the system’s hardware in the right order.
  • Communications with user space
  • The implementation of the sysfs virtual filesystem is tightly tied into the device model.
  • Hotpluggable devices
  • The hotplug mechanism used within the kernel to handle and (especially) communicate with user space about the plugging and unplugging of devices is managed through the device model.
  • Device classes
  • The device model includes a mechanism for assigning devices to classes, which describe those devices at a higher, functional level and allow them to be discovered from user space. Communication with user space via sysfs is a device model function.
  • Object lifecycles
  • The creation and manipulation of objects created within the kernel is complex. The device model provides a set of mechanisms for dealing with object lifecycles, their relationships to each other, and their representation in user space.
A tiny piece of the device model structure associated with a USB mouse:

  • “devices” tree shows how the mouse is connected to the system
  • “bus” tree tracks what is connected to each bus
  • “classes” tree concerns itself with the functions provided by the devices
There are hundreds of nodes like this in a system; it is a difficult data structure to visualize as a whole.
The Linux device model code takes care of all these considerations without imposing itself upon driver authors.
Driver authors can ignore the device model entirely, and trust it to take care of itself.

By showing how the low-level device components work, we can see how those components are used to build the larger structure.

Kobjects, Ksets


The kobject is the fundamental structure that holds the device model together.
The tasks handled by struct kobject and its supporting code now include:
  • Reference counting of objects
  • One way of tracking the lifecycle of such objects is through reference counting. When no code in the kernel holds a reference to a given object, that object has finished its useful life and can be deleted.
  • sysfs representation
  • Every object that shows up in sysfs has a kobject
  • Data structure glue
  • The device model is a complicated data structure made up of multiple hierarchies with numerous links between them. The kobject implements this structure and holds it together.
  • Hotplug event handling
  • The kobject subsystem handles the generation of events that notify user space about the comings and goings of hardware on the system.

Kobject Basics


A kobject has the type struct kobject defined in <linux/kobject.h>.

struct kobject {
 const char  *name;
 struct list_head entry;
 struct kobject  *parent;
 struct kset  *kset;
 struct kobj_type *ktype;
 struct kernfs_node *sd; /* sysfs directory entry */
 struct kref  kref;
#ifdef CONFIG_DEBUG_KOBJECT_RELEASE
 struct delayed_work release;
#endif
 unsigned int state_initialized:1;
 unsigned int state_in_sysfs:1;
 unsigned int state_add_uevent_sent:1;
 unsigned int state_remove_uevent_sent:1;
 unsigned int uevent_suppress:1;
};
kobjects are embedded in other structures. kobjects can be seen as a top-level, abstract class from which other classes are derived.
As an example,

struct cdev {
    struct kobject kobj;
    struct module *owner;
    struct file_operations *ops;
    struct list_head list;
    dev_t dev;
    unsigned int count;
};
To work with kobjects, find the structure containing the pointer to a kobject, use the container_of() macro. For ex., to find the struct cdev which contains a pointer to a struct kobject called kp:

struct cdev *device = container_of(kp, struct cdev, kobj);

The initialization steps for a kobject:
  • zero out a kobject structure
  • set up some of the internal fields
  • 
    void kobject_init(struct kobject *kobj, struct kobj_type *ktype)
    
    This function will properly initialize a kobject such that it can then be passed to the kobject_add() call. After this function is called, the kobject MUST be cleaned up by a call to kobject_put(), not by a call to kfree directly to ensure that all of the memory is cleaned up properly. This sets the kobject’s reference count to 1.
  • set the name of the kobject
  • This name is used in sysfs entries.
    
    int kobject_set_name(struct kobject *kobj, const char *format, ...);
    
    This function takes a printk-style variable argument list
The low-level functions for manipulating a kobject’s reference counts are:
  • struct kobject *kobject_get(struct kobject *kobj)
  • A successful call to kobject_get() increments the kobject’s reference counter and returns a pointer to the kobject.
  • void kobject_put(struct kobject *kobj)
  • the call to kobject_put() decrements the reference count and, possibly, frees the object.
The reference count in the kobject itself may not be sufficient to prevent race conditions.
To use a kobject, it's necessary that the module created that kobject exists.
try_module_get() and module_put() are used to manipulate the module usage count, to protect against removal (a module also can't be removed if another module uses one of its exported symbols: see below). Before calling into module code, you should call try_module_get() on that module: if it fails, then the module is being removed and you should act as if it wasn't there. Otherwise, you can safely enter the module, and call module_put() when you're finished.
The cdev_get() function does more than just increment the count in the kobject, however; it also increments the reference count for the module which drives that device:

static struct kobject *cdev_get(struct cdev *p)
{
 struct module *owner = p->owner;
 struct kobject *kobj;

 if (owner && !try_module_get(owner))
  return NULL;
 kobj = kobject_get_unless_zero(&p->kobj);
 if (!kobj)
  module_put(owner);
 return kobj;
}
What happens to a kobject when its reference count reaches 0 ?
The reference count is not under the direct control of the code that created the kobject. This notification is done through a kobject’s release() method. The kobject must persist until the release() method is called.

The release() method is not stored in the kobject itself; instead, it is associated with a type used to describe a particular type of kobject

struct kobj_type {
 void (*release)(struct kobject *kobj);
 const struct sysfs_ops *sysfs_ops;
 struct attribute **default_attrs; /* use default_groups instead */
 const struct attribute_group **default_groups;
 const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj);
 const void *(*namespace)(struct kobject *kobj);
 void (*get_ownership)(struct kobject *kobj, kuid_t *uid, kgid_t *gid);
};
The ktype controls what happens when a kobject is no longer referenced and the kobject's default representation in sysfs.
The macro finds the kobj_type pointer for a given kobject:

struct kobj_type *get_ktype(struct kobject *kobj);

Kobject Hierarchies


The kobject structure is often used to link together objects into a hierarchical structure that matches the structure of the subsystem being modeled. There are two separate mechanisms for this linking: the parent pointer and ksets.
The parent field in struct kobject is a pointer to another kobject — the one representing the next level up in the hierarchy.
For example, a kobject represents a USB device, its parent pointer may indicate the object representing the hub into which the device is plugged.
The main use for the parent pointer is to position the object in the sysfs hierarchy.

struct kobj_type concerns itself with the type of an object, struct kset is concerned with aggregation and collection.

struct kset {
    struct list_head list;
    spinlock_t list_lock;
    struct kobject kobj;
    const struct kset_uevent_ops *uevent_ops;
} __randomize_layout;
A kset keeps its children in a standard kernel linked list. linux/types.h:

struct list_head {
 struct list_head *next, *prev;
};

A kset is merely a collection of kobjects that want to be associated with each other. There is no restriction that they be of the same ktype, but be very careful if they are not. Objects of identical type can appear in distinct sets.
A kset serves these functions:
  • It serves as a bag containing a group of objects. A kset can be used by the kernel to track "all block devices" or "all PCI device drivers."
  • A kset is also a subdirectory in sysfs, where the associated kobjects with the kset can show up. Every kset contains a kobject which can be set up to be the parent of other kobjects; in this way the device model hierarchy is constructed.
  • Ksets can support the "hotplugging" of kobjects and influence how uevent events are reported to user space.
  • A kset can provide a set of default attributes that all kobjects that belong to it automatically inherit and have created whenever a kobject is registered belonging to the kset.
We can talk in detail about how a kobject should be prepared for its existence in the kernel.
  • k_name
  • This fields should always be initialized with kobject_set_name(), or specified in the original call to kobject_create_and_add().
  • refcount
  • This is the kobject's reference count; it is initialized by kobject_init()
  • parent
  • This is the kobject's parent in whatever hierarchy it belongs to. It can be set explicitly by the creator. If parent is NULL when kobject_add() is called, it will be set to the kobject of the containing kset.
  • kset
  • This is a pointer to the kset which will contain this kobject; it should be set prior to calling kobject_init().
  • ktype
  • This is the type of the kobject; it should be set prior to calling kobject_init().
ksets have an interface very similar to that of kobjects:

 void kset_init(struct kset *kset);
 int __must_check kset_register(struct kset *kset);
 void kset_unregister(struct kset *kset);
 struct kset * __must_check kset_create_and_add(const char *name,
      const struct kset_uevent_ops *u,
      struct kobject *parent_kobj);

Low-Level Sysfs Operations

In addition to /proc, the kernel also exports information to another virtual file system called sysfs. The creation of sysfs helped clean up the proc file system because much of the hardware information has been moved from proc to sysfs. The sysfs file system is mounted on /sys. A brief description of some of these top directories:
  • /sys/block
  • This directory contains entries for each block device in the system. Symbolic links point to the physical device that the device maps to in the physical device tree.
  • /sys/bus
  • This directory contains subdirectories for each physical bus type supported in the kernel. Each bus type has two subdirectories: devices and drivers.
    • The devices directory lists devices discovered on that type of bus.
    • The drivers directory contains directories for each device driver registered with the bus type.
    • Driver parameters can be viewed and manipulated.
  • /sys/class
  • This directory contains every device class registered with the kernel. Device classes describe a functional type of device. Examples include input devices, network devices, and block devices.
  • /sys/devices
  • This directory contains the global device hierarchy of all devices on the system. This directory also contains a platform directory and a system directory. The platform directory contains peripheral devices specific to a particular platform such as device controllers. The system directory contains non-peripheral devices such as CPUs and APICs.
  • /sys/firmware
  • This directory contains subdirectories with firmware objects and attributes.
  • /sys/module
  • This directory contains subdirectories for each module that is loaded into the kernel
  • /sys/power
  • The system power state can be controlled from this directory. The disk attribute controls the method by which the system suspends to disk. The state attribute allows a process to enter a low power state.
The sysctl utility can also be used to view or modify values to writable files in the /proc/sys directory.

Kobjects are the mechanism behind the sysfs virtual filesystem. For every directory found in sysfs, there is a kobject somewhere within the kernel. A call to kobject_add() results in the creation of a directory in sysfs.


int kobject_add(struct kobject *kobj, struct kobject *parent, const char *fmt, ...)
==> kobject_add_varg(kobj, parent, fmt, args);
       ==>  kobject_add_internal(kobj);
               ==> create_dir(kobj);
                      ==> populate_dir(kobj);
                             ==> sysfs_create_file(kobj, attr);

The name assigned to the kobject (with kobject_set_name() ) is the name used for the sysfs directory. Every kobject of interest also exports one or more attributes, which appear in that kobject’s sysfs directory as files containing kernel-generated information. Code that works with sysfs should include <linux/sysfs.h>.

Kobject's Attributes

  • Default Attributes
  • Every created kobject is given a set of default attributes. These attributes are specified in the kobj_type structure. The kobj_type->default_attrs field lists the attributes to be created for every kobject of this type, and kobj_type->sysfs_ops provides the methods to implement those attributes. linux/sysfs.h:
    
    struct attribute {
     const char  *name;
     umode_t   mode;
    #ifdef CONFIG_DEBUG_LOCK_ALLOC
     bool   ignore_lockdep:1;
     struct lock_class_key *key;
     struct lock_class_key skey;
    #endif
    };
    
    struct sysfs_ops {
     ssize_t (*show)(struct kobject *, struct attribute *, char *);
     ssize_t (*store)(struct kobject *, struct attribute *, const char *, size_t);
    };
    
    • name
    • the name of the attribute appeared within the kobject’s sysfs directory.
    • mode
    • the protection bits that are to be applied to this attribute. S_IRUGO for read-only attributes; S_IWUSR to give write access to root only.
    The last entry in the default_attrs list must be zero-filled. Whenever an attribute is read from user space, the show() method is called with a pointer to the kobject and the appropriate attribute structure. That show() method should encode the value of the given attribute into buffer , being sure not to overrun it (it is PAGE_SIZE bytes), and return the actual length of the returned data. Each attribute should contain a single, human-readable value; if you have a lot of information to return, you may want to consider splitting it into multiple attributes. The store() method is similar.
  • Nondefault Attributes
  • Kobject's attributes can be added and removed to kobjects at will. If you wish to add a new attribute to a kobject’s sysfs directory, simply fill in an attribute structure and pass it to:
    
     int  sysfs_create_file(struct kobject *kobj, const struct attribute *attr)
    
    To remove an attribute, call:
    
    int sysfs_remove_file(struct kobject *kobj, struct attribute *attr);
    
  • Binary Attributes
  • When data must be passed, untouched, between user space and the device, attributes may be chunks of binary data. For example, uploading firmware to devices requires this feature. Binary attributes are described with a bin_attribute structure:
    
    struct bin_attribute {
     struct attribute attr;
     size_t   size;
     void   *private;
     ssize_t (*read)(struct file *, struct kobject *, struct bin_attribute *,
       char *, loff_t, size_t);
     ssize_t (*write)(struct file *, struct kobject *, struct bin_attribute *,
        char *, loff_t, size_t);
     int (*mmap)(struct file *, struct kobject *, struct bin_attribute *attr,
          struct vm_area_struct *vma);
    };
    
    The read and write methods work similarly to the normal char driver equivalents; they can be called multiple times for a single load with a maximum of one page worth of data in each call. The code implementing a binary attribute must be able to determine the end of the data some other way.
    • To create a binary attribute
    • 
      int sysfs_create_bin_file(struct kobject *kobj, const struct bin_attribute *attr);
      
      
    • To remove a binary attribute
    • 
      void sysfs_remove_bin_file(struct kobject *kobj, const struct bin_attribute *attr);
      
      
Showing additional relationships between different folders in sysfs requires extra pointers. These are implemented through symbolic links. For ex,

$ ls -l /sys/bus/event_source/devices
total 0
lrwxrwxrwx 1 root root 0  五  14 13:32 breakpoint -> ../../../devices/breakpoint
lrwxrwxrwx 1 root root 0  五  14 13:32 cpu -> ../../../devices/cpu
lrwxrwxrwx 1 root root 0  五  14 13:32 cstate_core -> ../../../devices/cstate_core
lrwxrwxrwx 1 root root 0  五  14 13:32 cstate_pkg -> ../../../devices/cstate_pkg
lrwxrwxrwx 1 root root 0  五  14 13:32 kprobe -> ../../../devices/kprobe
lrwxrwxrwx 1 root root 0  五  14 13:32 msr -> ../../../devices/msr
lrwxrwxrwx 1 root root 0  五  14 13:32 software -> ../../../devices/software
lrwxrwxrwx 1 root root 0  五  14 13:32 tracepoint -> ../../../devices/tracepoint
lrwxrwxrwx 1 root root 0  五  14 13:32 uncore -> ../../../devices/uncore
lrwxrwxrwx 1 root root 0  五  14 13:32 uprobe -> ../../../devices/uprobe
  • Creating a symbolic link within sysfs
  • 
    int sysfs_create_link(struct kobject *kobj, struct kobject *target, const char *name);
    
  • Removing a symbolic link within sysfs
  • 
    void sysfs_remove_link(struct kobject *kobj, const char *name);
    

Hotplug Event Generation

A hotplug event is a notification to user space from the kernel that something has changed in the system’s configuration. They are generated whenever a kobject is created or destroyed. The actual event generation takes place when a kobject is passed to kobject_add() or kobject_del(). Before the event is handed to user space, code associated with the kobject (or, more specifically, the kset to which it belongs) has the opportunity to add information for user space or to disable event generation entirely.

Kernel Hotplug and User Mode Helper (/sbin/hotplug)

There is a kernel parameter: /proc/sys/kernel/hotplug, which normally holds the pathname /sbin/hotplug. That parameter names a program which can be invoked by any subsystem as part of its reaction to a configuration change, from a thread in that subsystem. Whenever a new device is discovered, the kernel spawns a new process that executes the User Mode program specified in /sbin/hotplug, passing to it any useful information on the discovered device as environment variables. If udev is installed, the script also creates the proper device file in the /dev directory.

It's possible to disable the user mode helper hotplug helper by writing an empty string into /proc/sys/kernel/hotplug. Instead udev, as the "successor" of the old 'user mode hotplug helper', listens on a netlink socket and gets notified by the kernel about hotplug events.

Buses, Devices, and Drivers

Details at this level are generally handled at the bus level, and few authors need to add a new bus type.

Buses

A bus is a channel between the processor and one or more devices. All devices are connected via a bus, even if it is an internal, virtual, “platform” bus. Buses can plug into each other—a USB controller is usually a PCI device, for example. The device model represents the actual connections between buses and the devices they control. A bus is represented by the bus_type structure:

struct bus_type {
 const char  *name;
 const char  *dev_name;
 struct device  *dev_root;
 const struct attribute_group **bus_groups;
 const struct attribute_group **dev_groups;
 const struct attribute_group **drv_groups;

 int (*match)(struct device *dev, struct device_driver *drv);
 int (*uevent)(struct device *dev, struct kobj_uevent_env *env);
 int (*probe)(struct device *dev);
 void (*sync_state)(struct device *dev);
 int (*remove)(struct device *dev);
 void (*shutdown)(struct device *dev);
...
};
where
  • name
  • The name of the bus.
  • dev_name
  • Used for subsystems to enumerate devices like ("foo%u", dev->id).
  • dev_root
  • Default device to use as the parent.
  • bus_groups
  • Default attributes of the bus.
  • dev_groups
  • Default attributes of the devices on the bus.
  • drv_groups
  • Default attributes of the device drivers on the bus.
  • match()
  • Called, perhaps multiple times, whenever a new device or driver is added for this bus. It should return a positive value if the given device can be handled by the given driver and zero otherwise. It may also return error code if determining that the driver supports the device is not possible. In case of -EPROBE_DEFER it will queue the device for deferred probing.
  • uevent()
  • Called when a device is added, removed, or a few other things that generate uevents to add the environment variables.
  • probe()
  • Called when a new device or driver add to this bus, and callback the specific driver's probe to initial the matched device.
  • sync_state()
  • Called to sync device state to software state after all the state tracking consumers linked to this device (present at the time of late_initcall) have successfully bound to a driver. If the device has no consumers, this function will be called at late_initcall_sync level. If the device has consumers that are never bound to a driver, this function will never get called until they do.
  • remove()
  • Called when a device removed from this bus.
  • shutdown()
  • Called at shut-down time to quiesce the device.
The name field is the name of the bus which are found underneath the /sys/bus.

Bus registration


int bus_register(struct bus_type *bus);
void bus_unregister(struct bus_type *bus);

A new bus must be registered via a call to bus_register(). For ex., to register a dummy bus:

struct bus_type dummy_bus_type = {
  .name = "dummy",
  .match = dummy_match,
  .uevent = dummy_uevent,
};

ret = bus_register(&dummy_bus_type);
if (ret)
    return ret;

If it succeeds, the new bus subsystem has been added to the system; it is visible in sysfs under /sys/bus, and it is possible to start adding devices.

Bus methods

  • match()
  • When real hardware is involved, the match function usually makes some sort of comparison between the hardware ID provided by the device itself and the IDs supported by the driver. The dummy driver has a very simple match function, which simply compares the driver and device names:
    
    static int dummy_match(struct device *dev, struct device_driver *driver)
    {
        return !strncmp(dev->bus_id, driver->name, strlen(driver->name));
    }
    
  • uevent()
  • linux/kobject.h:
    
    struct kobj_uevent_env {
     char *argv[3];
     char *envp[UEVENT_NUM_ENVP];
     int envp_idx;
     char buf[UEVENT_BUFFER_SIZE];
     int buflen;
    };
    
    Here, we just add the current revision number in the udev environment
    
    static int dummy_uevent(struct device *dev, struct kobj_uevent_env *env)
    {
        env->envp[0] = buffer;
        if ( snprintf(env->buffer, env->buflen, "DUMMYBUS_VERSION=%s",Version) >= buffer_size )
            return -ENOMEM;
        env->envp[1] = NULL;
        return 0;
    }
    

Iterating over devices and drivers

The bus driver always needs to perform some operation on all devices or drivers that have been registered with. Some helper functions are provided:
  • device iterator
  • To operate on every device known to the bus
    
    int bus_for_each_dev(struct bus_type *bus, struct device *start,
           void *data, int (*fn)(struct device *, void *))
    
    • bus
    • bus type.
    • start
    • device to start iterating from.
    • data
    • data for the callback.
    • fn
    • function to be called for each device.
    Iterate over bus's list of devices, and call fn() for each, passing it data. If start is not NULL, we use that device to begin iterating from. We check the return of fn() each time. If it returns anything other than 0, we break out and return that value.
  • driver iterator
  • To operate on every drivers known to the bus
    
    int bus_for_each_drv(struct bus_type *bus, struct device_driver *start,
           void *data, int (*fn)(struct device_driver *, void *))
    

Bus attributes

linux/device/bus.h:

struct bus_attribute {
 struct attribute attr;
 ssize_t (*show)(struct bus_type *bus, char *buf);
 ssize_t (*store)(struct bus_type *bus, const char *buf, size_t count);
};

#define BUS_ATTR_RW(_name) \
 struct bus_attribute bus_attr_##_name = __ATTR_RW(_name)
#define BUS_ATTR_RO(_name) \
 struct bus_attribute bus_attr_##_name = __ATTR_RO(_name)
#define BUS_ATTR_WO(_name) \
 struct bus_attribute bus_attr_##_name = __ATTR_WO(_name)

int bus_create_file(struct bus_type *,
     struct bus_attribute *);
void bus_remove_file(struct bus_type *, struct bus_attribute *);
The bus_attribute type includes two methods ( show() and store()) for displaying and setting the value of the attribute. Most device model layers above the kobject level work this way. Any attributes belonging to a bus should be created explicitly with bus_create_file().

Devices

At the lowest level, every device in a Linux system is represented by an instance of struct device. The device structure contains the information that the device model core needs to model the system. Most subsystems, however, track additional information about the devices they host. As a result, it is rare for devices to be represented by bare device structures; instead, that structure, like kobject structures, is usually embedded within a higher-level representation of the device.

struct device {
 struct kobject kobj;
 struct device  *parent;
 struct device_private *p;
 const char  *init_name; /* initial name of the device */
 const struct device_type *type;
 struct bus_type *bus;  /* type of bus device is on */
 struct device_driver *driver; /* which driver has allocated this device */
 void  *platform_data; /* Platform specific data, device core doesn't touch it */
 void  *driver_data; /* Driver data, set and get with dev_set_drvdata/dev_get_drvdata */
...
 void (*release)(struct device *dev);
...
  • kobj
  • The kobject that represents this device and links it into the hierarchy. Note that, as a general rule, device->kobj->parent is equal to &device->parent->kobj .
  • parent
  • The device's "parent" device, the device to which it is attached. In most cases, a parent device is some sort of bus or host controller. If parent is NULL, the device, is a top-level device, which is not usually what you want.
  • type
  • The type of device. This identifies the device type and carries type-specific information.
  • bus
  • Type of bus device is on.
  • driver
  • Which driver has allocated this
  • driver_data
  • Private pointer for driver specific info that may be used by the device driver.
  • release()
  • Callback to free the device after all references have gone away. This should be set by the allocator of the device (i.e. the bus driver that discovered the device). This is called from the embedded kobject’s release() method. All device structures registered with the core must have a release() method, or the kernel prints out scary complaints.

Device registration


int device_register(struct device *dev);
void device_unregister(struct device *dev);

Device attributes


/* interface for exporting device attributes */
struct device_attribute {
 struct attribute attr;
 ssize_t (*show)(struct device *dev, struct device_attribute *attr,
   char *buf);
 ssize_t (*store)(struct device *dev, struct device_attribute *attr,
    const char *buf, size_t count);
};

int device_create_file(struct device *device,
         const struct device_attribute *entry);
void device_remove_file(struct device *dev,
          const struct device_attribute *attr);

Device structure embedding

The device structure contains the information that the device model core needs to model the system. Like kobject structures, it is usually embedded within a higher-level representation of the device. For ex., a struct device buried inside the definitions of struct pci_dev or struct usb_device.

Device Drivers

The device model tracks all of the drivers known to the system. The main reason for this tracking is to enable the driver core to match up drivers with new devices. Once drivers are known objects within the system, however, a number of other things become possible. Device drivers can export information and configuration variables that are independent of any specific device. linux/device.h:

struct device_driver {
 const char  *name;
 struct bus_type  *bus;
 struct module  *owner;
 const char  *mod_name; /* used for built-in modules */
 bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
 enum probe_type probe_type;
 const struct of_device_id *of_match_table;
 const struct acpi_device_id *acpi_match_table;

 int (*probe) (struct device *dev);
 void (*sync_state)(struct device *dev);
 int (*remove) (struct device *dev);
 void (*shutdown) (struct device *dev);
 int (*suspend) (struct device *dev, pm_message_t state);
 int (*resume) (struct device *dev);
 const struct attribute_group **groups;
 const struct attribute_group **dev_groups;

 const struct dev_pm_ops *pm;
 void (*coredump) (struct device *dev);

 struct driver_private *p;
};
where:
  • name
  • Name of the device driver.
  • bus
  • The bus which the device of this driver belongs to.
  • owner
  • The module owner.
  • mod_name
  • Used for built-in modules.
  • suppress_bind_attrs
  • Disables bind/unbind via sysfs.
  • probe_type
  • Type of the probe (synchronous or asynchronous) to use.
  • of_match_table
  • The open firmware table.
  • acpi_match_table
  • The ACPI match table.
  • probe
  • Called to query the existence of a specific device, whether this driver can work with it, and bind the driver to a specific device.
  • sync_state
  • Called to sync device state to software state after all the state tracking consumers linked to this device (present at the time of late_initcall) have successfully bound to a driver. If the device has no consumers, this function will be called at late_initcall_sync level. If the device has consumers that are never bound to a driver, this function will never get called until they do.
  • remove
  • Called when the device is removed from the system to unbind a device from this driver.
  • shutdown
  • Called at shut-down time to quiesce the device.
  • suspend
  • Called to put the device to sleep mode. Usually to a low power state.
  • resume
  • Called to bring a device from sleep mode.
  • groups
  • Default attributes that get created by the driver core automatically.
  • dev_groups
  • Additional attributes attached to device instance once the it is bound to the driver.
  • pm
  • Power management operations of the device which matched this driver.
  • coredump
  • Called when sysfs entry is written to. The device driver is expected to call the dev_coredump API resulting in a uevent.
  • p
  • Driver core's private data, no one other than the driver core can touch this.
The registration functions are:

 int  driver_register(struct device_driver *drv);
 void driver_unregister(struct device_driver *drv);
The device driver's attribute structure and functions:

struct driver_attribute {
 struct attribute attr;
 ssize_t (*show)(struct device_driver *driver, char *buf);
 ssize_t (*store)(struct device_driver *driver, const char *buf,
    size_t count);
};

int driver_create_file(struct device_driver *driver,
     const struct driver_attribute *attr);
void driver_remove_file(struct device_driver *driver,
          const struct driver_attribute *attr);

Driver structure embedding

Classes

A class is a higher-level view of a device that abstracts out low-level implementation details.
Almost all classes show up in sysfs under /sys/class. The one exception is block devices, which can be found under /sys/block for historical reasons.
For example, all network interfaces can be found under /sys/class/net.
Drivers may see a SCSI disk or an ATA disk, but, at the class level, they are all simply disks.
Class membership is usually handled by high-level code without the need for explicit support from device drivers.
When a subsystem creates a class, it owns the class entirely, so there is no need to worry about which module owns the attributes found there.
The driver core exports two distinct interfaces for managing classes: the class_simple_* routines and the full class interface.

the class_simple_* routines

the full class interface

linux/device/class.h:

struct class {
 const char  *name;
 struct module  *owner;

 const struct attribute_group **class_groups;
 const struct attribute_group **dev_groups;
 struct kobject   *dev_kobj;

 int (*dev_uevent)(struct device *dev, struct kobj_uevent_env *env);
 char *(*devnode)(struct device *dev, umode_t *mode);

 void (*class_release)(struct class *class);
 void (*dev_release)(struct device *dev);
...
};

  • name
  • Name of the class. This is how this class appears under /sys/class
  • owner
  • The module owner.
  • class_groups
  • Default attributes of this class.
  • dev_groups
  • Default attributes of the devices that belong to the class.
  • dev_kobj
  • The kobject that represents this class and links it into the hierarchy.
  • dev_uevent()
  • Called when a device is added, removed from this class, or a few other things that generate uevents to add the environment variables.
  • devnode()
  • Callback to provide the devtmpfs.
  • class_release()
  • Called to release this class.
  • dev_release
  • Called to release the device.
struct attribute_group is the data structure used to declare an attribute group.

struct attribute_group {
 const char  *name;
 umode_t   (*is_visible)(struct kobject *,
           struct attribute *, int);
 umode_t   (*is_bin_visible)(struct kobject *,
        struct bin_attribute *, int);
 struct attribute **attrs;
 struct bin_attribute **bin_attrs;
};
  • Creating/Destroying the class
  • 
    struct class *  __class_create(struct module *owner,
            const char *name,
            struct lock_class_key *key);
    void class_destroy(struct class *cls);
    
    /* This is a #define to keep the compiler from merging different
     * instances of the __key variable */
    #define class_create(owner, name)  \
    ({      \
     static struct lock_class_key __key; \
     __class_create(owner, name, &__key); \
    })
    
    class_create() is used to create a struct class pointer that can then be used in calls to device_create().
    device_create() creates a device and registers it with sysfs.
    
           struct device * device_create(struct class * class, struct device * parent, dev_t devt, void * drvdata, const char * fmt, ...);
    	
    • class
    • pointer to the struct class that this device should be registered to
    • parent
    • pointer to the parent struct device of this new device, if any
    • devt
    • the dev_t for the char device to be added
    • drvdata
    • the data to be added to the device for callbacks
    • fmt
    • string for the device's name
    This function can be used by char device classes.
    A struct device will be created in sysfs, registered to the specified class.
    A "dev" file will be created, showing the dev_t for the device, if the dev_t is not 0,0.
    If a pointer to a parent struct device is passed in, the newly created struct device will be a child of that device in sysfs.
    The pointer to the struct device will be returned from the call.
    Any further sysfs files that might be required can be created using this pointer.
  • Registering/Unregistering the class
  • 
    int __class_register(struct class *class, struct lock_class_key *key);
    #define class_register(class)   \
    ({      \
     static struct lock_class_key __key; \
     __class_register(class, &__key); \
    })
    
    void class_unregister(struct class *class);
    
  • Working with class_attribute
  • 
    struct class_attribute {
     struct attribute attr;
     ssize_t (*show)(struct class *class, struct class_attribute *attr,
       char *buf);
     ssize_t (*store)(struct class *class, struct class_attribute *attr,
       const char *buf, size_t count);
    };
    
    int class_create_file(struct class *class,
       const struct class_attribute *attr)
    void class_remove_file(struct class *class,
              const struct class_attribute *attr)
    

Class devices

Not found.

Class interfaces

The class subsystem has a mechanism called an interface, a sort of trigger mechanism that can be used to get notification when devices enter or leave the class.

struct class_interface {
 struct list_head node;
 struct class  *class;

 int (*add_dev)  (struct device *, struct class_interface *);
 void (*remove_dev) (struct device *, struct class_interface *);
};
Interfaces can be registered and unregistered with:

int class_interface_register(struct class_interface *);
void class_interface_unregister(struct class_interface *);
Whenever a class device is added to the class specified in the class_interface structure, the interface’s add_dev() function is called. That function can perform any additional setup required for that device; this setup often takes the form of adding more attributes, but other applications are possible. When the device is removed from the class, the remove method is called to perform any required cleanup. Multiple interfaces can be registered for a class.

Putting It All Together

It's better to use the PCI subsystem to understand what the driver model does.

Driver Core


struct bus_type {
  ...
  struct device  *dev_root;
  ...
};

struct device_driver {
 const char  *name;
 struct bus_type  *bus;
 ...
};

struct device {
  struct kobject kobj;
  struct device  *parent;
  ...
  
};

PCI Core: Bus Initialization and Registration

Before devices and their drivers can be registered, a bus is required. The kernel initialization mechanisms uses a technique for creating a segment of function pointers which can be called at a later time in the order that they are inserted into the segment. The kernel uses this mechanism to call device driver initialization routines at boot-up. ( NOTE: This technique depends on, and requires the use of, the GNU compiler and linker tools. Additionally, the ELF binary executable format must be used. This is not a general-puropse ANSI-C compliant technique. ) The PCI subsystem declares a single struct bus_type called pci_bus_type , drivers/pci/pci-driver.c:

struct bus_type pci_bus_type = {
 .name  = "pci",
 .match  = pci_bus_match,
 .uevent  = pci_uevent,
 .probe  = pci_device_probe,
 .remove  = pci_device_remove,
 .shutdown = pci_device_shutdown,
 .dev_groups = pci_dev_groups,
 .bus_groups = pci_bus_groups,
 .drv_groups = pci_drv_groups,
 .pm  = PCI_PM_OPS_PTR,
 .num_vf  = pci_bus_num_vf,
 .dma_configure = pci_dma_configure,
};


static int __init pci_driver_init(void)
{
 int ret;

 ret = bus_register(&pci_bus_type);
 if (ret)
  return ret;

#ifdef CONFIG_PCIEPORTBUS
 ret = bus_register(&pcie_port_bus_type);
 if (ret)
  return ret;
#endif
 dma_debug_add_bus(&pci_bus_type);
 return 0;
}
postcore_initcall(pci_driver_init);
bus_register() register a driver-core subsystem. Once we have that, we register the bus with the kobject infrastructure, then register the children subsystems it has: the devices and drivers that belong to the bus subsystem.

int bus_register(struct bus_type *bus)
{
  ...
  // new bus is added to the bus subsystem
  retval = kobject_set_name(&priv->subsys.kobj, "%s", bus->name);
  ...
  priv->subsys.kobj.kset = bus_kset;
  priv->subsys.kobj.ktype = &bus_ktype;
  priv->drivers_autoprobe = 1;

  retval = kset_register(&priv->subsys);
  ...
  // registers kernel sets for devices and device drivers
  priv->devices_kset = kset_create_and_add("devices", NULL,
       &priv->subsys.kobj);
  ...
  priv->drivers_kset = kset_create_and_add("drivers", NULL,
       &priv->subsys.kobj);
}
bus_register() creates a sysfs directory /sys/bus/pci that consists of two directories: devices and drivers.

PCI Device Driver Initialization

All PCI device drivers must define a struct pci_driver variable that defines the different functions that this PCI device driver can do. linux/pci.h:

struct pci_driver {
 struct list_head node;
 const char  *name;
 const struct pci_device_id *id_table; /* Must be non-NULL for probe to be called */
 int  (*probe)(struct pci_dev *dev, const struct pci_device_id *id); /* New device inserted */
 void (*remove)(struct pci_dev *dev); /* Device removed (NULL if not a hot-plug capable driver) */
 int  (*suspend)(struct pci_dev *dev, pm_message_t state); /* Device suspended */
 int  (*resume)(struct pci_dev *dev); /* Device woken up */
 void (*shutdown)(struct pci_dev *dev);
 int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
 const struct pci_error_handlers *err_handler;
 const struct attribute_group **groups;
 struct device_driver driver;
 struct pci_dynids dynids;
};
That structure contains a struct device_driver that is then initialized by the PCI core when the PCI device driver is registered. struct device_driver is the basic device driver structure. The device driver-model tracks all of the drivers known to the system. This tracking is to enable the driver core to match up drivers with new devices. linux/device/driver.h:

struct device_driver {
 const char  *name;
 struct bus_type  *bus;

 struct module  *owner;
 const char  *mod_name; /* used for built-in modules */

 bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
 enum probe_type probe_type;

 const struct of_device_id *of_match_table;
 const struct acpi_device_id *acpi_match_table;

 int (*probe) (struct device *dev);
 void (*sync_state)(struct device *dev);
 int (*remove) (struct device *dev);
 void (*shutdown) (struct device *dev);
 int (*suspend) (struct device *dev, pm_message_t state);
 int (*resume) (struct device *dev);
 const struct attribute_group **groups;
 const struct attribute_group **dev_groups;

 const struct dev_pm_ops *pm;
 void (*coredump) (struct device *dev);

 struct driver_private *p;
};
When a PCI device driver is registered to the kernel, the struct device_driver of the struct pci_driver is then initialized by the PCI core:

#define pci_register_driver(driver)  \
 __pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
 
int __pci_register_driver(struct pci_driver *drv, struct module *owner,
     const char *mod_name)
{
 /* initialize common driver fields */
 drv->driver.name = drv->name;
 drv->driver.bus = &pci_bus_type;
 drv->driver.owner = owner;
 drv->driver.mod_name = mod_name;
 drv->driver.groups = drv->groups;

 spin_lock_init(&drv->dynids.lock);
 INIT_LIST_HEAD(&drv->dynids.list);

 /* register with core */
 return driver_register(&drv->driver);
}

int driver_register(struct device_driver *drv)
{
  ...
  bus_add_driver(drv);
  driver_add_groups(drv, drv->groups);
  kobject_uevent(&drv-gt;p->kobj, KOBJ_ADD);
  return ret;
}
EXPORT_SYMBOL_GPL(driver_register);
After the call to pci_register_driver(), the driver is now ready to be bound to any PCI devices it supports.

PCI Device Registration

The PCI core (the architecture-specific code that actually talks to the PCI bus) starts probing the PCI address space, looking for all PCI devices. When a PCI device is found, the PCI core creates a new variable in memory of type struct pci_dev. :

/* The pci_dev structure describes PCI devices */
struct pci_dev {
 struct list_head bus_list; /* Node in per-bus list */
 struct pci_bus *bus;  /* Bus this device is on */
...
 unsigned int devfn;  /* Encoded device & function index */
 unsigned short vendor;
 unsigned short device;
...
 struct pci_driver *driver; /* Driver bound to this device */
...
 struct device dev;   /* Generic device interface */
};
After the PCI device structure is initialized, the device is registered with the driver core with a call to:

int device_register(struct device *dev)
{
 device_initialize(dev); // adds the new device to the device subsystem
 return device_add(dev); // Registering the device in the devices subsystem
}

pci_register_host_bridge()
  ==> device_register(&bridge->dev); device_register(&bus->dev);
pci_alloc_child_bus()
  ==> device_register(&child->dev);

device_add() calls bus_add_device() to adds links within sysfs — one in the bus directory that points to the device, and one in the device directory which points to the bus subsystem. bus_attach_device() tries to autoprobe the device.

int device_add(struct device *dev)
{
 ...
 error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
 ...
 error = bus_add_device(dev);
 ...
 kobject_uevent(&dev->kobj, KOBJ_ADD);
 ...
 bus_probe_device(dev);
 ...
 
}
/*
 * bus_probe_device - probe drivers for a new device
 */
void bus_probe_device(struct device *dev)
{
...
 if (bus->p->drivers_autoprobe)
  device_initial_probe(dev);
 ...
}

void device_initial_probe(struct device *dev)
{
 __device_attach(dev, true);
}

/**
 * device_attach - try to attach device to a driver.
 *
 * Walk the list of drivers that the bus has and call
 * driver_probe_device() for each pair. If a compatible
 * pair is found, break out and return.
 *
 * Returns 1 if the device was bound to a driver;
 * 0 if no matching driver was found;
 * -ENODEV if the device is not registered.
 *
 * When called for a USB interface, dev->parent lock must be held.
 */
int device_attach(struct device *dev)
{
 return __device_attach(dev, false);
}
EXPORT_SYMBOL_GPL(device_attach);

static int __device_attach(struct device *dev, bool allow_async)
{
 ...
 if (dev->driver) {
   if (device_is_bound(dev)) {
     ret = 1;
     goto out_unlock;
   }
   ret = device_bind_driver(dev);
     if (ret == 0)
       ret = 1;
     else {
       dev->driver = NULL;
       ret = 0;
     }
 } else {
   struct device_attach_data data = {
   .dev = dev,
   .check_async = allow_async,
   .want_async = false,
   };
   ...
   ret = bus_for_each_drv(dev->bus, NULL, &data,
                          __device_attach_driver);
...
}

Hotplug

There are two different layers to handle hotplugging.
  • kernel space
  • The kernel views hotplugging as an interaction between the hardware, the kernel, and the kernel driver.
  • user space
  • The program /sbin/hotplug is called by the kernel when it wants to notify user space that some type of hotplug event has just happened within the kernel.

Dynamic Devices

“hotplug” means to handle devices appearing or disappearing while the system is powered on. Each different bus type handles the loss of a device in a different way.

The /sbin/hotplug Utility

When a hardware state change occurs, the kernel notifies the hotplugging handler (specified in /proc/sys/kernel/hotplug). This handler is set to /sbin/hotplug by default. The program /sbin/hotplug is typically a very small bash script that merely passes execution on to a list of other programs that are placed in the /etc/hot-plug.d/ directory tree. For most Linux distributions, this script looks like the following:

DIR="/etc/hotplug.d"
for I in "${DIR}/$1/"*.hotplug "${DIR}/"default/*.hotplug ; do
  if [ -f $I ]; then
    test -x $I && $I $1 ;
  fi
done
exit 1
The script searches for all programs bearing a .hotplug suffix that might be interested in this event and invokes them, passing to them a number of different environment variables that have been set by the kernel. Whenever a kobject is created or destroyed, the hotplug program is called with a single command-line argument providing a name for the event. A series of environment variables with information on what has just occurred are provided, the hotplug programs determine what has just happened in the kernel, and if there is any specific action that should take place. The formerly used hotplug package is entirely replaced by the udev and the udev-related kernel infrastructure. The hotplug infrastructure have been made obsolete or had their functionality taken over by udev.

udev

For more information about the udev infrastructure, refer to the following man pages:
  • udev
  • General information about udev, keys, rules, and other important configuration issues.
  • udevinfo
  • udevinfo can be used to query device information from the udev database.
  • udevd
  • Information about the udev event managing daemon.
  • udevmonitor
  • udevmonitor prints the kernel and udev event sequence to the console. This tool is mainly used for debugging purposes.

Dealing with Firmware

To save the cost in using EEPROM to store the firmware, the operating system is responsible for conveying the firmware to the device itself.
Drivers containing wired-in firmware are unlikely to be accepted into the mainline kernel to avoid the firmware's license issue.

The Kernel Firmware Interface

The proper solution is to obtain the firmware from user space when you need it.
the correct approach is to use the firmware interface,

Synchronous firmware requests

This kind of request asks user space to help, it is guaranteed to sleep before returning.
  • int request_firmware(const struct firmware ** firmware_p, const char * name, struct device * device)
  • This send firmware request and wait for it.
    I should be called from user context where sleeping is allowed.
    • firmware_p
    • pointer to firmware image.
      
      struct firmware {
          size_t size;
          u8 *data;
      };
          	
    • name
    • name of firmware file.
      name will be used as $FIRMWARE in the uevent environment and should be distinctive enough not to be confused with any other firmware image for this or any other device.
    • device
    • device for which firmware is being loaded
    You would typically load firmware and then load it into your device somehow.
    The typical firmware work flow is reflected below:
    
    if ( request_firmware(&fw_entry, $FIRMWARE, device) == 0)
           copy_fw_to_device(fw_entry->data, fw_entry->size);
    release_firmware(fw_entry);
        
    If the firmware is successfully loaded, the return value is 0 (otherwise the usual error code is returned).
    Device firmware usually contains identification strings, checksums, and so on; check them all before trusting the data.
  • void release_firmware(struct firmware *fw)
  • After you have sent the firmware to the device, you should release the in-kernel structure

Asynchronous firmware requests

Asynchronous firmware requests allow driver code to not have to wait until the firmware or an error is returned.
Function callbacks are provided so that when the firmware or an error is found the driver is informed through the callback.
  • int request_firmware_nowait(struct module * module, bool uevent, const char * name, struct device * device, gfp_t gfp, void * context, void (*cont) (const struct firmware *fw, void *context)
    • struct module * module
    • module requesting the firmware.
      This almost always be THIS_MODULE.
    • bool uevent
    • sends uevent to copy the firmware image if this flag is non-zero else the firmware copy must be done manually.
    • const char * name
    • name of firmware file
    • struct device * device
    • device for which firmware is being loaded
    • gfp_t gfp
    • allocation flags
    • void * context
    • a private data pointer that is not used by the firmware subsystem. will be passed over to cont,
    • void (*)(const struct firmware *fw, void *context) cont
    • The callback function will be called asynchronously when the firmware request is over and fw may be NULL if firmware request fails.
    request_firmware_nowait() cannot be called in atomic contexts.

How It Works

The firmware subsystem works with sysfs and the hotplug mechanism.
When a call is made to request_firmware, a new directory is created under /sys/class/firmware using your device’s name.
That directory contains three attributes:
  • loading
  • This attribute should be set to one by the user-space process that is loading the firmware.
    When the load process is complete, it should be set to 0.
    Writing a value of -1 to loading aborts the firmware loading process.
  • data
  • data is a binary attribute that receives the firmware data itself.
    After setting loading , the user-space process should write the firmware to this attribute.
  • device
  • This attribute is a symbolic link to the associated entry under /sys/devices.
If a firmware request is not serviced within 10 seconds, the kernel gives up and returns a failure status to the driver.
That time-out period can be changed via the sysfs attribute /sys/class/firmware/timeout.
Copying and distributing their firmware without permission is a violation of copyright law and an invitation for trouble.

留言

熱門文章