Netlink
Netlink Library (libnl): Developer Guide
1. Introduction
The core library contains the fundamentals required to communicate over netlink sockets.It deals with:
- connecting and disconnectng of sockets
- sending and receiving of data
- construction and parsing of messages
- provides a customizeable receiving state machine
- provides a abstract data type framework which eases the implementation of object based netlink protocols where objects are added, removed, or modified using a netlink based protocol
- Netlink Library (libnl) Socket handling, sending and receiving, message construction and parsing, …
- Routing Family Library (libnl-route) Adresses, links, neighbours, routing, traffic control, neighbour tables, …
- Netfilter Library (libnl-nf) Connection tracking, logging, queueing
- Generic Netlink Library (libnl-genl) Controller API, family and command registration
2. Netlink Protocol Fundamentals
3. Netlink Sockets
4. Sending and Receiving of Messages / Data
5. Message Parsing and Construction
6. Attributes
Any form of payload should be encoded as netlink attributes whenever possible.
6.1. Attribute Format
Netlink attributes allow for any number of data chunks of arbitary length to be attached to a netlink message.Every attribute is encoded with a type and length field, both 16 bits.
The length of an attribute is used to calculate the offset to the next attribute.
This information is stored in the attribute header (struct nlattr) preceding the attribute payload.
<----------- nla_total_size(payload) -----------> <---------- nla_size(payload) -----------> +-----------------+- - -+- - - - - - - - - +- - -+-----------------+- - - | struct nlattr | Pad | Payload | Pad | struct nlattr | +-----------------+- - -+- - - - - - - - - +- - -+-----------------+- - - <---- NLA_HDRLEN -----> <--- NLA_ALIGN(len) ---> <---- NLA_HDRLEN ---Every attribute must start at an offset which is a multiple of NLA_ALIGNTO (4 bytes).
6.2. Parsing Attributes
Splitting an Attributes Stream into Attributes
The pointer returned by nlmsg_attrdata() points to the first attribute header.struct nlattr* nlmsg_attrdata( const struct nlmsghdr *nlh, int hdrlen )Any subsequent attribute is accessed with the function nla_next() based on the previous header.
#include <netlink/attr.h> struct nlattr *nla_next(const struct nlattr *attr, int *remaining);The semantics are equivalent to nlmsg_next() and thus nla_next() will also subtract the size of the previous attribute from the remaining number of bytes in the attributes stream.
To check whether another attribute follows or not, the function nla_ok() exists to determine whether another attribute fits into the remaining number of bytes or not.
A typical use of nla_ok() and nla_next() looks like this:
#include <netlink/msg.h> #include <netlink/attr.h> struct nlattr *hdr = nlmsg_attrdata(msg, 0); int remaining = nlmsg_attrlen(msg, 0); while (nla_ok(hdr, remaining)) { /* parse attribute here */ hdr = nla_next(hdr, &remaining); };
Accessing Attribute Header and Payload
Attribute Validation
Parsing Attributes the Easy Way
Locating a Single Attribute
Iterating over a Stream of Attributes
6.3. Attribute Construction
The interface to add attributes to a netlink message is based on the regular message construction interface.It assumes that the message header and an eventual protocol header has been added to the message already.
The function nla_reserve() adds an attribute header at the end of the message and reserves room for len bytes of payload.
struct nlattr *nla_reserve(struct nl_msg *msg, int attrtype, int len);The function nla_put() is based on nla_reserve() but takes an additional pointer data pointing to a buffer containing the attribute payload.
int nla_put(struct nl_msg *msg, int attrtype, int attrlen, const void *data);Example:
struct my_attr_struct { uint32_t a; uint32_t b; }; int my_put(struct nl_msg *msg) { struct my_attr_struct obj = { .a = 10, .b = 20, }; return nla_put(msg, ATTR_MY_STRUCT, sizeof(obj), &obj); }
libnl: APIs
Introduction
libnl is a set of libraries to deal with the netlink protocol and some of the high level protocols implemented on top of it.The goal is to provide APIs on different levels of abstraction.
The core library libnl provides a fundamental set of functions to deal with sockets, construct messages, and send/receive those messages.
Modules
Command Line Interface API
Core Library (libnl)
Socket handling, connection management, sending and receiving of data, message construction and parsing, object caching system, ...
Caching System
Related sections in the development guide:Callbacks/Customization
Related sections in the development guide:Data Types
Core library data types.Message Construction and Parsing
Netlink Attributes Construction/Parsing Interface.- Data Structures
- Attribute Size Calculation
- Parsing Attributes
- Helper Functions
- Unspecific Attribute
- Integer Attributes
- String Attribute
- Flag Attribute
- Microseconds Attribute
- Nested Attribute
- Basic Attribute Data Types
- Attribute Construction (Exception Based)
Send and Receive Data
Connection management, sending and receiving of data.Utilities
Collection of helper functions.How to use netlink socket to communicate with a kernel module?
#include <linux/module.h> #include <net/sock.h> #include <linux/netlink.h> #include <linux/skbuff.h> #define NETLINK_USER 31 struct sock *nl_sk = NULL; static void hello_nl_recv_msg(struct sk_buff *skb) { struct nlmsghdr *nlh; int pid; struct sk_buff *skb_out; int msg_size; char *msg = "Hello from kernel"; int res; printk(KERN_INFO "Entering: %s\n", __FUNCTION__); msg_size = strlen(msg); nlh = (struct nlmsghdr *)skb->data; printk(KERN_INFO "Netlink received msg payload:%s\n", (char *)nlmsg_data(nlh)); pid = nlh->nlmsg_pid; /*pid of sending process */ skb_out = nlmsg_new(msg_size, 0); if (!skb_out) { printk(KERN_ERR "Failed to allocate new skb\n"); return; } nlh = nlmsg_put(skb_out, 0, 0, NLMSG_DONE, msg_size, 0); NETLINK_CB(skb_out).dst_group = 0; /* not in mcast group */ strncpy(nlmsg_data(nlh), msg, msg_size); res = nlmsg_unicast(nl_sk, skb_out, pid); if (res < 0) printk(KERN_INFO "Error while sending bak to user\n"); } static int __init hello_init(void) { printk("Entering: %s\n", __FUNCTION__); //nl_sk = netlink_kernel_create(&init_net, NETLINK_USER, 0, hello_nl_recv_msg, NULL, THIS_MODULE); struct netlink_kernel_cfg cfg = { .input = hello_nl_recv_msg, }; nl_sk = netlink_kernel_create(&init_net, NETLINK_USER, &cfg); if (!nl_sk) { printk(KERN_ALERT "Error creating socket.\n"); return -10; } return 0; } static void __exit hello_exit(void) { printk(KERN_INFO "exiting hello module\n"); netlink_kernel_release(nl_sk); } module_init(hello_init); module_exit(hello_exit); MODULE_LICENSE("GPL");
User Program:
#include <linux/netlink.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/socket.h> #include <unistd.h> #define NETLINK_USER 31 #define MAX_PAYLOAD 1024 /* maximum payload size*/ struct sockaddr_nl src_addr, dest_addr; struct nlmsghdr *nlh = NULL; struct iovec iov; int sock_fd; struct msghdr msg; int main() { sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_USER); if (sock_fd < 0) return -1; memset(&src_addr, 0, sizeof(src_addr)); src_addr.nl_family = AF_NETLINK; src_addr.nl_pid = getpid(); /* self pid */ bind(sock_fd, (struct sockaddr *)&src_addr, sizeof(src_addr)); memset(&dest_addr, 0, sizeof(dest_addr)); dest_addr.nl_family = AF_NETLINK; dest_addr.nl_pid = 0; /* For Linux Kernel */ dest_addr.nl_groups = 0; /* unicast */ nlh = (struct nlmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD)); memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD)); nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD); nlh->nlmsg_pid = getpid(); nlh->nlmsg_flags = 0; strcpy(NLMSG_DATA(nlh), "Hello"); iov.iov_base = (void *)nlh; iov.iov_len = nlh->nlmsg_len; msg.msg_name = (void *)&dest_addr; msg.msg_namelen = sizeof(dest_addr); msg.msg_iov = &iov; msg.msg_iovlen = 1; printf("Sending message to kernel\n"); sendmsg(sock_fd, &msg, 0); printf("Waiting for message from kernel\n"); /* Read message from kernel */ recvmsg(sock_fd, &msg, 0); printf("Received message payload: %s\n", NLMSG_DATA(nlh)); close(sock_fd); }
Netlink: A Communication Mechanism in Linux
Netlink socket is a communication mechanism used between the user space processes and also for communication between processes and the kernel.
It is a full duplex communication mechanism, that is, the kernel itself can initiate the communication.
The popular socket APIs are used for Netlink communication.
To use the Netlink sockets, only <linux/netlink.h> is to be included in the kernel or user space code, and processes and the kernel can start communicating immediately through the socket API.
Netlink socket is an asynchronous communication method, that is, it queues the messages to be sent in the receiver's Netlink queue. One of the features of a Netlink socket is that it also supports multicast communication, i.e., one process can send a message to a Netlink group address, and many processes can listen on this group address.
To use Netlink sockets in code, a standard socket API is used,
int socket(AF_NETLINK, type, protocol);SOCK_RAW or SOCK_DGRAM can be used for type.
protocol specifies which Netlink feature is to be used. <linux/netlink.h>:
#define NETLINK_ROUTE 0 /* Routing/device hook */ #define NETLINK_UNUSED 1 /* Unused number */ #define NETLINK_USERSOCK 2 /* Reserved for user mode socket protocols */ #define NETLINK_FIREWALL 3 /* Unused number, formerly ip_queue */ #define NETLINK_SOCK_DIAG 4 /* socket monitoring */ #define NETLINK_NFLOG 5 /* netfilter/iptables ULOG */ #define NETLINK_XFRM 6 /* ipsec */ #define NETLINK_SELINUX 7 /* SELinux event notifications */ #define NETLINK_ISCSI 8 /* Open-iSCSI */ #define NETLINK_AUDIT 9 /* auditing */ #define NETLINK_FIB_LOOKUP 10 #define NETLINK_CONNECTOR 11 #define NETLINK_NETFILTER 12 /* netfilter subsystem */ #define NETLINK_IP6_FW 13 #define NETLINK_DNRTMSG 14 /* DECnet routing messages */ #define NETLINK_KOBJECT_UEVENT 15 /* Kernel messages to userspace */ #define NETLINK_GENERIC 16 /* leave room for NETLINK_DM (DM Events) */ #define NETLINK_SCSITRANSPORT 18 /* SCSI Transports */ #define NETLINK_ECRYPTFS 19 #define NETLINK_RDMA 20 #define NETLINK_CRYPTO 21 /* Crypto layer */ #define NETLINK_SMC 22 /* SMC monitoring */ #define NETLINK_INET_DIAG NETLINK_SOCK_DIAG #define MAX_LINKS 32Netlink protocol is a 32-bit bitmask, where each bit represents a group.
Netlink socket programming in the user space
The following data structures need to be understood.
- struct sockaddr_nl
struct sockaddr_nl { __kernel_sa_family_t nl_family; /* AF_NETLINK */ unsigned short nl_pad; /* zero */ __u32 nl_pid; /* port ID */ __u32 nl_groups; /* multicast groups mask */ };
- nl_family This is the protocol family to be used, which is AF_NETLINK.
- nl_pad This is used for padding.
- nl_pid nl_pid is used to identify a single process or kernel, nl_pid = 0 is a special address which is the kernel address. It is used if a process wants to send a unicast message to other processes or the kernel. To assign it a unique value:
- nl_pid can be the PID of the process
addr.nl_pid = getpid();
addr.nl_pid = pthread_self() << 16 | getpid();
addr.nl_groups = 1<<5 | 1<<12;
struct nlmsghdr { __u32 nlmsg_len; /* Length of message including header */ __u16 nlmsg_type; /* Message content */ __u16 nlmsg_flags; /* Additional flags */ __u32 nlmsg_seq; /* Sequence number */ __u32 nlmsg_pid; /* Sending process port ID */ };
- nlmsg_len This is the length of the message to be transferred, including the header length.
- nlmsg_type This is the type of message that is being transferred and is used by applications. This field is not used by the kernel.
- nlmsg_flags This is used to give additional information.
- nlmsg_seq This is the sequence number of the message and is used by applications. This field is not used by the kernel.
- nlmsg_pid This is the identification of the process which sends the message and is used by applications. This field is not used by the kernel.
A Netlink message is a structure that holds both the Netlink header and the Netlink payload.
The header file netlink.h defines a few utility macroes. Among the others,
- NLMSG_LENGTH(payload_size) return the proper length to put in nlmsg_len
- NLMSG_SPACE(payload_size) return the number of bytes the netlink message would use
- NLMSG_DATA(nlh) given a pointer to the netlink message, return a pointer to the payload
- NLMSG_PAYLOAD(nlh, len) return the length of the payload associated with the netlink message
sizeof(struct nlmsghdr) + MAX_NL_MSG_LEN = NLMSG_SPACE(MAX_NL_MSG_LEN) NLMSG_DATA(nlh) = nlh + sizeof(struct nlmsghdr)The netlink "protocol" is message oriented, and the programs must send/receive data encapsulated in netlink messages. The system calls sendmsg() and recvmsg() are used to send and receive these messages in user space.
#include <sys/types.h> #include <sys/socket.h> struct msghdr { void *msg_name; /* optional address */ socklen_t msg_namelen; /* size of address */ struct iovec *msg_iov; /* scatter/gather array */ size_t msg_iovlen; /* # elements in msg_iov */ void *msg_control; /* ancillary data, see below */ size_t msg_controllen; /* ancillary data buffer len */ int msg_flags; /* flags (unused) */ }; ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags); ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags);
- The msg_name field is used on an unconnected socket to specify the target address for a datagram. It points to a buffer containing the address; the msg_namelen field should be set to the size of the address. For a connected socket, these fields should be specified as NULL and 0, respectively.
- The msg_iov and msg_iovlen fields specify scatter-gather locations, as for readv() and writev().
- You may send control information using the msg_control and msg_controllen members. The maximum control buffer length the kernel can process is limited per socket by the value in /proc/sys/net/core/optmem_max;
- The msg_flags field is ignored.
/* Structure for scatter/gather I/O. */ struct iovec { void *iov_base; /* Pointer to data. */ size_t iov_len; /* Length of data. */ };The message is passed to the kernel's Netlink core through iovec structure defined in <linux/uio.h>:
#include <uapi/linux/uio.h> struct kvec { void *iov_base; /* and that should *never* hold a userland pointer */ size_t iov_len; };<uapi/linux/uio.h>:
struct iovec { void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ };
Process-to-process unicast communication
Source:
- server Source:
#include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/socket.h> #include <linux/netlink.h> #include <errno.h> #include <string.h> #include <strings.h> #include <unistd.h> #define MAX_NL_MSG_LEN 1024 /* maximum payload size*/ #define SERVER_PID 888 int main(int argc, char **argv){ int sockfd=-1; struct sockaddr_nl server_addr, cli_addr; struct nlmsghdr *nlh = NULL; struct iovec iov; struct msghdr msg; char *payload=NULL; // open socket if ( (sockfd = socket(PF_NETLINK, SOCK_RAW, NETLINK_GENERIC)) < 0 ) { printf("Open socket failed! \n"); return 1; } // bind local socket bzero( (char *) &server_addr, sizeof(server_addr) ); server_addr.nl_family = AF_NETLINK; server_addr.nl_pid = SERVER_PID; // user space server's port server_addr.nl_groups = 0; // not multicast if ( bind(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0 ) { printf("Bind socket failed\n"); exit(1); } else printf("Bind to pid=%d\n", server_addr.nl_pid); nlh = (struct nlmsghdr *) malloc(NLMSG_SPACE(MAX_NL_MSG_LEN)); if (nlh == NULL) { perror("malloc nlmsghdr failed!\n"); close(sockfd); exit(1); } memset(nlh, 0, NLMSG_SPACE(MAX_NL_MSG_LEN)); iov.iov_base = (void *) nlh; iov.iov_len = MAX_NL_MSG_LEN; msg.msg_name = (void *) &server_addr; msg.msg_namelen = sizeof(server_addr); msg.msg_iov = &iov; msg.msg_iovlen = 1; // block until the message is received if ( recvmsg(sockfd, &msg, 0) < 0 ) { printf("recvmsg() error! %s\n", strerror(errno)); exit(1); } if ( iov.iov_len > 0 ) { printf("%ld bytes received: %s\n", iov.iov_len, (char *) NLMSG_DATA(nlh)); } else printf("0 byte received\n"); close(sockfd); exit(0); }
#include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/socket.h> #include <linux/netlink.h> #include <errno.h> #include <string.h> #include <strings.h> #include <unistd.h> #define MAX_NL_MSG_LEN 1024 /* maximum payload size*/ #define SERVER_PID 888 #define USER_HELLO "Hello from user spcae client" int main(int argc, char **argv){ int sockfd=-1; struct sockaddr_nl server_addr, cli_addr; struct nlmsghdr *nlh = NULL; struct iovec iov; struct msghdr msg; // open socket if ( (sockfd = socket(PF_NETLINK, SOCK_RAW, NETLINK_GENERIC)) < 0 ) { printf("Open socket failed! \n"); return 1; } // local address bzero( (char *) &cli_addr, sizeof(cli_addr) ); cli_addr.nl_family = AF_NETLINK; cli_addr.nl_pid = getpid(); // user space client's port cli_addr.nl_groups = 0; // not multicast // bind socket if ( bind(sockfd, (struct sockaddr *)&cli_addr, sizeof(server_addr)) < 0 ) { printf("Bind socket failed\n"); exit(1); } else printf("Bind to pid=%d\n", cli_addr.nl_pid); // server address bzero( (char *) &server_addr, sizeof(server_addr) ); server_addr.nl_family = AF_NETLINK; server_addr.nl_pid = SERVER_PID; // user space server's port server_addr.nl_groups = 0; // not multicast nlh = (struct nlmsghdr *) malloc(NLMSG_SPACE(MAX_NL_MSG_LEN)); if (nlh == NULL) { perror("malloc nlmsghdr failed!\n"); close(sockfd); exit(1); } memset(nlh, 0, NLMSG_SPACE(MAX_NL_MSG_LEN)); nlh->nlmsg_len = NLMSG_SPACE(MAX_NL_MSG_LEN); nlh->nlmsg_pid = getpid(); nlh->nlmsg_flags = 0; strcpy(NLMSG_DATA(nlh), USER_HELLO); iov.iov_base = (void *) nlh; iov.iov_len = NLMSG_SPACE(MAX_NL_MSG_LEN); memset(&msg, 0, sizeof(msg)); msg.msg_name = (void *) &server_addr; msg.msg_namelen = sizeof(server_addr); msg.msg_iov = &iov; msg.msg_iovlen = 1; // block until the message is received if ( sendmsg(sockfd, &msg, 0) < 0 ) { printf("sendmsg() error! %s\n", strerror(errno)); exit(1); } else printf("send msg: %s\n", (char *)NLMSG_DATA(nlh)); close(sockfd); exit(0); }Execute:
- server
$ ./server Bind to pid=888 1024 bytes received: Hello from user spcae client
$ sudo ./client Bind to pid=8414 send msg: Hello from user spcae client
Netlink in kernel space
The netlink supports a fixed number (MAX_LINKS = 32) of "protocols". All the sockets for a given "protocol" are linked in a list. The kernel-space netlink API is supported by the netlink core in the kernel, net/core/af_netlink.c. Unless you leverage the existing netlink socket protocol types, you need to add your own protocol type by adding a constant to netlink.h. In user space, we call socket() to create a netlink socket, but in kernel space, we call the following APIs:- net/core/net_namespace.c
struct net init_net = { .count = REFCOUNT_INIT(1), .dev_base_head = LIST_HEAD_INIT(init_net.dev_base_head), #ifdef CONFIG_KEYS .key_domain = &init_net_key_domain, #endif };
struct netlink_kernel_cfg { unsigned int groups; unsigned int flags; void (*input)(struct sk_buff *skb); struct mutex *cb_mutex; int (*bind)(struct net *net, int group); void (*unbind)(struct net *net, int group); bool (*compare)(struct net *net, struct sock *sk); }; struct sock *netlink_kernel_create( struct net *net, int unit, struct netlink_kernel_cfg *cfg); void netlink_kernel_release(struct sock *sk);The parameters in netlink_kernel_create():
- net In general, fill &init_net directly.
- unit The protocol type, customizable, such as #define NETLINK_TEST 25.
- cfg The fields used in netlink_kernel_cfg:
- groups Group number
- flags
- input() A callback function to receives a struct sk_buff structure.
- bind()
- unbind()
- compare()
Process to kernel communication
Source:- server Makefile:
KDIR := /lib/modules/$(shell uname -r)/build obj-m += kernel_nl.o all: $(MAKE) -C $(KDIR) M=$(PWD) modules clean: rm -rf *.o *.ko *.mod.* *.cmd .module* modules* Module* .*.cmd .tmp* make -C /lib/modules/$(shell uname -r)/build M=$(PWD) cleankernel_nl.c:
#include <linux/init.h> #include <linux/module.h> #include <net/sock.h> #include <net/netlink.h> #include <linux/string.h> #define NETLINK_TEST 25 #define MAX_MSGSIZE 1024 static struct sock *nl_sock = NULL; static void send_msg(char *msg, int pid) { struct sk_buff *skb = NULL; struct nlmsghdr *nlh = NULL; int msglen = strlen(msg); if (msg == NULL || nl_sock == NULL) { return; } skb = alloc_skb(NLMSG_SPACE(MAX_MSGSIZE), GFP_KERNEL); if (skb == NULL) { printk(KERN_ERR "allock skb failed!\n"); return; } nlh = nlmsg_put(skb, 0, 0, 0, MAX_MSGSIZE, 0); NETLINK_CB(skb).portid = 0; NETLINK_CB(skb).dst_group = 0; memcpy(NLMSG_DATA(nlh), msg, msglen + 1); printk("send msg: %s\n", (char *)NLMSG_DATA(nlh)); netlink_unicast(nl_sock, skb, pid, MSG_DONTWAIT); } static void recv_msg(struct sk_buff *in_skb) { struct sk_buff *skb = NULL; struct nlmsghdr *nlh = NULL; skb = skb_get(in_skb); if (skb->len >= nlmsg_total_size(0)) { nlh = nlmsg_hdr(skb); printk("receive msg: %s\n", (char *)NLMSG_DATA(nlh)); send_msg("Hello app!", nlh->nlmsg_pid); kfree_skb(skb); } } static int netlink_init(void) { struct netlink_kernel_cfg netlink_cfg; memset(&netlink_cfg, 0, sizeof(struct netlink_kernel_cfg)); netlink_cfg.input = recv_msg; nl_sock = netlink_kernel_create(&init_net, NETLINK_TEST, &netlink_cfg); if (nl_sock == NULL) { printk(KERN_ERR "netlink: netlink_kernel_create failed!\n"); return -1; } printk("netlink: netlink module init success!\n"); return 0; } static void netlink_exit(void) { if (nl_sock != NULL) { sock_release(nl_sock->sk_socket); } printk("netlink: netlink module exit success!\n"); } module_init(netlink_init); module_exit(netlink_exit); MODULE_LICENSE("GPL");
- Use the user-defined netlink protocal
#define NETLINK_TEST 25 sockfd = socket(PF_NETLINK, SOCK_RAW, NETLINK_TEST)
server_addr.nl_pid = 0;
留言