Linux Server Tips
Linux OS installation, configuration
init, systemd
The init process is the first process that is run by the kernel at the end of bootstrap procedure.
In the past days:
init usually reads system-dependent initialization files from /etc/rc/* and bring the system to a state which is defined in the /etc/inittab file.
You can use the init command to reread this file or change to a new runlevel.
For example, to change from your current runlevel to runlevel 1 (the single-user mode), you can use the init 1 command.
"Runlevels" are an obsolete way to start and stop groups of services used in SysV init.
systemd provides a compatibility layer that maps runlevels to targets, and associated binaries like runlevel.
Mapping between runlevels and systemd targets:
┌─────────┬───────────────────┐
│Runlevel │ Target │
├─────────┼───────────────────┤
│0 │ poweroff.target │
├─────────┼───────────────────┤
│1 │ rescue.target │
├─────────┼───────────────────┤
│2, 3, 4 │ multi-user.target │
├─────────┼───────────────────┤
│5 │ graphical.target │
├─────────┼───────────────────┤
│6 │ reboot.target │
└─────────┴───────────────────┘
The command runlevel prints the previous and current SysV runlevel if they are known.$ runlevel N 5The two runlevel characters are separated by a single space character. If a runlevel cannot be determined, N is printed instead. If neither can be determined, the word "unknown" is printed.
/sbin/init -> /lib/systemd/systemdsystemd is a system and service manager for Linux operating systems. When run as first process on boot (as PID 1), it acts as init system that brings up and maintains userspace services.
When run as a system instance, systemd interprets the configuration file /etc/systemd/system.conf and the files in system.conf.d directories; when run as a user instance, systemd interprets the configuration file /etc/systemd/user.conf and the files in user.conf.d directories.
$ tree /etc/systemd -L 2 /etc/systemd ├── journald.conf ├── logind.conf ├── network ├── resolved.conf ├── system │ ├── bluetooth.target.wants │ ├── brltty.service -> /dev/null │ ├── cloud-final.service.wants │ ├── dbus-fi.w1.wpa_supplicant1.service -> /lib/systemd/system/wpa_supplicant.service │ ├── dbus-org.bluez.service -> /lib/systemd/system/bluetooth.service │ ├── dbus-org.freedesktop.Avahi.service -> /lib/systemd/system/avahi-daemon.service │ ├── dbus-org.freedesktop.ModemManager1.service -> /lib/systemd/system/ModemManager.service │ ├── dbus-org.freedesktop.nm-dispatcher.service -> /lib/systemd/system/NetworkManager-dispatcher.service │ ├── dbus-org.freedesktop.resolve1.service -> /lib/systemd/system/systemd-resolved.service │ ├── dbus-org.freedesktop.thermald.service -> /lib/systemd/system/thermald.service │ ├── default.target.wants │ ├── display-manager.service -> /lib/systemd/system/gdm3.service │ ├── display-manager.service.wants │ ├── final.target.wants │ ├── getty.target.wants │ ├── graphical.target.wants │ ├── libvirt-bin.service -> /lib/systemd/system/libvirtd.service │ ├── multi-user.target.wants │ ├── network-online.target.wants │ ├── oem-config.service.wants │ ├── paths.target.wants │ ├── printer.target.wants │ ├── snap-core18-1279.mount │ ├── snap-core18-1288.mount │ ├── snap-core-8213.mount │ ├── snap-core-8268.mount │ ├── snap-gnome\x2d3\x2d26\x2d1604-97.mount │ ├── snap-gnome\x2d3\x2d26\x2d1604-98.mount │ ├── snap-gnome\x2d3\x2d28\x2d1804-110.mount │ ├── snap-gnome\x2d3\x2d28\x2d1804-91.mount │ ├── snap-gnome\x2dcalculator-536.mount │ ├── snap-gnome\x2dcalculator-544.mount │ ├── snap-gnome\x2dcharacters-367.mount │ ├── snap-gnome\x2dcharacters-375.mount │ ├── snap-gnome\x2dlogs-73.mount │ ├── snap-gnome\x2dlogs-81.mount │ ├── snap-gnome\x2dsystem\x2dmonitor-111.mount │ ├── snap-gnome\x2dsystem\x2dmonitor-123.mount │ ├── snap-gtk\x2dcommon\x2dthemes-1313.mount │ ├── snap-gtk\x2dcommon\x2dthemes-1353.mount │ ├── snap-vlc-1049.mount │ ├── snap-vlc-1397.mount │ ├── sockets.target.wants │ ├── spice-vdagentd.target.wants │ ├── sshd.service -> /lib/systemd/system/ssh.service │ ├── sysinit.target.wants │ ├── syslog.service -> /lib/systemd/system/rsyslog.service │ ├── teamviewerd.service │ └── timers.target.wants ├── system.conf ├── timesyncd.conf ├── user │ └── default.target.wants └── user.conf
- 舊的 init 啟動腳本是『一項一項任務依序啟動』的模式,因此不相依的服務也是得要一個一個的等待。但目前我們的硬體主機系統與作業系統幾乎都支援多核心架構了, 沒道理未相依的服務不能同時啟動啊!systemd 就是可以讓所有的服務同時啟動,因此你會發現到,系統啟動的速度變快了!
- systemd 只需要搭配 systemctl 指令來處理
- systemd 可以自訂服務相依性的檢查,如果 B 服務是架構在 A 服務上面啟動的,systemd 會自動幫你啟動 A 服務
- 如同 systemV 的 init 裡頭有個 runlevel 的特色,systemd 亦將許多的功能集合成為一個所謂的 target 項目,這個項目主要在設計操作環境的建置, 所以是集合了許多的 daemons,亦即是執行某個 target 就是執行好多個 daemon 的意思
查看和控制systemd的主要命令是systemctl。該命令可用查看系統狀態和管理系統及服務。
- Analyzing the system state
- Show system status
$ systemctl status
$ systemctlThe available unit files can be seen in /usr/lib/systemd/system/ and /etc/systemd/system/ (the latter takes precedence).
$ systemctl --failed
$ systemctl list-unit-files
- .service A service unit describes how to manage a service or application on the server. This will include how to start or stop the service, under which circumstances it should be automatically started, and the dependency and ordering information for related software.
- .socket A socket unit file describes a network or IPC socket, or a FIFO buffer that systemd uses for socket-based activation. These always have an associated .service file that will be started when activity is seen on the socket that this unit defines.
- .device A unit that describes a device that has been designated as needing systemd management by udev or the sysfs filesystem. Not all devices will have .device files. Some scenarios where .device units may be necessary are for ordering, mounting, and accessing the devices.
- .mount This unit defines a mountpoint on the system to be managed by systemd. These are named after the mount path, with slashes changed to dashes. Entries within /etc/fstab can have units created automatically.
- .automount An .automount unit configures a mountpoint that will be automatically mounted. These must be named after the mount point they refer to and must have a matching .mount unit to define the specifics of the mount.
- .swap This unit describes swap space on the system. The name of these units must reflect the device or file path of the space.
- .target A target unit is used to provide synchronization points for other units when booting up or changing states. They also can be used to bring the system to a new state. Other units specify their relation to targets to become tied to the target’s operations.
- .path This unit defines a path that can be used for path-based activation. By default, a .service unit of the same base name will be started when the path reaches the specified state. This uses inotify to monitor the path for changes.
- .timer A .timer unit defines a timer that will be managed by systemd, similar to a cron job for delayed or scheduled activation. A matching unit will be started when the timer is reached.
- .snapshot A .snapshot unit is created automatically by the systemctl snapshot command. It allows you to reconstruct the current state of the system after making changes. Snapshots do not survive across sessions and are used to roll back temporary states.
- .slice A .slice unit is associated with Linux Control Group nodes, allowing resources to be restricted or assigned to any processes associated with the slice. The name reflects its hierarchical position within the cgroup tree. Units are placed in certain slices by default depending on their type.
- .scope Scope units are created automatically by systemd from information received from its bus interfaces. These are used to manage sets of system processes that are created externally.
- Start a unit immediately # systemctl start unit
- Stop a unit immediately # systemctl stop unit
- Restart a unit # systemctl restart unit
- Ask a unit to reload its configuration # systemctl reload unit
- Show the status of a unit, including whether it is running or not $ systemctl status unit
- Check whether a unit is already enabled or not $ systemctl is-enabled unit
- Enable a unit to be started on bootup # systemctl enable unit
- Enable a unit to be started on bootup and Start immediately # systemctl enable --now unit
- Disable a unit to not start during bootup # systemctl disable unit
- Show the manual page associated with a unit (this has to be supported by the unit file) $ systemctl help unit
- Reload systemd manager configuration, scanning for new or changed units # systemctl daemon-reload
Writing unit files
Linux Server Hacks
Hack#2 Console Logging without password
Using GRUB to invoke bash,
- Select the appropriate boot entry in the GRUB menu and press e to edit the line.
- Select the kernel line and press e again to edit it.
- Append init=/bin/bash at the end of line.
- Press Ctrl-X to boot (this change is only temporary and will not be saved to your menu.lst). After booting you will be at the bash prompt.
- Your root file system is mounted as readonly now, so remount it as read/write mount -n -o remount,rw /
- Use the passwd command to create a new root password.
- Mount filesystems manually /etc/fstab is not read because init is not executed now. Use "mount -a" to show the content of /etc/fstab, mount the necessary filesystems manually.
- Reboot by typing reboot -f and do not lose your password again!
init=Run specified binary instead of /sbin/init as init process.
Hack#5 n&m
Each file in Linux has a corresponding File Descriptor associated with it.
The Bourne shell operator n&m rearranges the files and files operators.
This makes descriptor n point to the files of descriptor m.
Ex.,
- output including error
$ ls test* test.none ls: cannot access 'test.none': No such file or directory test.c testc.cpp test.cpio.gz test.cpp testelf.c test.sh
$ ls test* test.none 1>'./stdout' ls: cannot access 'test.none': No such file or directory $ cat ./stdout test.c testc.cpp test.cpio.gz test.cpp testelf.c test.sh
$ ls test* test.none 1>'./stdout' 2>&1 $ cat ./stdout ls: cannot access 'test.none': No such file or directory test.c testc.cpp test.cpio.gz test.cpp testelf.c test.shThe standards error is re-directed to the standard output which has been re-directed to a file. POSIX definition of Redirection Operator:
In the shell command language, a token that performs a redirection function.
Redirections are processed in the order they appear, from left to right.
It is one of the following symbols:
- command < file.txt Gives input to a command.
- command <> file.txt Gives input to a command. If the file doesn't exist, it will be created.
- command > file.txt Directs the output of a command into a file.
- command >| file.txt Does the same as >, but will overwrite the target
- command << WORD A here document.
command << WORD Text WORDEverything before the WORD (Text) will be the input to the command.
Linux disk and filesystem management
/etc/fstab
The file fstab contains descriptive information about the filesystems the system can mount.
Each filesystem is described on a separate line. Fields on each line are separated by tabs or spaces.
(file system) (mount point) (type) (options) (dump) (pass)
- file system the block device (/dev/xxx) or remote filesystem (host:dir) to be mounted.
- mount point directory
- type filesystem type
- options It is formatted as a comma-separated list of options.
- dump Defaults to zero(don't dump) if not present. Used by dump to determine which filesystems need to be dumped, this is not really used today anymore.
- pass This field is used by fsck to determine the order in which filesystem checks are done at boot time. The root filesystem should be specified with 1. Other filesystems should have 2. Defaults to zero (don't fsck) if not present.
LABEL=label or UUID=uuid may be given instead of a device name.
/etc/fstab is a list of filesystems to be mounted at boot time.
/etc/mtab is a list of currently mounted filesystems.
How to determine/find UUID of a partition?
In Linux, UUID(Universally Unique Identifier) can identify media more accurately and reliable, identify media via /dev/hdxy or /dev/sdxy is not a good method because the order may be different between boots, so it was no longer preferred any more, especially in fstab or grub config.
libuuid is part of the util-linux-ng package since kernel version 2.15.1 and it’s installed by default in Linux system.
The UUIDs generated by this library can be reasonably expected to be unique within a system, and unique across all systems.
UUIDs are represented as 32 hexadecimal (base 16) digits, displayed in five groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36 characters (32 alphanumeric characters and four hyphens).
How to find UUIDs of my hard disk partitions?
~$ tree /dev/disk/ /dev/disk/ ├── by-id │ ├── ata-TEAC_DVD-ROM_DV18SA_10091725083237 -> ../../sr0 │ ├── ata-WDC_WD2500BEKT-75A25T0_WD-WXQ1A80V7620 -> ../../sda │ ├── ata-WDC_WD2500BEKT-75A25T0_WD-WXQ1A80V7620-part1 -> ../../sda1 │ ├── wwn-0x50014ee655d0b10e -> ../../sda │ └── wwn-0x50014ee655d0b10e-part1 -> ../../sda1 ├── by-partuuid │ └── abfa7e81-01 -> ../../sda1 ├── by-path │ ├── pci-0000:00:1f.2-ata-1 -> ../../sda │ ├── pci-0000:00:1f.2-ata-1-part1 -> ../../sda1 │ └── pci-0000:00:1f.2-ata-2 -> ../../sr0 └── by-uuid └── 3db7ffaf-51bc-4f72-a09d-5ec2f3904c08 -> ../../sda1 $ sudo blkid [sudo] password for jerry: /dev/loop0: TYPE="squashfs" /dev/loop1: TYPE="squashfs" /dev/loop2: TYPE="squashfs" /dev/loop3: TYPE="squashfs" /dev/loop4: TYPE="squashfs" /dev/loop5: TYPE="squashfs" /dev/loop6: TYPE="squashfs" /dev/loop7: TYPE="squashfs" /dev/sda1: UUID="3db7ffaf-51bc-4f72-a09d-5ec2f3904c08" TYPE="ext4" PARTUUID="abfa7e81-01" /dev/loop8: TYPE="squashfs" /dev/loop9: TYPE="squashfs" /dev/loop10: TYPE="squashfs" /dev/loop11: TYPE="squashfs" /dev/loop12: TYPE="squashfs" $ cat /etc/fstab # /etc/fstab: static file system information. # # Use 'blkid' to print the universally unique identifier for a # device; this may be used with UUID= as a more robust way to name devices # that works even if disks are added and removed. See fstab(5). # ## / was on /dev/sda1 during installation UUID=3db7ffaf-51bc-4f72-a09d-5ec2f3904c08 / ext4 errors=remount-ro 0 1 /swapfile none
How to generate a new UUID for a partition?
tune2fs allows the system administrator to adjust various tunable filesystem parameters on Linux ext2, ext3, or ext4 filesystems.
tune2fs [ -l ] [ -c max-mount-counts ] [ -e errors-behavior ] [ -f ] [ -i interval-between-checks ] [ -I new_inode_size ] [ -j ] [ -J journal-options ] [ -m reserved-blocks-percentage ] [ -o [^]mount-options[,...] ] [ -r reserved-blocks-count ] [ -u user ] [ -g group ] [ -C mount-count ] [ -E extended-options ] [ -L vol‐ ume-label ] [ -M last-mounted-directory ] [ -O [^]feature[,...] ] [ -Q quota-options ] [ -T time-last- checked ] [ -U UUID ] [ -z undo_file ] deviceThe device specifier can either be a filename (i.e., /dev/sda1), or a LABEL or UUID specifier: "LABEL=volume-label" or "UUID=uuid". (i.e., LABEL=home or UUID=e40486c6-84d5-4f2f-b99c-032281799c9d).
~$ sudo tune2fs -U random /dev/sda1 tune2fs 1.44.1 (24-Mar-2018) The UUID may only be changed when the filesystem is unmounted.And in /etc/fstab, we should modify it to use the new UUID.
LinuxServer hacks
Hack#8 Immutable Files in ext2/ext3
Linux provides us the access control by file and directory permissions on three levels: user, group and other. These file permissions provide the basic level of security and access control.
The umask utility is used to control the file-creation mode mask, which determines the initial value of file permission bits for newly created files. Because umask affects the current shell execution environment, it is usually implemented as built-in command of a shell.
$ umask 0022 $ umask -S u=rwx,g=rx,o=rx
Linux also has advanced access control features like ACLs (Access Control Lists) and attributes. Attributes define properties of files.
a: append only c: compressed d: no dump e: extent format i: immutable j: data journalling s: secure deletion t: no tail-merging u: undeletable A: no atime updates C: no copy on write D: synchronous directory updates S: synchronous updates T: top of directory hierarchyFor ex., a file with the i attribute cannot be modified.
It cannot be deleted or renamed, no link can be created to this file and no data can be written to the file.
When set, prevents, even the superuser, from erasing or changing the contents of the file.
Some Linux-native filesystems support several attributes that you can adjust with the chattr command and use lsattr command to display the attributes of a file.
- chattr
chattr [-RVf] [-+=AacDdijsTtSu] [-v version] files...
lsattr [ -RVadv ] [ files... ]
Hack#11 Finding and Eliminating setuid/setgid Binaries
Linux uses a combination of bits to store the permissions of a file. We can change the permissions using the chmod command, which essentially changes the ‘r’, ‘w’ and ‘x’ characters associated with the file.
$ chmod u=rwx filename $ chmod go=rx filename $ chmod g+w foobar $ chmod a-w foobarNote, a: all; use this instead of typing "ugo".
Further, the ownership of files also depends on the uid (user ID) and the gid (group ID) of the creator, when we launch a process, it runs with the uid(effective user-id) and gid (effective group-id) of the user who launched it.
- The setuid bit When the setuid bit is set, it will set its permissions to that of the user who created it (owner), instead of setting it to the user who launched it. To locate the setuid, look for an ‘s’ instead of an ‘x’ in the executable bit of the file permissions.An example of an executable with setuid permission is passwd,
$ ls -l /usr/bin/passwd -rwsr-xr-x 1 root root 59640 三 23 2019 /usr/bin/passwdThis means that passwd is executed with the permissions for its creator root. If a vulnerable program runs with root privileges, the attacker could gain root access to the system through it. To find all setuid files :
find /usr/bin -perm -u+s -type f -print | xargs ls -ld
find /usr/bin -perm -g+s -type f -print | xargs ls -ld
drwxrwxrwt 22 root root 4096 一 1 12:35 tmp
Therefore,
- setuid ==> personal file access permission forwarding
- setgid ==> group file access permission forwarding sharing + parent group directory sharing
- sticky bit ==> directory sharing without file sharing
FilePermissionsACLs
ACLs are a second level of discretionary permissions, that may override the standard ugo/rwx ones. ACLs are used to define more fine-grained discretionary access control for files and directories.
An ACL consists of entries specifying access permissions on an associated object. ACLs can be configured per user, per group or via the effective rights mask.
- Enabling ACLs in the Filesystem The file system must be mounted with ACLs turned on. To check if it is on,
$ sudo tune2fs -l /dev/sda1 | grep acl Default mount options: user_xattr aclthe partition(s) on which you want to enable ACL can be set in /etc/fstab.
... UUID=07aebd28-24e3-cf19-e37d-1af9a23a45d4 /home ext4 defaults,acl 0 2 ...
$ sudo touch testAcl.txt $ ll testAcl.txt -rw-r--r-- 1 root root 0 一 12 14:33 testAcl.txt # confirm settings $ getfacl testAcl.txt # file: testAcl.txt # owner: root # group: root user::rw- group::r-- other::r-- # try to write with "jerry" $ echo "acl" > testAcl.txt bash: testAcl.txt: Permission denied # try to read with "jerry" $ cat testAcl.txt # set write for "jerry" $ sudo setfacl -m u:jerry:w testAcl.txt $ ll testAcl.txt -rw-rw-r--+ 1 root root 0 一 12 14:33 testAcl.txt $ getfacl testAcl.txt # file: testAcl.txt # owner: root # group: root user::rw- user:jerry:-w- group::r-- mask::rw- other::r--
Hack#12 Make sudo Work hard
sudo allows a permitted user to execute a command as the superuser or another user.
sudo is a setuid program.
$ ls -l /usr/bin/sudo -rwsr-xr-x 1 root root 149080 十 11 02:32 /usr/bin/sudo
sudo command works in conjunction with security policies, default security policy is sudoers and it is configurable via /etc/sudoers file.
Hack#16 Fun with /proc
The /proc filesystem contains a representation of kernel's live process table.
The directories named by numbers contains information for every processes running on the Linux. The number corresponds to the PID.
Take a look at the structure for each process, some are useful:
- cwd
- exe
- cmdline
- environ
Hack#18 Manage System resources Per Process
Bash provides a utility "ulimit".
“ulimit” isn’t a separate binary. It’s embedded into the shell itself.
$ ulimit unlimitedThe above means that the current user has “unlimited” amount of resources to be accessed.
To get the report in details,
$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 15274 max locked memory (kbytes, -l) 16384 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 15274 virtual memory (kbytes, -v) unlimited file locks (-x) unlimitedAll the limits that’ll be applicable to the users are defined in:
/etc/security/limits.conf
Partitioning
Partitioning a hard drive divides the available space into sections that can be accessed independently. An entire drive may be allocated to a single partition, or multiple ones for cases such as dual-booting, maintaining a swap partition, or to logically separate data such as audio and video files.
The required information is stored in a partition table scheme such as MBR or GPT.
MBR
The Master Boot Record (MBR) is the first 512 bytes of a storage device. It contains an operating system bootloader and the storage device's partition table. It plays an important role in the boot process under BIOS systems.
Note: The MBR is not located in a partition; it is located at the first sector(usually 512 bytes) of the device (physical offset 0), preceding the first partition.
- bootstrap code The first 440 bytes of MBR are the bootstrap code area. The bootstrap code can be backed up, restored from backup or erased using dd.
- partition table There are 3 types of partitions:
- Primary Primary partitions can be bootable and are limited to four partitions per disk
- Extended A hard disk can contain no more than one extended partition. The extended partition is also counted as a primary partition so if the disk has an extended partition.
- Logical
GPT
The GUID partition table (GPT) partitioning scheme was introduced by Intel as part of an effort to introduce more modern firmware to generic PC hardware.
GPT is part of the Unified Extensible Firmware Interface (UEFI) specification; it uses globally unique identifiers (GUIDs), or UUIDs in the Linux world, to define partitions and partition types.
與MBR相比,GPT(Globally Unique Identifier Partition Table ,GUID Partition Table, 縮寫:GPT)磁碟分割區樣式支援最大為128個分割,一個分割最大18 EB(Exabytes)
The high-level summary of the block layout used by GPT:
Block | Description |
---|---|
0 | Protective MBR |
1 | Partition Table Header (primary) |
2 through 2+b-1 | Partition Entry Array (primary) |
2+b through n-2-b | partition data |
n-2-b+1 through n-2 | Partition Entry Array (backup) |
n-1 | Partition Table Header (backup) |
- Protective MBR At the start of a GPT disk there is a protective Master Boot Record (PMBR) to protect against GPT-unaware software. This protective MBR just like an ordinary MBR has a bootstrap code area which can be used for BIOS/GPT booting with boot loaders that support it. A GPT-unaware program sees the GPT disk as an MBR disk with a single, unknown partition.
- Partition Table Header (primary) A structure that defines various aspects of the disk:
- a GUID to uniquely identify the disk
- the starting block of the partition entry array
- the size of each partition entry in that array
- Partition Entry Array (primary) An array of partition entries, each of which defines a partition (or is all zero, indicating that the entry is not in use). This array is treated as an array of bytes. The first partition entry start at the first byte of the array, the next partition entry follows immediately after that, and so on. The size of these entries is given by a field in the partition table header. Each partition entry contains:
- a GUID to uniquely identify the partition itself
- a GUID to identify the partition type
- the start and end block of the partition
- the partition name
- partition data
- Partition Entry Array (backup)
- Partition Table Header (backup)
Tools and Usages
- Check for an existing partition
$ sudo fdisk -l /dev/sdb Disk /dev/sdb: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x00027921Disklabel type indicates what partition table is applied: dos(for MBR) or gpt.
$ sudo parted /dev/sdb GNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands.
- create a new GPT partition table
(parted) mklabel gpt
(parted) mklabel mbr
mkpart part-type fs-type start endwhere
- part-type This is meaningful only for MBR partition tables:
- primary
- extended
- logical
- fs-type This can be listed by entering help mkpart.
- start the beginning of the partition from the start of the device
- end the end of the partition from the start of the device
- s sector (n bytes depending on the sector size, often 512)
- MB megabyte (1000000 bytes)
- GB gigabyte (1000000000 bytes)
- % percentage of the device (between 0 and 100)
(parted) mkpart fat32 0GB 30GB (parted) print Model: Seagate Portable (scsi) Disk /dev/sdb: 1000GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 30.0GB 30.0GB fat32 (parted) mkpart ext4 30GB 250GB (parted) print Model: Seagate Portable (scsi) Disk /dev/sdb: 1000GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 30.0GB 30.0GB fat32 2 30.0GB 250GB 220GB ext4
rm PartitionNumberTo get the partition information,
$ sudo blkid /dev/sdb1: UUID="EC2E-4699" TYPE="vfat" PARTLABEL="fat32" PARTUUID="a884291d-6558-4acb-9c2e-8d56ba0cbf21" /dev/sdb2: UUID="99349696-e669-4ec9-8f7c-475ea4e97102" TYPE="ext4" PARTLABEL="ext4" PARTUUID="ec6f13f8-45a5-4e74-9f91-fdd1741fc3b6"
Booting Process
- BIOS Booting The BIOS firmware will be told which disk to boot the system from. It executes the bootloader it finds in the MBR of the specified disk, and that’s it. The firmware is no longer involved in booting. The BIOS firmware layer doesn’t really know what a bootloader is, or what an operating system is. All it can do is run the boot loader from a disk’s MBR.
- bootloader is a Linux loader GRUB does not fit in 440 bytes, the size of the Master Boot Record. Therefore, the bootstrap code that is loaded actually just parses the partition table, finds the /boot partition, and parses the filesystem information, it then loads Stage 2 GRUB. Stage 2 GRUB loads everything it needs, including the GRUB configuration, then presents a menu (or not, depending on user configuration). After a boot sequence is chosen, the Linux loader knows where the kernel file is and will load Linux kernel in RAM and execute it.
- bootloader is not a Linux Loader The bootstrap code loads the MBR of the active partition where the Linux loader is installed in.
- UEFI Booting UEFI stands for Unified Extensible Firmware Interface. It's a standard specification for the firmware interface on a computer. UEFI systems require an EFI system partition. The EFI system partition or ESP is a partition on a data storage device that is used by computers adhering to the UEFI. When a computer is booted, UEFI firmware loads files stored on the ESP to start installed operating systems and various utilities. The EFI system partition is formatted with a file system whose specification is based on the FAT file system and maintained as part of the UEFI specification. Both GPT- and MBR-partitioned disks can contain an EFI system partition, as UEFI firmware is required to support both partitioning schemes. UEFI provides backward compatibility with legacy systems by reserving the first block (sector) of the partition for compatibility code, the first sector of a partition is loaded into memory and execution is transferred to this code on legacy BIOS-based system. Many UEFI firmwares can boot a system just like a BIOS firmware would – they can look for an MBR on a disk, and execute the boot loader from that MBR, and leave everything subsequently up to that bootloader. Regular UEFI boot has several lists of possible boot entries, stored in UEFI config variables (normally in NVRAM), and boot order config variables stored alongside them. It allows for many different boot options, and a properly-defined fallback order. In many cases, you can even list and choose which OS / boot loader to use from the system boot menu (similar to the boot device menu implemented in many BIOSes). The boot sequence for UEFI consists of the following:
- The boot order list is read from a globally defined NVRAM variable. Modifications to this variable are only guaranteed to take effect after the next platform reset. The boot order list defines a list of NVRAM variables that contain information about what is to be booted. Each NVRAM variable defines a name for the boot option that can be displayed to a user.
- The variable also contains a pointer to the hardware device and to a file on that hardware device that contains the UEFI image to be loaded.
- The variable might also contain paths to the OS partition and directory along with other configuration specific directories
Drive and partition backups with dd
The dd is an command which stands for “data duplicator”.
Note that dd copies "empty" space too, so if the partition is 200MB in size, even if it only contains 100MB of data, the output file will be 200MB in size.
- One of the most typical use cases for the utility is the backup of the MBR. To backup the MBR of /dev/sda disk,
$ sudo dd if=/dev/sda bs=512 count=1 of=mbr.img
# dd if=/dev/sda1 of=/srv/boot.img
# dd if=/srv/boot.img of=/dev/sda1
- use fdisk to recreate the appropriately-sized partitions
sfdisk -d /dev/sda | sfdisk /dev/sdb
# fdisk -l /dev/sda; fdisk -l /dev/sdb
# dd if=/dev/sda of=/dev/sdb bs=446 count=1
# dd if=/dev/sda1 of=/dev/sdb1 # dd if=/dev/sda2 of=/dev/sdb2
Understanding File System Superblock in Linux
Blocks in File System
Hard disk sector is a basic storage unit of the drive.
When a partition or disk is formatted, the sectors in the hardisk is first divided into small groups. This groups of sectors is called as blocks. Block size of file systems is a software construct.
A linux Kernal performs all its operations on a file system using block size of the file system. The block size can never be smaller than the hard disk's sector size, and will always be in multiple of the hard disk sector size. The linux Kernel also requires the file system block size to be smaller or equal to the system page size. Use getconf to query system configuration variables:
$ getconf PAGE_SIZE 4096The block size is something that can be specified when a user formats a partition using the command line parameters available.
mkfs -t ext3 -b 4096 /dev/sda1The block size you select will impact the following things:
- Maximum File Size
- Maximum File System Size
- Performance
The layout of a standard block group is approximately as follows:
For the special case of block group 0, the first 1024 bytes are unused, to allow for the installation of x86 boot sectors and other oddities. The superblock will start at offset 1024 bytes.
Superblock
The superblock records various information about the enclosing filesystem, such as block counts, inode counts, supported features, maintenance information, and more.
Similar to how i-nodes stores metadata of files, superblocks store metadata of the filesystem.
The superblock information of an existing file system can be viewed by using dumpe2fs,
$ sudo dumpe2fs -h /dev/sda1 [sudo] password for jerry: dumpe2fs 1.44.1 (24-Mar-2018) Filesystem volume name:Last mounted on: / Filesystem UUID: 3db7ffaf-51bc-4f72-a09d-5ec2f3904c08 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 15269888 Block count: 61049344 Reserved block count: 3052467 Free blocks: 47122648 Free inodes: 14341481 First block: 0 Block size: 4096 Fragment size: 4096 Group descriptor size: 64 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Mon Jun 24 16:11:33 2019 Last mount time: Tue Dec 31 16:02:30 2019 Last write time: Tue Dec 31 16:02:23 2019 Mount count: 102 Maximum mount count: -1 Last checked: Mon Jun 24 16:11:33 2019 Check interval: 0 ( ) Lifetime writes: 550 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 32 Desired extra isize: 32 Journal inode: 8 First orphan inode: 7837802 Default directory hash: half_md4 Directory Hash Seed: d9d5e4cd-c2d0-491c-88aa-1762b2295bb1 Journal backup: inode blocks Checksum type: crc32c Checksum: 0xcb19d304 Journal features: journal_incompat_revoke journal_64bit journal_checksum_v3 Journal size: 1024M Journal length: 262144 Journal sequence: 0x0019c036 Journal start: 131623 Journal checksum type: crc32c Journal checksum: 0x0f28cdd7
STORAGE ADMINISTRATION GUIDE
2.3. THE /PROC VIRTUAL FILE SYSTEM
The following /proc files are relevant in managing and monitoring system storage:- /proc/devices Displays various character and block devices that are currently configured.
- /proc/filesystems Lists all file system types currently supported by the kernel.
- /proc/mdstat Contains current information on multiple-disk or RAID configurations on the system, if they exist.
- /proc/mounts Lists all mounts currently used by the system.
- /proc/partitions Contains partition block allocation information.
CHAPTER 18. USING THE MOUNT COMMAND
- Listing Currently Mounted File Systems
mount
mount -t ext4
mount [option…] device directoryThe device can be identified by :
- a full path to a block device “/dev/sda3”
- a universally unique identifier “UUID=34795a28-ca6d-4fd8-a347-73671d0c19cb”
- a volume label “LABEL=home”
mount --bind old_directory new_directoryThis allows the file system under old_directory can be accessed via new_directory.
If the new_directory is under the old_directory, use
mount --rbind old_directory new_directory
6.4. BACKUP EXT2/3/4 FILE SYSTEMS
If the partition being backed up is an operating system partition, bootup your system into Single User Mode. Use dump to backup the contents of the partitions:# dump -0uf /backup-files/sda1.dump /dev/sda1Note:
- If the system has been running for a long time, it is advisable to run e2fsck on the partitions before backup.
- dump should not be used on heavily loaded and mounted filesystem as it could backup corrupted version of files.
-level# The dump level (any integer). A level 0, full backup, specified by -0 guarantees the entire file system is copied. -f file Write the backup to file -u Update the file /var/lib/dumpdates after a successful dump. The format of /var/lib/dump‐ dates is readable by people
6.5. RESTORE AN EXT2/3/4 FILE SYSTEM
If you are restoring an operating system partition, bootup your system into Rescue Mode.- Format the destination partitions by using the mkfs command
- Prepare the working directories.
# mkdir /mnt/sda1 # mount -t ext3 /dev/sda1 /mnt/sda1
# cd /mnt/sda1 # restore -rf /backup-files/sda1.dump
12.2. FILESYSTEM-SPECIFIC INFORMATION FOR FSCK
The generic 'fsck' command will attempt to detect the filesystem type, or it will accept parameters specifying the type. 'e2fsck' is essentially a shortcut saying it's an ext2 filesystem. They all behave the same way and check the filesystem for errors.
fsck is simply a front end that calls the appropriate tool for the filesystem in question,
lrwxrwxrwx 1 root root 8 六 24 2019 /sbin/dosfsck -> fsck.fat -rwxr-xr-x 1 root root 314080 九 27 02:01 /sbin/e2fsck -rwxr-xr-x 1 root root 47232 八 23 07:47 /sbin/fsck -rwxr-xr-x 1 root root 34928 八 23 07:47 /sbin/fsck.cramfs lrwxrwxrwx 1 root root 6 九 27 02:01 /sbin/fsck.ext2 -> e2fsck lrwxrwxrwx 1 root root 6 九 27 02:01 /sbin/fsck.ext3 -> e2fsck lrwxrwxrwx 1 root root 6 九 27 02:01 /sbin/fsck.ext4 -> e2fsck -rwxr-xr-x 1 root root 59472 一 25 2017 /sbin/fsck.fat -rwxr-xr-x 1 root root 92264 八 23 07:47 /sbin/fsck.minix lrwxrwxrwx 1 root root 8 六 24 2019 /sbin/fsck.msdos -> fsck.fat lrwxrwxrwx 1 root root 8 六 24 2019 /sbin/fsck.vfat -> fsck.fat
If these filesystems encounter metadata inconsistencies while mounted, they will record this fact in the filesystem superblock. If e2fsck finds that a filesystem is marked with such an error, e2fsck will perform a full check .
CHAPTER 13. PARTITIONS
parted is a program to manipulate disk partitions.$ sudo parted -l [sudo] password for jerry: Model: ATA WDC WD2500BEKT-7 (scsi) Disk /dev/sda: 250GB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1049kB 250GB 250GB primary ext4 boot
CHAPTER 14. LVM (LOGICAL VOLUME MANAGER)
LVM2, can be used to gather existing storage devices into groups and allocate logical units from the combined space as needed. Physical volumes are regular storage devices. LVM writes a header to the device to allocate it for management. LVM combines physical volumes into storage pools known as volume groups. A volume group can be sliced up into any number of logical volumes. Logical volumes are functionally equivalent to partitions on a physical disk, but with much more flexibility. In summary, LVM can be used to combine physical volumes into volume groups to unify the storage space available on a system. Afterwards, administrators can segment the volume group into arbitrary logical volumes, which act as flexible partitions. Each volume within a volume group is segmented into small, fixed-size chunks called extents. The extents on a physical volume are called physical extents, while the extents of a logical volume are called logical extents. A logical volume is simply a mapping that LVM maintains between logical and physical extents.To use LVM, the first step is to scan the system for block devices that LVM can see and manage.
$ sudo lvmdiskscan | grep sd /dev/sda1 [ 232.88 GiB] /dev/sdb1 [ <27.94 GiB] /dev/sdb2 [ 204.89 GiB] /dev/sdb3 [ <46.57 GiB]Warning: Make sure that you double-check that the devices you intend to use with LVM do not have any important data already written to them. Using these devices within LVM will overwrite the current contents.
We can mark these 2 physical partitions as physical volumes within LVM using the pvcreate command:
$ sudo pvcreate /dev/sdb2 /dev/sdb3 WARNING: ext4 signature detected on /dev/sdb2 at offset 1080. Wipe it? [y/n]: y Wiping ext4 signature on /dev/sdb2. WARNING: ext4 signature detected on /dev/sdb3 at offset 1080. Wipe it? [y/n]: y Wiping ext4 signature on /dev/sdb3. Physical volume "/dev/sdb2" successfully created. Physical volume "/dev/sdb3" successfully created.pvcreate initializes a physical volume(PV) so that it is recognized as belonging to LVM, and allows the physical volume to be used in a volume group(VG). A PV can be a disk partition, whole disk, meta device, or loopback file. We can use pvs to display information about physical volumes,
$ sudo pvs PV VG Fmt Attr PSize PFree /dev/sdb2 lvm2 --- 204.89g 204.89g /dev/sdb3 lvm2 --- <46.57g <46.57gPVs can be removed:
$ sudo pvremove /dev/sdb2 /dev/sdb3 Labels on physical volume "/dev/sdb2" successfully wiped. Labels on physical volume "/dev/sdb3" successfully wiped.Creating Volume Groups:
$ sudo vgcreate vg1 /dev/sdb2 /dev/sdb3 Physical volume "/dev/sdb2" successfully created. Physical volume "/dev/sdb3" successfully created. Volume group "vg1" successfully createdWe can see a brief summary of the volume group,
$sudo vgs VG #PV #LV #SN Attr VSize VFree vg1 2 0 0 wz--n- 251.45g 251.45gCurrently, there has two physical volumes, zero logical volumes, and has the combined capacity of the underlying devices. We can use VG as a pool that we can allocate logical volumes from. You can use vgcreate to create a new VG on the PV, or vgextend to add the PV to existing VG. To create logical volumes, we use the lvcreate command:
- pass in the volume group to pull from
- name the logical volume with the -n option
- specify the size with the -L option
$ sudo lvcreate -L 10G -n projects vg1 Logical volume "projects" created. $ sudo lvcreate -L 5G -n www vg1 Logical volume "www" created. $ sudo lvcreate -L 20G -n db vg1 Logical volume "db" created.We can see the logical volumes and their relationship to the volume group,
$ sudo vgs -o +lv_size,lv_name VG #PV #LV #SN Attr VSize VFree LSize LV vg1 2 3 0 wz--n- 251.45g 216.45g 10.00g projects vg1 2 3 0 wz--n- 251.45g 216.45g 5.00g www vg1 2 3 0 wz--n- 251.45g 216.45g 20.00g dbNow, we can allocate the rest of the space in the volume group to the “workspace” volume using the -l flag,
$ sudo lvcreate -l 100%FREE -n workspace vg1 Logical volume "workspace" created. $ sudo vgs -o +lv_size,lv_name VG #PV #LV #SN Attr VSize VFree LSize LV vg1 2 4 0 wz--n- 251.45g 0 10.00g projects vg1 2 4 0 wz--n- 251.45g 0 5.00g www vg1 2 4 0 wz--n- 251.45g 0 20.00g db vg1 2 4 0 wz--n- 251.45g 0 216.45g workspaceAs you can see, the “vg1” volume group is completely allocated. The logical volume devices are available within the /dev directory just like other storage devices. You can access them in two places:
/dev/volume_group_name/logical_volume_name /dev/mapper/volume_group_name-logical_volume_nameand format logical volumes with the Ext4 filesystem,
$ sudo mkfs.ext4 /dev/vg1/projects $ sudo mkfs.ext4 /dev/vg1/www $ sudo mkfs.ext4 /dev/vg1/db $ sudo mkfs.ext4 /dev/vg1/workspace
Linux Server Hacks, Volume Two: Storage Management and Backups
#46 Create Flexible Storage with LVM
Logical volumes, which are filesystems that appear to be single volume but are actually assembled from space that has been allocated on multiple physical partitions. The size of a logical volume can exceed the size of any of the physical storage devices on your system, but it cannot exceed the sum of all of their sizes.Linux process management, user management and package management
Process Management
Every process has 6 or more IDs associated with it:
- real User ID ans real group ID The real ID of the executor. Only the superuser can change the real IDs.
- effective user ID, effective group ID, supplementaty group ID Determine the file access permission. If set-user-ID or set-group-ID bits of a file are set, the effective user ID or the effective group ID will be set to the owner of the file.
- saved set-user-ID and set-group-ID These are copied from the effective IDs by exec.
A process refers to a program in execution; it’s a running instance of a program.
The only way a new process is created by the kernel is when an existing process calls the fork() function.
The new process created by fork is called the child process.
Both the child and parent continue executing with the instruction that follows the call to fork. The child is a copy of its parent:
- data
- heap
- stack
A fork is often followed by an exec.
The child process will have the same environment as its parent, but only the process ID number is different. A executed program is identified by its process ID (PID) as well as its parent processes ID (PPID). You can use the pidof command to find the ID of a process:
$ pidof init 1To find the process ID and parent process ID of the current shell, run:
$ echo $$ $ echo $PPID
If we have an application that can only do one thing at a time, it is really bad. This is where thread steps in.
A process can have multiple threads.
Meaning threads will be part of a process (all threads of the same process will share same PID).
In Linux, processes and threads are almost the same. The major difference is that threads share the same virtual memory address space(not a copy). Processes run in separate virtual memory spaces.
A thread is a path of execution within a process. Threads share with other threads their code section, data section, and OS resources (like open files and signals). But, like process, a thread has its own program counter (PC), register set, and stack space.
The low level interface to create threads is the clone() system call. The higher level interface is pthread_create().
#include <pthread.h> int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg); Compile and link with -pthread.The pthread_create() function starts a new thread in the calling process.
The new thread starts execution by invoking start_routine(); arg is the only one argument passed start_routine().
Why threads?. This is because communication between processes is not simple:
- Some overhead involved and is comparatively slow.
- Context switching between threads are faster compared to switching between processes.
In Linux, these thread id numbers are indicated by LWP (Light Weight Process) , the ps command column name is also LWP:
$ ps -efL UID PID PPID LWP C NLWP STIME TTY TIME CMD root 1 0 1 0 1 十二23 ? 00:00:10 /sbin/init splash root 2 0 2 0 1 十二23 ? 00:00:00 [kthreadd] ... jerry 2464 1 2464 0 33 十二23 tty2 00:11:23 /opt/google/chrome/chrome jerry 2464 1 2472 0 33 十二23 tty2 00:00:00 /opt/google/chrome/chrome jerry 2464 1 2479 0 33 十二23 tty2 00:00:00 /opt/google/chrome/chrome jerry 2464 1 2480 0 33 十二23 tty2 00:00:00 /opt/google/chrome/chrome jerry 2464 1 2483 0 33 十二23 tty2 00:00:00 /opt/google/chrome/chrome jerry 2464 1 2484 0 33 十二23 tty2 00:04:37 /opt/google/chrome/chrome jerry 2464 1 2485 0 33 十二23 tty2 00:00:00 /opt/google/chrome/chrome
A process group is a collection of one or more processes. Each process group can have a process group leader. The leader is identified by its process ID equals its process group ID.
A session is a collection of one or more process groups. A process establishes a new session by calling the setsid() function.
A controlling terminal is the terminal device(tty/pts):
- A session can only have one controlling terminal.
- The session leader that establishes the connection to the controlling terminal is called the controlling process.
- The process groups within a session can be divided into a single foreground process group, and, one or more background process groups.
Foreground processes (also referred to as interactive processes) are initialized and controlled through a terminal session. Background processes (also referred to as non-interactive/automatic processes) – are processes not connected to a terminal; they don’t expect any user input. A new process is normally created when an existing process makes an exact copy of itself in memory by fork().
Job control allows us to start multiple jobs from a single terminal and controls which jobs can access the terminal and which jobs are to run in the background.
Job control requires:
- A shell that supports job control
- The terminal driver supports job control
- Support for job control signals
pgrep looks through the currently running processes and lists the process IDs which match the selection criteria to stdout. All the criteria have to match. For example,
$ pgrep -u root sshdwill only list the processes named sshd AND owned by root. On the other hand,
$ pgrep -u root,daemonwill list the processes owned by root OR daemon. pkill will send the specified signal (by default SIGTERM) to each process matching a pattern.
$ pkill chromThe above kill processes include chrome. "killall" needs an exact process name.
PROCESS STATE CODES:
Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process:
D uninterruptible sleep (usually IO) R running or runnable (on run queue) S interruptible sleep (waiting for an event to complete) T stopped by job control signal t stopped by debugger during the tracing W paging (not valid since the 2.6.xx kernel) X dead (should never be seen) Z defunct ("zombie") process, terminated but not reaped by its parentFor BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users) N low-priority (nice to other users) L has pages locked into memory (for real-time and custom IO) s is a session leader l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do) + is in the foreground process group
Linux Server Hacks
Remove Unnecessary Services
Use "ps ax" to check if unnecessary services exists:- NFS portmap, rpc.mountd, rpc.nfsd
- Samba smbd, nmbd
- automount All statically mount is setup via /etc/fstab.
User Management
- Creating a User adduser
- Deleting, disabling account
- passwd -l 'username' Lock the password of the named account. This option disables a password by changing it to a value which matches no possible encrypted value (it adds a ´!´ at the beginning of the password).
- userdel -r 'username' With this option, files in the user's home directory will be removed along with the home directory itself and the user's mail spool.
- Modify groups The groupmod command modifies the definition of the specified GROUP by modifying the appropriate entry in the group database
- Modify an account
- add a user to a group usermod -a -G GROUPNAME USERNAME
- Gives information on all users finger
Package Management
Debian/Ubuntu
dpkg
dpkg is a tool to install, build, remove and manage Debian packages. The primary and more user friendly front-end for dpkg is aptitude.aptitude
aptitude is a text-based interface to the Debian GNU/Linux package system. It allows the user to view the list of packages and to perform package management tasks such as installing, upgrading, and removing packages. Actions may be performed from a visual interface or from the command-line.apt-get, apt
apt-get is the command-line tool for handling packages, and may be considered the user's "back-end" to other tools(aptitude, synaptic and wajig) using the APT library. apt provides a high-level command line interface for the package management system. It is intended as an end user interface and enables some options better suited for interactive usage by default. apt is less-commonly used and differs from apt-get mostly in terms of output formatting.- update update is used to resynchronize the package index files from their sources. The indexes of available packages are fetched from the location(s) specified in /etc/apt/sources.list.
- upgrade upgrade is used to install the newest versions of all packages currently installed on the system from the sources enumerated in /etc/apt/sources.list.
- install install is followed by one or more packages desired for installation or upgrading.
Network skill
IP
Ip header
- big endian Packets are transmitted in the order: bits 0-7, 8-15, 16-23, 24-31. This is called big endian byte ordering or network byte order. "endian"決定multi-bytes data放在記憶體位址的順序, 以一個4 bytes(32-burs)暫存器(0x100)來說, 4 bytes data "0x 01 23 45 67"被存放的順序:
- big-endian LSB存在高位址
0x100: 01 23 45 67 0x104:
0x100: 67 45 23 01 0x104:
IP Routing
The IP layer has a routing table in memory that it searches each time it receives a datagram. Each entry in the routing table contains the following information:- destination IP
- IP of the next-hop router
- flags Specify if the destination IP is for a network or a host.
- interface Which network interface should be used for transmission
Subnet
網路等級 | IP分佈範圍 | 可用網路組 | 可連結主機數目 |
---|---|---|---|
A | 0.0.0.0~127.0.0.0 | 126組 | 16,777,214 |
B | 128.0.0.0~191.255.0.0 | 16383組 | 65,534 |
C | 192.0.0.0~223.255.255.0 | 2,097,152組 | 254 |
D | 224.0.0.0~239.255.255.255 | ||
E | 240.0.0.0~255.255.255.255 |
Private internal addresses are not routed on the Internet and no traffic cannot be sent to them from the Internet, they only supposed to work within the local network. Private addresses include IP addresses from the following classes:
- A Range from 10.0.0.0 to 10.255.255.255, 10.0.0.0/255.0.0.0
- B Range from 172.16.0.0 to 172.31.255.255, 172.16.0.0/255.240.0.0
- C A 192.168.0.0 to 192.168.255.255 range, 192.168.0.0/255.255.0.0
The host ID portion can be divided into a subnet ID and a host ID. The subnet mask is a 32-bits value containing "1" bits for the network ID and subnet ID, and "0" bits for the host ID. Therefore,
- IP address tells you what network class you are using
- The subnet mask tells you the boundary between hosts and subnets
TCP
TCP services
TCP provides reliability:- the bytes passed by TCP to IP is called a segment.
- TCP uses a timer to wait for an ACK to the segment it sent. If an ACK is not received in time, the segment is re-transmitted.
- TCP sends an ACK for each data it received When a new connection is being established, the SYNC flag is turned on. The sequence number field contains the initial sequence number chosen by the host initializing the connection. The acknowledge number contains the next sequence number that the sender expects to receive.( acknowledge number = received sequence number + 1 ) TCP provides full-duplex service so that each end of a connection must maintain a sequence number in each direction.
- TCP maintains a checksum on its header and data If a segment arrives with an invalid checksum, TCP discards it and does not ACK it.
- TCP resequences the received data when necessary IP datagram can arrive out of order.
- TCP must discard duplicated received data
- TCP provides flow control A receiving TCP only allows the other end to send as much data as the receiver buffer. Every time TCP receives a packet, it needs to send an ACK, this ACK message includes the 16-bits current receive window size field,
rwnd_size = ReceiveBuffer - (LastByteReceived – LastByteReadByApplication)so the sender knows if it can keep sending data. TCP uses a sliding window protocol to control the number of bytes it can send. This make sure it never has more bytes in sending than the window advertised by the receiver The sender will always keep this invariant:
LastByteSent - LastByteAcked <= ReceiveWindowAdvertisedIn today’s networks, this 16-bits window size( max. 65,535 )is not enough to provide optimal traffic flow, TCP Options were introduced in RFC 1323 that enable the TCP receive window to be increased exponentially. The specific function is called TCP Window Scaling, which is advertised in the handshake process. If one side or the other cannot support scaling, then neither will use this function. The scale factor, or multiplier, will only be sent in the SYN packets during the handshake and will be used for the life of the connection. When the TCP sender receives an ACK with zero-window message it starts the persist timer, TCP will stop transmitting data and periodically send a small packet to the receiver (usually called ZeroWindowProbe,rfc), so the receiver has a chance to advertise a nonzero window size. The transmitting host SHOULD send the first zero-window probe when a zero window has existed for the retransmission timeout period, and SHOULD increase exponentially the timeout interval between successive probes.
TCP Connection Establishment and Termination
- establishment: 3-way handshake
- the client sends SYN + ISN
- the server responds with its own SYN + server's ISN + ACK(client's ISN+1)
- the client ACK(server's ISN )
- termination Ether end can send the FIN when it is done sending data. When a TCP receives a FIN, it sends back a ACK( received SN + 1 ). Therefore, this needs 4 steps.
TCP Timers
TCP maintains the connection state internally and requires timers to keep track of events.
The TCP requires 3 timers to maintain the state on the transmit side in the protocol.
In Linux, when timers are initialized, they are given an associated function that is called when the timer goes off. Each timer function for TCP is passed a pointer to the sock structure. The timer uses the sock to know which connection it is dealing with. The timer functions can be found in the file linux/net/tcp_timer.c.
- tcp_retransmit_timer() TCP uses a time out timer for retransmission of lost segments. This is called when the retransmit timer expires, indicating that an expected acknowledgment was not received.
- Sender starts a time out timer after transmitting a TCP segment to the receiver.
- If sender receives an ACK before the timer goes off, it stops the timer.
- If sender does not receives any acknowledgement and the timer goes off, then TCP Retransmission occurs.
- Sender retransmits the same segment and resets the timer
- The value of time out timer is dynamic and changes with the amount of traffic in the network.
- tcp_probe_timer() The zero window timer is set when this side of the connection sends out a zero window probe in response to a zero window advertisement from the peer. We arrive at this function because the timer expired before a response was received to the zero window probe.
- tcp_delack_timer() This is to minimize the number of separate ACKs that are sent. The receiver does not send an ACK as soon as it can. The delayed acknowledgment timer is set to the amount of time to hold the ACK waiting for outgoing data to be ready.
- tcp_keepalive_timer() TCP uses a keep alive timer to prevent long idle TCP connections. If a client opens a TCP connection to a server transfers some data and becomes silent the client will crash. In this case, the connection remains open forever. TCP normally does not perform any keepalive function; keepalive polling is not part of the TCP specification. The keepalive is added outside the TCP specification for the use of some TCP application layer servers for protocols that don’t do any connection polling themselves. For example, the telnet daemon sets the keepalive mode.
- Each time server hears from the client, it resets the keep alive timer to 2 hours.
- If server does not hear from the client for 2 hours, it sends 10 probe segments to the client.
- These probe segments are sent at a gap of 75 seconds.
- If server receives no response after sending 10 probe segments, it assumes that the client is down.
- Then, server terminates the connection automatically.
- Time Wait Timer TCP uses a time wait timer during connection termination.
- Sender starts the time wait timer after sending the ACK for the second FIN segment.
- It allows to resend the final acknowledgement if it gets lost.
- It prevents the just closed port from reopening again quickly to some other application.
- It ensures that all the segments heading towards the just closed port are discarded.
- The value of time wait timer is usually set to twice the lifetime of a TCP segment.
Firewall
A Packet-Filtering Firewall
A packet-filtering firewall consists of a list of acceptance and denial rules. The list of rules defining what can come in and what can go out are called chains. A packet is matched against each rule in the list, 1-by-1, until a match is found or the list is exhausted.Choosing a Default Packet-filtering Policy
If the packet does not match any rule, the default policy for a chain is applied to the packet,- ACCEPT means to let the packet through.
- DROP means to drop the packet on the floor.
- RETURN means stop traversing this chain and resume at the next rule in the previous (calling) chain.
How Packets Traverse The Filters
The kernel starts with three lists of rules( chains ) in the `filter' tables: INPUT, OUTPUT and FORWARD.
Incoming Outgoing
| ^
| |
v |
------------ --------------
|Pre-Routing | |Post-routing |
------------ --------------
| |
| _____ |
| / \ |
+---- -->[Routing ]--->|FORWARD|-------+--->-------+
[Decision] \_____/ ^
| |
v ____
___ / \
/ \ |OUTPUT|
|INPUT| \____/
\___/ ^
| |
+----> Local Process ->--+
- When a packet comes in (say, through the Ethernet card) the kernel first looks at the destination of the packet: this is called `routing'.
- If it's destined for this box, the packet passes downwards in the diagram, to the INPUT chain. If it passes this, any processes waiting for that packet will receive it.
- If the kernel does not have forwarding enabled, or it doesn't know how to forward the packet, the packet is dropped.
- If forwarding is enabled, and the packet is destined for another network interface (if you have another one), then the packet goes rightwards on our diagram to the FORWARD chain. If it is ACCEPTed, it will be sent out.
- A program running on the box can send network packets. These packets pass through the OUTPUT chain immediately: if it says ACCEPT, then the packet continues out to whatever interface it is destined for.
- NF_IP_PRE_ROUTING This hook will be triggered by any incoming traffic very soon after entering the network stack. This hook is processed before any routing decisions have been made regarding where to send the packet.
- NF_IP_LOCAL_IN This hook is triggered after an incoming packet has been routed if the packet is destined for the local system.
- NF_IP_FORWARD This hook is triggered after an incoming packet has been routed if the packet is to be forwarded to another host.
- NF_IP_LOCAL_OUT This hook is triggered by any locally created outbound traffic as soon it hits the network stack.
- NF_IP_POST_ROUTING This hook is triggered by any outgoing or forwarded traffic after routing has taken place and just before being put out on the wire.
Using iptables
Iptables and ip6tables are used to set up, maintain, and inspect the tables of IPv4 and IPv6 packet filter rules in the Linux kernel. The iptables uses tables to organize its rules. These tables classify rules according to the type of decisions they are used to make. The names of the built-in chains mirror the names of the netfilter hooks they are associated with:- PREROUTING Triggered by the NF_IP_PRE_ROUTING hook.
- INPUT Triggered by the NF_IP_LOCAL_IN hook.
- FORWARD Triggered by the NF_IP_FORWARD hook.
- OUTPUT Triggered by the NF_IP_LOCAL_OUT hook.
- POSTROUTING Triggered by the NF_IP_POST_ROUTING hook.
- filter The default table. The built-in chains:
- INPUT
- OUTPUT
- FORWARD
- nat
- PREROUTING
- OUTPUT
- POSTROUTING
- mangle
- PREROUTING
- OUTPUT
- the name of a user-defined chain
- one of the targets described in iptables-extensions(8)
- one of the special values ACCEPT, DROP or RETURN.
- Create a new user-defined chain by the given name. iptables [-t table] -N chain
- Flush the selected chain (all the chains in the table if none is given). iptables [-t table] -F [chain [rulenum]] [options...]
- Append/Check/Delete one or more rules to the end of the selected chain. iptables [-t table] {-A|-C|-D} chain rule-specification
- Set the default policy for the built-in (non-user-defined) chain to the given target( ACCEPT or DROP ). iptables [-t table] -P chain target
- List all rules in the selected chain. iptables [-t table] -L [chain [rulenum]] [options...] The list command can take additional options:
- -n List IP and port numbers numerically but by name.
- -v List additional information such as counters.
- --line-numbers List the rule's position within the chain
- -x List exact values of counters
- Insert one or more rules in the selected chain as the given rule number. iptables [-t table] -I chain [rulenum] rule-specification
- Replace a rule in the selected chain. iptables [-t table] -R chain rulenum rule-specification
- Delete one or more rules from the selected chain. iptables [-t table] -D chain rulenum
- Print all rules in the selected chain. iptables [-t table] -S [chain [rulenum]]
- List all rules in the selected chain. iptables [-t table] -L [chain [rulenum]] [options...]
- Delete the optional user-defined chain specified. iptables [-t table] -X [chain]
- Rename the user specified chain to the user supplied name. iptables [-t table] -E old-chain-name new-chain-name
rule-specification = parameter-1 option-1 ... parameter-n> option-nThe rule-specification are composed of pairs of parameters and options that define what happens when a packet matches the rule.
Basic parameters:
- -i [!] name Name of an interface via which a packet was received (only for packets entering the INPUT, FORWARD and PREROUTING chains).
- -o [!] name Name of an interface via which a packet is going to be sent (for packets entering the FORWARD, OUTPUT and POSTROUTING chains).
- -p [!] protocol The protocol of the rule or of the packet to check. The specified protocol can be one of tcp, udp, icmp, or all, or it can be a numeric value, representing one of these protocols or a different one. A protocol name from /etc/protocols is also allowed.
- -s [!] address[/mask] Source specification. Address can be either a network name, a hostname, a network IP address (with /mask), or a plain IP address. The mask can be either a network mask or a plain number, specifying the number of 1's at the left side of the network mask. Thus, a mask of 24 is equivalent to 255.255.255.0.
- -d [!] address[/mask] Destination specification.
- -j target This specifies the target of the rule; i.e., what to do if the packet matches it. The target can be a user-defined chain, one of the special builtin targets, or an extension. If this option is omitted in a rule, then matching the rule will have no effect on the packet's fate, but the counters on the rule will be incremented.
- options are available for the TCP protocol (-p tcp)
- --dport [!] port[:port] Sets the destination port for the packet. Use either a network service name (such as www or smtp), port number, or range of port numbers to configure this option.
- --sport [!] port[:port] Sets the source port of the packet
- [!] --syn Only match TCP packets with the SYN bit set and the ACK,RST and FIN bits cleared. Such packets are used to request TCP connection initiation.
- --tcp-flags [!] mask comp The first argument is the flags which we should examine, written as a comma-separated list, and the second argument is a comma-separated list of flags which must be set. For example, "-p tcp --tcp-flags ACK,FIN,SYN SYN" is equivalent to "--syn".
- --tcp-option [!] number Match if TCP option set.
- options are available for the UDP protocol (-p udp)
- --dport [!] port[:port]
- --sport [!] port[:port]
- ICMP Protocol (-p icmp)
- --icmp-type [!] typename Sets the name or number of the ICMP type to match with the rule.
To use a match option module, load the module by name using the -m option.
- -m limit Places limits on how many packets are matched to a particular rule. A rule using this extension will match until this limit is reached. It can be used in combination with the LOG target to give limited logging.
- --limit rate Maximum average matching rate: specified as a number, with an optional '/second', '/minute', '/hour', or '/day' suffix; the default is 3/hour.
- --limit-burst number Maximum initial number of packets to match: this number gets recharged by one every time the limit specified above is not reached, up to this number; the default is 5.
- -m state Enable the access to the connection tracking state for this packet.
--state stateWhere state is a comma separated list of the connection states to match:
- ESTABLISHED The matching packet is associated with other packets in an established connection.
- INVALID The matching packet cannot be tied to a known connection.
- NEW The matching packet is either creating a new connection or is part of a two-way connection not previously seen.
- RELATED The matching packet is starting a new connection related in some way to an existing connection.
- --mac-source [!] address It must be of the form XX:XX:XX:XX:XX:XX. Note that this only makes sense for packets coming from an Ethernet device and entering the PREROUTING, FORWARD or INPUT chains.
- --mark value[/mask] Matches packets with the given unsigned mark value which was set at some earlier point.
- --tos value The value can be a string or a numeric value.
The following are the standard targets:
- -j user-defined-chain This target passes the packet to the target user-defined-chain.
- -j ACCEPT Allows the packet to successfully move on to its destination or another chain.
- -j DROP Drops the packet without responding to the requester.
- -j QUEUE The packet is queued for handling by a user-space application.
- -j RETURN Stops checking the packet against rules in the current chain, the packet is returned to the first chain to resume rule checking where it left off.
- -j LOG Logs all packets that match this rule. Since the packets are logged by the kernel, it can be read with dmesg or syslogd, the /etc/syslog.conf file determines where these log entries are written. By default, they are placed in the /var/log/messages file. This is a "non-terminating target"(packet will not be dropped), i.e. rule traversal continues at the next rule. To specify the way in which logging occurs
- --log-level Sets the priority level of a logging event. A list of priority levels can be found within the syslog.conf man page.
- --log-ip-options Logs any options set in the header of a IP packet.
- --log-prefix Places a string of up to 29 characters before the log line when it is written. This is useful for writing syslog filters for use in conjunction with packet logging.
- --log-tcp-options Logs any options set in the header of a TCP packet.
- --log-tcp-sequence Writes the TCP sequence number for the packet in the log.
- -j REJECT This is used to send back an error packet in response to the matched packet. The following option controls the nature of the error packet returned:
--reject-with typewhere type:
- icmp-net-unreachable
- icmp-host-unreachable
- icmp-port-unreachable
- icmp-proto-unreachable
- icmp-net-prohibited
- icmp-host-prohibited
- --to-source ipaddr[-ipaddr][:port-port] which can specify a single new source IP address, an inclusive range of IP addresses, and optionally, a port range. The source port is mapped to a free port if not assigned.
- --to-destination ipaddr[-ipaddr][:port-port] which can specify a single new destination IP address, an inclusive range of IP addresses, and optionally, a port range. If no port range is specified, then the destination port will never be modified. This feature is useful when you want to forward connections to internal servers that are not publicly visible.
- --to-ports port[-port] This specifies a destination port or range of ports to use: without this, the destination port is never altered.
DNS: The Domain Name System
DNS is a distributed database that is used by TCP/IP applications to map between hostnames and IP addresses. DNS provides the protocol for clients and servers to communicate with each other. From application's point of view, access to the DNS is through a name resolver which contacts one or more name servers to do the mapping. The resolver can be accessed by 2 library functions:- getaddrinfo() gethostbyname() is deprecated.
- getnameinfo() gethostbyaddr() is deprecated.
DNS Basics
The DNS name space is a hierarchical tree, similar to file system.- node Every node has a label (max 63 characters)
- root The root is a node with null label.
- domain name The domain name of any node is the list of labels: starting at the node, walking up to the root, using "." to separate labels.
- FQDN(fully qualified domain name) A domain name that ends with a "." is called an absolute domain name or FQDN.
DNS message format
DNS configuration
The file /etc/resolv.conf is actually used indirectly now.The network manager does it now.:
nmcli device show interfacename | grep IP4.DNS
Linux Server Hacks, Volume Two: System Services
#20 Quick and Easy DHCP Setup
- Installing a DHCP Server Debian:
apt-get install isc-dhcp-serverFedora:
yum install dhcp
# option definitions common to all supported networks... option domain-name "example.org"; option domain-name-servers ns1.example.org, ns2.example.org; default-lease-time 600; max-lease-time 7200; option domain-name "isc.org"; option domain-name-servers ns1.isc.org, ns2.isc.org; # The ddns-updates-style parameter controls whether or not the server will # attempt to do a DNS update when a lease is confirmed. We default to the # behavior of the version 2 packages ('none', since DHCP v2 didn't # have support for DDNS.) ddns-update-style none; # If this DHCP server is the official DHCP server for the local # network, the authoritative directive should be uncommented. #authoritative; # Use this to send dhcp log messages to a different log file (you also # have to hack syslog.conf to complete the redirection). #log-facility local7; # No service will be given on this subnet, but declaring it helps the # DHCP server to understand the network topology. #subnet 10.152.187.0 netmask 255.255.255.0 { #} # This is a very basic subnet declaration. #subnet 10.254.239.0 netmask 255.255.255.224 { # range 10.254.239.10 10.254.239.20; # option routers rtr-239-0-1.example.org, rtr-239-0-2.example.org; #} # This declaration allows BOOTP clients to get dynamic addresses, # which we don't really recommend. #subnet 10.254.239.32 netmask 255.255.255.224 { # range dynamic-bootp 10.254.239.40 10.254.239.60; # option broadcast-address 10.254.239.31; # option routers rtr-239-32-1.example.org; #} # A slightly different configuration for an internal subnet. #subnet 10.5.5.0 netmask 255.255.255.224 { # range 10.5.5.26 10.5.5.30; # option domain-name-servers ns1.internal.example.org; # option domain-name "internal.example.org"; # option subnet-mask 255.255.255.224; # option routers 10.5.5.1; # option broadcast-address 10.5.5.31; # default-lease-time 600; # max-lease-time 7200; #} # Hosts which require special configuration options can be listed in # host statements. If no address is specified, the address will be # allocated dynamically (if possible), but the host-specific information # will still come from the host declaration. #host passacaglia { # hardware ethernet 0:0:c0:5d:bd:95; # filename "vmunix.passacaglia"; # server-name "toccata.example.com"; #} # Fixed IP addresses can also be specified for hosts. These addresses # should not also be listed as being available for dynamic assignment. # Hosts for which fixed IP addresses have been specified can boot using # BOOTP or DHCP. Hosts for which no fixed address is specified can only # be booted with DHCP, unless there is an address range on the subnet # to which a BOOTP client is connected which has the dynamic-bootp flag # set. #host fantasia { # hardware ethernet 08:00:07:26:c0:a5; # fixed-address fantasia.example.com; #} # You can declare a class of clients and then do address allocation # based on that. The example below shows a case where all clients # in a certain class get addresses on the 10.17.224/24 subnet, and all # other clients get addresses on the 10.0.29/24 subnet. #class "foo" { # match if substring (option vendor-class-identifier, 0, 4) = "SUNW"; # set. #host fantasia { # hardware ethernet 08:00:07:26:c0:a5; # fixed-address fantasia.example.com; #} # You can declare a class of clients and then do address allocation # based on that. The example below shows a case where all clients # in a certain class get addresses on the 10.17.224/24 subnet, and all # other clients get addresses on the 10.0.29/24 subnet. #class "foo" { # match if substring (option vendor-class-identifier, 0, 4) = "SUNW"; #} #shared-network 224-29 { # subnet 10.17.224.0 netmask 255.255.255.0 { # option routers rtr-224.example.org; # } # subnet 10.0.29.0 netmask 255.255.255.0 { # option routers rtr-29.example.org; # } # pool { # allow members of "foo"; # range 10.17.224.10 10.17.224.250; # } # pool { # deny members of "foo"; # range 10.0.29.10 10.0.29.230; # } #}man dhcpc.conf:
- DHCP Global configuration The basic configuration that we need in order to run a DHCP server:
- default-lease-time
- max-lease-time
- INTERFACESv4="eth0" Define which interface the DHCP server should use to serve DHCP requests.
- authoritative
- Defining the Subnet Each subnet may have its own router.
subnet 204.254.239.0 netmask 255.255.255.224 { ( subnet-specific parameters... ) range 204.254.239.10 204.254.239.30; }Notice, not to assign fixed addresses that overlap with the pool you’ve configured in your subnet statement.
group { ( group-specific parameters... ) host zappo.test.isc.org { ( host-specific parameters... ) } host beppo.test.isc.org { ( host-specific parameters... ) } host harpo.test.isc.org { ( host-specific parameters... ) } }
sudo systemctl status isc-dhcp-server.serviceTo start the DHCP service,,
sudo systemctl start isc-dhcp-server.serviceTo stop the DHCP service,
sudo systemctl stop isc-dhcp-server.serviceTo restart the DHCP service,
sudo systemctl restart isc-dhcp-server.service
#21 Integrate DHCP and DNS with Dynamic DNS Updates
If DNS and DHCP servers are not in sync, a DHCP release for a new IP may cause the name resolution problem. There are 2 solutions: statically assign addresses to your hosts, or use a tool (or script one yourself) to perform DNS updates. In more recent versions of DHCP and BIND, both services support a mechanism for performing dynamic DNS updates (defined in RFC 2136).- Generating session key The two services will use a key to communicate with each other. The DHCP server uses this key to sign update requests sent to the DNS server, and the DNS server uses it to verify the signed requests from the DHCP server. BIND 9 comes with a utility to generate this key, called dnssec-keygen.
- Configuring the BIND Name Server The next step is to configure BIND to allow updates from the DHCP server, using the key you just generated.
- Configuring the ISC DHCP Server
Linux network troubleshooting tools
Cheat sheet
ip
ip - show / manipulate routing, network devices, interfaces and tunnels .
link layer
network device. Show the status:$ ip link show 1: lo: <loopback> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eno1: <no-carrier> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 5c:26:0a:13:82:cf brd ff:ff:ff:ff:ff:ff 3: wlp2s0: <broadcast> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000 link/ether a4:4e:31:a6:78:64 brd ff:ff:ff:ff:ff:ffBring the interface up,
$ sudo ip link set eno1 upWe can use the -s flag with the ip command to print additional statistics about an interface,
$ ip -s link show 1: lo: <loopback> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 RX: bytes packets errors dropped overrun mcast 6259323 66161 0 0 0 0 TX: bytes packets errors dropped carrier collsns 6259323 66161 0 0 0 0 2: eno1: <no-carrier> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 5c:26:0a:13:82:cf brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 3: wlp2s0: <broadcast> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000 link/ether a4:4e:31:a6:78:64 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 1373619334 1354333 0 0 0 0 TX: bytes packets errors dropped carrier collsns 229249749 830716 0 0 0 0For more advanced Layer 1 troubleshooting, the ethtool/wireshark utility is an excellent option.
data link layer
If your localhost can’t successfully resolve its gateway’s Layer 2 MAC address, then it won’t be able to send any traffic to remote networks.We can check the entries in our ARP table with the ip neighbor command:
$ ip neighbor show 192.168.0.1 dev wlp2s0 lladdr c0:a0:bb:ef:3d:d7 REACHABLE fe80::c2a0:bbff:feef:3dd7 dev wlp2s0 lladdr c0:a0:bb:ef:3d:d7 router REACHABLEIf there was a problem with ARP, then we would see a resolution failure.
Linux caches the ARP entry for a period of time, you can manually delete an ARP entry, which will force a new ARP discovery process:
$ ip neighbor delete 192.168.0.1 dev wlp2s0
network/internet layer
$ ip -br address show lo UNKNOWN 127.0.0.1/8 ::1/128 eno1 DOWN wlp2s0 UP 192.168.0.105/24 2001:b011:5003:14ec:2976:d29d:44a9:5cd6/64 2001:b011:5003:14ec:e136:5b5f:ebb4:4e4e/64 fe80::8205:ef53:5b3b:7756/64-br prints only basic information in a tabular format for better readability.
This option is currently only supported by ip addr show and ip link show commands.
The lack of an IP address can be caused by a local misconfiguration, such as an incorrect network interface config file, or it can be caused by problems with DHCP.
ping can be an easy way to tell if a host is alive and responding.
The next tool in the Layer 3 troubleshooting tool belt is the traceroute command.
Traceroute will send out one packet at a time, beginning with a TTL of one. Since the packet expires in transit, the upstream router sends back an ICMP Time-to-Live Exceeded packet.
Traceroute then increments the TTL to determine the next hop.
The list of gateways for different routes is stored in a routing table, which can be inspected and manipulated using ip route commands.
We can print the routing table:
$ ip route show default via 192.168.0.1 dev wlp2s0 proto dhcp metric 600 169.254.0.0/16 dev wlp2s0 scope link metric 1000 192.168.0.0/24 dev wlp2s0 proto kernel scope link src 192.168.0.105 metric 600we can check the route for a specific prefix:
$ ip route get 10.0.0.0/8 10.0.0.0 via 192.168.0.1 dev wlp2s0 src 192.168.0.105 uid 1000 cacheTo delete a default route:
$ sudo route delete default gw 192.168.1.250 eth0The Domain Name System (DNS) translates IP addresses into human-readable names.
A DNS trouble is the ability to connect to a remote host by IP address but not its hostname.
Performing a quick nslookup on the hostname can tell us what happened.
transport layer
To find out which process is listing upon a port:- netstat
$ netstat -tulpn Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
It allows showing information similar to netstat.
It can display more TCP and state information than other tools.
When no option is used ss displays a list of open non-listening sockets (e.g. TCP/UNIX/UDP) that have established connection.
$ ss -tunlp4 NetidState Recv-Q Send-Q Local Address:Port Peer Address:Port udp UNCONN 0 0 224.0.0.251:5353 0.0.0.0:* users:(("chrome",pid=2410,fd=49)) udp UNCONN 0 0 224.0.0.251:5353 0.0.0.0:* users:(("chrome",pid=2372,fd=322)) udp UNCONN 0 0 224.0.0.251:5353 0.0.0.0:* users:(("chrome",pid=2410,fd=120)) udp UNCONN 0 0 0.0.0.0:5353 0.0.0.0:* udp UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:* udp UNCONN 0 0 0.0.0.0:68 0.0.0.0:* udp UNCONN 0 0 0.0.0.0:43241 0.0.0.0:* udp UNCONN 0 0 0.0.0.0:631 0.0.0.0:* tcp LISTEN 0 128 127.0.0.1:5939 0.0.0.0:* tcp LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:* tcp LISTEN 0 5 127.0.0.1:631 0.0.0.0:* tcp LISTEN 0 100 127.0.0.1:25 0.0.0.0:* tcp LISTEN 0 80 127.0.0.1:3306 0.0.0.0:*flags:
- -t - Show TCP ports.
- -u - Show UDP ports.
- -n - Do not try to resolve hostnames.
- -l - Show only listening ports.
- -p - Show the processes that are using a particular socket.
- -4 - Show only IPv4 sockets.
- TCP
telent ip port
sudo apt-get install netcatUsage:
nc ip -u portUse nc to test connections:
- Test on a TCP port Server:
nc -lvnp 1234Client:
nc -vn 192.168.0.112 1234
nc -lvnup 1234Client:
nc -vnu 192.168.0.112 1234
nc -vnz -w 1 192.168.0.101 20-25
nc -vnzu 192.168.40.146 1-65535
cat sample_video.avi | nc -l 1234Client:
nc 192.168.0.101 1234 | mplayer -vo x11 -cache 3000 -However, a much more powerful tool is nmap which can scan TCP and UDP ports listened on the remote host.
$ nmap -v 192.168.122.1 PORT STATE SERVICE 22/tcp open ssh 53/tcp open domain 80/tcp open http 443/tcp open https 8081/tcp open blackice-icecap
A tcpdump Tutorial with Examples — 50 Ways to Isolate Traffic
The traffic can be dumpped with different ways:- on an interface
tcpdump -i eth0
tcpdump host 1.1.1.1
tcpdump src 1.1.1.1 tcpdump dst 1.0.0.1
tcpdump net 1.2.3.0/24
tcpdump -X icmp
tcpdump port 3389 tcpdump src port 1025
tcpdump icmp
tcpdump ip6
tcpdump portrange 21-23
tcpdump less 32 tcpdump greater 64 tcpdump <= 128
tcpdump -w capture_file tcpdump -r capture_file
- -t Give human-readable timestamp output.
- -v Verbose output (more v’s gives more output).
tcpdump -nnvvS src 10.5.2.3 and dst port 3389
Troubleshooting and Performance Tuning
system hung, Linux kernel panic analysis, filesystem failed.Linux Server Hacks, Volume Two: Troubleshooting and Performance
#69 Find Resource Hogs with Standard Commands
The first thing you can debug the resource problem is log into the machine and run the top command:Tasks: 368 total, 1 running, 240 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.7 us, 1.8 sy, 0.0 ni, 96.0 id, 0.0 wa, 0.0 hi, 0.5 si, 0.0 st KiB Mem : 3960656 total, 490284 free, 2295076 used, 1175296 buff/cache KiB Swap: 2097148 total, 1409676 free, 687472 used. 1323712 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20446 root 20 0 0 0 0 I 6.0 0.0 0:03.42 kworker/u16:23- 1780 jerry 20 0 567372 45480 32776 S 3.0 1.1 55:06.71 Xorg 1894 jerry 20 0 3970804 154548 34504 S 2.3 3.9 32:30.89 gnome-shell 8825 jerry 20 0 753108 29704 18288 S 2.3 0.7 0:04.34 gnome-terminal- 1211 mysql 20 0 1489996 4068 0 S 0.3 0.1 0:44.82 mysqld 1932 jerry 20 0 362680 4988 2868 S 0.3 0.1 0:49.15 ibus-daemon 2372 jerry 20 0 1352908 320352 116632 S 0.3 8.1 47:27.69 chrome 4847 jerry 20 0 806508 128588 73200 S 0.3 3.2 0:29.62 chrome 12658 jerry 20 0 1272228 364528 122448 S 0.3 9.2 2:53.93 chrome 20092 root 20 0 0 0 0 I 0.3 0.0 0:01.78 kworker/0:0-eve 20743 jerry 20 0 51764 4412 3528 R 0.3 0.1 0:00.33 top 1 root 20 0 225852 6188 3684 S 0.0 0.2 0:36.19 systemd
- Tasks: This shows total tasks or threads.That total is further classified as: running; sleeping; stopped; zombie. Only one process can run at a time on a single CPU. Note, the process is runnable: it is either currently running or on a runqueue waiting to run". Most processes are in one of the following two states:
- A process that is on the CPU (a running process with state R)
- A process that is off the CPU (a not-running process) Process that is not running appears in one of the following state:
- Runnable state (R) The scheduler keeps that process in the run queue (the list of ready-to-run processes maintained by the kernel). When the CPU is available, this process will enter into Running state.
- Sleeping state A process enters a Sleeping state when it needs resources that are not currently available. When the resource the process is waiting on becomes available, a signal is sent to the CPU. The next time the scheduler gets a chance to schedule this sleeping process, the scheduler will put the process either in Running or Runnable state. There are two types of sleep states:
- Interruptable sleep state (S) An Interruptible sleep state means the process is waiting either for a particular time slot or for a particular event to occur.
- Un-interruptable sleep state(D) An Uninterruptible sleep state is one that won't handle a signal right away. It will wake only as a result of a waited-upon resource becoming available or after a time-out occurs during that wait. The Uninterruptible state is mostly used by device drivers waiting for disk or network I/O.
- Defunct or Zombie state(Z) Between the time when the process terminates and the parent releases the child process, the child enters into what is referred to as a Zombie state. The reason you cannot kill a Zombie process is that you cannot send a signal to the process to kill it as the process no longer exists.
- Born or forked
- Ready to run or runnable
- Running in user space or running in kernel space
- Blocked, Waiting, Sleeping, in an Interruptable sleep, or in an Uninterruptable sleep
- The process is sleeping, but it is present in main memory
- The process is sleeping, but it is present in secondary memory storage (swap space on disk)
- Terminated or stopped
- %Cpu(s) When a user initiates a process, the process starts working in user mode. When the kernel starts serving requests from user-level processes, the user-level process enters into kernel space. The top command's Cpu line shows the overall percentage of CPU work in user mode (us) and system mode (sy). This shows CPU state percentages based on the interval since the last refresh.There are two labels (recent kernel versions are shown first) shown below,
- us, user : time running un-niced user processes
- sy, system : time running kernel processes
- ni, nice : time running niced user processes
- id, idle : time spent in the kernel idle handler If id is low, the CPU is working hard and doesn't have much excess capacity.
- wa, IO-wait : time waiting for I/O completion If wa is high, the CPU is ready to run, but is waiting on I/O access to complete (like fetching rows from a database table stored on the disk).
- hi : time spent servicing hardware interrupts
- si : time spent servicing software interrupts
- st : time stolen from this vm by the hypervisor Steal time is the percentage of time a virtual CPU waits for a real CPU while the hypervisor is servicing another virtual processor. Steal time is the amount of CPU time needed by a guest virtual machine that is not provided by the host. The VM kernel gets the steal metric from the hypervisor. ( in terms of losses) The hypervisor doesn’t specify which processes it is running. It just says: "I’m busy, and can’t allocate any time to you." The hypervisor doesn’t share with the VM information regarding what it is occupied with. Large amounts of steal time indicate CPU contention, which can reduce guest VM's performance. To relieve CPU contention, increase the guest VM's CPU priority or CPU quota, or run fewer guest VMs on the host. This data is supported by Xen and KVM virtual environments. A general rule of thumb - if steal time is greater than 10% for 20 minutes, the VM is likely in a state that it is running slower than it should. When this happens:
- Shut down the instance and move it to another physical server
- If steal time remains high, increase the CPU resources
- If steal time remains high, contact your hosting provider. Your host may be overselling physical servers.
- start KVM
- Try logging into the VM from the host
- KiB Mem : Status of physical memory: total, free, used and buff/cache. "used" includes “all memory allocated by system processes" and other categories of memory: “buffers” and “cache”. The Linux kernel attempts to use unused memory to improve performance: Disk data is cached in the “page cache”, buffers+cache is the size of the page cache. Your real used memory is:
T = stopped by job control signal t = stopped by debugger during trace
Test to see the steal value changed:
$ virsh start ubuntu18.04
$ ssh jerry@192.168.122.145Run top then monitor the steal value:
%Cpu(s): 0.3 us, 3.0 sy, 96.4 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1934 root 39 19 226624 115428 71280 R 99.7 5.7 8:02.91 unattended-upgr
Host status %Cpu(s): 26.2 us, 0.8 sy, 0.0 ni, 71.9 id, 0.7 wa, 0.0 hi, 0.3 si, 0.0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3522 libvirt+ 20 0 4802444 1.899g 4200 S 100.3 50.3 12:05.82 qemu-system-x86
( used - buff/cache ) / total
- %CPU -- CPU Usage The task's share of the elapsed CPU time since the last screen update. Press (Shift+P) to sort processes as per CPU utilization.
- %MEM Memory Usage, simply RES divided by total physical memory
- RES Resident Memory Size (KiB). A subset of the virtual address space (VIRT) representing the non-swapped physical memory a task is currently using. It is also the sum of the RSan, RSfd and RSsh fields.
- RSlk Resident Locked Memory Size (KiB). A subset of resident memory (RES) which cannot be swapped out.
- RSan Resident Anonymous Memory Size (KiB). A subset of resident memory (RES) representing private pages not mapped to a file.
- RSfd Resident File-Backed Memory Size (KiB). A subset of resident memory (RES) representing the implicitly shared pages supporting program images and shared libraries. It also includes explicit file mappings, both private and shared.
- RSsh Resident Shared Memory Size (KiB). A subset of resident memory (RES) representing the explicitly shared anonymous shm*/mmap pages.
- SHR Shared Memory Size (KiB). A subset of resident memory (RES) that may be used by other processes. It will include shared anonymous pages and shared file-backed pages. It also includes private pages mapped to files representing program images and shared libraries.
- SWAP Swapped Size (KiB). The formerly resident portion of a task's address space written to the swap file when physical memory becomes over committed.
- USED Memory in Use (KiB). This field represents the non-swapped physical memory a task is using (RES) plus the swapped out portion of its address space (SWAP).
- VIRT Virtual Memory Size (KiB). The total amount of virtual memory used by the task. It includes everything in-use and/or reserved: all code, data and shared libraries plus pages that have been swapped out and pages that have been mapped but not used.
- COMMAND -- Command Name or Command Line Press ‘c‘ option in running top command, it will display absolute path of running process.
- PR The kernel runs jobs based on their priorities, which are indicated in the PR field. In linux system priorities are 0 to 139 in which 0 to 99 for real time and 100 to 139 for users.
- NI Nice value is a user-space and priority PR is the process's actual priority that use by Linux kernel. The nice value range is -20 to +19: -20 is highest, 0 default and +19 is lowest. Relation between nice value and priority is :
PR = 20 + NIPR [0 - 39] maps 100 to 139. To re-nice a process, hit 'R' , then top will ask for the input:
PID to renice [default pid = 20998]To change the priority of everything owned by user jerry:
renice 20 -u jerry
$ top -u jerry
command sorted-field supported A start time (non-display) No M %MEM Yes N PID Yes P %CPU Yes T TIME+ Yes
top -n 1 -b
$ top -p2100
Threads: 1061 total, 1 running, 1007 sleeping, 0 stopped, 0 zombie
%Cpu0 : 16.8 us, 2.3 sy, 0.0 ni, 79.2 id, 0.7 wa, 0.0 hi, 1.0 si, 0.0 st %Cpu1 : 12.8 us, 4.0 sy, 0.0 ni, 80.5 id, 1.3 wa, 0.0 hi, 1.3 si, 0.0 st %Cpu2 : 16.9 us, 4.7 sy, 0.0 ni, 76.7 id, 1.7 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 18.9 us, 3.0 sy, 0.0 ni, 78.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 3960624 total, 447004 free, 2951536 used, 562084 buff/cache KiB Swap: 2097148 total, 882012 free, 1215136 used. 672356 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND P 32027 jerry 20 0 1262872 284452 76560 S 11.4 7.2 0:02.23 ThreadPoolForeg 2 32146 jerry 20 0 1262872 284452 76560 S 10.4 7.2 0:00.86 ThreadPoolForeg 2 32096 jerry 20 0 1262872 284452 76560 S 10.1 7.2 0:01.06 ThreadPoolForeg 1
The next tool is vmstat. vmstat reports information about processes, memory, paging, block IO, traps, disks and cpu activity.
$ vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 915712 1730676 58020 577300 9 71 217 129 2001 1152 11 3 76 10 0By default, vmstat produces output once. You can assign a value for delay (in seconds) after which the output is updated.
Analysis:
- Swap
- si: Amount of memory swapped in from disk (/s).
- so: Amount of memory swapped to disk (/s).
- IO
- bi: Blocks received from a block device (blocks/s).
- bo: Blocks sent to a block device (blocks/s).
- System
- in: The number of interrupts per second, including the clock.
- cs: The number of context switches per second.
Virtual memory is a memory management technique used by Linux that combine active RAM and inactive memory on the disk drive (hard disk / ssd) to form a large range of contiguous addresses.
A page fault occurs when a process accesses a page that is mapped in the virtual address space, but not loaded in physical memory. 中央處理器的記憶體管理單元發出中斷信號。通常情況下,用於處理此中斷的程式是作業系統的一部分。如果作業系統判斷此次存取是有效的,那麼作業系統會嘗試將相關的分頁從硬碟上的虛擬記憶體檔案中調入記憶體。而如果存取是不被允許的,那麼作業系統通常會結束相關的行程。
A major fault occurs due to disk access, a minor fault occurs due to page allocation (already in memory, but it isn't allocated to that process).
Use "ps" or "top" can get the statistics of page fault.
Disks
You can use pam_limits, or the ulimit utility to keep users from going overboard after they log in the system.The df -h command shows disk usage/free disk statistics for all mounted filesystems. To find out the identity of the disk hog for a folder:
du /home/* | sort –niostat - Report CPU statistics and IO statistics for devices and partitions.
When the command "iostat" is run without arguments, it generates a detailed report containing information since the system was booted. While each subsequent report covers the time period since the last report was generated.
You can provide two optional parameters to change this:
iostat [option] [interval] [count]
- interval parameter specifies the duration of time in seconds between each report
- count parameter allows you to specify the number of reports that are generated before iostat exits.
- the CPU Utilization report This is similar to top.
- the Device Utilization report The device report provides statistics on a per physical device or partition basis.
$ iostat -d Linux 5.0.0-37-generic (jerry-Latitude-E6410) 廿廿年一月四日 _x86_64_ (4 CPU) Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 12.65 208.85 180.56 3623705 3132796
- Device This column gives the device (or partition) name as listed in the /dev directory.
- tps Indicate the number of transfers per second that were issued to the device.
- kB_read/s the amount of data read from the device
- kB_wrtn/s the amount of data written to the device
- kB_read The total number of blocks read
- kB_wrtn The total number of blocks written.
Bandwidth
The command lsof lists open files including socket files.$ lsof -U -Pi | grep IP COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME chrome 2372 jerry 90u IPv4 6179770 0t0 UDP jerry-Latitude-E6410:59666 chrome 2372 jerry 205u IPv4 5864118 0t0 UDP 224.0.0.251:5353 chrome 2410 jerry 30u IPv6 5861252 0t0 TCP 2001-b011-5003-14ec-2976-d29d-44a9-5cd6.dynamic-ip6.hinet.net:32836->th-in-xbc.1e100.net:5228 (ESTABLISHED) chrome 2410 jerry 33u IPv6 6180782 0t0 UDP 2001-b011-5003-14ec-2976-d29d-44a9-5cd6.dynamic-ip6.hinet.net:37080->tsa03s06-in-x0e.1e100.net:443 chrome 2410 jerry 42u IPv4 5865030 0t0 UDP 224.0.0.251:5353
#70 Reduce Restart Times with Journaling Filesystems
Computer systems can only successfully mount and use filesystems if they can be sure that all of the data structures in each filesystem are consistent. "consistency" means that:- all of the disk blocks that are actually used in some file or directory are marked as being in use
- all deleted blocks aren’t linked to anything other than the list of free blocks
- all directories in the filesystem actually have parent directories
- ...
#71 Optimize Your System with sysctl
The files under /proc/sys/ are often collectively referred to as the sysctl interface, because they can be written to, and changes made to the files will be picked up by the running kernel without rebooting. Now, sysctl is also a command that allows administrators to easily configure kernel parameters at runtime. Display all values currently available,$ sysctl -athis returned many “key=value”-formatted records. The keys on the left are dotted representations of file paths under /proc/sys. For example, the setting for net.ipv4.ip_forward can be found in /proc/sys/net/ipv4/ip_forward. You can specify what you want as an argument to sysctl :
- read
$ sysctl net.ipv4.ip_forward net.ipv4.ip_forward = 0
$ sysctl -w net.ipv4.ip_forward=1 net.ipv4.ip_forward = 1If you want to make a permanent change, you can put your custom settings into the /etc/sysctl.conf file.
#74 Profile Your Systems Using /proc
#!/bin/bash echo "" echo "#########BASIC SYSTEM INFORMATION########" echo HOSTNAME: `cat /proc/sys/kernel/hostname` echo DOMAIN: `cat /proc/sys/kernel/domainname` echo KERNEL: `uname -r` top -b | head -8
- "top -b" starts top in Batch mode, which could be useful for sending output from top to other programs or to a file.
- "head -8" prints the first 8 lines of each FILE to standard output.
echo "######## FILESYSTEM INFORMATION #########" echo "" echo "SUPPORTED FILESYSTEM TYPES:" echo ---------------------- echo `cat /proc/filesystems | awk -F'\t' '{print $2}'` echo "" echo "MOUNTED FILESYSTEMS:" echo ---------------------- cat /proc/mounts
#75 Kill Processes the Right Way
First use the ps –ef command to determine the process ID, then simply type this command:$ kill -pidThe “zombie processes" is that child process has "died" but has not yet been "reaped" by its parent process(use wait() to read child's exit status). Also, unlike normal processes, the kill command has no effect on a zombie process. In order to prevent “zombie processes", you should make sure that you kill any and all child processes before you kill their respective parent processes. When a child process is terminated, the kernel keeps some information about it in the process table (including its exit status). The parent needs to read the exit status of the child before it removes the child’s entry from the table. When a process is dead, all resources associated with it are deallocated so that they can be reused by other processes. The problem caused by the zombie processes is: there is only one process table per system and this table has a limited number of unique processes identifiers (PIDs). If you have too many entries in this table, it won’t be able to create a new one. By looking at the column labeled PPID(parent process ID) on the output of "ps -ef", you can find the child process of the PID. Alternatively, you can attempt to kill all the processes within the same process group using killall.
Logfiles and Monitoring
#78 Avoid Catastrophic Disk Failure
ATA and SCSI drives have supported a standard mechanism for disk diagnostics called “Self Monitoring, Analysis, and Reporting Technology” (SMART), aimed at predicting hard drive failures. The smartmontools project (http://smartmontools.sourceforge.net) produces a SMART monitoring daemon called smartd and a command-line utility called smartctl, which can do most things on demand that the daemon does in the background periodically. To find the information of a hard drive:$ sudo smartctl -i /dev/sda smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.0.0-37-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Black Device Model: WDC WD2500BEKT-75A25T0 Serial Number: WD-WXQ1A80V7620 LU WWN Device Id: 5 0014ee 655d0b10e Firmware Version: 01.01A01 User Capacity: 250,059,350,016 bytes [250 GB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Mon Dec 9 18:43:42 2019 CST SMART support is: Available - device has SMART capability. SMART support is: EnabledTo ask the drive about its overall health:
$ sudo smartctl -H /dev/sda smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.0.0-37-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSEDTo get all information,
$ sudo smartctl -a /dev/sda
#79 Monitor Network Traffic with MRTG
The Multi-Router Traffic Grapher provides a quick visual snapshot of network traffic, making it easy to find and resolve congestion. Each time you run MRTG, you’ll need to specify the location from which you want it to read the config file.#86 Fine-Tune the syslog Daemon
dmesg is used to examine or control the kernel ring buffer. The default action is to display all messages from the kernel ring buffer.
klogd reads the messages from either /proc/kmsg or calls sys_syslogd. It provides kernel log data streams to another daemon called syslogd
A system daemon syslog listens for messages on a Unix domain socket named /dev/log. Based on classification information in the messages and its configuration file (usually /etc/syslog.conf), syslogd routes them in various ways. Some of the popular routings are:
- Write to the system console
- Mail to a specific user
- Write to a log file( /var/log/messages)
- Pass to another daemon
- Discard
Therefore, both log information are directed to the same log file /var/log/messages.
dmesg is most useful in capturing boot-time messages before syslogd started,
rsyslogd is derived from the sysklogd package. Support of both internet and unix domain sockets enables this utility to support both local and remote logging.
USING Security Enhanced Linux (SELinux)
Chapter 1. Getting started with SELinux
Discretionary(自由裁量權) access control (DAC) is a means of restricting access to objects based on the identity of subjects and/or groups to which they belong.
Mandatory(強制性的) Access Control (MAC) constrains the ability of a subject ( or initiator ) to access or generally perform some sort of operation on an object ( or target ).
In practice,
- a subject is usually a process or thread;
- objects are constructs such as files, directories, TCP/UDP ports, shared memory segments, IO devices, etc.
For example: May a web server access files in users' home directories?
This can enable system administrators to create comprehensive and fine-grained security policies, such as restricting specific applications to only viewing log file.
The main difference between MAC and DAC:
- With mandatory access control (MAC) This security policy is centrally controlled by a security policy administrator; users do not have the ability to override the policy , for example, grant access to files that would otherwise be restricted. This is a organization-wide security policies. This allows security administrators to define a central policy that is guaranteed (in principle) to be enforced for all users.
- with discretionary access control (DAC) This allows users the ability to make policy decisions and/or assign security attributes.
SELinux implements MAC.
Every process and system resource has a special security label called an SELinux label/context.
NOTE: Remember that SELinux policy rules are checked after DAC rules. SELinux policy rules are not used if DAC rules deny access first
1.2. Benefits of running SELinux
SELinux provides the following benefits:
- All processes and files are labeled.
- Fine-grained access control.
- SELinux policy is administratively-defined and enforced system-wide.
1.4. SELinux architecture and packages
Linux Security Modules (LSM) is a framework(interface) that allows the Linux kernel to support a variety of computer security models. LSM is built into the Linux kernel. SELinux and AppArmor are implementations of LSM. Both SELinux and AppArmor provide a set of tools to isolate applications from each other to protect the host system from being compromised.
Only a single LSM is allowed to be operational at a time.
The SELinux subsystem in the kernel is driven by a security policy which is controlled by the administrator and loaded at boot. All security-relevant, kernel-level access operations on the system are intercepted by SELinux and examined in the context of the loaded security policy.
By default, Ubuntu uses AppArmor and not SeLinux, which is similar in terms of performance but rather popular in terms of simplicity.
AppArmor has to be disabled prior to installing SeLinux to avoid any conflicts. Use the following instructions to disable AppArmor:
$ sudo systemctl status apparmor ● apparmor.service - AppArmor initialization Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled) Active: active (exited) since Sun 2020-01-12 10:17:46 CST; 29min ago Docs: man:apparmor(7) http://wiki.apparmor.net/ Process: 477 ExecStart=/etc/init.d/apparmor start (code=exited, status=0/SUCCESS) Main PID: 477 (code=exited, status=0/SUCCESS) 一 12 10:17:54 jerry-Latitude-E6410 apparmor[477]: * Starting AppArmor profiles 一 12 10:17:54 jerry-Latitude-E6410 apparmor[477]: Skipping profile in /etc/apparmor.d/disable: usr 一 12 10:17:54 jerry-Latitude-E6410 apparmor[477]: Skipping profile in /etc/apparmor.d/disable: usr 一 12 10:17:54 jerry-Latitude-E6410 apparmor[477]: ...done. 一 12 10:17:34 jerry-Latitude-E6410 systemd[1]: Starting AppArmor initialization... 一 12 10:17:46 jerry-Latitude-E6410 systemd[1]: Started AppArmor initialization. $ sudo systemctl stop apparmor $ sudo systemctl disable apparmor
Note, SELinux is not tested for Ubuntu and not recommended run on Ubuntu. Try RedHat.
由於 SELinux 是在kernel中實踐的,應用程式無須被特別編寫或重寫便可以採用 SELinux。當然,如果一個程式特別留意稍後所提及的 SELinux 錯誤碼,它的運作可能會更暢順。假若 SELinux 攔阻了一個行動,它會以一個標準的(至少是常規的)「拒絕存取」類錯誤來匯報給該應用程式。然而,很多應用程式不會測試系統函數所傳回的錯誤碼,因此它們也許不會輸出訊息解釋問題所在,或者輸出錯誤訊息。
1.5. SELinux states and modes
SELinux can run in one of three modes:
- Enforcing mode It enforces the loaded security policy on the entire system.
- Permissive mode It enforces the loaded security policy on the entire system, but, it does not actually deny any operations, it just emits access denial entries in the logs.
- Disabled mode The policy is not enforced and persistent objects (such as files) are not labeled. This makes it difficult to enable SELinux in the future.
# getenforce Enforcing # setenforce 0 # getenforce Permissive # setenforce 1 # getenforce EnforcingTo get the status of a system running SELinux:
$ sestatus SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: permissive Mode from config file: permissive Policy MLS status: enabled Policy deny_unknown status: allowed Memory protection checking: actual (secure) Max kernel policy version: 31
Chapter 2. Changing SELinux states and modes
While enabling SELinux on systems that previously had it disabled, to avoid problems, follow this procedure:
- Enable SELinux in permissive mode.
- Reboot your system.
- Check for SELinux denial messages. If there are no denials, switch to enforcing mode. For more information, see Changing to enforcing mode.
2.2.1. Changing to permissive mode
- Modify /etc/selinux/config SELINUX=permissive
- Reboot the system
2.2.2. Changing to enforcing mode
- Modify /etc/selinux/config SELINUX=enforcing
- Reboot the system
2.3. Disabling SELinux
- Modify /etc/selinux/config SELINUX=disabled
- Reboot the system
2.4. Changing SELinux modes at boot time
On boot, you can set several kernel parameters to change the way SELinux runs:
- enforcing=0 Cause the machine to boot in permissive mode. Using permissive mode might be the only option to detect a problem if your file system is too corrupted.
- selinux=0 Cause the kernel to not load any part of the SELinux infrastructure.
- autorelabel=1 Force the system to relabel.
Chapter 3. Configuring SELinux for applications and services with non-standard configurations
3.1 Customizing the SELinux policy for the Apache HTTP server in a non-standard configuration
Prerequisites: the Apache HTTP server is configured to listen on TCP port 3131 . Modify /etc/httpd/conf/httpd.conf:
Listen 3131
Procedure:
- Start the httpd service and check the status
$ systemctl start httpd $ systemctl status httpd ... Status: "Running, listening on: port 3131" ...SELinux 預設會透過Linux Auditing System 'auditd' 將日誌寫在 /var/log/audit/audit.log 內. SELinux 的日誌都以 AVC 這個關鍵字作標籤,讓 grep 等程式可輕易地把它們從其它信息中過濾出來。 Check the log,
$ journalctl | grep avc 一 14 16:02:32 localhost.localdomain audit[2398]: AVC avc: denied { name_bind } for pid=2398 comm="httpd" src=3131 scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket permissive=1
# touch /.autorelabel # reboot
semanage — SELinux Policy Management tool
Non-uniform memory access (NUMA)
Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing.
Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors).
Introduction
Hyper-Threading
A single physical CPU core with hyper-threading appears as two logical CPUs to an operating system.
Hyper-threading allows the two logical CPU cores to share physical execution resources. This can speed things up somewhat — if one virtual CPU is stalled and waiting, the other virtual CPU can borrow its execution resources.
Multiple Cores
A dual-core CPU has two central processing units, so it appears to the operating system as two CPUs. A CPU with two cores could run two different processes at the same time. There only needs to be a single CPU socket with a single physical CPU to with four different CPUs,
Multiple CPUs
To add additional processing power to computers by adding additional CPUs, this requires a motherboard with multiple CPU sockets and additional hardware to connect those CPU sockets to the RAM and other resources.
Most multi-processor computers are considered Symmetric Multi-Processors(SMP) as each processor is equal and has equal access to all system resources (e.g., memory and I/O busses).
As SMP systems have increased their processor count, the system bus has increasingly become a bottleneck.
CPU Affinity(親和力)
The ability in Linux to bind one or more processes to one or more processors called CPU affinity.
The idea is to say “always run this process on processor one” or “run these processes on all processors but processor zero”.
CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs.
taskset is used to set or retrieve the CPU affinity of a running process given its pid, or to launch a new command with a given CPU affinity.
NUMA
Historically, all memory on AMD64 and Intel 64 systems is equally accessible by all CPUs. Known as Uniform Memory Access (UMA), access times are the same no matter which CPU performs the operation.
Non-Uniform Memory Access (NUMA) refers to multiprocessor systems whose memory is divided into multiple memory nodes. The access time of a memory node depends on the relative locations of the accessing CPU and the accessed node.
Thus, there are multiple physical regions of memory, but all memory is tied together into a single cache-coherent physical address space. The resulting system has the property such that for any given region of physical memory, some processors are closer to it than other processors. Conversely, for any processor, some memory is considered local (i.e., it is close to the processor) and other memory is remote.
To maximize performance on a NUMA platform, Linux must take into account the way the system resources are physically laid out. This includes information such as which CPUs are on which node, which range of physical memory is on each node, and what node an I/O bus is connected to. This type of information describes the topology of the system.
NUMA把系統切成數個節點 (node),每個處理器及記憶體就位在某一個節點上,當處理器存取同一個節點的記憶體時,可以有較高的存取速度;而存取其他節點的記憶體時,就需要透過節點間的資料傳遞,會耗費較多時間。
Linux NUMA Support
作業系統中為了提高記憶體存取的效率,會針對硬體的NUMA配置來設定記憶體存取的策略 (policy),並提供NUMA相關的程式介面 (API) 來查詢系統NUMA配置和修改存取策略。
Linux manages memory in zones. How Linux has arranged memory can be determined by looking at /proc/zoneinfo.
On boot-up, Linux will detect the organization of memory and then create zones that map to the NUMA nodes and DMA areas as needed.
NUMA Memory Allocation Policies
How memory is allocated from the nodes in a system is determined by a memory policy:
The most important memory policies are:
- interleave During boot up, the system default policy will be set to interleave allocations across all nodes with “sufficient” memory, so as not to overload the initial boot node with boot-time allocations. Allocation occurs round-robin. First a page will be allocated from node 0, then from node 1, then again from node 0, etc. Interleaving is used to distribute memory accesses for structures that may be accessed from multiple processors in the system in order to have an even load on the interconnect and the memory of each node.
- local allocation When the system is “up and running”, when the first userspace process (init daemon) is started, the system default policy will be changed to “local allocation”. The allocation occurs from the memory node local to where the code is currently executing.
Basic Operations On Process Startup
The main tool used to set up the NUMA execution environment for a process is numactl.
numactl controls NUMA policy for processes or shared memory. It is possible to restrict processes to a set of processors, as well as to a set of memory nodes.
The hardware NUMA configuration of a system can be viewed by using
$ numactl --hardware available: 1 nodes (0) node 0 cpus: 0 1 2 3 node 0 size: 3867 MB node 0 free: 931 MB node distances: node 0 0: 10
numastat displays per-node NUMA hit and miss system statistics from the kernel memory allocator.
$ numastat node0 numa_hit 103812126 numa_miss 0 numa_foreign 0 interleave_hit 33350 local_node 103812126 other_node 0
- numa_hit is memory successfully allocated on this node as intended.
- numa_miss is memory allocated on this node despite the process preferring some different node. Each numa_miss has a numa_foreign on another node.
- numa_foreign is memory intended for this node, but actually allocated on some different node. Each numa_foreign has a numa_miss on another node.
- interleave_hit is interleaved memory successfully allocated on this node as intended.
- local_node is memory allocated on this node while a process was running on it.
- other_node is memory allocated on this node while a process was running on some other node.
The information about a process's NUMA memory policy and allocation can be displayed via /proc/[pid]/numa_maps. For ex.,
$ sudo cat /proc/2907/numa_maps 3473fdedd000 default 3473fdede000 default anon=14265 dirty=14265 active=6585 N0=14265 kernelpagesize_kB=4 ... 7fb1ad2ca000 default file=/lib/x86_64-linux-gnu/ld-2.27.so anon=1 dirty=1 active=0 N0=1 kernelpagesize_kB=4 7fb1ad2cb000 default anon=1 dirty=1 active=0 N0=1 kernelpagesize_kB=4 7ffc1dbb3000 default stack anon=29 dirty=29 active=20 N0=29 kernelpagesize_kB=4 7ffc1dbe2000 default 7ffc1dbe5000 defaultEach line contains information about a memory range used by the process.
- The first field of each line shows the starting address of the memory range. This field allows a correlation with the contents of the /proc/[pid]/maps file, which contains the end address of the range and other information, such as the access permissions and sharing.
- The second field shows the memory policy currently in effect for the memory range.
- anon=[pages] The number of anonymous page in the range.
- stack Memory range is used for the stack.
- file=[filename] The file backing the memory range.
- dirty=[pages] Number of dirty pages.
- N[node]=[nr_pages] The number of pages allocated on [node].
Create a Linux Swap File
When a Linux system runs out of RAM, inactive pages are moved from the RAM to the swap space.Swap space can take the form of either a dedicated swap partition or a swap file.
How to add Swap File
- Create a file that will be used for swap To create a 1GB swap file
sudo dd if=/dev/zero of=/swapfile bs=1024 count=1048576This caused the swap file initialized with 0.
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
/swapfile swap swap defaults 0 0
sudo swapon --show
Cpusets on Linux
在個人使用的環境下,想要調效多核心環境的計算效能,Linux核心提供的CPUSET功能會是一個簡單而方便的操作方式。
Cpusets are logical, hierarchical groupings of CPUs and units of memory.
The cpuset facility is primarily a workload manager tool permitting a system administrator to restrict the number of processor and memory resources that a process or set of processes may use.
- A cpuset defines a list of CPUs and memory nodes.
- A process contained in a cpuset may only execute on the CPUs in that cpuset and may only allocate memory on the memory nodes in that cpuset. Essentially, cpusets provide you with a CPU and memory containers or “soft partitions” within which you can run sets of related tasks.
Linux 2.6 Kernel Support for Cpusets
- Each task has a link to a cpuset structure that specifies the CPUs and memory nodes available for its use.
- The kernel task scheduler is constrained to only schedule a task on the CPUs in that task's cpuset.
- The kernel memory allocation mechanism is constrained to only allocate physical memory to a task from the memory nodes in that task's cpuset.
Cpuset Facility Capabilities
The cpuset facility allows you and your system service software to do the following:
- Create and delete named cpusets.
- Decide which CPUs and memory nodes are available to a cpuset.
- Attach a task to a particular cpuset.
- Identify all tasks sharing the same cpuset.
- Exclude any other cpuset from overlapping a given cpuset, thereby, giving the tasks running in that cpuset exclusive use of those CPUs and memory nodes.
- Perform bulk operations on all tasks associated with a cpuset, such as varying the resources available to that cpuset or hibernating those tasks in temporary favor of some other job.
- Perform sub-partitioning of system resources using hierarchical permissions and resource management.
Initializing Cpusets
The kernel, at system boot time, initializes one cpuset, the root cpuset, containing the entire system's CPUs and memory nodes. Subsequent user space operations can create additional cpusets.
Mounting the cpuset virtual file system (VFS) at /dev/cpuset exposes the kernel mechanism to user space.
How to Determine if Cpusets are Installed
Check if the /proc/filesystems contains cpusets,
$ grep cpuset /proc/filesystems nodev cpusetIf the /dev/cpuset/tasks file is not present on your system, it means the cpuset file system is not mounted. you can mount the cpuset file system, as follows:
$ sudo mkdir /dev/cpuset $ sudo mount -t cpuset cpuset /dev/cpuset $ ls /dev/cpuset cgroup.clone_children cpuset.effective_cpus cpuset.memory_pressure cpuset.sched_load_balance tasks cgroup.procs cpuset.effective_mems cpuset.memory_pressure_enabled cpuset.sched_relax_domain_level cgroup.sane_behavior cpuset.mem_exclusive cpuset.memory_spread_page machine cpuset.cpu_exclusive cpuset.mem_hardwall cpuset.memory_spread_slab notify_on_release cpuset.cpus cpuset.memory_migrate cpuset.mems release_agent
Cpuset File System Directories
Each cpuset is represented by a directory in the cpuset virtual file system.
The state of each cpuset is represented by small text files in the directory for the cpuset. These files may be read and written using traditional shell utilities or using ordinary file access routines from programming languages.
Descriptions of the files in the cpuset directory,
- tasks List of process IDs (PIDs) of tasks in the cpuset. The list is formatted as a series of ASCII decimal numbers, each followed by a newline. A task may be added to a cpuset (removing it from the cpuset previously containing it) by writing its PID to that cpuset's tasks file (with or without a trailing newline.)
- notify_on_release A flag
WiFi Debug
How to hide MAC address?
You are not going to make modification on hardware, you going to change RAM. When the computer starts, MAC address loads in RAM and we going to change already loaded MAC address.
- install macchanger
$ sudo apt-get install macchanger
# ifconfig eno1 down
$ sudo macchanger -r eno1 Current MAC: 5c:26:xx:xx:xx:xx (Dell Inc.) Permanent MAC: 5c:26:xx:xx:xx:xx (Dell Inc.) New MAC: f2:68:fb:e3:61:54 (unknown)
Wireshark
Wireshark can be used to capture packet directly:
$sudo apt-get install wireshark $sudo wireshark -i wlan0 -w
Wireless modes
Basically, the default "managed" mode of your networking card allows the networking device to receive packets that are sent to its MAC address.
$ sudo iwconfig wlp2s0 wlp2s0 IEEE 802.11 ESSID:"Jerry_DSL-5G" Mode:Managed Frequency:5.745 GHz ...
You can let the card enter "monitor" mode by typing in these commands
$ sudo ifconfig wlp2s0 down $ sudo iwconfig wlp2s0 mode monitor $ sudo ifconfig wlp2s0 up $ sudo iwconfig wlp2s0
Aircrack-ng
Aircrack-ng is a whole suite of tools for Wireless Security Auditing. It can be used to monitor, test, crack or attack Wireless Security Protocols like WEP, WPA, WPA2. Aircrack-ng is command line based and is available for Windows and Mac OS and other Unix based Operating systems.
We’ll only look at some important tools that are used more often in Wireless Security testing.
airodump-ng is used for packet capturing of raw 802.11 frames for the intent of using them with aircrack-ng.
Installation
sudo apt-get install -y aircrack-ng
Usage
- kill all the processes running on wireless card using airmon-ng
$sudo airmon-ng check kill
$sudo airmon-ng start wlp2s0
$sudo iwconfigairmon-ng has started Monitor mode on wireless card, it’ll appear as different name
$sudo airodump-ng wlan0mon0You can narrow down search using MAC (–bssid) and channel (-c) filters.
$sudo airodump-ng --channel [channel] –bssid [bssid] –write [file-name] wlan0mon0
$sudo aireplay-ng –deauth [number of deauth packets] –a [bssid] –c [target_client_mac] wlan0mon0
- http://www.hackreports.com/2013/05/biggest-password-cracking-wordlist-with.html
- https://crackstation.net/buy-crackstation-wordlist-password-cracking-dictionary.htm
$sudo aircrack-ng [handshake_filename] –w [dictionary_wordlist] wlan0mon0
$ sudo airmon-ng stop wlan0mon
The Linux-PAM configuration file
/etc/pam.conf is made up of a list of rules:
service type control module-path module-arguments
- service The service is the name of the file in the /etc/pam.d/ directory which defines the rule:
/etc/pam.d ├── chfn . ├── cron . ├── login . ├── passwd . └── systemd-user
- account
- auth
- password
- session
Run a shell script as another user without password
sudo -H -u otheruser bash -c 'echo "I am $USER, with uid $UID"'
- -H The -H (HOME) option requests that the security policy set the HOME environment variable to the home directory of the target user (root by default) as specified by the password database. Depending on the policy, this may be the default behavior.
- -u The -u (user) option causes sudo to run the specified command as a user other than root. To specify a uidinstead of a user name, use #uid. When running commands as a uid, many shells require that the '#' be escaped with a backslash ('\'). Security policies may restrict uids to those listed in the password database. The sudoers policy allows uids that are not in the password database as long as the targetpw option is not set. Other security policies may not support this.
You can modify the /etc/pam.d/su file to allow su without password.
With the following modification in /etc/pam.d/su, any user that was part of group somegroup could su to otheruser without a password:
auth sufficient pam_rootok.so auth [success=ignore default=1] pam_succeed_if.so user = otheruser auth sufficient pam_succeed_if.so use_uid user ingroup somegroup
ip COMMAND CHEAT SHEET
Discussions of Networking Issues
Unable to use ping as regular user
問題描述: 原因分析:Description
Linux系統下,普通用戶使用ping命令返回“ping: icmp open socket: Operation not permitted”錯誤輸出,但root用戶可以正常使用該命令。$ ip a 5: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:0d:48:5f:45:73 brd ff:ff:ff:ff:ff:ff inet 192.168.168.170/24 brd 192.168.168.255 scope global dynamic noprefixroute enp4s0 valid_lft 85616sec preferred_lft 85616sec inet6 2001:b011:5003:19dc:5b27:dfe9:ba8d:e6c/64 scope global dynamic noprefixroute valid_lft 85618sec preferred_lft 85618sec inet6 fe80::7623:d501:113b:8e77/64 scope link noprefixroute valid_lft forever preferred_lft forever $ ping 192.168.168.1 -c 2 -w 3 -I enp4s0 ping: SO_BINDTODEVICE enp4s0: Operation not permitted $ sudo ping 192.168.168.1 -c 2 -w 3 -I enp4s0 PING 192.168.168.1 (192.168.168.1) from 192.168.168.170 enp4s0: 56(84) bytes of data. 64 bytes from 192.168.168.1: icmp_seq=1 ttl=64 time=0.464 ms 64 bytes from 192.168.168.1: icmp_seq=2 ttl=64 time=0.499 ms
Solution
$ sudo chmod u+s /usr/bin/ping $ ping 192.168.168.1 -c 2 -w 3 -I enp4s0 PING 192.168.168.1 (192.168.168.1) from 192.168.168.170 enp4s0: 56(84) bytes of data. 64 bytes from 192.168.168.1: icmp_seq=1 ttl=64 time=0.497 ms
Analysis
ping命令在運行中采用了ICMP協議,需要發送ICMP報文。但是只有root用戶才能建立ICMP報文。而正常情況下,ping命令的權限應為-rwsr-xr-x,即帶有suid的文件,一旦該權限被修改,則普通用戶無法正常使用該命令。Changes/Enable SysctlPingGroupRange
Set Up the NFS server
server:
- Install the NFS server
$ sudo apt install nfs-kernel-server
/home/jerrlee/work2 10.19.108.147/8(rw,sync,no_subtree_check,fsid=0,insecure_locks,insecure,no_root_squash)
$ sudo chown jerrlee /home/jerrlee/work2
$ sudo systemctl restart nfs-kernel-server $ sudo service nfs-server status
$ sudo exportfs -a
sudo ufw allow nfs sudo ufw allow sunrpc sudo ufw allow 111 sudo ufw allow from 10.19.108.147/8 sudo ufw statusclient:
- Install the client
sudo apt install nfs-utils sudo apt install nfs-common
$ showmount -e build-server Export list for build-server: /home/jerrlee/work2 10.19.108.147/8
sudo ufw allow ssh sudo ufw allow nfs sudo iptables --flush sudo ufw allow from 10.19.108.147/8
nc -v -u 10.19.108.147 111 nc -v 10.19.108.147 111 nc -v 10.19.108.147 2049 nc -v -u 10.19.108.147 2049 rpcinfo -p 10.19.108.147
sudo mount -o v3 10.19.108.147:/home/jerrlee/work2 /home/jerry/work2
sudo mount -o v3 10.19.108.147:/home/jerrlee/work2 /home/jerry/work2
留言