Linux 101¶
In this course, we will look at some of the generic Linux concepts and how we can use Linux to perform various tasks easier than in other operating systems. Also see: Building HPC Systems course 2024
Foreword¶
Use the menu on the left to navigate between the different topics.
Throughout this site, code blocks indicate commands to be executed. Even though one can easily copy and paste the commands from the code blocks, it may be better to type the commands one after the other for training purposes.
Typing the commands and observing the results and interactions presents a better training experience.
Example code block:
For most parts of this Linux 101 training section, we will separate the results from the commands and put the results in a collapsed box. Instead of simply looking at the results, it is recommended that you execute the commands yourselves and verify that your result is the same (or at least partially, where slight variations are expected).
To illustrate what to expect, the first line (starting with a hash #) in the following code block is a comment. The second line is a command, and the collapsed output is underneath the block.
Results of the long listing. To expand, click on this line
Section 01¶
GNU/Linux a brief introduction¶
When we talk about Linux, we should in fact refer to it as GNU/Linux.
GNU, is a recursive acronym meaning GNU's Not Unix
GNU is an operating system that is free software—that is, it respects users' freedom. The GNU operating system consists of GNU packages (programs specifically released by the GNU Project) as well as free software released by third parties. The development of GNU made it possible to use a computer without software that would trample your freedom. 1
GNU/Linux was derived from Unix but was newly created from scratch. The GNU concept and Free Software Foundation were started in 1983 by Richard Stallman, seeking a methodology to create and share software freely.
The GNU Project later got Linus Torvalds involved in 1991 to create the Linux Kernel, which made the distribution of a complete OS possible by 1994.
It is important to the Free Software foundation to refer to GNU/Linux; for if only referring to Linux, one unfairly disparages the value of the GNU project on Linux's existence.
Who uses GNU/Linux?¶
Although the exact number of Linux users is somewhat unknown, various sources believe the number of desktop users is around ±3% of all personal computers but up to 97% of servers in data centres.2 Determining the exact number is problematic because users don’t have to register or license their software.
That being said, GNU/Linux:
- Runs on Smartphones, Smart home devices, Desktop computers, Laptops, Tablets, Main Frames, Super Computers, Routers, Switches, TVs, even cars, etc.
- Only about 3% of users run on desktop PCs
- The most popular choice to host Web Services and other security infrastructure
- Most enterprise core systems run on GNU/Linux
- All the Top500 clusters run on GNU/Linux
- Most Cloud-based solutions are built and run on GNU/Linux technology
- even Microsoft Azure
What is GNU/Linux?¶
The most distinct feature of GNU/Linux compared to other commercial operating systems is that Linux is open source. Different operating systems also only have one mainstream version, with perhaps one or two legacy versions they maintain. Linux has hundreds of distributions, each with a slightly different look and feel.
- GNU/Linux is Open Source
- You can opt to pay for support: Red Hat and SUSE Linux
- The core of GNU/Linux is the Kernel
- GNU/Linux is a POSIX compliant OS which makes the applications runnable and portable between most vendor hardware
- The file systems also use POSIX semantics
- Holds a hierarchical (tree) layout ./ ../ ../../ ../../../
- ACL
- User Permissions and ownerships per file or directory
- File/Directory permissions ( ugw rwx 461 )
The file system layout can be inspected by using the tree command. The tree command shows the hierarchical structure with the different levels in the structure.
Listing of the first level of the file system
The Linux Kernel¶
The Linux Kernel is the heart of the operating system. It manages all processes and devices through system calls and drivers running on top of it.
The kernel was originally released by Linus Torvalds in September 1991. The most recent version of the kernel is version 6.9.8. This version is constantly being updated. However, this is not the only version being maintained. As indicated in the table below, several long-term versions are still actively maintained.
The table below indicates the currently supported long term kernel versions.
Version | Maintainer | Released | EOL |
---|---|---|---|
6.6 | Greg Kroah-Hartman & Sasha Levin | 2023-10-29 | 2026-12-31 |
6.1 | Greg Kroah-Hartman & Sasha Levin | 2022-12-11 | 2026-12-31 |
5.15 | Greg Kroah-Hartman & Sasha Levin | 2021-10-31 | 2026-12-31 |
5.10 | Greg Kroah-Hartman & Sasha Levin | 2020-12-13 | 2026-12-31 |
5.4 | Greg Kroah-Hartman & Sasha Levin | 2019-11-24 | 2025-12-31 |
4.19 | Greg Kroah-Hartman & Sasha Levin | 2018-10-22 | 2024-12-31 |
Certain Linux distributions, such as Red Hat Linux, use older kernel versions known to be stable in their server releases and newer kernel versions in their workstation releases, such as Fedora Linux. The latest release of Rocky Linux 9.4 (a Red Hat-derived Linux server edition) runs kernel version 5.14. On the other hand, Fedora 40 uses the newer kernel version 6.8. A mitigation is released as soon as possible if a vulnerability is detected in either kernel release version.
Linux distributions¶
There are currently over 600 active Linux distributions. Most distributions are based on one of three parent distributions: Slackware, Debian and Red Hat. Three other smaller parent distributions named Gentoo, SUSE and Arch Linux form the basis of a few other distributions, but we won't discuss them. The figure below shows a complete timeline3 of all the distributions and their origins. We will discuss some distinguishing factors between the principal three distributions and some of their children. Still, for the most part, these distributions are closely related, and several packages are shared among them.
Linux distribution mainly consists of:
- The Linux Kernel
- System Drivers
- Software
- Graphical User Interface
- KDE Plasma, GNOME, XFCE, LXDE, and MATE
Slackware Linux¶
Slackware Linux distributions are probably the lesser-known distributions among the longer-existing distributions. In fact, the most well-known distribution moved away from this baseline in 1996.
- Released in 1993 and was based on the Softlanding Linux System (SLS), which defunct in 1994
- Community based
- Uses pkgtools (Collective Package Tools) as package manager
Popular distributions:
Slackware distributions are less known in the Linux Community. The best known distribution is SUSE, which was recreated in 1996 based on the Jurix Linux distribution (existed between 1993 and 1999).
A lot of Slackware distributions become defunct before the age of 20, with only 21 out of the 48 Slackware distributions still active today.
Debian Linux¶
End-users mainly use Debian Linux distributions. However, some servers in data centres may use the LTS releases to extend the lifetime of the operating systems installed. One will also notice more technical support forums using Debian (mainly Ubuntu) in their tutorials and examples.
- Released in 1993
- Community based
- Uses APT (Advanced Package Tool) as package manager
- Over 50 000 software packages
Popular distributions:
- Ubuntu - Mark Shuttleworth
- Ubuntu has several children distributions
- Linux Mint
- Knoppix
- Kali
Red Hat Linux¶
Red Hat Linux distributions are often used by systems engineers in data centres. The most prominent distribution in data centres used to be CentOS. The top 10% of the Fortune 500 runs over 50 000 instances of CentOS. China’s entire telecoms backend also runs on CentOS.
However, a massive controversy started when Red Hat’s principal solution architect, Magnus Glantz, stated that clones like CentOS are "making money off others' hard work".4 This, among other movements such as the acquisition by IBM, led to the creation of yet more clones, such as Rocky Linux and AlmaLinux.
It remains to be seen precisely how the market share of Red Hat will change, but in the Linux community, there seems to be a move over to Debian systems already.
Red Hat Linux is perceived as a very stable Linux distribution that goes through vigorous testing before releases. Official training and certification, make it more appealing to enterprises.
- Released in 1994
- One of the first commercial distributions
- Most popular corporate distribution
- In 2012 exceeded over $ 1 Billion in revenue
- In 2019 Big Blue (IBM) acquires for $34B
- Uses Red Hat Package Manager (RPM)
- YUM (Yellowdog Update Manager)
- DNF (Dandified YUM)
Most popular distributions:
Storage¶
Logical or Software RAID¶
One aspect to consider when installing an operating system on a server is the redundancy of storage devices. The best method of ensuring redundancy used to be RAID devices. A RAID device would be used to mirror data over multiple hard drives. A drive or numerous drives could fail without data loss if the RAID were configured correctly. The problem with having a RAID controller is that the RAID controller itself can fail. Without another RAID controller with the same model and specifications, the data would be unavailable for weeks until the issue has been resolved.
Newer releases of Linux introduced the use of Software RAID. This eliminates the dependency on RAID controllers, slightly impacting the system's performance. This performance impact is not noticeable to most end-users, especially with the new multicore CPUs. Software RAID volumes also make the addition of hard drives to the pool more accessible to grow the available size of the pool, whereas expanding the capacity of a hardware RAID would sometimes entail replacing all the drives in the RAID with larger capacity drives. Software RAID can use hard drives of different sizes in the configuration. In hardware RAID, the hard drives in the same pool had to be the same size because the data is mirrored block for a block to all the drives in the pool.
The most common software RAID in Linux is Logical Volume Management (LVM), followed by ZFS. ZFS was created by Sun Microsystems (acquired by Oracle) and used in their Solaris operating system since 2001. It has since moved to the OpenZFS Project and is supported by various Linux distributions. One can use the EXT4 file system on top of ZFS and have the ZFS pool expand the EXT4 file system as the pool expands. See the the file systems seection for more information.
File systems¶
Choosing the correct file system depends on the perceived use of the system. Some file systems perform better with smaller files than others. Some file systems are more manageable to extend (grow) as storage requirements increase.
The primary considerations when choosing a file system include:
- Native support for the file system on the operating system
- Expansion of the file system, should it be required in future
- POSIX compliance
- General maintenance of the file system
- File size or limitations of the file system’s size itself
- Handling of small files, large amounts of files or directory structures
- Redundancy, should it be important
- Encryption, should it be important
In the Microsoft Windows space, the most supported file systems are:
- NTFS, FAT32, FAT16, FAT12, FAT, and MSDOS.
On Apple macOS, the main used file systems are:
- HFS+ and APFS
On GNU/Linux, most distributions support and use:
- XFS, ZFS, EXT4, EXT3, ReiserFS, EXT2, and EXT
The following table shows some of the technical features of the different file systems.
File System | Indexing5 | Journal6 | Extents7 | COW8 | Use case |
---|---|---|---|---|---|
ext2 | h-tree | Legacy and not really in use | |||
ext3 | h-tree | Legacy | |||
ext4 | h-tree | Older standard but still in use | |||
xfs | b-tree | Current Standard | |||
btrfs | b-tree | Flexible and support large files |
One often talks about gigabytes but technically refers to gibibytes when discussing file systems or file sizes. Especially when copying files from one operating system to another.
The difference is that gigabytes (109) are measured in decimal units, whereas gibibytes (230) are measured in binary units. A gigabyte is thus 73 741 824 bytes larger than a gibibyte. That is a 73.7 MB difference. This difference is not substantial at the gigabytes scale, but when working at larger scales, such as petabytes, the difference is 125 Terabytes. Imagine having to transfer files from one system to the other, thinking that you have enough space, only to find out that you have a couple of gigabytes or terabytes too little.
This is perhaps a bit too technical or “nitty gritty”, but just note the difference. When installing Linux and defining file systems, you will be working in gibibytes instead of gigabytes, etc.
The following table indicates the relative decimal and binary sizes and limitations of several file systems at that scale.
Name | Unit | Bytes in Decimal | Binary Name | Binary Unit | Bytes in Binary | File system within this maximum size |
---|---|---|---|---|---|---|
kilobyte | KB | 103 | kibibyte | KiB | 210 | |
megabyte | MB | 106 | mebibyte | MiB | 220 | |
gigabyte | GB | 109 | gibibyte | GiB | 230 | FAT16: 16GiB (256KB clusters, 4KB sectors) |
terabyte | TB | 1012 | tebibyte | TiB | 240 | EXT2/3: 2-32TiB (1KiB Block size - 8 KiB block size) FAT 32: 6TB (64KB clusters, 4KB sectors) |
petabyte | PB | 1015 | pebibyte | PiB | 250 | NTFS: 8PB |
exabyte | EB | 1018 | exbibyte | EiB | 260 | EXT4: 1EiB |
zettabyte | ZB | 1021 | zebibyte | ZiB | 270 | EXT4: 64ZiB (4KiB block size) (Theoretical) |
yottabyte | YB | 1024 | yobibyte | YiB | 280 | EXT4: 1YiB (64KiB block size) (Theoretical) |
Notice
Excluded from this table is ZFS, which can theoretically store a staggering 2128 bytes (281 474 976 710 656 YiB) on a volume with a maximum of
2128 (340 282 366 920 938 463 463 374 607 431 768 211 456 ~ 340 billion billion billion billion) files on the file system.
SWAP Space¶
SWAP space is a particular storage location on a hard drive that holds temporarily volatile data intended to be saved in RAM when the RAM becomes exhausted.
When a system is running out of RAM, the swap space prevents the system from crashing. SWAP space is much slower than RAM because SWAP is stored on a hard drive. Even a Non-volatile Memory express (NVMe) or Solid-State Drive (SSD) can’t read and write as fast as RAM. It is therefore advisable to not depend too much on SWAP.
- Emulated Memory on a hard drive
- The rule of thumb was 2 x size of the RAM but nowadays, servers have 256GB and even a couple of terabytes of RAM
- Use
free
,ps
andvmstat
commands to determine the actual needcat /proc/meminfo
top
- (VIRT = available)
- (RES= resident = currently using)
- Page Cache: Getting Memory from secondary storage into primary storage - like when reading files from disk. Important PERFORMANCE ENHANCEMENT
- Virtual Memory is the total allocable Memory, also known as Process Address Space (a 64-bit system has a total of 32 TB of Virtual Memory)
- SWAP is not Virtual Memory.
Here are some guidelines when choosing the size of SWAP:
-
If you have less than 4GB RAM: allocate 50% of the RAM size
- Eg. 4GB RAM system: Allocate 2GB SWAP
- If you have more than 4GB RAM: allocate 25% of the RAM size
- Eg. 256GB RAM system: 64GB SWAP
-
However, for Scientific Applications, 50% is still recommended for the nodes.... up to 128GB of SWAP
-
It could also be useful "playing" with swappiness:
cat /proc/sys/vm/swappiness
sysctl: vm.swappiness=xx
- I would recommend a value of 60 to 80 if you use an SSD
Exercise 01¶
In these sessions, we will use the Redhat-derived distribution called Rocky Linux.
An infrastructure will be provided, but installing a virtual machine running Rocky Linux would be beneficial for practice (and playing around).
The default installation should suffice for this course, but we will dive deeper into more specifics in the later sessions.
Using the local UFS mirror, you can download the installation ISO of the latest Rocky Linux release. Currently the latest major release is Rocky Linux 9.
Install Rocky Linux 9
Using a hypervisor/cloud services, install the latest version of Rocky Linux.
If you want to explore Linux as a desktop environment, instead, install Fedora.
Note
The installation media is quite large (±10Gb for the full release). Download the correct ISO using a fast and uncapped Internet connection.
For Fedora, you may want to download the "Everything" ISO. For Rocky Linux, the minimum ISO should suffice.
Any hypervisor, such as Oracle VirtualBox, can be used. If you work in a computer laboratory, you may be limited in what you can install. Laboratory computers are also often limited in terms of system memory, especially GPU memory. Suppose the installer is not willing to install using a graphical interface. In that case, it may indicate that you cannot run your Linux virtual machine with a graphical interface. Don’t despair. We only need a terminal (non-graphical interface) for this course; as mentioned, an infrastructure will be provided for the remainder of the course.
You can also use Windows Subsystem for Linux (WSL) to install a Linux distribution, but it is out of the scope of this course. Several Youtube videos are available to install, run and use Windows Subsystem for Linux. The following video, should suffice to get you going.
Summary¶
One of the most predominant features of Linux and its file systems is that commands, file names, variables, etc., are case-sensitive.
GNU/Linux distributions¶
A GNU/Linux distribution consists of:
- The Linux Kernel
- System Drivers
- Graphical User Interface
- Tools
- Software
- Compilers
- Libraries
There are several hundred distributions, based mainly on:
Which distribution should be used:
- For desktop computers, good options include Ubuntu Desktop, Fedora, and Linux Mint.
- For server solutions, Rocky Linux or Ubuntu LTS are considered good choices.
- Kali Linux is the most widely used operating system for security and penetration testing (ethical hacking).
Storage¶
Software RAID is suitable for utilising multiple storage drives in a single file system. This will allow one to extend the file system later by adding additional drives to the pool.
When choosing a file system, depending on the use case, a file system such as XFS or ZFS should be used where possible. EXT4 (or even EXT3) is still an option and is sometimes used for more basic file systems such as boot partitions.
SWAP space¶
- SWAP space is used to write temporary data to a hard drive.
- SWAP space is much slower than System Memory (RAM).
Section 02¶
Notes from the previous session
- Please ensure that you sign the register.
- Some students asked what clusters are and what they are used for.
- This course is not compulsory; please email me if you can’t continue.
- Passwords are now preset and permanent. See the relevant events page for your information.
- Linux is primarily written in C.
- Therefore, most commands, parameters, file names, usernames, passwords, etc., are case-sensitive.
- This course does not count toward your degree, but the experience can set you above another candidate when applying for a job.
- The career path you can follow is diverse, but note that a path in HPC is challenging (especially in SA).
- The first class is the worst. From here on, we will do a lot of practical.
- Please feel free to engage, but time might be limited to solve your specific issue on the spot.
- Today’s session is paramount; please ensure you can connect to the infrastructure.
Remote Access¶
One usually connects remotely to gain access to a GNU/Linux machine. That is, if the machine is not your desktop or laptop computer, and you don't have physical access to the machine. To be able to connect, one needs to make use of a secure connection protocol called Secure Shell (SSH). The connection between the two computers is encrypted and safeguarded by authentication requirements.
With older versions of Windows, you need a particular client to connect to an SSH session. This is no longer necessary when connecting from GNU/Linux, Mac OS, or Windows machines. However, especially in Windows, it is often more convenient to install a client in which you can save your session information for regular reuse.
Several clients are available, of which some are licensed or commercial solutions. One open-source solution is PuTTY. PuTTY enables you to connect to remote devices over SSH or Telnet. You can save several profiles and perform port forwarding. PuTTY has been popular for several years and is still a good solution.
Another client is Tabby. Tabby is the client chosen for this course and is discussed in the next section.
Installing Tabby ¶
Tabby is a Java application that is rich in features. As the name may imply, Tabby has a window where multiple session tabs can be opened simultaneously. Another feature differentiating Tabby is that it can make serial connections to devices. Serial connections enable the configuration of devices such as routers or switches before they are accessible through other means. Another feature often unavailable for free clients is the built-in support to transfer files to and from the remote host.
Select the download link from Tabby's website. The current version is available from the GitHub download site. Various binaries are available from the GitHub site; for most Windows users, the tabby-XXXXXX-setup-x64.exe should work. In this case, the file tabby-1.0.211-setup-x64.exe should be used.
The default installation options will suffice. The only mention worthy setting is that if you have administrative rights, you can install and make Tabby available to all user profiles.
Create an SSH profile¶
Tabby allows you to connect ad hoc to remote machines. However, you may want to save your connection profiles for later reuse.
When Tabby opens, the default screen has an option for Profiles and Connections. This item is also available as a small icon of two windows on top of each other in the title or menu bar. Profiles and Connections can also be accessed as a sub-item in the settings menu.
- Select either one of the abovementioned options to open the Profiles and Connections menu
- If you are not in the settings window already
- Select the last item in the list: Manage Profiles
- At the top, click on the New button
- Select New profile
- A drop-down list will appear on which the new profile is going to be based
- Scroll down in the list and select SSH connection
- Give the session a descriptive Name
- e.g. HPC Training
- Now, click on the input box under Host
- Type in the login hostname provided for this course
- If a different port number was provided, set the Port Number accordingly
- Next, set the Username to the username that you want to use to log in as
- If the host you are connecting to is not managed by yourself, use the provided username
- If the host you are connecting to is managed by yourself, the username should be the name you chose during installation
- It is not a good idea to SSH directly as the root user; instead, log in as a regular user and then become root when needed
- By default, the Authentication method is set to Auto
- This is sufficient and will automatically select the best-known method to connect
- We will use a password for now, but you don't need to specify the Password now.
- We can leave this option as Auto and continue
- It is not recommended to click on Set Password if you are sharing the computer with others
- Finally, click on the Save button to save the profile
- You can click on the X next to the Settings tab in the Title/Menu bar to close the Manage Profiles settings
- The next time opening Tabby, you can select the profile by clicking on the double window icon and scrolling towards the profile as you named it.
Connecting to a remote GNU/Linux machine¶
If you haven't done so, please first install Tabby.
For convenience's sake, it is also recommended to Create an SSH profile.
Select the profile you would like to connect with from the list. Following the instructions above, the profile should be saved at the top, under the ungrouped profiles. There is also an option to search for the session by typing either the name you provided or the hostname/IP address of the destination machine.
After selecting the machine, a connection will be opened to the destination host. If it is the first time connecting to the host, a popup message will display, asking to import the remote host's fingerprint. A fingerprint is a public signature/key that the SSH server generates from its private key to verify the authenticity of the connection. It verifies that the host you are connecting to is the host you expect to contact. This mitigates the risk of connecting to a malicious party attempting to capture your credentials.
The safest option here is to select the Accept and Remember key.
Depending on the authentication method, you will be prompted for either or both your username and password.
Note
The password prompt might appear near the bottom of the window.
You should now be connected to the remote host and able to execute a few commands.
Exercise 02¶
We will install a Rocky Linux server on the virtual infrastructure for this class. The lecturer will provide the login details and how to connect to the (probably) web interface of the infrastructure. If you are not part of an official HPC class or want to practice, please use a different hypervisor, such as Oracle VirtualBox.
Practical
Please pay careful attention to the instructions, or you may fall behind and have little time to catch up.
There are checkboxes next to the instructions. This is only for your reference and does not affect the workflow.
Please make sure to click the checkbox (for your own reference) to indicate that you have completed that instruction.
Note
The instructions provided here will only work for the official HPC class 2024. Other courses may differ, and instructions may have to be amended.
The virtual infrastructure is only available on campus.
Log into the cloud infrastructure.
- Log in to the cloud interface.
The lecturer will provide the address and credentials to use. - If the VM is switched off, turn it on by right-clicking on the VM's name and selecting Power On.
- Click on the VM's name to view the summary.
The VM may take some time to start up. To refresh the screen and see when the status changes, click on any other tabs, such as Console and click back on the Summary tab. - When the state is On, click on the More drop-down menu.
- Select Launch console to view the screen that is displayed on the VM. You can resize the window to scroll more easily. The default VM configuration should have the Rocky Linux DVD mounted and boot from the image.
Install Rocky Linux¶
If you have an active console to the VM, please follow these instructions to install Rocky Linux.
- Select the option to Install Rocky Linux.
- Select the language.
It is highly recommended that you select English and not your native language. It will make debugging and assistance from others easier.
We will go through the installation items from left to right, line by line. - The first option is the Keyboard layout.
The default English (US) will work well for most SA keyboards. - The next item is the Installation Source. This item can be left as is from the local media.
- The first thing that needs our attention is the Installation Destination.
Click on this item to select and partition the hard drives we will use.- Ensure that the hard drive(s) are selected.
For demonstration purposes, we will choose the custom option. - Select Custom
- Click on the Done button.
By default, the installer will use Logical Volume Management (LVM). - However, for our purposes, we will select the Standard Partition option.
This is due to a technical challenge that may be experienced later during the course.
It is fixable but may cause unnecessary effort. - Click on the option to create the partition layout automatically.
This will select the most reasonable layout according to your machine's physical resources. - Remove one gigabyte of storage from the / (root) partition.
- Add one gigabyte to the swap space for demonstration purposes.
When making modifications, you can click the Update Settings button to see the effect or simply click away on one of the other partitions.
The maximum available space will be used to specify the partition size if the value exceeds the available capacity. - When all the changes have been made, click on Done.
A popup window will ask for confirmation of the changes.
Note that at this point, the partition table of the selected hard drives will be erased and modified according to your selection. - Click on Accept Changes to confirm this action.
- Ensure that the hard drive(s) are selected.
- The next item in the installation wizard is the Language Support.
This value should be acceptable if you selected English on the first screen. - The next option that requires our attention is the Software Selection.
With the DVD as media, the default will be to install a server with a graphical user interface.- Click on the Software Selection and select Minimum Install instead.
- Click on Done.
- The next item is KDUMP.
This option creates a system dump to a file whenever something goes wrong, and the Kernel needs to kill an application or process.- We can turn off this option for our purposes.
- Click on Done.
- Click on Time & Date.
- Set the time zone by clicking on the region or selecting it from the drop down.
- Enable the Network Time Protocol (NTP).
It is essential for cluster members/nodes to all have the exact time. Any deviations, even at a micro-second level, can compromise the system's stability. - Click on Done.
The next option is to check our Network & Host Name configuration.
In the lab environment, the network address should be attained automatically via DHCP.
However, one should manually set the IP addresses and other settings in a production environment.
- It is essential to have the IP Address configured correctly,
seeing that we will connect to the machine through the network and an incorrect configuration will not allow us to connect. - The hostname can also be set on this screen,
but it can be set later on should we need to modify it. - The next step is to set a password for the Root user.
The root user has the highest level of administrator privileges for the system.
When selecting a password, it is essential to choose a strong password.
However, if not setting a password, the Root user cannot log in directly.
This is a good option when several systems engineers must manage a server.
It will also heighten security, as the account will be deactivated without a password. - The final task is the User Creation.
Note that you need to create a user with administrative rights if you deactivate the Root account.
It is best practice to create an account with administrative access and use that account to log in instead of directly logging in with the Root user.
For this reason, it is essential to choose a strong password in a production environment. - Now, you can click on the button to proceed with the installation.
The installation will take a couple of minutes, depending on the software selection and hardware used by the system. - After the installation, you must restart the system.
Connect to the newly installed machine¶
After installing the VM, you should check that the machine is accessible. The machine will only be accessible through a jump node.
We will use the jump node profile configured in the Create an SSH profile section.
Example login
The following simulation shows the expected behaviour. Right-click and select reload to see the simulation in action.
ED25519 key fingerprint is SHA256:3pntj0KR1cT4MW2qcoU425nBUg/lWwCG7R/NSuCiX5E.
No matching host key fingerprint found in DNS.
This key is not known by any other namesyes**************whoamiadmin
Example
Execute a couple of commands on the machine to verify that, in fact, it is the machine that you have configured:
The following simulation shows the expected behaviour. Right-click and select reload to see the simulation in action.
hostname -ftrn-usr01
df -h /homeFilesystem Size Used Avail Use% Mounted on
/dev/sda2 25G 1.4G 24G 6% /
uptime15:39:09 up 1 min, 1 user, load average: 0.00, 0.00, 0.00
ip address...
...
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 50:6b:8d:00:00:42 brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet 172.21.0.142/24 brd 172.21.0.255 scope global dynamic noprefixroute ens3
...
...
For the remainder of this course, we will simply return the necessary output instead of the above simulations.
Destroy the newly installed machine¶
Danger
This section will "damage" the Linux system beyond repair.
Never execute these commands in a production environment.
Typically, one would format a physical machine when you want to reinstall it; this step will not be necessary for physical machines.
With virtual infrastructure, one can delete the VM and recreate it or delete the disk and create a new disk. However, some unique configurations may have been performed by the lecturer, which will be lost upon the destruction of the VM or hard drive itself.
Therefore, we will render the GNU/Linux OS unbootable without deleting or changing the VM itself.
- First, log into the VM.
- Ensure you are the regular user and not root by executing the
whoami
command. - Execute the following command:
dd if=/dev/zero of=/dev/sda count=500
Example
The following result should be displayed:
The following section is destructive.
Now, become the root user by executing the command:
Reissue the command:
Shut down the VMs¶
It is essential to shut down your infrastructure while not working on the machines. If you use, for instance Amazon EC2 and omit to shut down your machines, you will be charged their hourly rate and end up with an unnecessarily high bill.
Please perform these actions to shut down the machines.
Log into the cloud administration page. Right-click on your machine and select Power Off.
If you have multiple machines to switch off, you can click on the Select All box, select Actions, and select Power Off.
Note
Please always switch off your machines when you are not using them.
Section 03¶
Notes from the previous session
- The lecturer may have to reset passwords and VMS from time to time.
- If you no longer attend classes, please inform the lecturer to deregister you.
- User accounts are managed by a system that enforces intruder detection.
- Multiple incorrect attempts may lock your account.
- Ask the lecturer for assistance.
- Multiple incorrect attempts may lock your account.
- The session regarding connecting to VMS is specific to this course.
- Usually, a hostname such as trn-usr01.xxxxxxdomain.com will not resolve.
- The graphical (such as installer, etc.) access to the VMS is only available on campus.
- If you can, attempt to use the same seat/computer each week.
- This will make it easier, for instance, to access your specific Tabby session.
- The command
dd if=xxx of=xxx ...
was only for demonstration purposes.
Instructions¶
See the Information Page, or the provided instructions for login information.
Note
We are going to install software and make changes to the system. You must use your own VM or GNU/Linux installation and become root.
Log into the jumpnode (login.xxxxx)
Log into your VM as the admin user:
Software installation¶
Because most software in the GNU/Linux realm is open source, one can usually download, compile and install software relatively easily. To perform this, you need to download the source code, compile it using a compiler for the specific programming language (usually C), install the compiled code and libraries, change any file permissions required, and finally configure the package if it is a service that needs to run.
However, most systems engineers only install software from source code when necessary. For instance, when a specific version of an application is required or when the distribution's packages are dated. When you install the software from source code, you must reinstall it later to perform an update, etc. This puts some maintenance overhead on your workflow and is less desirable when you have multiple systems to maintain. The alternative method is to use the distribution’s internal package managers.
Software is installed on Rocky Linux (RedHat-derived OS) using RedHat Package Manager (RPM) packages. An RPM contains pre-compiled binaries, the associated libraries, and configuration files. An RPM also contains metadata that indicates any files, permissions and dependencies a software package might require or provide. Finally, the RPM includes a section in which commands are automatically executed upon the package's installation, upgrade or removal.
To make the download and installation of software as lean as possible, the required dependencies will only be installed upon request. To install a package, one can use the rpm
command. Note that when installing using the rpm command, if a dependency is not installed, the installation will fail until all dependencies are met.
Installing a package using the rpm
command¶
In this section, we will download and install a package known to have few or no additional dependencies. We will install the wget
package. The wget
package can be used in a terminal to download packages from the Internet. A similar package (curl
) has already been installed; thus, wget
is not technically required, but we will install it nonetheless for demonstration purposes. Execute the following command to install the wget
package:
Querying a package using the rpm
command¶
We used the rpm
command in the previous section to download and install the required package automatically. We can inspect installed packages on the system using the rpm
command.
For more information, let’s inspect the wget
package using rpm
:
Success
The above listing shows each file forming part of the wget
package. The file of most importance is the binary (/usr/bin/wget
) itself, but other comments were added about some of the other files.
As mentioned, we can see information about packages installed on the system. First, let’s see all the packages installed on the system.
The following (reduced) list should be displayed, showing each rpm package installed:
Success
You may notice the last entry in the list is the previous package we installed.
Let’s get more information about the wget
package itself. We already know the package version and the list of its files, but let’s see more details about the package, such as its dependencies.
Success
Even though we installed the package without requiring any additional packages, from the above list, we can see that the package, in fact, had dependencies that were already met by our GNU/Linux installation.
Finally, let’s see some informative information about the package:
Success
These are some of the package metadata used by other tools that we will use in the next section to determine which packages are required and provide a short description of the package itself.
Get more help with therpm
command¶
Most commands in GNU/Linux have an option --help
to see more information about a command. For instance, to see most of the options that are often used with the rpm
command, execute the following command:
Info
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
The above command only shows some of the options. If you want more information or know what an option means, read the manual pages for a specific command.
Reading the manual pages of a command¶
In the above section, we gathered some information about a command by executing the command's --help
option. However, the list returned only includes some of the more often used options, with minor to no comment that describes the options, etc. We can read the package's installed manual pages if we need more information (without resorting to an online source).
To read the manual pages for the wget command, execute the following command:
This will open an interactive manual page. The following "shortcuts" can be used to navigate the file:
Info
Installing MariaDB using rpm
¶
Let's attempt to install a specific package called MariaDB using the rpm
command. MariaDB is an open-source database used by various companies as an alternative to Oracle and Microsoft SQL.
We will install the client that is used to connect to remote servers.
Failure
From the feedback given by the rpm
command, we see that the installation failed because various packages were required. The first mentioned requirement is /usr/bin/perl. This is an executable for the Perl package. After that, mention is made of mariadb-common, mariadb-connector-c and a few Perl modules.
Let’s try and install the mariadb-common package first:
Failure
The installation failed because we require a file name /etc/my.cnf
. Unfortunately, the rpm
command is not intelligent enough to resolve which package provides the required dependency, and we need to download and install the package ourselves.
Knowing that this file is part of a package that forms part of the dependency list, let’s try installing the one dependency first. One dependency (mariadb-connector-c
) was mentioned in an earlier attempt. Let’s install it and see if this resolves the issue:
Failure
The command also failed. Let’s add another dependency:
We have resolved three of the package dependencies. Let’s try installing the initial mariadb package again:
Failure
All the mariadb-xxx dependencies are resolved, but the Perl packages are still missing when installing the MariaDB client. Let’s try installing Perl:
Failure
We once more see that there are more dependencies. This is becoming a lot of effort to install one package. Fortunately, this is only an example to illustrate how dependencies can make software installation quite complex.
Installing MariaDB using dnf
¶
In the previous section, we performed several commands to resolve the dependencies and install the MariaDB client. As we observed, this task is tedious and not the best way to install packages. Let’s use a different tool to perform this function.
RedHat makes use of the Yellowdog Updater Modified (YUM) package manager.
With the release of RedHat Linux 8, the YUM package manager was replaced (phased out) with the Dandified (DNF) YUM package manager. Their function and operations are similar, so moving to DNF is relatively simple.
The primary benefit of using DNF instead of installing software manually is that DNF can use additional XML configuration files (metadata in repositories) to resolve dependencies on your behalf.
Let’s install mariadb using dnf:
Success
When executing the dnf
command, you will notice that three repositories (Rocky Linux 9 – BaseOS, - AppStream and - Extras) were downloaded/refreshed first. This is the metadata required to indicate which packages are available and the dependencies in the repository. From this information, the requested packages are determined, and the packages that will be downloaded, installed, and possibly updated are listed.
When prompted to continue, type y and press enter. The download will commence. When the packages (RPMs) are all locally available, the wizard may ask you to import the GNU Privacy Guard (GPG) fingerprint. This ensures that the software being installed comes from a reputable source. In this case, it is the repository manager for Rocky Linux, so you can safely type y and press enter.
The result shows that 59 packages would be downloaded and installed. This saved us a lot of guessing and manual work.
Software repositories for dnf
¶
We downloaded various packages from the Rocky (AppStream) repository in the previous section. However, sometimes, we need to add additional repositories to install software.
Let’s try and install a helpful command tool called htop
:
Failure
This installation failed because no known repository provides the package. From experience, I know that the repository required is named Extra Packages for Enterprise Linux (EPEL). This repository is often used and is directly available from the Rocky Linux repository.
Success
After installing the repository, we can attempt to install htop
again:
Success
After installing the repository, we reissued the installation of htop
. This time, the repository’s metadata is downloaded, and the package is available for installation from the EPEL repository. Because we used the -y
option, we were not prompted to accept the GPG fingerprint or to confirm the installation.
The htop
package is now installed and can be used:
Success
The htop
command displays various system statistics. Note the options at the bottom of the screen. Press F10 or q to quit.
DNF software repositories¶
We installed the EPEL repository in the previous section, but we are uncertain what this repository does and which software is available from it.
First, let’s see what the epel-release
package installs:
Info
Various files are installed from this package. However, we are interested in the files installed under /etc/yum.repos.d/
. Let’s investigate the contents of the /etc/yum.repos.d/epel.repo
file:
Info
Line 1 defines the name of the repository. Line 2 describes what the repository is used for. Line 5 is commented out but indicates an example website from which the software can be downloaded. Let’s browse to an example EPEL repository hosted on kernel.org.
Another line of importance is line 7, indicating that this repository is active and will be used to search for software. In contrast, lines 18 and 28 suggest that those repositories are disabled.
Some valuable options for dnf
¶
In earlier code blocks, we attempted to install dependencies, which were difficult to determine. For instance, we required a package that provides a specific file. If you don’t know which package provides a particular file, you can use the dnf
command to determine it.
Info
The above result indicates that two packages (in this case, one package stored on two different repositories) provide the particular file.
One can also use interpolation using the asterisk (*). For instance:
The provides option only indicates packages that form part of a dependency. If you want to search for a string, for instance, the description of a package, you can use the search feature.
Info
With the search option, we get the two additional packages (ebranch.noarch and fedrq.noarch), which do not contain epel in their name but in the description.
Another helpful option is the info
option. Unlike the rpm -q --info htop
command, the dnf info htop
command will show information for packages, whether or not they are installed. For instance:
Info
Finally, the dnf
command can be used to upgrade a package, or the whole system:
Success
The above result indicates that various packages need to be upgraded. Installing updates regularly or before installing a new package is good practice. Depending on the Internet speed, downloading and installing could take a while. In this example, the download took 22 seconds, and the entire transaction took just under two minutes.
Exercise 03¶
- Install a package that provides a binary file called fortune
- Install the Apache HTTP server package using dnf
- After installing the service, start it using the command:
By default, the firewall blocks the service. The management of the firewall is described later. - For now, we can simply stop the firewall:
Test the connection to your web server using your local Web browser and opening the page as specified in the instructions page.
Note
To make this work, a lot of "magic" is done on your behalf.
If you get an error "503 Service Unavailable", check the following:
- You have specified the correct URL.
- The service is started.
- The firewall is opened/stopped to port 80.
If the test page displayed correctly, create a basic web page with the current date, etc.
Danger
We disabled our firewall for now.
When testing is done, restart the firewall:
Reminder
Remember to switch off your VM!
Section 04¶
Notes from the previous session
- The lecturer may have to reset passwords and VMS from time to time.
- Please attempt to keep up.
- If you are stuck at a point, read the instructions again, you may solve your own issue.
- If you are able to complete a task, please assist students next to you if they struggle.
- Only one student switched off their VM.
- This is what happened to a test machine during the week:
Instructions¶
See the Information Page to log into the cloud infrastructure. For this session, we will be working directly on the VM terminal. Usually, we would use an SSH session to access the remote host but to ensure that you type the commands out (better practice) and do not simply copy and paste them, we will use a terminal session instead.
Note
We are going to execute commands directly in the terminal.
You must use your own VM or GNU/Linux installation.
- Open the cloud infrastructure
- Select your VM
- Click on the Open Console button/option at the top
- A terminal is already open
- Become the normal user
admin
:
Note
Just note, if you logged into the system yourself, the terminal would have looked like:
Also note: While typing your password, it will not be displayed on-screen.Overview¶
Until now, we somewhat mindlessly followed instructions and executed commands without much thought. In this section, we will delve deeper and gain a better understanding of the GNU/Linux operating system by using the command line more efficiently.
To learn some of the commands and observe their effect or workings, we may have to use or learn some commands simultaneously.
In this section, we will start familiarising ourselves with some of the most used command-line tools and methods to navigate GNU/Linux and interact with files. We will begin with the more basic commands and progress to a point where the commands and concepts may seem challenging. However, following the instructions and going back later, you will learn concepts often invaluable to researchers and power users to transform data, use advanced toolsets to perform data analysis, etc.
The terminal and command line¶
The terminal or command line of GNU/Linux encapsulates the power of Linux. Almost every action that can be performed on Linux can be performed without a graphical interface. The benefit of using a terminal instead of a graphical interface is that with a graphical interface, one often makes errors or ineffective gestures to achieve a task. In the terminal, one can be more consistent and perform actions more repetitively, which is more scientific in reproducing a procedure.
In a later session, we will look at writing scripts to perform specific actions more effectively. To illustrate, think about how you would create a user using a web-based system. That is probably quite simple. But what about performing that same action, say, ten times for ten different user accounts? It becomes challenging but still doable. What about a hundred or even a thousand? The rate of errors will be significant, and the switching between the source and the web-based system will be tedious. But more about that in the later session.
As you may know, GNU/Linux is case-sensitive in most aspects. Ensure that the commands in this section are typed correctly, taking extra caution with the cases and sequence of execution.
Important concepts of the terminal/command line:
- Case sensitivity
Commands, options, parameters, file and directory names - The number of spaces/tabs between options is of no significance
- Long commands can continue over multiple lines by ending the line with a backslash ( \ )
- The order of parameters (and sometimes options) for a command usually does not matter
For instance,ls -l -a
is the same asls -a -l
ls -l -a /var/
can also be executed asls /var/ -a -l
- Parameters may be combined for most commands
For instance,ls -l -a -r -t
can be combined as:ls -lart
- Completion and auto-completion
Using the Tab button, one can complete commands, filenames, or paths
A useful package namedbash-completion
can extend this behaviour further - Help about a command can be seen for most commands using the --help option
- The manual pages of a command can be seen by preceding the
man
command in front of the command name - While typing a command, one can press Ctrl+C to cancel the input and the command
- To clear the screen, the Ctrl+L can be pressed at any time, even while typing a command
- A file and a directory in the same directory can’t be named the same
- A history of executed commands is kept
One can press the Up/Down arrow to scroll through them
Use Left and Right arrows to modify the line if needed
Press Enter to execute the command as displayed - If you typed a path for one command, you can reuse the path by typing Alt+. in the next command.
Example:
File system structure¶
In the first section, we briefly looked at the layout of the file system. To refresh your memory, the following table indicates some of the more pertinent directories. We will only look at the top-level directories and not go too deep into the sub-directories. However, looking into some of the sub-directories may be helpful.
Directory | Description |
---|---|
/ | Root Directory ..... much like c:\ |
/bin | Most system-wide executables like c:\windows |
/boot | Grub boot partition with kernel images |
/etc | System-wide configuration files |
/home | Holds users' home directories |
/lib(64) | System libraries ..... much like dll files |
/mnt | Can be used to mount external drives etc. |
/proc | System devices (Hardware) and processes |
/root | Root user's home directory |
/run | Some temp files for services & processes |
/tmp | Temporary files |
/usr | Usually installation path of some apps/databases |
/var | Files/databases/logfiles that change a lot |
File/Directory permissions¶
With a POSIX-compliant file system, file permissions and ownership are essential. The ls -l
command can be used to see the ownership and permission of a file/directory.
To explain our observation, we will look at a long listing for one directory. We will deconstruct each entry into different sections to determine what each indicates.
Example
The first character, (d), indicates this is a directory. The character would have been a minus (-) for a file. Another option often observed is a lowercase l, suggesting a link.
The character is followed by three groups of three characters (rwx rwx r-x). Each of these groups shows the permissions (in order) for:
Owner, Group owner, and World.
The owner is the person who created or owns the file/directory and usually can modify the object. In this example (drwxrwxr-x), the owner can Read, Write, and Execute the directory.
The next group (drwxrwxr-x) indicates that the group that owns the directory can also read, write, and execute the directory.
The last group (drwxrwxr-x) indicates that the world permission (everyone) can read and execute the directory, but there is no write permission.
To browse into a directory, you must have at least read and execute permission.
The permissions are then followed by a number (6 in this case). This number can be the number of links to the directory or a calculated size of the directory’s contents.
Next, we observe the name root
. This is the owner of the file/directory. Then, we observe the name chrony
. This is the group to which the permission refers.
After that, a date and time follow. This is the time when the file was last modified. Finally, the name of the directory is displayed.
The above structure is used throughout the system.
Tip
Almost 99% of the time, when a service or user complain about not being able to find/access a file; it is a permission error.
Commands¶
We will look at various commands often used. Participants should execute these commands to experience their working.
Note
We will only look at a few commands.
A minimal Rocky 9 installation, has ± 1 150 commands.
This can be verified by pressing tab twice in the terminal: Tab+Tab
Change the Directory (cd
) and determine the Present Working Directory (pwd
)¶
The cd
command is used to change directories. The essential operation uses the cd
command and the destination path. For instance, execute the following command to move into the /tmp directory.
To see where in the file system you are currently, execute the pwd
command.
Note
In the following code block, execute the pwd
command after each cd
command to see the directory you are in.
You can also execute the ls
command to see a listing of the files/directories in the current directory.
Listing files/directories (ls
)¶
We have been using the ls
command in various examples, but let's see the use of ls
in more depth.
Make a directory (mkdir)¶
The mkdir
command is used to create directories and subdirectories. One does not need to be in the directory in which you want to create a subdirectory.
Note that in the last mkdir
example, a list was provided and separated by commas. Furthermore, note that the list starts with a comma, indicating that the first item is empty. So, in this case, the first directory to be created was parent/. The second would have been parent/with; the third would be parent/multiple, and so on.
Remove files and directories (rm
)¶
Removing files or directories is done using the rm
command. Care should be taken when removing files, especially when removing directories. GNU/Linux does not use a recycle bin when removing files.
Finally, the directory was removed. Again, care should be taken when removing files and directories, especially when removing directories recursively and even more so when the forceful option is used.
Danger
The most dangerous command would be to execute the command rm -rf /
as the root user.
This command can destroy the filesystem and operating system in seconds.
Copying files/directories (cp
)¶
The cp
command is used to copy files and directories on a local or removable file system.
Note
After executing each of the following commands, execute the ls
command to see the effect.
View the content of files¶
We have copied files to our home directory in the previous section, but we would like to see the content of these files. The first option is to view the top part of the file using the head
command. By default, the head
command shows the first ten lines of a file. This is useful, especially when viewing a CSV file.
The opposite command also exists to see the last couple of lines in a file. The tail
command can be used to see the last lines of a file. This command is helpful, especially when monitoring a file in which new entries are appended to the end of the file.
The following command is used to see all the file content. The concatenate (cat
) command can be used to view a file's content.
The cat
command can be used to list the content of a file on-screen. The cat command has other uses, but we will return to it briefly.
We noticed that the content was displayed on the screen, but a larger file, such as the log file, scrolled over the screen at an unreadable pace. We can use the more
command to view the content better.
With more
, you can use the spacebar to scroll down a page, but more
can only scroll in one direction. We can use the less
command to scroll up and down a file. The less
command also allows you to search for terms by typing a forward-slash directly, followed by the term you want to search for.
Use the following to navigate a file using the less
command.
Info
Up/Down to scroll up and down
Page Up/Page Down, scrolls the page up or down
Space, skip to the next page
G = go to the end of the document
lowercase G = go to the top of the document
/term = search for term
(case-sensitive) in the document
lowercase N = repeat the search again
N = search backwards (previous) occurrence of the term specified
-i
= switch case-sensitivity for searches
Q = quit the more command
Explore the /var/log/dnf.log
log file and browse around.
Scroll up and down the document, go to the end of the log and thereafter back to the top.
Search for a term.
Search for the next occurance of the term.
Search for the next occurance of the term.
Search for the previous occurance of the term.
Quit.
Looking for text in a file using grep
¶
The GNU regular expressions command grep
can be used to search for text in a file in a specific format. The command is often used to retrieve text as it is streamed on the screen, but this will be demonstrated later. A different section will also explain the description and use of regular expressions.
Executing a command in another command¶
It is sometimes helpful to execute a command and use that output within another command. For instance, we may want to print a line to the screen with a short message and the current date. This can be done as follows:
In a previous section, we used the rpm
command to determine which package provided a specific file. Using the above method, we can, for instance, determine which package provides a file /etc/passwd
and, in the same command, determine more information about the package itself.
Using output as input for another command¶
Up to this point, all the output of the commands was displayed on the screen. However, we can use the output of one command and feed that into another command. The output of one command can be parsed to the following command using the pipe ( | ) character.
For instance, we can use the grep command to search for all lines that contain the word install; we can use that list and further search, for example, for the word kernel. The following command will achieve this:
In a previous example, we executed the grep command to search for specific dates in a file; we can use that command again and uniquely sort the results to print only the unique dates. It is easier to build the command step by step. For instance, to get the result we want, we may execute the following:
The power of this type of manipulation will become more evident as we progress. Especially when using regular expressions and the sed
command.
Redirecting output to a file¶
One often wants to save the output of a command into a file. This can enable you to keep track of records for preservation or to act upon later. In this section, we will execute various commands that have not yet been discussed, but their operation will be noted here.
As mentioned in the previous section, the output of a command can be processed by another command using the pipe | character. To redirect the output of a command to a file, we can use the greater than character >
Let’s use the echo command to display a message and redirect it to a file:
You can also execute a combination of redirection of output to other commands and then output the final result to a file as follows:
The file (if it exists) will be overwritten whenever we use the greater than sign. If you want to append the results to the end of the file, we can append the result by using a double greater-than >>
Note
Execute the following commands using the Up arrow, pressing Enter and repeating the command as indicated.
After that, cat
the resulting file to inspect the file:
Exercise 04¶
- Return to your home directory
- Create a directory and a subdirectory under it called
exercise04/files
- Print a single line that specifies which package provides the file
/var/log/dnf.log
It should read (don't hardcode the output):
- Print the same message to a file:
~/exercise04/files/dnf
- Copy the file
/var/log/dnf.log
into the~/exercise04/files
directory - Print (on-screen) all the lines from this file that contains "ssh"
- Print (on-screen) all the lines from this file that contains "ssh" and the word "install"
- Print (on-screen) the same result, but exclude lines containing "DEBUG"
- Redirect this result but this time redirect the output to a file called:
~/exercise04/files/result
- Print the last four lines from the file
/etc/passwd
into the~/exercise04/files/result
file without overriding the file
Reminder
Remember to switch off your VM!
Section 05¶
Notes from the previous session
- The lecturer may have to reset passwords and VMS from time to time.
- Please attempt to keep up.
- If you are stuck at a point, read the instructions again, you may solve your own issue.
- If you are able to complete a task, please assist students next to you if they struggle.
- Several students didn't switch their VMS off.
- Current Amazon costing is ± R100 per VM per day.
- Would have cost the UFS R 9 800.00 for the week.
Instructions¶
See the Information Page to log into the jump node.
Note
We will all be working on this machine for today’s class.
- Log into the jump node
- Copy the file
/opt/example05/examples.log
to your home directory - Change into your home directory
- Attempt to execute the commands that the instructor is executing
Overview¶
In this session, we will look at regular expressions and how to use them to search for specially formatted text.
We will investigate environmental variables, conditional statements, basic loop operators, and while loops.
Regular expressions¶
Regular expressions (regex) print or change text according to a predefined pattern.
The most often used commands with regular expressions are grep
, sed
and awk
. We will only look at the grep
command, but the usage for other tools, such as sed
and awk
, is similar.
In some examples in the previous sections, we used the grep
command to search for text in a file or output matching a specific pattern.
We looked, for instance, for text starting with a particular date or ending with a specific word.
Let's see how we can define or narrow our search for other cases. To effectively use regex, we need to know how a regex is constructed and which components can be used.
We will look at anchors
, character classes
, POSIX mnemonics
, quantifiers
, and grouped matches
to build regular expressions.
Anchors¶
As the name may suggest, an anchor indicates in which position a pattern should be matched. The following table shows the anchors with their meanings and examples follow
Anchor | Description |
---|---|
^ | Beginning of a line. |
$ | End of a line. |
Anchors for begining and ending of strings also exist, but we will not use those in this course.
Example
Character classes¶
The following character classes can be used to search for a pattern:
Characters | Description | Example |
---|---|---|
[abc] | A single character | a, b, or c |
[a-h] | A character range | a, b, c, d, e, f, g, and h |
[^a-f] | A character other than a, b, c, d, e, and f | g,h,i,j,k,l…., 123456, !@#$%^&... |
[^a,e,I,o,u] | All characters that are not vowels | b,c,d,f…., 123, !@#$%^&... |
\w | Any word (or number) | 007, hat, cat, log, dog, etc. |
\s | A whitespace | (space, tab or form feed) |
\d | A digit | 0-9 |
Note
The capital versions of w, s, and d is the negation of the meaning.
For instance, \S is any non-whitespace character.
Example
POSIX mnemonic¶
The POSIX mnemonic classes are used to describe a pattern classification.
A mnemonic class technically defines a class of characters in a more readable/memorable fashion.
POSIX mnemonic | Description | Example |
---|---|---|
[:alnum:] | Alphanumeric characters | a-z, A-Z, 0-9 |
[:alpha:] | Alphabetic characters | a-z, A-Z |
[:blank:] | Blank characters | Space and tab |
[:digit:] | Numerical digits | 0-9 |
[:cntrl:] | Control characters | \a alert bell, \b backspace, \n newline, etc. |
[:graph:] | A visible and printable character | a-z. A space, for instance, is printable but not visible |
[:lower:] | Lowercase characters | a-z |
[:print:] | A printable character that is not a control character | a-z, A-Z, 0-9, !@#$%^&.... |
[:punct:] | Punctuation characters | !@#$%^&()_-+=.?<>/ |
[:space:] | Non-printable space characters | Space, tab, form feed |
[:upper:] | Uppercase characters | A-Z |
[:xdigit:] | Hexadecimal digits | 0-9, a-f |
To use the mnemonic classes, you may have to enclose the class in an additional set of braces;
for instance, [:lower:] becomes [[:lower:]]
Example
Quantifiers¶
A quantifier matches a character, or a set of characters.
Quantifier | Description |
---|---|
. | Matches any character |
+ | Matches the preceding set one or more times |
* | Matches the preceding set zero or more times |
? | Matches the preceding set zero or one time |
{n} | Matches the preceding set n times |
{n,} | Matches the preceding set n or more times |
{n,m} | Matches the preceding set between n and m times |
Note
You may have to escape the quantifier using a backslash when searching for entries containing a quantifier without the quantifier having its special meaning.
For instance, if you are searching for a value of 12.34, if you don’t escape the period, you may also get results like 12x34, 12-34, or 12<34, 12 34, etc.
Example
Grouped matches¶
Group matches are used to group some characters together in the search. When specifying two or more characters in parentheses, the string becomes sequential. A pipe character indicates an or statement in the combination.
Quantifier | Description |
---|---|
(og) | Parentheses enclose contiguous matches, such as dog, fog, frog, but not goat |
| | The pipe character indicates an or statement within grouped matches |
There is only an or | quantifier because the character is already contained in the parentheses for an and.
Example
Question
Using the examples.log file, search for the following:
- Only lines containing an IP address
- Lines containing three or more capital letters
- Lines containing two or more consecutive periods
Environmental variables¶
The terminal makes extensive use of Environmental Variables for its regular operation. An environmental variable is defined by providing a name, an equal sign, and a value. There should not be any spaces between the name and the comma when declaring variables.
Environmental variables of interest:
Variable | Description |
---|---|
HOME | The home directory of the current user |
OLDPWD | The previous working directory. When executing cd - , this is the directory where the user will end up. |
PATH | Executable files in the path can be executed from anywhere |
PPID | Parent Process ID. A unique number representing the parent process that started the current session |
PS1 | The string used to set the prompt. For instance '[\u@\h \W]$ ' displays as: '[ user@hostname Working_Directory]# ' |
PWD | The present working directory, which is displayed by the pwd command |
SHELL | The path to the shell that the current user is using |
UID | The unique user ID number. Zero is reserved to the root user |
USER | The name of the current user, also displayed when executing whoami |
$$ | A special variable returning the process ID of the command itself |
Considerations
- Use the
set
command to list all variables and functions- Use the set command and pipe the result to
grep
to search for lines beginning with[:alpha:]
,
which will show only variables and functions without the content of the functions
- Use the set command and pipe the result to
- Use the
echo
command to view the value of a specific variable - A variable’s name cannot start with a number but may contain numbers
- A variable’s value can be exported using the
export
command to make the variable available to children processes - Values with spaces should be declared between single or double quotes
- The variable's value is not displayed when using a single quote with commands such as
echo
. Instead, the name of the variable is shown as it was typed - A typeset can be specified optionally when declaring a variable using the
declare
command - A variable's value is referenced by adding a dollar sign ( $ ) in front of its name
- The dollar sign has a special meaning. When you want to print it, you need to escape it using a backslash: \$
- When accessing an undeclared variable, a null value will be returned
- When using a variable directly adjacent to other text, the
${Variable_Name}
format can be used
Example: `echo "The value of the ${USER}-user's home directory is $HOME."
If the curly braces weren’t used, we would have attempted to use $USER-user, which is undefined - The output of a command can be used to set the value of a variable
- Use the
unset
command to remove the variable
Example
Context testing (test, if, and case statements)¶
You can use the open square bracket ( [ ) to perform a test statement in the terminal. Note that the square bracket is actually a command.
This can be confirmed by typing Space+Tab+Tab
You will see [
and [[
are listed as commands.
A basic test syntax would look like:
In the above example, we tested if the value for the user id is equal to zero, meaning the current user is the root
user. We did nothing with the result, so nothing was returned to the terminal. In the subsequent sessions, we will see how to use the test statement, for instance, in if-statements, etc.
Note
Because the square brackets ( [ ) are, in fact, a command, it is essential to use a space as a separator between the bracket and the variable/value being tested.
For instance, [-e /etc/hosts]
will fail, saying that the command [-e
does not exist.
Testing for files, directories, links, etc.¶
To see all the options for a test, execute the command man test
.
We will look at some of the options often used.
Test | Description |
---|---|
-e | A file or directory with the name exists |
-d | A directory with the name exists in the path specified |
-f | A file with the name specified exists |
-s | A file exists and is not empty (larger than zero bytes) |
To see the usage of these tests, we first have to decide what to do, using and/or statements, which is discussed in the following session.
The and, and or-operator¶
In the test command, acting on whether the test is true or false is essential. In the command line, if you perform a test, you can act according to the value. Two options are available: an and-statement and an or-statement.
The and-statement is indicated using a double ampersand: &&
For instance,
The or-statement is indicated using a double pipe: ||
For instance,
Of course, it may be used in a combination:
Example
Testing operands¶
Operand | Description |
---|---|
-gt | Testing for a number (integer) greater than |
-ge | A number greater than or equal |
-lt | A number less than |
-le | A number less or equal to |
! | The not (inverse) of a test |
-z | The string is of zero length |
-n | The string is not zero length |
Example
If-statement¶
The above examples inevitably performed an if-statement without expressively mentioning the if keyword.
All students should have a programming background, and the following example will therefore be understood:
Example
The strangest of the above statements is probably the fi
statement.
It is the if
statement’s inverse and represents the closing part of the if
statement.
Case-statement¶
The case statement is similar to the if statement but used for multiple values. In the above examples, we were performing binary tests. A value was either true
or false
.
The syntax of a case statement is, again, simple to computer scientists if an example is provided.
Example
The case statement starts by using a case $variable in
statement, indicating which variable is tested.
The options are then tested and can have multiple values for the same test.
A double semi-colon (;;
) indicates that the commands for that section ends.
The *)
represents all remaining values not pragmatically defined and tested.
In this case, we also set a value to the prefix, indicating that we don’t want to print a unknown
, which is grammatically incorrect, but rather print an unknown
.
Finally, the case statement is closed by the inverse of case, which is esac
.
Loops, basic mathematical arithmetic and iterations¶
The first iteration that we can observe is shell expansion.
With shell-expansion, one can write a statement such as:
This is a type of iteration through three values.
Perhaps using a numeric value will have more value:
Example
The above command creates sub-directories such as 0, 1, 3, 4, etc.However, say we want to have the directory names the same length (zero pre-pended), we can use the sequence (
seq
) command:The remaining 001 to 100 were created in the parent directory, not the subdirectory.
We can do this correctly by creating a directory first and then changing into the directory and executing a version of the previous command:
For loop¶
The above three commands (mkdir...
, cd...
, mkdir...
) in the previous section provided the correct results but were clunky.
It would be better to iterate over the values of the command. The first iterator we will use is the for loop.
A for loop can iterate over multiple string values (space-separated) or numbers.
For instance,
Example
To write the command to create sub-directories in a better form than the one in the last code block in the previous section, we can write it as:
While loop¶
When performing an for loop, the starting and ending conditions are predetermined. This means that a limit is eventually reached.
However, a while loop runs until the condition it tests for is matched, which can cause an infinite loop when the condition never becomes true
.
Therefore, extra caution should be taken to ensure that an infinite loop does not occur. A simple infinate loop example is: while [ 1 -gt 0 ]; do ...
Danger
The following code will cause an infinite loop.
Press Ctrl+C to cancel the process.
Note
Knowing that the above code will cause an infinate loop, a sleep was added to safeguard the system from being flooded.
The code block in the danger
section above executes an infinite while loop.
In line five, we mistakenly changed the end condition, which will means the ending condition will never be reached.
Let’s implement it correctly, by removing line five.
It is better to use a for loop to avoid an infinite loop, but sometimes, a while loop is sound.
A while loop is often used when prompting for user input.
For instance, the following code block won’t exit until the user types an acceptable value:
Example
Another use for a while loop is to loop over each line in a file.
The following code block will read each line from a file and process it line by line, with a logical condition as an example in the code.
Example
Exercise 05¶
Timed Challenge
Execute the following code block (COPY AND PASTE)
Await further instructions.Note
Execute the command: c
to see the status of your results and submit it if it was correct.
Question
- Define a variable named
my_tmp
with a value set to /tmp/YOUR_USERNAME/exercise05
Note: Use the value of an existing varaible instead of typing your username
e.g.:/tmp/usr00/exercise05
- Create a directory (and subdirectories - if needed) with the path as specified in the above variable
- Create 100 sub-directories numbered from 1 to 100 in the same directory:
e.g.:/tmp/usr00/exercise05/1
/tmp/usr00/exercise05/2
.../tmp/usr00/exercise05/100
- Use a for loop, that creates a file
$my_tmp/hosts
with 99 line entries in the format:
- Using the examples.log (available from /opt/example05/examples.log) file, search for the following:
A unique list of "valid" IP addresses contained in the file
(Ensuring you don’t get, for instance, ... or 890800003...)- Redirect the output of this command to a file called ips.txt in the $my_tmp directory
- Write a while loop that reads each line in the $my_tmp/ips.txt file.
If the IP address starts with 172 or 192, it should print "xxx.xxx.xxx.xxx is a private IP"
If it does not start with 172 or 192, it should print "xxx.xxx.xxx.xxx is a public IP"
Where xxx.xxx.xxx.xxx represents the actual IP address observed in the coresponding line.
Tip: It may be easier to write the loop in Notepad and copying and pasting it to test.
The last question in the exercise may be a bit challenging. If you don’t get it right during the class, please revisit it during the week and see if you can complete it.
Reminder
This time it is not necessary, but always remember to switch off your VM!
Section 06¶
Notes from the previous session
- The lecturer may have to reset passwords and VMS from time to time.
- Please attempt to keep up.
- If you are stuck at a point, read the instructions again, you may solve your own issue.
- If you are able to complete a task, please assist students next to you if they struggle.
Instructions¶
See the Information Page to log into the jump node.
Note
We will all be working on this machine for today’s class.
- Log into the jump node
- Copy the
/opt/example06/
directory to your home directory and cd into your home directory:
- Attempt to execute the commands that the instructor is executing
Overview¶
In this session, we will learn how to change file content using text manipulation tools like sed
. After that, we will look at how to use two popular text editors named nano
and vi
.
Text manipulation using sed
¶
In the previous session, we looked at regular expressions and searched for text in a file using the grep
statement. We now need to look at how to modify text in a file.
We will look at the stream editor sed
. Several tools, such as sed
and awk
, can be used for this purpose. However, we will only be looking at sed
. To use awk
, see the awk
manual pages or online sources.
Using sed
to modify output¶
The easiest way to use sed
is to execute a command and modify the output using a regular expression (pattern) or specific text.
The sed
command has various functions within the command that can be called. We will only use a few available options, which are helpful throughout this course.
Using sed
to delete lines containing text/pattern¶
The first option is to delete lines containing specific text. The syntax is:
/search_string_we_are_looking_to_exclude/d
Note
The forward slash delimiter / is compulsory for the delete option.
For other options, such as the substitute hereafter, the delimiter can be changed to another character, such as % or |.
The same delimiter must be used throughout the command.
For instance, if we print 15 lines and we want to delete a line that contains the number three (3), we can execute the following:
If we want to use a regular expression, we can simply include the regular expression in our search.
For instance:
Using sed
to replace specific text¶
We deleted complete lines in the above examples using the "/search_line_to_delete/d"
syntax.
If you want to replace only specific values from a string, you need to use the following syntax:
Example:
- The search and replace function (substitute) starts with an s.
- After that, a delimiter is used.
In this case, a forward slash /. - The delimiter is followed by the string we are looking for.
- Another delimiter / is used to indicate that the replacement is to follow.
- After the string that is replacing the search string, another delimiter / is used.
- After the final delimiter is specified, a value is specified.
This value represents the occurrence at which the replacement should commence.
In the above example, a value of g is used, meaning a global replacement is done.
The g-option is the most common and replaces all occurrences of the text. - Note: the occurrence of a replacement string is per line, so if you specify:
"s/replace_me/with_me/2"
then the second occurrence of replace_me in each line, will be replaced by with_me.
Note
Note that other special characters can be delimiters instead of the forward slash /.
Using a pipe character | instead when modifying a path will be easier.
Simple example
Another option is to replace it from the nth position to the last occurrence in the line, using the syntax "s/search/and_replace/ng"
.
Print a specific line using sed
¶
The sed
command can also print specific lines from a file. The print function is used and has the following syntax to print line three of /etc/passwd
:
Ranges can also be specified; example:
Changing a file in place, using sed
¶
The above examples simply modified the output provided. The sed
command can write modifications to a file using the in-place (-i) option.
Example
Note
Note using a different delimiter (other than / ) is recommended when using paths.
If a forward slash / is used as a delimiter, any forward slashes searched for must be escaped using a backslash \.
The in-place instruction is the most common use for the sed
command to modify configuration or template files.
Exercise
Using the examples.log file, search for the following:
- Copy the original /etc/passwd file over the current file in your home directory
- Using
sed
, modify the file (in place) to change the nologin at the end of the lines to the word false - Replace all occurrences of
/usr/sbin
with/bin
Note using a different delimiter (other than / ) is recommended
If a forward slash / is used as a delimiter, any forward slashes searched for need to be escaped using a backslash \.
The nano
text editor¶
In a GNU/Linux system, all configurations and data files are usually in plain text format. It is, therefore, valuable for modifying files directly in the terminal. The first and most user-friendly text editor we will look at is nano
.
One drawback of the nano
editor is that few GNU/Linux systems have nano
installed by default.
However, it is available in their package managers, and thus, a simple dnf install nano
will install nano on a RedHat-derived Linux system with Internet connectivity.
The nano
editor can be used by executing the nano
command, followed by the path and name of the file to edit.
Example
An empty file will be created if the file does not exist.
The following figure shows the primary nano
interface.
Note the options at the bottom of the screen.
In this case, the most prominent line is the warning that the file is not writable.
For the other options, we see for instance, Ctrl+O will write the current file out (save).
To close nano
, we press Ctrl+X
Nano file management¶
Task | Keystroke | Notes |
---|---|---|
Open a file from within nano | Ctrl+R | NOTE: tab completion is in effect; also, once this command has been entered, notice the new menu items are at the bottom of the screen. For example, Ctrl+T will allow you to browse the file system and look for a file to open. |
Display the next file buffer | Alt+> | |
Display the previous file buffer | Alt+< | |
Save the current file buffer to disk | Ctrl+O | |
Close the current file buffer | Ctrl+X | NOTE: If the file hasn't been saved, you'll be asked if you want to save it. Also, if only one file buffer is open, closing it will exit from nano. |
Nano copy and paste¶
Task | Keystroke | Notes |
---|---|---|
Select a region for a cut or paste operation | Alt+A | NOTE: After setting a mark with Alt+A, move the cursor to define the region; you should see it highlighted as you move the cursor. Also, to cancel the definition of the region, just press Alt+A again. |
Copy a highlighted region into the clipboard | Alt+^ | |
Cut a highlighted region into the clipboard | Ctrl+K | |
Paste the contents of the clipboard at the current cursor position | Ctrl+U | |
Cut from the current cursor position to the end-of-line (EOL) | Ctrl+K | NOTE: This command doesn't require highlighting of the region. |
Nano navigation¶
Task | Keystroke | Notes |
---|---|---|
Go to the beginning of the file | Alt+\ | |
Go to end of file | Alt+/ | |
Move forward one screenful | Ctrl+V | |
Move backwards one screenful | Ctrl+Y | |
Go to a target line number | Alt+G | |
Jump to matching open/close symbol | Alt+] | NOTE: Handy for finding mismatched brace compiler errors! |
Window scrolling | Alt+= | To scroll down |
Window scrolling | Alt+- | To scroll up |
Indenting selected blocks | Alt+A to select a block | Then Alt+} will indent the selected block. |
Outdenting selected blocks | Alt+A to select a block | Then Alt+{ will outdent the selected block. |
The vi
text editor¶
The first release of vi
was in 1976. The vi
editor is installed by default in most GNU/Linux distributions. You can use' vi' to modify a repository or network configuration file to install nano
on a clean installation.
The vi
or newer vim
(vi-improved) is a powerful text editor with various commands and shortcuts that are less intuitive and not displayed on screen like for nano
. You will notice some syntax highlighting when opening a file such as a configuration or other Linux system file in vi
.
Like with nano
you can simply use the vi
command followed by the path of the file to edit.
The above command will open the vi
editor and will look like the following figure:
The most noticeable difference between the nano
and vi
interfaces, is the syntax/colour highlighting displayed in vi
.
The different vi
modes¶
The first concept in learning to use vi
is identifying the different modes. Six modes exist, but you will most often only use two modes. Nonetheless, the following table lists and explains the six modes.
Name | Description | Help page |
---|---|---|
normal | For navigation and manipulation of text. vim usually starts in this mode, which you can usually get back to with Esc |
:help Normal-mode |
insert | For inserting new text. The main difference from vi is that many important "normal" commands are also available in insert mode - provided you have a keyboard with enough meta keys (such as Ctrl, Alt, Win, etc.). |
:help Insert-mode |
visual | For navigation and manipulation of text selections, this mode allows you to perform most normal commands, and a few extra commands, on selected text | :help Visual-mode |
select | Similar to visual, but with a more MS Windows-like behaviour. | :help Select-mode |
command-line | For entering editor commands - like the help commands in the 3rd column. | :help Command-line-mode |
Ex-mode | Similar to the command-line mode but optimized for batch processing. | :help Ex-mode |
You will most likely only work in the insert
and normal
mode for 90% of the time.
Two visual modes are essential when working with vi
.
Visual Mode | Key Stroke |
---|---|
Visual line mode | Shift+V |
Visual block mode | Alt+V |
See the Learning vi editor Wiki page for more information on the editor modes.
One-pager for vi
¶
The following one pager cheat sheet is included as a quick reference guide.
Demonstration using vi
¶
The following demonstration will showcase the regular use of the vi
editor.
- The script below should be copied and pasted into a file, named
example06.sh
- The script will not work as expected, seeing that the two functions contained in the script, should be defined before they are called
- There is also no indentation, which is fine in bash, but makes reading the code harder
- Two variables (i01 and i02) are ambiguous and should be named better
Note
One can also execute sed
-like commands in vi
.
To demonstrate, press Esc twice to go into normal mode.
Then press : (colon) to go into command-line mode.
Now, type the following:
%s/iNumber/myNumber/g
Press Enter to execute the sed
command.
The above sed
command will replace all iNumber entries with myNumber.
Example
- Open the file ~/example06/moby.txt using
vi
- Search for the second line starting with ***
- From that position, delete 351 lines
- Go to the first line, delete the first 843 lines
- On which line does Chapter 42 start?
- What is the first word of the 4th sentence in that chapter?
- Go to the bottom of the file
How many lines are in the file? - Save the changes to the current file
- Go to line 6789
- Go into visual line mode
- Go to the end of the paragraph
- Yank the selection
- Execute the following in the command-line mode:
:6789,6828w long.txt
- Exit
vi
- Execute the following command in the shell to see the word count of the long.txt file:
cat long.txt | wc
- Execute a similar command to see how many lines the
~/example06/moby.txt
file contains now
It should be the same value (21122) as observed when going to the end of the file in a previous instruction
More advanced commands in vi¶
In a previous challenge, you were asked to create a file with 99 entries in the format:
172.21.0.101 trn-usr01 trn-usr01.examplesdomain.com
172.21.0.102 trn-usr02 trn-usr02.examplesdomain.com
…
…
172.21.0.199 trn-usr99 trn-usr99.examplesdomain.com
This section is quite advanced and will most likely not be something you would often (or ever again) attempt in a text editor. This is only to showcase some of the powerful actions that can be performed in a text editor.
Let’s create this file in vi
, carefully following the instructions:
Example
- Execute the following command to create a file in your home directory:
- Press 99 followed by I (lower case i)
This will instructvi
to execute the following insert statement 99 times. - Type (or paste) the following line:
- Press Enter to go to the following line
- Press Esc
99 lines with the exact text will appear, followed by an empty line - Delete the last (empty) line:
Press D+D (lower case d, twice) - Go to the second line:
Press :+2+Enter (colon + 2 + enter) - Scroll to the zero of 101 (column 11), having the cursor blink on the zero
- Press Ctrl+V to enter the visual block mode
- Press Shift+G to go to the last line of the file
Note that the first columns will be highlighted - Press the Right (right arrow) until the last two digits (01) of the 101 column are selected
- Press G to execute a global change
- Press Ctrl+A to add one and auto increment each line in the column
- Press :+2 (colon + 2) to go to the second line again
- Move the cursor to the 0 of trn-usr01
- Press . (period) to repeat the previous command
- Move the cursor to the 0 of trn-usr01.examplesdomain.com
- Press . (period) again to repeat the previous command
- Press Shift+G to go to the last line in the file
You will notice that all the lines were modified - Press :+X (colon + x) to write and save the file
Timed Challenge
Execute the following code block (COPY AND PASTE)
Await further instructions.Note
Execute the command: c
to see the status of your results and submit it if it was correct.
Reminder
This time it is not necessary, but always remember to switch off your VM!
Section 07¶
Notes from the previous session
- The lecturer may have to reset passwords and VMS from time to time.
- Please attempt to keep up.
- If you are stuck at a point, read the instructions again, you may solve your own issue.
- If you are able to complete a task, please assist students next to you if they struggle.
Instructions¶
See the Information Page to log into the jump node.
Note
We will all be working on this machine for today’s class.
- Log into the jump node
- Attempt to execute the commands that the instructor is executing
Overview¶
This session will examine how shell scripts can be created to document and somewhat automate processes.
Bash Shell Scripting¶
- A bash script is essentially a file containing a few commands that are executed one after the other (sequentially)
- Most prominently, you will notice the use of variables, functions, return codes, and logical testing
- The first line should start by defining the shell that is used:
#!/bin/bash
This line is referred to as the shebang. - After the file has been created, it can be made executable by changing the file permission:
chmod ugo+x my_script.sh
- The file extension does not matter, but often users will add the ".sh" extension to the filename
- To execute a script (after chmod has been executed at some stage):
./scriptname.sh
or
sh scriptname.sh
- In a script, anything after a # is seen as a comment
- There are some cases (like the shebang) where the comment has meaning to the interpreter.
Another instance of "special comments" is for PBS jobs:
- If (for formatting reasons) you want to continue a command on the next line, you can use a "\" followed by nothing other than an enter/line break:
The above command will be interpreted as a single command:
Exercise
Write a Hello World script.
Don't forget to add the shebang.
Make the script executable.
Execute the script.
Recap of previous sessions¶
In an earlier section, we examined using environment variables, loops, logical testing, and functions.
However, these concepts are essential when writing scripts; therefore, a brief extract of those concepts is repeated here.
Environment variables in scripts¶
- Environmental variables declared and exported in the shell before the script is executed can also be addressed
- Parameters can be parsed to the script when it is executed:
./my_script parameter1 parameter2
- A variable is declared, and the value is assigned as follows:
myName="Albert"
Special variables often used in scripting:
Variable | Description |
---|---|
$0 | Name of the script |
$1 | First parameter parsed to the script |
$7 | Seventh parameter parsed to the script |
$# | Number of parameters parsed to the script |
$@ | All parameters parsed to the script |
$$ | The process id of the script |
$? | Returns 0 if the previous command was successful or any non-zero value if not |
Example
Create a script called test.sh
with the following content:
Testing¶
Testing was described earlier, but to recap, the following basic testing is often used in scripting.
- A basic test to see if a directory exists is expressed as follows:
[ -e /home ] && echo "/home exists"
The above reads: If /home exists, and (then) echo "/home exists"
Other tests often used in scripting:
Expression | Description |
---|---|
-e | File/directory exists |
-d | Exists and is a directory |
-h | File is a symbolic link |
-x | File/directory is executable |
-eq | Number is equal to |
-ne | Number is not equal |
-gt | Number is greater than |
-ge | Number is greater than or equal |
For more information, see the manual pages of the man test
or the earlier section describing testing a bit more.
Logical testing¶
When testing for values or the existence of a file/directory, one often must execute multiple commands when the statement is true or various commands if the statement is false.
In those cases, we can combine the commands together using an if-statement.
The if-statement was discussed earlier, but here is a brief recap:
- An "if" statement is closed by a "
fi
" statement (inverse of if)
- It is important to have a space after [ and before ]
- An "if not" statement is written as:
- Note the space between the ! and the [, also note the use of the double quotes that enclose the text variables
Logical Case-statement¶
The case-statement was also discussed earlier, but for reference, the syntax of a case-statement is as follows:
For loops¶
The following for-loops reflect the syntax to loop (iterate) over strings and numbers.
Basic math¶
This section is new and just indicates how basic math can be performed.
One sometimes needs to calculate values (usually integers) for various purposes.
The first option is to perform the calculation using the built-in bash functions.
The syntax is to enclose the equation in a $(( xxxxx ))
clause.
Example of basic calculation using the built-in bash math:
The following method uses the bc
(basic calculator) application.
The bc
command must be installed before it can be used.
The input for the calculator must be piped to the bc
command:
Using math in an increment for-loop:
Functions¶
Functions were mentioned earlier. To recap, a function executes various commands together.
- A function can return a value.
- If a value is returned, the value should be printed as if it is printed to the screen.
- The function can also
return
whether the function executed successfully or not.
This value should be zero (0) if the function is successful or a non-zero value if the function failed.
Scripting Example¶
Example with the guidance of the tutor.
Example
- Write a script that takes two numbers as input.
- In the script, create two variables and set their values equal to the numbers provided.
- Add a function in the script that calculates the sum of the two values.
- Add a function that calculates the average of the two numbers.
- Test if two values were provided.
- We can assume that the user will provide numbers (we don’t have to test for it).
- Display the following, with the calculated values:
-
Indexing: Method used to keep and seek records (starting positions of files) on a file system ↩
-
Journal: The operating system holds and writes a journal of open files to check in case the file system or system crashes ↩
-
Extents: Usually a file system has 4k blocks, extents creates sub-allocation blocks to support large files ↩
-
Copy on Write: won’t overwrite old data blocks but instead writes new data blocks. Easy to revert to previous state of a file which makes a journal redundant ↩