How do programs run on Linux?

Ever wondered what happens when you run a program? Ever downloaded software from the Net that didn’t work, even though it may have compiled and installed correctly?

If the answer to either of these questions is yes, you probably want to know more about how Linux runs programs. Even if you can’t say yes to either of these questions, you’ll undoubtedly be curious. In this section, we’ll look at how a program starts up when it is run from the command line. We won’t deal with running a program from a menu pop-up in X, as this depends on which window manager you’re using. You’re at the console and you type foo and then press Enter. One of two things will happen: the shell will print out a message saying that the command ‘foo’ was not found, or it will run a command, foo. When asked to run a command like this, Bash takes a number of steps to figure out what to do. First, it checks whether the command is a built-in Bash command like while for or if (to see a list of all the inbuilt commands type man while at the prompt). If the given command was not a built-in command it then checks if it is a Bash alias or function. Aliases and functions can be defined at the command prompt or in one of the Bash initialization files; either the per-user ones (~/.bash_profile or ~/.bash_login) or the global one (/etc/profile). If the command foo was not found in any of the above steps, Bash then looks for the command in the directories listed in the PATH environment variable. Finally, if it could not find the given command in any of these directories Bash will print out the dreaded ‘bash: foo: command not found message. If foo does exist and the person running Bash has execute permissions for the file, Bash will execute it. Inside the Linux kernel, there is some code to figure out how to run the file. For instance, the file may be an ELF executable or a script for one of the many scripting languages common on Linux. If foo is a script, the kernel will find the script interpreter’s name in the first line of the file (such as ‘#!/bin/bash’) and start the interpreter, passing it the name of the script it has to execute. If, on the other hand, foo is an ELF executable, then it’s much more complex. Firstly, there are dynamically and statically linked ELF executables.

Statically linked executables are completely self-contained and do not need any external libraries to run. When a statically linked binary is running, the only time it executes code outside of itself is when it makes a system call. A system call is a call into the Linux kernel to perform tasks like opening, reading or writing a file, starting another program, getting process information, or any of a number of other tasks. Library calls are similar in many ways but pure library calls do not need to enter the Linux kernel to complete their required operation. Dynamically linked executables, on the other hand, do use code from external libraries, which are shared with other programs, to perform some of their work. There are a number of advantages to this. Firstly, it means the program will take up less disk space because part of its code is stored elsewhere. In addition, the application programmer’s job is easier as they can use code written and debugged by someone else instead of writing it themselves from scratch. More importantly, it reduces the usage of system memory. Consider three programs all running at the same time and all linked to ‘libxxxx.so’. This library will contain code, static data such as strings and constants, and data that is modified during the running of the library code. The Linux kernel will only load the code and static data sections of this library into memory once, but it will give all three programs access to it. The only part of the library which will have to be replicated is the dynamic data section; the part of the library that changes during its running lifetime. You might already have seen packages such as ‘gnome-libs’ and ‘kdelibs’. When you install a new Gnome or KDE program, it can’t run without the Gnome or KDE libraries installed as it calls on these to perform certain functions — such as loading a file save dialog or drawing buttons a certain way. In fact, even KDE itself was built with the QT toolkit — a set of libraries that provide GUI functions to help make producing an interface a breeze. Hence, to run a KDE program you need the KDE and QT libraries installed. When the Linux kernel reads the headers of a program and detects that the program is a statically linked executable it simply runs the program. However, for a dynamically linked program, it needs to figure out which dynamic linker is needed by the program. Historically there have been three different dynamic linkers on Linux and most systems have all three: /lib/ld.so, /lib/ld-linux.so.1 and /lib/ld-linux.so.2. The last one is the most recent; it is used by programs linked to glibc-2.1 and is the only one we will look at here.

All dynamically linked programs will have one of these three dynamic linker names in their ELF header. Once the kernel has validated the ELF executable and figured out which of the three dynamic linkers to use, it runs the linker and passes it the name of the ELF executable. The dynamic linker then reads the headers of the file and finds a list of dynamic libraries that the program requires to run. The list of library names does not contain the full paths of where the program expects the libraries to be, just the names. This allows the libraries to be placed anywhere on the system and as long as the dynamic linker knows where they are, they will be linked successfully. So, how does the dynamic linker know where to find the libraries? This is Linux, so it’ll be in a configuration file. This file is /etc/ld.so.conf, which contains a list of library directories with one directory per line. If you have a look at this file on your system you’ll notice that ‘/lib’ is not in the list. That’s because that is the one directory the dynamic linker searches automatically. Once the linker has found all the libraries the program needs, it loads the program itself and then runs it. Now, back to where we started: why won’t a program sometimes run? If you have a situation like this, you should now be able to figure out why. If, when you type foo at the command line, you get a ‘command not found’ message, this means that Bash can’t find it anywhere on the current path, which you can fix. If, instead, the Bash shell says something like ‘Permission denied’, you have a file permissions problem, which you might be able to fix. If the program foo is available and has execute permissions, it still may not run if a dynamic library it requires is missing. You can use the command file /full/path/to/foo to tell you whether it is dynamically or statically linked along with some other useful information. To find out which libraries a program needs you can use the ldd command: run ldd /full/path/to/foo to get a full list of the libraries used and their locations. Now you know what goes on under the hood, you’re better equipped to solve these problems when they arise.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Leave a Comment Cancel reply