A Linux process can spontaneously generate a new, identical copy of itself with fork().
Let’s watch it happen. Imagine you’re in a shell, and you type ls. What you’re seeing is actually a sequence of system calls.
First, the shell, which is itself a process, calls fork(). This creates an exact duplicate of the shell process. This new process, the "child," is almost identical to the original, the "parent." It inherits the parent’s memory space, open file descriptors, and environment variables.
Right after fork(), both the parent and child processes continue execution from the instruction after the fork() call. The critical difference is that fork() returns a value: 0 to the child process and the process ID (PID) of the child to the parent process. This return value is how the two processes know who they are.
# In your shell, imagine this is what's happening under the hood
# This is conceptual, you can't directly type fork() in a shell
# but you can see the effects.
# Let's say your shell is PID 1234
# You type 'echo hello'
# Shell (PID 1234) calls fork()
# fork() returns 5678 (child's PID) to the shell (parent)
# fork() returns 0 to the new child process (PID 5678)
# Now, both processes are executing code right after the fork()
# The child process (PID 5678) checks the return value: it's 0.
# It knows it's the child.
# It then calls execvp() to replace itself with the 'echo' program.
# execvp('/bin/echo', ['echo', 'hello'])
# The parent process (PID 1234) checks the return value: it's 5678.
# It knows it's the parent.
# It then calls waitpid() to pause and wait for the child to finish.
# The child (PID 5678) executes /bin/echo 'hello' and prints 'hello'
# The child process then exits.
# The parent (PID 1234) wakes up from waitpid() because the child exited.
# The parent resumes its own execution (e.g., showing the prompt again).
The child process, having identified itself by the 0 return value from fork(), then typically calls one of the exec() family of functions. exec() replaces the current process image with a new program. It doesn’t create a new process; it transforms the existing one. So, the child process that was a copy of the shell is gone, and in its place is the echo program.
The parent process, meanwhile, usually calls wait() or waitpid(). This makes the parent pause its execution until one of its child processes terminates. This is important for process management, as it allows the parent to collect the exit status of the child and prevents "zombie" processes (processes that have terminated but whose parent hasn’t yet collected their exit status).
Signals are a form of inter-process communication (IPC) used to notify a process of an event. They are asynchronous notifications. When a signal is sent to a process, the operating system interrupts the process’s normal flow of execution. The process can then choose to:
- Perform a default action: For many signals, like
SIGSEGV(segmentation fault), the default action is to terminate the process. - Ignore the signal: For some signals, like
SIGCHLD(child process status change), the default action is to ignore it, but a process can explicitly decide to handle it. - Catch the signal: The process can register a signal handler – a specific function that will be executed when the signal arrives.
This mechanism is fundamental for controlling processes. For instance, pressing Ctrl+C in your terminal sends a SIGINT (interrupt signal) to the foreground process group. If the process has a signal handler for SIGINT, it can gracefully shut down, save its state, or clean up resources before exiting. If it doesn’t, the default action is usually termination.
The kill command, despite its name, doesn’t necessarily kill a process. It sends a signal. kill -9 PID sends SIGKILL, which is a signal that cannot be caught or ignored, forcing termination. kill -1 PID sends SIGHUP (hangup signal), often used to tell daemons to re-read their configuration files.
The interplay of fork(), exec(), and signals forms the bedrock of how programs are launched and managed in Linux. When you run a command, a new process is forked, the child execs the command, and the parent often waits, all while signals can be used to manage their lifecycle.
The most surprising true thing about this whole dance is that exec() doesn’t create a new process; it overwrites the existing one, meaning the PID of the process does not change when it execs a new program.
Understanding how a parent process can choose to ignore the SIGCHLD signal, or more commonly, how it registers a handler for SIGCHLD to reap terminated children, is crucial for building robust server applications that manage many child processes.