What is the Unix System Interface?
this part will represent ONLY the lecture's presentation. the
books material is at the end and is considered extra
Imagine you're writing a program that saves data to a file. In your C
code, you call fwrite()
and the data gets written. But
have you wondered how this actually works? How does your program
communicate with the hard drive, and how does the operating system
ensure that your program can't accidentally corrupt another program's
files?
The answer lies in the Unix System Interface, a
carefully designed layer sitting between your programs and the
hardware. This interface provides
system callsβspecial functions that allow programs to
safely interact with files, processes, memory, and other system
resources.
The Core Problem
Modern computers run multiple programs simultaneously. Allowing each
program direct access to hardware would result in:
-
Security issues: Programs could read private
files.
-
Data corruption: Programs could overwrite each
other's data.
-
System crashes: A single buggy program could
crash the entire system.
-
Resource conflicts: Multiple programs competing
for the same hardware.
The Unix Solution
Unix solves these issues by creating a controlled interface:
-
System Calls: Programs request services from the
OS through system calls.
-
Kernel Protection: The kernel controls all
hardware access and enforces security.
-
Process Isolation: Each program runs within its
own protected memory space.
Understanding Unix Processes
Every program you run becomes a processβa
self-contained instance of that program. Understanding processes is
essential to understanding the Unix System Interface.
A process includes:
-
Memory Space: Contains the program's code, data,
and variables.
-
Process ID (PID): Unique identifier for the
process.
-
File Descriptor Table: Tracks open files and
their permissions.
-
Environment Variables: Configuration settings for
the process.
-
Working Directory: The process's current
operating directory.
-
Security Context: Permissions and access rights.
How a Process Is Assembled
When you run a program, the operating system:
- Allocates memory space for the program.
- Loads the program instructions into memory.
- Assigns a unique PID.
-
Sets up file descriptors (
stdin
, stdout
,
stderr
).
- Sets working directory and environment variables.
- Assigns permissions based on user context.
- Starts executing the program instructions.
Each process operates in isolation, ensuring system security and
stability.
File Descriptors:
Now that we understand the basic flow, let's talk about how Unix
organizes file access. Instead of letting programs directly manipulate
files, Unix uses a system called
file descriptors.
Think of file descriptors as simple "tickets" or "handles" that your
program gets when it opens a file. Each process maintains a
file descriptor table with unique integer IDs (0, 1,
2, 3...) that track which files are open and what permissions they
have.
File Descriptor |
Name |
Default Stream |
Flags |
Description |
0 |
stdin |
Standard Input |
R |
Read input from keyboard/terminal |
1 |
stdout |
Standard Output |
W |
Write normal output to terminal |
2 |
stderr |
Standard Error |
W |
Write error messages to terminal |
3 |
message.txt |
Our File |
W |
The file we created in our example |
4+ |
Other Files |
User-defined |
R/W/A |
Additional files opened by the program |
-
fd 0, 1, 2: Always pre-opened for every program
(stdin, stdout, stderr)
-
fd 3+: Your program gets these when opening files
with open() or fopen()
-
Flags: R=Read, W=Write, A=Append - determines
what operations are allowed
-
Simple integers: Much easier than dealing with
complex file paths and permissions
Process ID vs File Descriptor:
Don't confuse Process ID (PID) with
File Descriptor (FD):
-
PID: Identifies the process itself (unique
system-wide)
-
FD: Identifies an open file within that specific
process
Each process has its own set of file descriptors (0, 1, 2, 3...),
but only one PID. The same file descriptor number in different
processes refers to completely different files.
Example - Two Text Editors Running Simultaneously:
-
Process A: TextEditor (PID 1234) opens "data.txt"
β gets FD 3
File descriptor table: 0=stdin, 1=stdout, 2=stderr,
3=data.txt
-
Process B: TextEditor (PID 5678) opens
"config.txt" β also gets FD 3
File descriptor table: 0=stdin, 1=stdout, 2=stderr,
3=config.txt
Key Point: Both processes have FD 3, but when
Process A writes to FD 3, it modifies "data.txt", while Process B
writing to FD 3 modifies "config.txt". The file descriptor numbers
are process-local, not system-wide!
Flags:
- O_RDONLY: Read-only access.
- O_WRONLY: Write-only access.
- O_RDWR: Read and write access.
- O_CREAT: Create file if nonexistent.
- O_APPEND: Append data to end of file.
- O_TRUNC: Clear file contents upon opening.
Permissions (Who Can Do What):
Owner (User), Group, Others permissions with read (r), write (w),
execute (x).
Common examples:
- 0644: Owner read/write, others read-only.
-
0755: Owner full access, others read/execute.
- 0600: Owner read/write only.
How System Calls Work
let us start with the following example:
#include<stdio.h>
int main() {
char str[] = "Hi";
FILE* fp = fopen("a.txt", "w");
fwrite(str, 1, sizeof(str), fp);
fclose(fp);
return(0);
}
we will inspect the function fwrite in this code, here's what happens:
-
Program Call: You invoke a file-writing
function (
fwrite
).
-
C Library Translation: High-level calls
internally call low-level system calls.
-
Kernel Mode Switch: The CPU switches from user
mode to kernel mode securely.
-
Kernel Execution: Kernel routine performs the
requested file operation safely.
-
Hardware Access: Data is written safely to the
storage device.
Understanding this low-level interaction provides precise control,
performance improvements, deeper system knowledge, and better
debugging capability.
π€ Your Program
β File writing function call
π C Library (glibc)
β write() wrapper function
π§ USER / KERNEL BOUNDARY π§
Security checkpoint
β syscall instruction
ποΈ System Call Table
β sys_write() lookup
βοΈ Kernel Routine
β Safe hardware access
πΎ Hard Drive
The Three Essential System Calls
System Call Functions - The Building Blocks:
Unix file operations are built on three fundamental system calls.
Input and output uses the
read and write system calls, which
are accessed from C programs through two functions called
read
and write
.
read() - Reading Data from Files:
int n_read = read(int fd, char *buf, int n);
-
fd: File descriptor (which file to read from)
-
buf: Character array in your program where the
data will go
- n: Number of bytes to be transferred
-
Returns: Count of bytes actually read. Zero
indicates end of file, -1 indicates an error
Note: The number of bytes returned may be less than the number
requested.
write() - Writing Data to Files:
int n_written = write(int fd, char *buf, int n);
-
fd: File descriptor (which file to write to)
-
buf: Character array in your program where the
data comes from
- n: Number of bytes to be transferred
-
Returns: Number of bytes actually written. An
error has occurred if this isn't equal to the number requested
open() - Opening Files for Access:
int open(char *name, int flags, int perms);
- name: Pathname of the file to open
-
flags: How you want to access the file (read,
write, create, etc.)
-
perms: File permissions when creating new files
(always zero for existing files)
-
Returns: File descriptor (positive integer) on
success, -1 on error
Important: Other than the default standard input,
output and error, you must explicitly open files in order to read or
write them. There are two system calls for this,
open
and creat
. However,
creat
is rarely used these days since it can be fully
replaced with open
.
Every file operation on Unix ultimately uses these system calls,
no matter how high-level your programming language!
Example
Now that you understand the basics, let's see a more advanced example
that demonstrates the power of low-level system calls. This example
shows operations that would be difficult or impossible with high-level
functions:
#include <stdio.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
int main() {
int fd = open("demo.txt", O_RDWR | O_CREAT, 0644);
write(fd, "Hello World\n", 12);
printf("Created file with: Hello World\n");
off_t pos = lseek(fd, 6, SEEK_SET);
printf("Moved file pointer to position: %ld\n", pos);
int bytes = write(fd, "Unix", 4);
printf("Overwrote %d bytes at position 6\n", bytes);
close(fd);
printf("Final result: Hello Unix\n");
return 0;
}
What This Advanced Example Demonstrates:
-
Initial Write: Creates "demo.txt" and writes
"Hello World\n" (12 bytes)
-
Precise Positioning: lseek() moves file pointer
to byte 6 (after "Hello ")
-
Selective Overwriting: Overwrites "World" (4
bytes) with "Unix" without affecting the rest
-
Result: File now contains "Hello Unix\n" instead
of "Hello World\n"
This type of precise file manipulation is essential for
databases, editors, and other applications that need to modify
specific parts of files efficiently.
Understanding lseek() - The File Position Controller:
-
SEEK_SET - Position from beginning of file:
lseek(fd, 6, SEEK_SET)
= "go to byte 6"
-
SEEK_CUR - Position relative to current location:
lseek(fd, 5, SEEK_CUR)
= "move forward 5 bytes"
-
SEEK_END - Position from end of file:
lseek(fd, 0, SEEK_END)
= "go to end of file"
lseek() is like a cursor in a text editor - it determines where
the next read or write operation will happen
Why close() is Critical:
Always remember to close your file descriptors! Here's why:
-
Data Safety: Ensures all buffered data is
actually written to disk
-
Resource Management: Frees the file descriptor
for other processes to use
-
System Limits: Each process has a limit (~1024
file descriptors by default)
-
File Locking: Releases any locks on the file so
other programs can access it
Forgetting to close files can lead to data loss, resource
exhaustion, and system instability!
Here's how the advanced example looks in the terminal:
user@vm:~/unix-demo $
gcc advanced_demo.c -o advanced_demo
user@vm:~/unix-demo $
./advanced_demo
Created file with: Hello World
Moved file pointer to position: 6
Overwrote 4 bytes at position 6
Final result: Hello Unix
user@vm:~/unix-demo $
cat demo.txt
Hello Unix
user@vm:~/unix-demo $
β
Assembly Level: System Calls in Their Purest Form
Now that you understand how system calls work at the C level, let's
see what happens at the lowest level - assembly language. This is
where system calls directly communicate with the kernel using CPU
instructions.
When your C program calls write()
, it eventually becomes
assembly instructions that load values into specific CPU registers and
execute the syscall
instruction. Let's see exactly how
this works.
sys_read
system call number (in rax): 0
arguments:
- rdi: file descriptor (to read from it)
-
rsi: pointer to buffer (to keep a read data
into it)
-
rdx: maximal number of bytes to read (maximal
buffer size)
return value (in rax):
- number of bytes received
- On errors: negative number
section .bss
buffer: .space 1
.section .text
global _start
_start:
movq $0, %rax
movq $0, %rdi
leaq buffer(%rip), %rsi
movq $1, %rdx
syscall
movq $60, %rax
movq $0, %rdi
syscall
sys_write
system call number (in rax): 1
arguments:
- rdi: file descriptor (to write to it)
- rsi: pointer to buffer (data to write)
- rdx: number of bytes to write
return value (in rax):
- number of bytes written
- On errors: negative number
section .data
msg: .ascii "Hello\n"
len = . - msg
.section .text
global _start
_start:
movq $1, %rax
movq $1, %rdi
leaq msg(%rip), %rsi
movq $len, %rdx
syscall
movq $60, %rax
movq $0, %rdi
syscall
assembly code to print positive int
What this code does:
Converts the integer 234 to its ASCII string
representation and prints it to stdout.
Key techniques:
-
Backward string building - Builds the string
from right to left
-
Division by 10 - Extracts digits one by one
-
ASCII conversion - Adds '0' (48) to convert
digit to ASCII
-
Dynamic length - Calculates string length
during conversion
Register usage:
-
%rax - Holds the number being converted
(quotient after division)
-
%rdx - Contains remainder after division
(current digit)
-
%rcx - Holds divisor (10) for division
operation
-
%rdi - Buffer pointer, moves backward as digits
are stored
-
%rsi - Points to start of converted string for
sys_write
Algorithm:
- Load number (234) into %rax register
- Point %rdi to end of buffer
- Divide %rax by 10, get remainder in %rdx (last digit)
- Convert digit in %rdx to ASCII and store at (%rdi)
- Repeat until %rax becomes 0
- Print using %rsi pointer to start of digits
.section .data
x: .long 234
buflen: .long 0
.section .bss
buffer: .space 10
.section .text
.global _start
_start:
xorq %rax, %rax
movl x, %eax
leaq buffer, %rdi
addq $10, %rdi
convert_loop:
movq $0, %rdx
movq $10, %rcx
divq %rcx
addq $'0', %rdx
decq %rdi
movb %dl, (%rdi)
incl buflen
testq %rax, %rax
jnz convert_loop
movq $1, %rax
movq %rdi, %rsi
movq $1, %rdi
movq buflen, %rdx
syscall
movq $60, %rax
movq $0, %rdi
syscall
print section text byte by byte
What this code does:
Prints each byte of the .text section (the
executable code) one by one to stdout.
Key concepts:
-
Section boundaries - Uses labels to mark start
and end of .text section
-
Byte-by-byte iteration - Loops through each
byte in the section
-
Pointer arithmetic - Increments memory address
to access next byte
-
Memory inspection - Reads and displays raw
executable code
Register usage:
-
%rcx - Current byte pointer, starts at label1
(section start)
- %rax - System call number for sys_write
- %rdi - File descriptor (stdout = 1)
- %rsi - Pointer to current byte to print
-
%rdx - Number of bytes to write (always 1)
Algorithm:
- Set %rcx to label1 (start of .text section)
- Print the byte at address %rcx
- Increment %rcx to point to next byte
- Check if %rcx reached label2 (end of section)
- If not at end, repeat from step 2
- Exit when entire section is printed
.section .text
.global _start
label1:
_start:
leaq label1(%rip), %rcx
print_loop:
leaq label2(%rip), %rax
cmpq %rax, %rcx
jge exit_program
movq $1, %rax
movq $1, %rdi
movq %rcx, %rsi
movq $1, %rdx
syscall
incq %rcx
jmp print_loop
exit_program:
movq $60, %rax
movq $0, %rdi
syscall
label2:
sys_open
system call number (in rax): 2
arguments:
- rdi: pathname of the file to open/create
-
rsi: file access bits (bitwise OR'ed together)
- O_RDONLY (0) - read only
- O_WRONLY (1) - write only
- O_RDRW (2) - read and write
- O_APPEND (1024) - append to end
- O_TRUNC (512) - truncate existing content
- O_CREAT (64) - create if doesn't exist
-
rdx: file permissions (when O_CREAT is set)
- S_IRWXU (0700) - RWX mask for owner
- S_IRUSR (0400) - Read for owner
- S_IWUSR (0200) - Write for owner
- S_IXUSR (0100) - Execute for owner
return value (in rax):
- file descriptor (positive integer)
- On errors: negative number
Example breakdown:
$66 = O_RDWR | O_CREAT (2 + 64)
$0700 = S_IRWXU (owner: read/write/execute)
.section .data
fileName: .string "file.txt"
fd: .quad 0
.section .text
global main
main:
movq $2, %rax
leaq fileName(%rip), %rdi
movq $66, %rsi
movq $0700, %rdx
syscall
movq %rax, fd(%rip)
movq $60, %rax
movq $0, %rdi
syscall
sys_close
system call number (in rax): 3
arguments:
-
rdi: file descriptor (obtained from sys_open)
return value (in rax):
- 0 on success
- On errors: negative number
What sys_close does:
-
Releases file descriptor - Makes it available
for reuse
-
Flushes buffers - Ensures all data is written
to disk
-
Frees kernel resources - Cleans up internal
file structures
-
Prevents resource leaks - Essential for
long-running programs
Example workflow:
-
file_open: Opens "file.txt" with O_RDONLY
-
file_close: Closes the file using saved
descriptor
- exit_program: Clean program termination
Important notes:
Always close files when done! Process has limited file descriptors
(~1024).
xorq %rdx, %rdx efficiently sets %rdx to 0.
.section .data
fileName: .string "file.txt"
fd: .quad 0
.section .text
global main
main:
file_open:
movq $2, %rax
leaq fileName(%rip), %rdi
movq $0, %rsi
xorq %rdx, %rdx
syscall
movq %rax, fd(%rip)
file_close:
movq $3, %rax
movq fd(%rip), %rdi
syscall
exit_program:
movq $60, %rax
xorq %rdi, %rdi
syscall
sys_lseek
system call number (in rax): 8
arguments:
- rdi: file descriptor
- rsi: offset (number of bytes to move)
-
rdx: where to move from
- SEEK_SET (0) - beginning of the file
- SEEK_CUR (1) - current position of the file pointer
- SEEK_END (2) - end of file
return value (in rax):
- Current position of the file pointer
- On errors: negative number
What sys_lseek does:
-
Repositions file pointer - Changes where next
read/write will occur
-
Non-destructive operation - Doesn't modify file
content, only pointer position
-
Enables random access - Jump to any position in
file without reading sequentially
-
Essential for file editing - Allows overwriting
specific parts of files
Example workflow:
-
file_open: Opens "file.txt" with O_RDONLY
(read-only)
-
file_lseek: Moves file pointer 15 bytes from
beginning (SEEK_SET)
- exit_program: Clean program termination
Important notes:
SEEK_SET (0) = absolute positioning from file
start.
SEEK_CUR (1) = relative positioning from current
location.
SEEK_END (2) = positioning relative to file
end.
Moving beyond file end with write operations extends the file.
.section .data
fileName: .string "file.txt"
fd: .quad 0
.section .text
global main
main:
file_open:
movq $2, %rax
leaq fileName(%rip), %rdi
movq $0, %rsi
xorq %rdx, %rdx
syscall
movq %rax, fd(%rip)
file_lseek:
movq $8, %rax
movq fd(%rip), %rdi
movq $15, %rsi
movq $0, %rdx
syscall
exit_program:
movq $60, %rax
xorq %rdi, %rdi
syscall
Unix Processes
Unix processes are independent programs running in memory. Each
process has its own memory space, file descriptors, and process ID
(PID). The operating system manages these processes and provides
system calls to create, control, and communicate between them.
fork
The fork()
system call creates a new process by
duplicating the current process. The new process (child) is an exact
copy of the original process (parent), except for the return value of
fork()
.
How fork() works:
- Returns 0 - In the child process
- Returns child PID - In the parent process
- Returns -1 - On error (fork failed)
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
pid_t pid = fork();
if (pid == 0) {
printf("I'm the child! PID: %d\n", getpid());
} else if (pid > 0) {
printf("I'm the parent! Child PID: %d\n", pid);
wait(NULL);
} else {
printf("Fork failed!\n");
}
return 0;
}
What happens step by step:
-
fork() is called - Creates an identical copy of
the current process
-
Two processes exist - Parent and child, both
continue from the fork() line
-
Different return values - Child gets 0, parent
gets child's PID
-
Both execute if-else - Each process takes a
different branch
-
wait() synchronizes - Parent waits for child to
complete
Here's how it looks when you run it:
user@vm:~/SPLab $
gcc fork_demo.c -o fork_demo
user@vm:~/SPLab $
./fork_demo
I'm the parent! Child PID: 1234
I'm the child! PID: 1234
user@vm:~/SPLab $
β
Key Points:
-
Memory isolation - Child has its own copy of
variables
-
File descriptors - Child inherits parent's open
files
-
Process hierarchy - Parent-child relationship is
established
-
Concurrent execution - Both processes run
simultaneously
#include <stdio.h>
#include <unistd.h>
int main() {
int counter = 0;
printf("Before fork: counter = %d\n", counter);
pid_t pid = fork();
if (pid == 0) {
counter += 10;
printf("Child: counter = %d\n", counter);
} else if (pid > 0) {
counter += 5;
printf("Parent: counter = %d\n", counter);
}
return 0;
}
Memory Isolation Demo:
This example shows that each process has its own memory space. When
the child modifies counter
, it doesn't affect the
parent's copy.
Output: Parent shows counter = 5, Child shows counter = 10
Understanding wait() - Process Synchronization
The wait()
system call is crucial for parent-child
process coordination. It allows the parent to wait for a child process
to terminate and retrieve its exit status.
How wait() works:
-
Blocks the parent - Parent stops executing until
a child terminates
-
Returns child PID - The PID of the terminated
child process
-
Collects exit status - Stores termination
information in the status parameter
-
Prevents zombie processes - Cleans up terminated
child processes
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
int main() {
int pid = fork();
int status;
if(pid) {
printf("Parent: waiting for child (PID %d) to finish...\n", pid);
int child_pid = wait(&status);
if (WIFEXITED(status))
printf("Parent: child PID %d exited with status %d\n",
child_pid, WEXITSTATUS(status));
}
else {
printf("Child: I'm about to exit with status 18\n");
_exit(18);
}
return 0;
}
Process Flow
β
β
pid > 0 (Parent)
4
printf("Parent: waiting...")
β
5
child_pid = wait(&status)
β
7
WIFEXITED(status) check
β
8
printf("child PID %d exited...")
pid == 0 (Child)
4
printf("Child: I'm about...")
β
β
Step-by-Step Execution:
-
fork() creates child - Parent gets child PID,
child gets 0
-
Child executes else block - Prints message and
exits with status 18
-
Parent calls wait() - Blocks until child
terminates
-
wait() returns - Returns the PID of the
terminated child
-
Status is analyzed - WIFEXITED() checks if child
exited normally
-
Exit status extracted - WEXITSTATUS() gets the
actual exit code (18)
Here's the terminal output showing the synchronization:
user@vm:~/SPLab $
gcc wait_demo.c -o wait_demo
user@vm:~/SPLab $
./wait_demo
Parent: waiting for child (PID 1234) to finish...
Child: I'm about to exit with status 18
Parent: child PID 1234 exited with status 18
user@vm:~/SPLab $
β
Why wait() Returns the Child PID:
Think of it logically: a parent process might have multiple children
running simultaneously. When wait()
returns, the parent
needs to know which specific child just terminated.
The return value (child PID) identifies exactly which child process
finished.
-
Multiple children scenario: Parent can
distinguish between different child processes
-
Process tracking: Parent can maintain records of
which children are still running
-
Selective waiting: Parent can take different
actions based on which child terminated
Process Termination Status Explained:
-
status parameter: Contains packed information
about how the child terminated
-
WIFEXITED(status): Returns true if child exited
normally (not killed by signal)
-
WEXITSTATUS(status): Extracts the actual exit
code passed to _exit() or return
-
_exit(18): Child terminates immediately with exit
status 18
The parent can use this information to determine if the child
completed successfully or encountered an error.
Advanced Process Management Examples
Now let's see practical examples of the three scenarios mentioned
above. Each example demonstrates a different aspect of process
management with accompanying flowcharts.
1. Multiple Children Scenario
This example shows how a parent process can create and distinguish
between multiple child processes:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
pid_t child1, child2;
child1 = fork();
if (child1 == 0) {
printf("Child 1: PID %d, doing task A\n", getpid());
sleep(2);
return 10;
}
child2 = fork();
if (child2 == 0) {
printf("Child 2: PID %d, doing task B\n", getpid());
sleep(1);
return 20;
}
int status;
pid_t finished = wait(&status);
if (finished == child1) {
printf("Child 1 finished with code %d\n",
WEXITSTATUS(status));
} else if (finished == child2) {
printf("Child 2 finished with code %d\n",
WEXITSTATUS(status));
}
finished = wait(&status);
if (finished == child1) {
printf("Child 1 finished with code %d\n",
WEXITSTATUS(status));
} else {
printf("Child 2 finished with code %d\n",
WEXITSTATUS(status));
}
printf("All children finished!\n");
return 0;
}
Multiple Children Process Flow
β
β
β
Parent
β
5c
Identify finished child by PID
user@vm:$
gcc multiple_children.c -o multiple_children
user@vm:$
./multiple_children
Child 1: PID 1234, doing task A
Child 2: PID 1235, doing task B
Child 2 finished with code 20
Child 1 finished with code 10
All children finished!
user@vm:$
β
2. Process Tracking Scenario
This example shows how a parent can maintain records of which children
are still running:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
pid_t children[3];
int active_children = 3;
for (int i = 0; i < 3; i++) {
children[i] = fork();
if (children[i] == 0) {
printf("Child %d started (PID: %d)\n", i+1, getpid());
sleep((i+1) * 2);
printf("Child %d finishing\n", i+1);
return i+1;
}
}
while (active_children > 0) {
int status;
pid_t finished_pid = wait(&status);
for (int i = 0; i < 3; i++) {
if (children[i] == finished_pid) {
printf("Tracked: Child %d (PID %d) finished\n",
i+1, finished_pid);
children[i] = -1;
active_children--;
break;
}
}
printf("Remaining active children: %d\n", active_children);
}
printf("All children tracked and finished!\n");
return 0;
}
Process Tracking Flow
1
Parent creates 3 children
β
2
Initialize tracking array
children[3] = {PID1, PID2, PID3}
β
β
4
Parent waits in loop
wait() returns PID
β
5
Find PID in array
Mark as finished (-1)
Update counter
β
No
7a
Continue loop (back to step 4)
user@vm:$
gcc process_tracking.c -o process_tracking
user@vm:$
./process_tracking
Child 1 started (PID: 1234)
Child 2 started (PID: 1235)
Child 3 started (PID: 1236)
Child 1 finishing
Tracked: Child 1 (PID 1234) finished
Remaining active children: 2
Child 2 finishing
Tracked: Child 2 (PID 1235) finished
Remaining active children: 1
Child 3 finishing
Tracked: Child 3 (PID 1236) finished
Remaining active children: 0
All children tracked and finished!
user@vm:$
β
3. Selective Waiting Scenario
This example shows how a parent can take different actions based on
which child terminated:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
pid_t worker_pid, monitor_pid;
worker_pid = fork();
if (worker_pid == 0) {
printf("Worker: Processing data...\n");
sleep(3);
printf("Worker: Data processing complete\n");
return 0;
}
monitor_pid = fork();
if (monitor_pid == 0) {
printf("Monitor: Watching system health...\n");
sleep(5);
printf("Monitor: System check complete\n");
return 1;
}
int status;
pid_t finished_pid = wait(&status);
if (finished_pid == worker_pid) {
printf("Action: Worker finished first\n");
printf("Action: Proceeding with next phase\n");
printf("Action: Terminating monitor\n");
} else if (finished_pid == monitor_pid) {
printf("Action: Monitor finished first\n");
printf("Action: Checking if worker needs help\n");
}
wait(&status);
printf("Parent: All tasks completed\n");
return 0;
}
Selective Waiting Flow
β
fork() β Worker
2a
Worker Child
Process Data (3s)
β
fork() β Monitor
2b
Monitor Child
Watch System (5s)
β
β
4
Parent: wait()
Returns PID of first to finish
β
β
7
Wait for remaining child and continue
user@vm:$
gcc selective_waiting.c -o selective_waiting
user@vm:$
./selective_waiting
Worker: Processing data...
Monitor: Watching system health...
Worker: Data processing complete
Action: Worker finished first
Action: Proceeding with next phase
Action: Terminating monitor
Monitor: System check complete
Parent: All tasks completed
user@vm:$
β
execvp() - Running External Programs
The execvp()
system call is how Unix processes run
external commands and programs. Think of it as "replacing" the current
process with a completely different program. It's like taking over
someone's body - the process ID stays the same, but everything else
changes.
How execvp() works:
-
Completely replaces the process - The current
program is thrown away and replaced with a new one
-
Same process ID (PID) - The process keeps its ID,
but becomes a totally different program
-
Passes command-line arguments - The new program
receives the arguments you specify
-
Never returns on success - If it works, your
original program is gone forever
-
Returns only on failure - If the command doesn't
exist, execvp() fails and your original program continues
Real-world example:
Imagine you're a manager (parent process) and you need to send an
employee (child process) to deliver a presentation. You make a copy
of the employee (fork), then completely transform that copy into a
presentation-delivery specialist (execvp). The specialist delivers
the presentation and then disappears (process ends). Meanwhile, you
(the original manager) wait for the job to be done, then continue
with your normal work.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
int main() {
char* argv[3] = { "ls", "-l", 0 };
int pid = fork();
if (pid)
wait(NULL);
else
execvp(argv[0], argv);
return 0;
}
Process Flow (execvp Success)
β
2
argv[3] = {"ls", "-l", 0}
β
β
β
Code Breakdown - Step by Step:
-
argv[3] = {"ls", "-l", 0} - This creates the
command "ls -l" (list files in long format). The array must end
with 0 to mark the end of arguments.
-
fork() creates two processes - Parent (original
program) and child (copy of the program)
-
Parent process (if pid > 0) - Calls wait() and
sleeps until the child finishes
-
Child process (if pid == 0) - Calls execvp() and
transforms into the "ls" program
-
execvp(argv[0], argv) - Child is completely
replaced by the program named in argv[0] ("ls") with all arguments
in argv
-
ls runs and exits - The transformed child process
lists files and then terminates
-
Parent wakes up - wait() returns, parent
continues and exits normally
Important: Understanding argv[0] and the argv array
argv[0] is NOT a process ID! It's the
program name that the new process should identify
itself as.
-
argv[0] = "ls" - The name of the program to
execute
-
argv[1] = "-l" - First command-line argument
-
argv[2] = 0 - NULL terminator (marks end of
arguments)
When execvp(argv[0], argv) runs, it searches for a program named
"ls", and when that program starts, it receives the entire argv
array as its command-line arguments. The "ls" program will see
argv[0]="ls" and argv[1]="-l", just like if you typed "ls -l" in the
terminal!
Key Point:
The child process doesn't "run the ls command" - it
becomes the ls command! It's like a complete
personality change. The child process stops being your program and
starts being the ls program instead.
Here's the terminal output showing the ls command execution:
user@vm:~/SPLab $
gcc execvp_demo.c -o execvp_demo
user@vm:~/SPLab $
./execvp_demo
total 24
-rwxr-xr-x 1 user user 8760 Jan 15 10:30 execvp_demo
-rw-r--r-- 1 user user 245 Jan 15 10:29 execvp_demo.c
-rw-r--r-- 1 user user 123 Jan 15 09:45 demo.txt
user@vm:~/SPLab $
β
What Actually Happened - The Complete Story:
-
Program starts - Your main() function begins
executing
-
fork() creates identical twin - Now there are two
copies of your program running simultaneously
-
Parent says "I'll wait" - Parent process calls
wait() and goes to sleep
-
Child transforms completely - Child calls
execvp() and becomes the "ls -l" program (your original program in
the child is gone!)
-
ls does its job - The transformed child (now ls)
lists directory contents and prints them
-
ls finishes and dies - The ls program completes
and the child process terminates
-
Parent wakes up and exits - wait() returns,
parent realizes child is done, then parent exits too
Important: This is exactly how your shell works
when you type "ls -l" in the terminal! The shell forks itself,
transforms the child into ls, waits for ls to finish, then shows you
the prompt again.
What if we change the code like this:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
int main() {
char* argv[2] = { "junk", 0 };
int pid = fork();
if (pid)
wait(NULL);
else
execvp(argv[0], argv);
return 0;
}
user@vm:~/SPLab $
gcc junk_demo.c -o junk_demo
user@vm:~/SPLab $
./junk_demo
sh: 1: junk: not found
user@vm:~/SPLab $
β
Process Flow (execvp Failure)
β
β
pid > 0 (Parent)
β
6
Parent resumes after child exits
β
pid == 0 (Child)
β
5
execvp() FAILS! "junk" not found
β
5
Child continues with original process
β
β
Understanding Command Failure with "junk":
We intentionally use "junk" - a command that doesn't exist - to
demonstrate what happens when execvp() fails. This is educational
because it shows the difference between success and failure.
What happens when execvp() fails:
-
Child tries to become "junk" - execvp() looks for
a program called "junk"
-
System can't find "junk" - No such program exists
on the system
-
execvp() fails and returns - Instead of
transforming, the child continues as your original program
-
Child reaches return 0 - Since execvp() failed,
the child process continues and exits normally
-
Parent wakes up - wait() returns when child
exits, parent continues and exits
Key insight: When execvp() succeeds, it never
returns (the process becomes something else). When it fails, it
returns an error and your original program continues running.
Back to our working example:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
int main() {
char* argv[3] = { "ls", "-l", 0 };
int pid = fork();
if (pid)
wait(NULL);
else
execvp(argv[0], argv);
return 0;
}
π€ Critical Question:
In the last example we can see that the parent and the child
processes will not execute in parallel. So why even bother having a
child process? Aren't there simpler solutions? Like for example
using the same process and adding a few more loops?
let's find out...
Unix Shell
A Unix shell is a command-line interpreter that provides a user
interface for the Unix operating system. It acts as an intermediary
between the user and the kernel, allowing users to execute commands,
run programs, and manage files. The shell reads commands from the
user, interprets them, and executes the appropriate programs.
π‘ Why does the Shell need a child process, if they do not work in
parallel?
The Answer:
If the shell called execvp()
directly on itself,
we would lose the shell process entirely!
The shell would be replaced by the command being executed, and
when that command finishes, there would be no shell left to return
to.
Here's how the shell solves this problem:
-
Shell displays prompt - Waits for user input
-
User types command - Shell parses the command
and parameters
-
Shell forks a child - Creates a copy of itself
-
Child executes command - Child process image is
replaced by the command
-
Parent waits - Shell waits for command to
complete
-
Command finishes - Child process terminates
-
Shell continues - Returns to step 1, ready for
next command
Key insight:
The shell preserves itself by using a child
process as a "sacrificial" process that gets replaced by the
user's command. This way, the original shell process remains alive
and can continue accepting new commands after each one completes.
while (TRUE) {
typePrompt();
getCommand(&command, ¶meters);
if (fork() > 0)
wait();
else
execvp(command, parameters);
}
Assembly and Fork
Understanding how fork() and execve() work at the assembly level
provides deep insight into Unix process management. This low-level
implementation shows exactly how system calls interact with the
kernel, how processes are created and managed, and how program
execution is transferred. Assembly code reveals the raw mechanics
behind C library functions, demonstrating the direct syscall interface
and register usage patterns that high-level languages abstract away.
Equivalent C Code:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>
int main() {
pid_t pid = fork();
if (pid < 0) {
exit(2);
}
else if (pid == 0) {
write(1, "child\n", 6);
char* argv[] = { "/bin/ls", NULL };
char* envp[] = { NULL };
execve("/bin/ls", argv, envp);
exit(1);
}
else {
wait(NULL);
write(1, "parent\n", 7);
exit(0);
}
return 0;
}
user@vm:~/assembly $
gcc -o fork_demo fork_demo.c
user@vm:~/assembly $
./fork_demo
child
demo.txt fork_demo fork_demo.c fork_demo.s
parent
user@vm:~/assembly $
β
Assembly Implementation:
bin_ls: .string "/bin/ls\0"
child_msg: .string "child\n"
parent_msg: .string "parent\n"
xorq %rax, %rax # Clear rax for syscall number
movq $57, %rax # syscall number
testq %rax, %rax # Check if rax is zero (child process)
js fork_error # Jump to error handler if rax < 0
jz child_process # Jump if rax == 0 (child process)
movq %rax, %rdi # Save child's PID in rdi
xorq %rax, %rax # Zero out rax for syscall number
movq $61, %rax # syscall number
xorq %rsi, %rsi # rsi = 0 (wait for any child process)
xorq %rdx, %rdx # rdx = 0 (no options)
xorq %r10, %r10 # r10 = 0 (no usage of rusage)
movq $1, %rdi # file descriptor 1 (stdout)
leaq parent_msg(%rip), %rsi # pointer to parent message
movq $7, %rdx # size of the parent message
movq $1, %rax # syscall number for write()
exit: # Exit parent process
movq $60, %rax # syscall number for exit()
xorq %rdi, %rdi # rdi = 0 (exit code)
movq $1, %rdi # file descriptor 1 (stdout)
leaq child_msg(%rip), %rsi # pointer to child message
movq $6, %rdx # size of the child message
movq $1, %rax # syscall number for write()
leaq bin_ls(%rip), %rdi # rdi = pointer to filename
leaq argv(%rip), %rsi # rsi = pointer to argv array
leaq envp(%rip), %rdx # rdx = pointer to envp array
movq $59, %rax # syscall number
movq $60, %rax # syscall number for exit()
movq $1, %rdi # rdi = 1 (exit code)
movq $60, %rax # syscall number for exit()
movq $2, %rdi # rdi = 2 (exit code)
How This Assembly Implementation Works:
-
Data Section Setup: The assembly defines string
literals and data structures in memory, including the program path
("/bin/ls"), argument array, and environment pointer - all stored
in the .data section with proper null termination.
-
Direct System Call Interface: Instead of using C
library functions, this code makes direct syscalls to the kernel
using specific syscall numbers (57 for fork, 61 for wait, 59 for
execve, 60 for exit) and the syscall instruction.
-
Register-Based Parameter Passing: System call
arguments are passed through specific registers (%rdi, %rsi, %rdx,
%r10) following the x86-64 calling convention, with %rax holding
the syscall number and receiving the return value.
-
Conditional Branching Logic: The assembly uses
conditional jumps (js, jz) to handle fork's three possible
outcomes: error (negative), child process (zero), and parent
process (positive PID).
-
Process Transformation: In the child process,
execve() completely replaces the process image with the new
program (/bin/ls), while the parent waits and then prints its
message, demonstrating the fundamental Unix process model at the
lowest level.
Key Insight: This assembly code reveals the raw
mechanics that C library functions abstract away - every high-level
operation like fork(), wait(), and execve() ultimately translates to
these precise register manipulations and syscall invocations,
providing direct communication with the kernel.