# K22 - Operating Systems: Design Principles and Internals

Fall 2025 @dit

Vaggelis Atlidakis

# K22 - Operating Systems: Design Principles and Internals

Fall 2025 @dit

Vaggelis Atlidakis std05010

References: Similar OS courses @Columbia, @Stanford, @UC San Diego, @Brown, @di (previous years); and textbooks: Operating Systems: Three Easy Pieces, Operating Systems: Principles and Practice, Operating System Concepts, Linux Kernel Development, Understanding the Linux Kernel

# What you probably ordered?



# What you ended up getting?



# Why study OSes?

Revenue of leading cloud vendors (2019-21)



[1] https://iot-analytics.com/cloud-market/

# Why study OSes?

- These days: Code developed with superficial understanding
  - No free lunch: Bugs won't fix themselves in production
- In the near future: Requirement for software engineers with legit systems understanding will soon be at an all-time high
  - A line of code is not just a line of code
  - Myriads of things happen under the hood

# Why study OSes?

- These days: Code developed with superficial understanding
  - No free lunch: Bugs won't fix themselves in production
- In the near future: Requirement for software engineers with legit systems understanding will soon be at an all-time high
  - A line of code is not just a line of code
  - Myriads of things happen under the hood
- Learning to navigate a complex codebase, such as an OS kernel, will make you a fearless software engineer!

# What you need to do to pass the class?

- Programming assignments (groups of three): 50%
  - You'll add features to the Linux kernel and run a VM with it
  - You'll need to understand lots of code; won't need to write much

#### - Prerequisites

- Do you have a laptop that can run a Linux VM?
- Can you make teams of three ASAP?
- Can you use git?
- Exams: 50%
  - Midterm: 20%
  - Final: 30%

# What you <u>really</u> need to do?

- Come to class, ask questions, be curious
- Choose a team you can collaborate well with
- Own your code
  - You know what you must not do...
  - This is not a class you pass by outsourcing your work
  - There is no magic: You do the work ⇒ You pass the class
  - GenAI usage: Use LLMs to help you understand
- It's a difficult course. Hang in there, we will help...

- We'll start from hardware and follow a question-oriented approach

- We'll start from hardware and follow a question-oriented approach
  - Intro [Q: What is an OS?]
  - Events [Q: When does the OS run?]
  - Runtime [Q: How does a program look like in memory?]

\* Basic (H/W & S/W)

- We'll start from hardware and follow a question-oriented approach
  - Intro [Q: What is an OS?]
  - Events [Q: When does the OS run?]
  - Runtime [Q: How does a program look like in memory?]
  - Processes [Q: What is a process?]
  - Threads [Q: What is a thread?]

- Files [Q: What is a file descriptor?]

- \* Basic (H/W & S/W)
- \* Abstractions

- We'll start from hardware and follow a question-oriented approach
  - Intro [Q: What is an OS?]
  - Events [Q: When does the OS run?]
  - Runtime [Q: How does a program look like in memory?]
  - Processes [Q: What is a process?]
  - IPC [Q: How do processes communicate?]
  - Threads [Q: What is a thread?]
  - Synchronization [Q: What goes wrong w/o synchronization?]

- Files [Q: What is a file descriptor?]

- \* Basic (H/W & S/W)
- \* Abstractions
- \* Primitives

- We'll start from hardware and follow a question-oriented approach
  - Intro [Q: What is an OS?]
  - Events [Q: When does the OS run?]
  - Runtime [Q: How does a program look like in memory?]
  - Processes [Q: What is a process?]
  - IPC [Q: How do processes communicate?]
  - Threads [Q: What is a thread?]
  - Synchronization [Q: What goes wrong w/o synchronization?]
  - Time Management [Q: What is scheduling?]
  - Memory Management [Q: What is virtual memory?]
  - Files [Q: What is a file descriptor?]
  - Storage Management [Q: How do we allocate disk space to files?]

- \* Basic (H/W & S/W)
- \* Abstractions
- \* Primitives
- \* Mechanisms









- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

Why need an O5?

- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

Why need an OS?



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an OS?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an O5?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware (design assumptions)



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an O5?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware (design assumptions)



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an OS?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware (design assumptions)

#### Desirable properties?

<u>Security</u>: Does the system behave as expected in adversarial contexts?



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an O5?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware (design assumptions)

- <u>Security</u>: Does the system behave as expected in adversarial contexts?
- **Reliability**: Does the system behave as expected given benign failures?



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an O5?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware (design assumptions)

- <u>Security</u>: Does the system behave as expected in adversarial contexts?
- <u>Reliability</u>: Does the system behave as expected given benign failures?
- <u>Portability</u>: Is it easy for developers to build and maintain apps?



- A layer of abstraction that translates the hardware of a machine into (standardized) software concepts that applications use

#### Why need an O5?

- Multiple users not necessarily sane, build and run different applications, at the same time on shared hardware (design assumptions)

- <u>Security</u>: Does the system behave as expected in adversarial contexts?
- <u>Reliability</u>: Does the system behave as expected given benign failures?
- <u>Portability</u>: Is it easy for developers to build and maintain apps?
- <u>Fairness</u>: Is the system "pleasant" to use?



- Physical Memory (PM)
  - Stores data addressable on a byte granularity

#### Physical memory



#### Processor



#### Physical memory



#### - Physical Memory (PM)

- Stores data addressable on a byte granularity

#### Processor

- Reads data from memory to its registers
- Performs computations
- Writes the data from its registers to memory



#### Physical Memory (PM)

- Stores data addressable on a byte granularity

#### Processor

- Reads data from memory to its registers
- Performs computations
- Writes the data from its registers to memory

#### System bus

- Memory and processor communication channel



#### Physical Memory (PM)

- Stores data addressable on a byte granularity

#### - Processor

- Reads data from memory to its registers
- Performs computations
- Writes the data from its registers to memory

#### System bus

- Memory and processor communication channel
- Is this model satisfactory?
- Reminds you of something?

## Designing an OS kernel

#### Desirable properties

1) Security, 2) Reliability, 3) Portability, 4) Fairness





# Designing an OS kernel

#### Desirable properties

1) Security, 2) Reliability, 3) Portability, 4) Fairness

From desirable properties to design principles





## Designing an OS kernel

## Desirable properties

1) Security, 2) Reliability, 3) Portability, 4) Fairness

### From desirable properties to design principles

 <u>Fault Isolation</u>: Errors in one running program do not affect any other program





## Designing an OS kernel

## Desirable properties

1) Security, 2) Reliability, 3) Portability, 4) Fairness

## From desirable properties to design principles

- <u>Fault Isolation</u>: Errors in one running program do not affect any other program
- <u>Principle of Least Privilege (PoPL)</u>: Any program has the minimum privileges necessary to perform its function



Physical address for program A

## Designing an OS kernel

## Desirable properties

1) Security, 2) Reliability, 3) Portability, 4) Fairness

## From desirable properties to design principles

- <u>Fault Isolation</u>: Errors in one running program do not affect any other program
- Principle of Least Privilege (PoPL): Any program has the minimum privileges necessary to perform its function
- <u>Preemption</u>: The OS is always able to take control
  of the processor, regardless of what programs
  are running



# Implementing fault isolation

- In simple words: load/ store/ jmp instructions of a program cannot read, write, or jump to another program's memory

# Implementing fault isolation

- In simple words: load/store/jmp instructions of a program cannot read, write, or jump to another program's memory



# Implementing dual-mode execution

- In simple words: A trusted portion of the code (the OS) must have full control of the hardware

# Implementing dual-mode execution

- In simple words: A trusted portion of the code (the OS) must have full control of the hardware



# Implementing preemption

- In simple words: The OS should be able to periodically gain control of the processor regardless of what programs are executing

# Implementing preemption

- In simple words: The OS should be able to periodically gain control of the processor regardless of what programs are executing



# Implementing preemption

- In simple words: The OS should be able to periodically gain control of the processor regardless of what programs are executing



# Hardware components leveraged by OS mechanisms





### Nahalem Intel® core i7



1.8 cm

#### Nahalem Intel® core i7



Access times (1 clock cycle ~ 0.5ns)

#### Nahalem Intel® core i7



Access times (1 clock cycle ~ 0.5ns)

- L1 Cache (hit): ~ 5 clock cycles

### Nahalem Intel® core i7



### Access times (1 clock cycle ~ 0.5ns)

- L1 Cache (hit): ~ 5 clock cycles
- L2 Cache (hit): ~10 clock cycles

1.8 cm

#### Nahalem Intel® core i7



### Access times (1 clock cycle ~ 0.5ns)

- L1 Cache (hit): ~ 5 clock cycles
- L2 Cache (hit): ~10 clock cycles
- L3 Cache (hit): ~40-60 clock cycles

1.8 cm

#### Nahalem Intel® core i7



### Access times (1 clock cycle ~ 0.5ns)

- L1 Cache (hit): ~ 5 clock cycles
- L2 Cache (hit): ~10 clock cycles
- L3 Cache (hit): ~40-60 clock cycles
- L3 Cache miss (MM access): ~100 clock cycles