UNC COMP 530 Operating Systems Good Things to Remember

October 25, 2014

I ended up dropping this course almost immediately because I didn't want to do the work and I hated the flipped style classroom so much, where you'd watch lectures online and then come into the classroom and just do worksheets in groups. But the content was interesting so I spent a weekend watching all the lectures which was cool.

Timesharing: (this still currently happens): have an interrupt occurring at regular intervals, which calls some function in the OS called (schedule) which figures out if the CPU should start executing a different job/program. This is decided algorithmically in the scheduler.

UNIX Architecture: At the very bottom you have the hardware stuff, which is I/O devices, memory, CPU, and on top of that you’ve got the processes, memory management, file systems, interprocess communication, all of which is considered the kernel pretty much. Then on top of that you’ve got some middleware for doing all sorts of stuff like maybe video graphics, and you’ve got the window system and user applications and command interpreter. You can consider all of that the OS, really. The kernel is the core of the OS.

DOS: (again, foundation of Windows). Essentially, it wasn’t a strict layer so applications could actually talk to lower levels like the BIOS.

Virtual Machines: you can have a VM emulator which lives on top of the native operating system. It emulates certain hardware, and on top of that hardware you run another operating system that can only run on that emulated hardware. The VM will take the instructions sent to the emulated hardware and figure out what they mean in the actual hardware on the device and send them to the actual hardware (I guess through the native OS?).

Java Virtual Machine (JVM): it is essentially a virtual machine itself which will translate the Java code into whatever it the (I guess Assembly?) needs to be on the actual physical processor you have. You could also have the JVM be in the hardware, where you have a processor that can execute whatever Java compiles to.

Darwin (implementation of UNIX): It seems as though you’ve got some OS Kernel which I guess interacts with the hardware and then you’ve got a bunch of applications on top which are all at the same level as user applications. So these applications could be the file server, the window server, the memory server etc. Essentially everything is a user level application. Now Mac OS X is based on Darwin, and we can see that there is a user level process essentially for window management. This is called the “client/server or microkernel” model. The client programs are still communicating through the OS kernel at the bottom, but this OS kernel is actually directing the communication to other user level processes (like the file server, etc). Looking up Mach suggests that the Mac OS X kernel is not the client/server model…interesting.

Process’ memory state: So when you’re executing a program, you’ll have the following loaded into memory: program code (the actual binary that’s executing), data, and the execution stack. Data includes all of your constants, your static variables, any other data. The execution stack is essentially just keeping track of where you are in your program (so like maybe you’re in a method that called another method and this is going to keep track of that nesting).

exec In Linux: In your code (assuming C), you can have the line “exec(foo)” which will just find the file named foo and then execute it within your program. The contents of foo will then become the code that is run. So the binary file in the memory state is replaced by the code for foo. So importantly, in Linux what would happen is let’s say that you’re running a shell, and then you want to use the command ls. Well what would happen is the shell would make a copy of itself via fork(), and then this child copy would exec(ls), where ls is really just an executable file on the disk somewhere. So then the parent shell would wait until ls is done, and then there must be some passing back of output.

User login in Linux: there’s a master process called “init”. So when booting, some program called init is run. say the number of people logged on is n, then init would make n copies of itself(?), and each init would exec a getty, which prints login prompt and then reads login and pass. Once you’ve typed in your login stuff, then getty will exec “login” which verifies login. After getting verified, login will exec “csh”, the shell. Which shell gets execed is an environmental variable (there’s .login file in home directory).

Thread: A thread can be thought of as a “thread of control within a process”. So typically threads share the same memory state. They’re just individually different stacks. By creating a thread you’re making a new processor state, which means essentially all the memory is shared, but you have two processors running independently so that the value in their registers may be different. A “task” is what Mac OS X people may refer to as what we typically think of each thread in a multithreaded program. One good thing about threads is it allows multiprogramming to occur within a process, that is, while one thread is waiting for I/O, another thread could continue executing. Essentially, you don’t need to have a multicore processor in order for threads to be useful.

Protection of OS operations: So when you call OS operations, what’s actually happening is that you’ll call a little stub for that code (say it’s read()) and that’ll generate an system interrupt, which will sort of switch the processor into a mode where it’s only executing from memory that the user had no access to. That way, the user can’t mess around with any of the lower level stuff.

Memory Management: There are physical addresses for all the locations in RAM. This is sort of already known, but a program’s view of its address space may not be exactly the physical addresses. It may vary from 0 to maxForProgram, but its zero location could actually be in the middle of the physical addresses in RAM.

Compilation: People mainly refer to it all as compiling, but really there’s more. The first step is called compiling, and that’s turning your code into assembly. The next step is assembly, which is turning the assembly code into binary. At this stage, you’re actually going to replace the jump instructions’ addresses to be relative memory addresses, and hard code all of the define variables and such (?). Then you get to the linking stage which takes in all the library routines you called and actually puts them into the code (in the case of static linking), and it’ll shift any memory addresses in the code by the appropriate amount. Then lastly there’s the loading step which loads the program into physical memory, and when it does that in needs to again modify the addresses so that they correspond to the actual physical addresses (because say your program was loaded starting at the 500th byte of RAM, then you need to add 500 bytes to each memory address for it to work in physical memory).

Dynamic Linking: So this is contrasted with static linking where all the libraries are actually loaded right into the program code at compile time. This is essentially a “use as you go” method, where libraries are linked and loaded into memory at run-time. You do need to do some address changes during runtime as a result, but it requires much less space. Apparently with dynamic linking you don’t actually have to load the entire library, but only what you referenced. The at run-time stuff is essentially done by reading in the logical (relative) address for the program, then before that instruction gets executed the CPU will first check to see if the program is pointing to something out of its memory bound, and then in a register it’ll have what’s called a “base” which is what the actual physical address is that the program is residing in.

Memory Fragmentation: so if you’re splitting up your memory space into partitions of all equal or variable size, and you relegate programs to those partitions based on how much memory they say they need, then imagine when you have a program un-utilizing a full block, well that memory could be used for something else. That’s fragmented memory.

Swapping: Essentially just taking a program out of main memory and writing it to disk because it’s not being used. This is to free up RAM.

Overlays: this is the way you can run programs without loading them entirely into memory at one time (so this is sort of like dynamic loading). The loader understands which functions call what other functions, and so based on the current stack it can sort of figure out what functions it will need to load into RAM based on what could possibly get called. This is interesting. It may explain to some extent why running things on your phone could be slower because you may have to read/write from disk to RAM more to be bale to run large programs.

Virtual Memory: seems like this is reliant on paging, which is splitting up physical memory into really small memory chunks (college page frames) which, again, are really small, like 2048 bytes. Within these frames you also have an offset, by some number of bytes which gives you the starting position. Essentially, all this allows you to do is not have to store in RAM contiguously, so you can split a given program up based on what’s free in physical memory. This is done via some hardware (MMU, the memory management unit), which creates this mapping. The MMU contains what’s called a Page Table. The virtual address is just the index into the page table, and the value at that table is the physical memory address. So say you’re executing a program, not all of the program is going to be loaded into memory at a time, only really what needs to be executing. And the same is true for the stack and the data of your program too; it can be stored in some file on disk that you don’t see. The OS manages what’s actually put in memory and what’s kept on disk.

Paging: The process of taking something from disk and putting it in main memory (?).

Page fault: your program sort of sees that it has infinite virtual memory that it can fill into, and your OS deals with what to store/not to store in actual RAM out of all the programs’ virtual memory. So, essentially, not everything is actually stored in RAM (as mentioned above), for instance in your program parts of the code/data/stack segment may not be in RAM. So what happens when you call a virtual memory address, it goes through the MMU, but at the MMU stage (which has a bit for whether the entry is actually in RAM or not), it realizes that the entry isn’t valid. Well that means you essentially had a “miss” in the cache, and we call that a page fault. Page faults have to be really really rare because they take a long long time.

Thrashing: This has to do with load control, which is essentially increasing the number of programs sharing the CPU/main memory. So in general, what often happens in OS’ is when an OS notices that the CPU is idle, it’ll often give it more work to do (more processes, for instance). So assume memory is nearly full, happens all the time, So say a few page faults occur (that is, say three programs are sharing the CPU but then each of them tried to read something in virtual memory that wasn’t actually in main memory, so they halt and wait for the memory they need to be loaded from disk), well the CPU is going to be pretty idle so the OS is going to give it more processes to do, but then these page faults happened in the first place because not enough main memory had been allocated to them, and so these new processes are going to page fault too, so essentially this continues until you have really crappy CPU utilization and your RAM is all filled up and you’re just reading everything off of disk so your computer is slow. That’s called thrashing.

Disk: the unit of read/write on a disk is called a “block” or a “sector” and these are typically 512 bytes in size. Much of what was discussed was about HDD, and it’s all pretty obvious. There are multiple platters stacked on top of each other and a head for each platter, and it just moves in and out while the disk spins. It’s obviously slower to move the head so you want files located on one radius of the disk that way you don’t have to move the head between reading files. Each of the blocks also have a CRC (cyclic redundancy check) which indicates whether the block is still good or if it failed. On read and write, you compare against this check in the block to see if what you read/wrote was correct/was stored correctly. When you format a disk, it’ll hide blocks that are available from the OS, so that when a block goes bad, it’ll remap anything that is supposed to read/write from the bad block and just have it actually read/write from one of its spare blocks. This is all completely transparent to the OS; the OS has no idea that blocks failed.

Device Directory: this is what keeps track of the names, locations, lengths, owner, etc of all the files on the disk. It also stores the free block lists. It has operations such as search (find a file), create a file, delete a file, list contents of a directory, etc. It’s located at a certain location because when booting, it needs to look at a specific memory point to figure out where the files are. Now, on the hard drive remember blocks are split up into 512 bytes in size, and so essentially every directory has its own block dedicated to it, and in that block you’ve got pointers to the actual memory location of the files, and pointers to subfolders which themselves have a block dedicated to them. The device directory also will have what’s called a free list, which shows which blocks have files currently stored in them and which are free. You could have the free blocks store a pointer to the next free block, which would allow for not too difficult writing. Sometimes the blocks fail, and if a free block fails then that means that you have to repair the linked list to regain the free list.

File Allocation Methods: If you’re doing a contiguous allocation method where every file is loaded into contiguous memory (the device directory would point to just the starting block of the files): one thing that is done is, because you want to put a file in contiguous memory and often things are added to the end of files, when allocating for a file you leave a little empty space at the end so that the file can expand into it. You can also do linked allocation which is just when you have files stored in blocks and each block has a pointer to the next part of the file. In this instance, the device directory would point to the start and end of each file. In the indexed allocation method, you have one index block for each file which points to the blocks of the files. The device directory would point to that index block. Unix uses the index block implementation for small files, then for medium files it uses one index block which points to an index block which points to other index blocks, which point to data blocks. Also note that the OS can treat the data blocks to be whatever size, so instead of 512 it could pretend like each block is 5K, and then in the hardware it’ll still get implemented as 512 blocks but all the reading and writing will occur assuming 5K data block I suppose.

RAID: So (redundant array of inexpensive disks) is actually faster because what you do is split up all your writes into say 4 parts, and then write each of those parts simultaneously to four different disks. But now the redundancy comes in because having 4 disks means you essentially have 4 times the chance of something going bad. A simple thing to do would be to have a mirrored disk which is just a backup (you can even use it to do simultaneous reads, which makes reads even faster). But that’s not how RAID-3 does it (there are multiple iterations of RAID). RAID will essentially use “parity striping”: essentially say it needs to store 1000 (binary), then it’ll store the first bit on one disk, the second bit on the second disk, etc. And then it’ll have a “parity disk”, which indicates essentially the value mod 2 of what should be stored across the disks. So 1+0+0+0 mod (2) = 1, so a 1 would be stored. So realize now that if one of the disks fails, you can the value that was lost based on the parity disk.

Deadlock: When you have it that two or more processes are halted execution because they’re waiting for each other to give up resources. This means essentially that there’s no way for any of these processes to continue. The way you prevent this is each time a processes requests more resources, you check to see if these resources were granted (right now), look at the worst case which is every processes requests its estimated worst case resources needed, would you still be able to find a way so that you can grant those resources in an order that doesn’t cause deadlock. Linux to some extent detects if deadlock is occurring.