cs357 Lecture Notes
Spring 2000
Week 5, Tuesday

For today you should have read Chapter 4,
Sections 4.1 through 4.4.

And you should have picked up Homework 3, compiled
and run the code, looked up documentation for the
tools you will need, and thought about the experiments
you are going to perform.

Topics for today
----------------

1) processes

2) homework 3


Chapter 4 intro
---------------

Processes, threads, interprocess communication.

Processes
---------

What's the API?

What does the OS implementation look like?

What does the HW interface look like (implementation of
     context switching)?


Threads
-------

What's the difference between a thread and a process?

Again, what are the interfaces (API, HW) and implementation issues?


Interprocess communication
--------------------------

Implicit -- through shared memory
	    I write, you read, we've communicated

Explicit -- we create a socket and pass messages

	 (underlying mechanism might be the OS, passing notes
	 between processes on the same machine, or a network,
	 passing between processes on different machines).


Processes
---------

A process is a program in execution.

Processes are created by the OS in order to execute programs.

The process consists of all the state associated with
a running program:

In memory:

   loaded program text
   static data segment
   execution stack (parameters, local variables, temporary variables)
   heap (dynamically allocated structures)

On chip:

   program counter (where in the flow of execution are we?)
   contents of registers (primarily GP registers, but some others,
                          depending on the architecture)

In OS memory space:

   Program Control Block
   status info
   copy of hardware state (while process is not running)
   info about this process' memory (base and limit, or page table)
   list of open files, file locks, pending signals, etc.
   accounting info (how long has this process run -- see ps man page)


Process status
--------------

Most processes are in one of the following states:

1) ready: the process can run as soon as the scheduler allow it

2) running: the process is currently executing (its state is
   in hardware and the CPU is executing instructions from its
   text segment)

3) waiting/blocked: the process has made a synchronous I/O
   request that is not complete.  This state is similar to
   ready, except that the job should not be scheduled to
   execute.

   (what would happen if it were)


Warning: distinguish between a process'

	 status: is it running, ready or waiting
and
         state: the sum total of all the information
	        pertaining to this process (including the status)


Transitions
-----------

How do you get from:

ready to running?

running to waiting?

running to ready?

waiting to ready?


Context switch
--------------

Switching out means moving the process from running to
ready or waiting.

1) interrupt occurs (which toggles the mode bit)

   interrupt mechanism cleans up some of the state that
   is scattered around the disk, bringing the process to
   a state that can be resumed

   flow of execution enters the OS

2) OS copies the hardware state into the PCB

   registers

   how do we get the program counter?

3) OS updates the PCB

   change the status

   increment the accounting stats


Switching in means moving a process from ready to running

1) copy the info from the PCB onto the hardware  (how?)

2) jump to the location recorded in the PC

   (don't forget to toggle the mode bit!!!)


Context Switch Overhead
-----------------------

Minimum 100 cycles = 200 ns = 0.2 us

Typically more like 10-100 us, although I am not sure
	  where all the extra time is going.

Let's take the worst case... switch time = 0.1 ms

Typical quantum sizes are 10 ms, so the cost of the
overhead is only one part in 100.

With ten ready processes competing for CPU, the time
between runs for each process is 100ms = 0.1 s, which
is still below human-perceived time.


One little note: a hidden cost of context switching is that
the process that gets switched out loses some of the info
that had accumulated in cache.

By the time it runs again, its data may be gone and it
will be greeted with a hail of cache misses.


Time
----

The cache program uses the times system call to find out
how much time has been allocated to the process:

user time: time that has been spent running user code

system time: time that has been spent running system code
             on behalf of this process

wall time: time that has elapsed in the real world since
           this process began

You can get user time and system time from times.
You can get wall clock time from gettimeofday.

Both involve slightly ugly interfaces with built-in structures.


Ideally:

wall time = user time + system time
          + OS overhead (like context switch time)
          + time spent running other processes

But keep in mind that the system is estimating some of these
things, so it doesn't always work out.

The shell provides a command called time that runs other
programs and then reports their usage:

/rocky/downey % time rdist

...output omitted...

4.370u 4.430s 1:47.51 8.1%      0+0k 0+0io 1370pf+0w

I will give a prize to anyone who can find the documentation
for this command (explaining the output).  I will send out
the answer tomorrow.

Homework 3: how can we measure process creation time accurately?


Fork and exec
-------------

In UNIX, the API for creating a new process is fork...

NAME
       fork, vfork - create a child process

SYNOPSIS
       #include <unistd.h>

       pid_t fork(void);
       pid_t vfork(void);

DESCRIPTION
       fork  creates a child process that differs from the parent
       process only in its PID and PPID, and  in  the  fact  that
       resource  utilizations are set to 0.  File locks and pend-
       ing signals are not inherited.

       Under  Linux,  fork  is  implemented  using  copy-on-write
       pages,  so  the  only penalty incurred by fork is the time
       and memory required to duplicate the parent's page tables,
       and to create a unique task structure for the child.

RETURN VALUE
       On  success,  the  PID of the child process is returned in
       the parent's thread of execution, and a 0 is  returned  in
       the child's thread of execution.  On failure, a -1 will be
       returned in the parent's context, no child process will be
       created, and errno will be set appropriately.

ERRORS
       EAGAIN fork  cannot allocate sufficient memory to copy the
              parent's page tables and allocate a task  structure
              for the child.

       ENOMEM fork failed to allocate the necessary kernel struc-
              tures because memory is tight.


Implementation
--------------

What do we guess the implementation looks like?

1) copy the PCB (but not pending signals or file locks)

2) copy the memory of the parent process

   actually, just the page tables.

   we won't copy the pages themselves until one of the
   processes does a write

	     (implicitly, we have a shared text segment)

3) we have to put different values into the "stack" of
   the two processes (return value may be in registers)

4) both processes are now ready to run -- which one gets
   scheduled first?


Parent and child execute concurrently.

Eventually parent may wait for child to complete.

   (read the documentation of wait for next time)


Inter-process communication
---------------------------

parent to child:   everything in address space

		   open files, etc.

OS->parent \        who am I (parent or child?)
OS->child  /        getpid, getppid

child->parent       child pid (returned by fork)
		    return value (wait)

What kind of information can we get from wait?

How can we get richer communication between parent and child?

1) use sockets

2) make parent and child threads in the same address space
   rather than processes with their own address spaces


Next time: threads!