cs357 Lecture Notes Spring 2000 Week 5, Thursday For today you should have worked on homework 3. You should have finished and written up the first experiment (many concurrent children), and designed the second experiment (measuring the OS overhead of creating a process). You should have printed and read the documentation for fork, wait, times, gettimeofday, and any other system calls you think you will need, and you should have written short programs that exercise each of these calls in isolation. For next time you should finish homework 3 and finish reading chapter 4. Next week we will start talking about threads! Topics for today ---------------- 1) homework 2 2) processes continued Homework 2 comments ------------------- In general, the hw2 writeups were not very good, but that is not entirely unexpected. Here are some suggestions for making things better: 1) Do not hide data in the appendix. Unexplained data is useless. First, choose the design of your graphs carefully to make the point you want to make as clearly and simply as possible. Then, integrate those figures into the text, either by inlining them or by referring to them by number in the text. Finally, add captions to figures explaining what they are meant to show. 2) Make the logical flow of the argument more explicit: Good: a) we are trying to find the size of the cache b) as the size of the array increases, we expect a discontinuity in the access times, because the performance will get drastically worse as soon as the array is bigger than the cache c) as we expected, there is a discontinuity in the graph (see the arrow in figure whatever). d) we conclude that the L1 cache is 32K (the largest array size that still yields good performance) Bad: "We can tell by looking at the data that the cache is 32K." 3) Write as if you are addressing an audience that does not know what experiments you are performing. You can assume they have taken cs357, but that's all you should assume. So, let's make the next batch better! Processes, continued... Context Switch Overhead ----------------------- Minimum 100 cycles = 200 ns = 0.2 us Typically more like 10-100 us, although I am not sure where all the extra time is going. Let's take the worst case... switch time = 0.1 ms Typical quantum sizes are 10 ms, so the cost of the overhead is only one part in 100. With ten ready processes competing for CPU, the time between runs for each process is 100ms = 0.1 s, which is still below human-perceived time. One little note: a hidden cost of context switching is that the process that gets switched out loses some of the info that had accumulated in cache. By the time it runs again, its data may be gone and it will be greeted with a hail of cache misses. Time ---- The cache program uses the times system call to find out how much time has been allocated to the process: user time: time that has been spent running user code system time: time that has been spent running system code on behalf of this process wall time: time that has elapsed in the real world since this process began You can get user time and system time from times. You can get wall clock time from gettimeofday. Both involve slightly ugly interfaces with built-in structures. Ideally: wall time = user time + system time + OS overhead (like context switch time) + time spent running other processes But keep in mind that the system is estimating some of these things, so it doesn't always work out. The shell provides a command called time that runs other programs and then reports their usage: /rocky/downey % time rdist ...output omitted... 4.370u 4.430s 1:47.51 8.1% 0+0k 0+0io 1370pf+0w I will give a prize to anyone who can find the documentation for this command (explaining the output). I will send out the answer tomorrow. Homework 3: how can we measure process creation time accurately? Fork and exec ------------- In UNIX, the API for creating a new process is fork... NAME fork, vfork - create a child process SYNOPSIS #include pid_t fork(void); pid_t vfork(void); DESCRIPTION fork creates a child process that differs from the parent process only in its PID and PPID, and in the fact that resource utilizations are set to 0. File locks and pend- ing signals are not inherited. Under Linux, fork is implemented using copy-on-write pages, so the only penalty incurred by fork is the time and memory required to duplicate the parent's page tables, and to create a unique task structure for the child. RETURN VALUE On success, the PID of the child process is returned in the parent's thread of execution, and a 0 is returned in the child's thread of execution. On failure, a -1 will be returned in the parent's context, no child process will be created, and errno will be set appropriately. ERRORS EAGAIN fork cannot allocate sufficient memory to copy the parent's page tables and allocate a task structure for the child. ENOMEM fork failed to allocate the necessary kernel struc- tures because memory is tight. Implementation -------------- What do we guess the implementation looks like? 1) copy the PCB (but not pending signals or file locks) 2) copy the memory of the parent process actually, just the page tables. we won't copy the pages themselves until one of the processes does a write (implicitly, we have a shared text segment) 3) we have to put different values into the "stack" of the two processes (return value may be in registers) 4) both processes are now ready to run -- which one gets scheduled first? Parent and child execute concurrently. Eventually parent may wait for child to complete. (read the documentation of wait for next time) Inter-process communication --------------------------- parent to child: everything in address space open files, etc. OS->parent \ who am I (parent or child?) OS->child / getpid, getppid child->parent child pid (returned by fork) return value (wait) What kind of information can we get from wait? How can we get richer communication between parent and child? Interprocess communication -------------------------- Two fundamental mechanisms: 1) Explicit message-passing, like sockets. 2) Shared state a) Parent and child share a file system. One can read what the other rights. b) A generalization of that idea is that two processes can share some part of their process state. Such processes are called threads! Look at the definition of process state and see what part can be shared between processes/threads and which parts have to be "private" (i.e. every thread gets its own version). Next time: threads! Homework 3 hints ---------------- Include files: These header files contain definitions of data types like pid_t and struct tms. The man page usually tells you which headers you need in order to use a given system call. Usually the order in which you include the header files doesn't matter. If you want to see the header files, they are in /usr/include and its subdirectories. For example, the command #include includes the file /usr/include/sys/wait.h which contains the definitions of macros like WEXITSTATUS(status) that are used to interpret the bits of the status word. Passing arguments by "reference": C usually passes arguments by value (like Java) which means that the callee cannot modify the value of the caller's variable. It also provides a mechanism that is similar to call by reference. You can pass the address of a variable, and the callee can use it to modify the variable. The & operator in C takes the address of a variable, so &status is an expression that means "the address of the variable named status." The type of this expression is int*, or "pointer to integer." That's why the prototype for wait is pid_t wait (int *status) It takes an int* as an argument (and modifies it). It also returns a pid_t. This is a cheap mechanism for returning two values from a procedure. The suffix _t in C usually means "type", so pid_t just means "the type that contains pids". The interface for times and gettimeofday is similar. You have to give them the address of a structure, and it will fill in the fields of the structure. This is very similar to the mechanism in Java where objects (stuctures) are passed by reference. In fact, this is how Java implements call-by-reference. I am counting on you to pick up the C you need as we go along, but I am happy to help. If you see something you don't understand 1) print the relevant man pages and read them 2) find the relevant sections of the cow book and read them 3) if that fails, ask a question in class, or out of class, depending on your estimate of the likelihood that someone else will find it useful.