Software Systems Spring 2005 For today, you should have: 1) read the Lottery Scheduling paper and answered the reading questions 2) worked on Homework 3 Outline: 1) lottery scheduling paper 2) final comments on scheduling 3) workload characterization 4) performance evaluation overview 5) da project For next time you should: 1) read Tanenbaum pages 202-219 (notice that there are gaps) 2) prepare for a quiz on CPU scheduling 3) think about projects: proposal due March 4! Workload characterization ------------------------- wget wb/ss/code/lifetimes.tgz tar -xzf lifetimes.tgz cd lifetimes ls ctc.lifetimes # data from the SP2 at the Cornell Theory Center t3e.batch.lifetimes # data from the T3E at SDSC t3e.inter.lifetimes ucb.lifetimes # data from workstations at Berkeley process # a script for processing the data emacs process: # replace this with the path to your copy of cdf cdf=/home/downey/bin/cdf for file in *.lifetimes do echo $file $cdf -t logx $file > $file.cdf $cdf -t loglog $file > $file.ccdf done Edit the script so it knows where you put cdf, and then run it: bash process Let's start by looking at the data from UCB: wc ucb.lifetimes xgraph ucb.lifetimes.cdf 1) The x axis is log2(process lifetime in seconds) 2) Left of 0 is less than 1 second. 3) Resolution is coarse for short jobs What is the median lifetime? What percentage of processes run longer than 1 second? xgraph ucb.lifetimes.ccdf 1) The vertical axis is now log2(1 - Prob {liftime > x}) 2) This view of the data makes the tail more visible. 3) A straight line on this graph is evidence of a Pareto distribution. What percentage of processes run longer than 2^5 seconds? Now let's look at the data from CTC: xgraph ctc.lifetimes.cdf xgraph ctc.lifetimes.ccdf What do we make of this? We can also look at all the cdfs: xgraph *.cdf xgraph *.ccdf What conclusions can we take from these datasets? Performance Evaluation ---------------------- As a running example, we'll look at: Exploiting Process Lifetime Distributions for Dynamic Load Balancing Harchol-Balter and Downey, ACM Transations on Computer Systems, 1997. Measurement 1) what do you want to know? 2) what environments do you care about? 3) what do you have access to? 4) what instruments do you have? Workload characterization 1) what generalizations can we make from measurements? 2) what simplifications can we make? Modeling 1) what features of the system do we need to include? 2) what can we leave out? 3) what kind of analysis/simulation are we planning Analysis 1) what can we derive/prove about the system? 2) how do our assumptions affect these results? Simulation 1) if we relax the assumptions needed for analysis, do the results change? 2) are there additional results we can get? Implementation 1) what do our results tell us about real systems? 2) if we have developed new algorithms, can we implement them, or at least an approximation? Validation 1) does the implementation behave as we expect? 2) how does the actual workload affect performance? 3) what metrics are important? Project ideas ------------- Strategies for finding a project 1) If there is a part of an operating system, network or database that interests you, start with textbooks and follow their pointers to additional reading. 2) If you read something in the press, or someplace like slashdot, and you want to look deeper, I can give you some pointers. Example: http://www.theregister.co.uk/2005/02/18/intel_tcpip_attack/ 3) If you have an idea for a hardware/software product, we can try to build and evaluate an implementation. 4) If you would like to start in on publishable work in this area, I can help you find the research frontier. 5) If there is a company that is doing something interesting, maybe we can make a deal! Some ideas I have kicked around: 1) Perform a measurement study of some part of the Olin computing infrastructure. wireless network workload / performance laptop / hard drive reliability -- data loss modeling 2) Design and implement a software system for local use a backup/restore system for your laptops general interaction between laptops and other systems load sharing, storage sharing, distributed processing 3) Implement some OS feature on a microcontroller. process abstraction (timesharing, virtual memory) real time scheduling joint POE project to build a boot loader for enchanced PIC 4) Investigate and improve some part of Linux. where are the real performance bottlenecks? eliminate thrashing implement an application-specific memory allocation system (or evaluate existing ones) evaluate alternative file system implementations 5) Develop a higher-level abstraction replace the file system abstraction with something like a searchable database. 6) Perform a simulation study of a real-life "system" optimize some of the algorithms we use in real life example: virtual postal addresses 7) Investigate alternative architectures Build and evaluate a system that uses flash memory instead of disk. Use a simulation study to evaluate other wacky ideas.