Coding Java Expert needed
Data structures assignment/week4b.pdf
CI583: Data Structures and Operating Systems
Algorithmic strategies to solve NP-hard problems
1 / 15
Depth-�rst and breadth-�rst
Many NP problems can be reduced to a problem in which the nodes in a graph must be visited exactly once � this is a more general idea of traversal, which we used with trees.
For instance, we may need to transmit a network packet to every computer on a network, making sure that no computer receives it twice.
There are two ways we might do this: depth-�rst and breadth-�rst.
2 / 15
Depth-�rst and breadth-�rst
A depth-�rst traversal will go as far as possible down a given path before it considers any other. A breadth-�rst traversal goes evenly in many directions.
7 8 9
456
321
3 / 15
Depth-�rst traversal
In a depth-�rst traversal, we visit the starting node then follow edges through the graph until we reach a dead-end. In an undirected graph a node is a dead end if all nodes adjacent to it have been visited. In a directed graph a node is also a dead end if it has no outgoing edges.
7 8 9
456
321
4 / 15
Depth-�rst traversal
When we reach a dead end we go back up the path until we �nd a node with an unvisited adjacent node. One traversal of this graph starting at 1 reaches a dead end at 6: [1, 2, 3, 4, 7, 5, 6].
7 8 9
456
321
5 / 15
Depth-�rst traversal
We then go back to node 7 and �nd that 8 is unvisited. We visit 8 and reach a dead end.
7 8 9
456
321
6 / 15
Depth-�rst traversal
We then go back to 4 and �nd that 9 is unvisited. The next time we go back up the path we end up at the starting node and we are done.
7 8 9
456
321
7 / 15
Breadth-�rst traversal
With a breadth-�rst traversal we �rst visit all nodes which are adjacent to the starting node. Then we visit all nodes which are two edges away from the starting node, and so on. For the graph below, one traversal starting at 1 is [1, 2, 8, 3, 7, 4, 5, 9, 6].
7 8 9
456
321
8 / 15
Breadth-�rst traversal
With a breadth-�rst traversal we �rst visit all nodes which are adjacent to the starting node. Then we visit all nodes which are two edges away from the starting node, and so on. For the graph below, one traversal starting at 1 is [1, 2, 8, 3, 7, 4, 5, 9, 6].
7 8 9
456
321
9 / 15
Breadth-�rst traversal
With a breadth-�rst traversal we �rst visit all nodes which are adjacent to the starting node. Then we visit all nodes which are two edges away from the starting node, and so on. For the graph below, one traversal starting at 1 is [1, 2, 8, 3, 7, 4, 5, 9, 6].
7 8 9
456
321
10 / 15
Breadth-�rst traversal
With a breadth-�rst traversal we �rst visit all nodes which are adjacent to the starting node. Then we visit all nodes which are two edges away from the starting node, and so on. For the graph below, one traversal starting at 1 is [1, 2, 8, 3, 7, 4, 5, 9, 6].
7 8 9
456
321
11 / 15
Breadth-�rst traversal
With a breadth-�rst traversal we �rst visit all nodes which are adjacent to the starting node. Then we visit all nodes which are two edges away from the starting node, and so on. For the graph below, one traversal starting at 1 is [1, 2, 8, 3, 7, 4, 5, 9, 6].
7 8 9
456
321
12 / 15
Traversal
Implementations of depth-�rst traversal often use a stack. Implementations of breadth-�rst traversal often use a queue. Give DF and BF traversals starting at the node labelled 1 for this directed graph:
5 4 3
76
21
13 / 15
Weighted graphs
Consider the problem of �nding �cheapest� paths in a directed and weighted graph:
5 4 3
76
21
5
2
1
3
4
1
2
1
2 5
14 / 15
Minimum spanning trees
The problem of �nding cheapest or shortest paths in a a weighted and connected graph reduces to that of �nding the minimum spanning tree (MST). A spanning tree for a graph, G, contains all the nodes of G and a subset of the edges and has no cycles. A connected graph is one in which there is a path from every node to any other.
The MST for a graph, G, is a spanning tree in which the total of the edge weights is minimal.
We can calculate the MST by brute force � for n edges this takes 2n comparisons between potential MSTs � this is an NP problem.
15 / 15
- Depth-first and breadth-first
- Greedy algorithms
__MACOSX/Data structures assignment/._week4b.pdf
Data structures assignment/week8a.pdf
Processes and threads
CI583: Data Structures and Operating Systems OS Design Principles
1 / 17
Processes and threads
Outline
1 Processes and threads
2 / 17
Processes and threads
Processes and threads
Conceptually, the thread is a unit of computation, to be stopped and started by the OS.
T im
e
Process
Thread #1 Thread #2
3 / 17
Processes and threads
Processes and threads
Each thread belongs to an enclosing process, which can also be stopped and started by the OS and which is used to store context which includes:
an address space (the set of memory locations that contains the code and data for the program),
a list of references to open files, and
other information might be common to several threads.
4 / 17
Processes and threads
Processes and threads
To represent a process we use a data structure called a process control block (PCB), containing references to all the info given above and to a list of threads.
Each thread is represented by a thread control block (TCB). The TCB contains references to thread-specific context, including the thread’s stack and contents of registers.
stack pointer
other registers
current state
stack pointer
other registers
current state
Address space description
open file descriptors
list of threads
current state
PCB TCB
stacks
stack pointer
other registers
current state
5 / 17
Processes and threads
Processes and threads
The OS needs to be able to create, delete and synchronise processes.
In Windows and *NIX a process is either active or terminated.
Terminated processes are deleted (by a process dedicated to this purpose) when all of its threads are terminated and there are no more references to the process from elsewhere.
6 / 17
Processes and threads
Processes and threads
When a process is created, the address space is loaded into memory. On Linux, this is accomplished by the exec program.
How does the OS initialise the address space?
One approach would be to copy the code and data of the program into the address space, but this would be inefficient.
7 / 17
Processes and threads
Processes and threads
The code section of the program is read-only and so can be shared by any processes executing the same program.
The parts of the data section that are never modified can also be shared.
8 / 17
Processes and threads
Processes and threads
A better approach is to map the executable file into the address space.
Both the address space and the executable file are divided into blocks of equal size called pages.
9 / 17
Processes and threads
Processes and threads
The text regions of all processes running this executable are set up using hardware-address translation facilities, with each process mapping to the same location.
The data regions of each process initially refers to a single copy of the data portion of the executable, which has been copied into memory.
10 / 17
Processes and threads
Processes and Threads
$ find / -amin +1$ find ./ -name Main.java
Process A
TEXT
DATA
Process B
TEXT
DATA
Memory Map
A: TEXT
A: DATA
B: TEXT
B: DATA
11 / 17
Processes and threads
Processes and threads
When a process modifies data for the first time, it is given a new, private page containing a copy of the pristine page.
This is called a private mapping.
Modern systems also use the notion of a shared mapping: when data is modified the original page is altered and all processes see the change.
12 / 17
Processes and threads
Processes and threads
$ find / -amin +1$ find ./ -name Main.java
Process A
TEXT
DATA
Process B
TEXT
DATA
Memory Map
A: TEXT
A: DATA
B: TEXT
B: DATA
13 / 17
Processes and threads
Processes and threads
When a thread is created it is in the runnable state.
At some point it is switched into the running state by the scheduler (which we will come back to in more detail).
A thread that is running may then be put into the waiting state, for instance because it has initiated an I/O action and needs to wait for the result before doing anything else.
14 / 17
Processes and threads
Processes and threads
The thread can put itself into the waiting state, or can be moved into the state by the OS.
If the OS uses time-slicing, whereby threads are run for a maximum period of time before relinquishing the processor, the scheduler can move the thread into the waiting state.
15 / 17
Processes and threads
Processes and threads
A thread can move itself into the terminated state, or it can be moved there by the OS.
Note that a thread must be in the running state at least once before terminating.
Terminated threads are removed in various ways, such as by a dedicated “reaper” process.
16 / 17
Processes and threads
Processes and threads
Running
Waiting
TerminatedRunnable
17 / 17
- Processes and threads
__MACOSX/Data structures assignment/._week8a.pdf
Data structures assignment/week9e.pdf
Journalled file systems
CI583: Data Structures and Operating Systems Journalled File systems
1 / 16
Journalled file systems
Journalled file systems
In journalling, the steps of a transaction are written to a journal before being committed, or written to disk.
If anything goes wrong whilst the changes are being written to disk, a recovery procedure repeats the steps from the journal when the system restarts.
If any change is made twice, that’s not a problem because changes are idempotent – carrying them out n times is the same as carrying them out once.
2 / 16
Journalled file systems
Journalled file systems
There are two ways to do this:
redo journalling (as described above), and
undo journalling, in which the old contents of data blocks are stored in the journal, allowing the user to get back to the previous consistent state after a crash.
3 / 16
Journalled file systems
Journalled file systems
Making every change twice (once in a journal, once for real) risks doubling the amount of work, and journalling would not have been practical on the systems of the 70s.
However, aggressive caching and other techniques mean that we can minimise the work involved by batching up sets of changes and carrying them out in one go, rather than writing to the journal every time one byte is altered.
4 / 16
Journalled file systems
Journalled file systems
How big, then, should a transaction be?
Deleting a large file might require millions of operations, so that a single task is too big for the journal.
Carrying out each small change in its own transaction would be inefficient.
5 / 16
Journalled file systems
Journalled file systems
In practice, small changes are batched together into one transaction and big changes are broken up into several transactions.
Some systems (e.g. ext3) use time-based transactions.
The important thing is that we respect the ACID rules and take the system from one consistent state to another.
6 / 16
Journalled file systems
Shadow-paged file systems
Journalled file systems ensure consistency but involve rather a lot of work. There is a simpler solution: shadow-paged file systems.
Like journalling, shadow-paged systems are only viable in a context where memory is plentiful and processors are fast.
Examples include ZFS from Sun.
7 / 16
Journalled file systems
Shadow-paged file systems
The idea is simple – the whole file system is represented in memory as a data structure called the shadow-page tree, the root of which is called the überblock.
The tree contains pointers to all metadata and data blocks.
8 / 16
Journalled file systems
Shadow-paged file systems
Whenever a disk block is about to be changed a copy is made and it is that which is modified – the original is kept unchanged.
To link the modified copy into the tree, the parent node must be altered, but instead of changing it directly, that node is also copied, and so on, up to the root node.
This procedure is called copy-on-write.
9 / 16
Journalled file systems
Shadow-paged file systems
The root node is modified directly in a single disk write.
If the system crashes while an update is in progress but before the root node has been altered, the system comes back up with the old copy of the tree.
By keeping the original root node, we have a snapshot of the system as a whole.
10 / 16
Journalled file systems
Shadow-paged file systems
root
indirect inode blocks
direct inode blocks
indirect data blocks
direct data blocks
11 / 16
Journalled file systems
Shadow-paged file systems
root
indirect inode blocks
direct inode blocks
indirect data blocks
direct data blocks
12 / 16
Journalled file systems
Shadow-paged file systems
root
indirect inode blocks
direct inode blocks
indirect data blocks
direct data blocks
13 / 16
Journalled file systems
Shadow-paged file systems
root
indirect inode blocks
direct inode blocks
indirect data blocks
direct data blocks
14 / 16
Journalled file systems
Shadow-paged file systems
root
indirect inode blocks
direct inode blocks
indirect data blocks
direct data blocks
15 / 16
Journalled file systems
Shadow-paged file systems
root
indirect inode blocks
direct inode blocks
indirect data blocks
direct data blocks
16 / 16
- Journalled file systems
__MACOSX/Data structures assignment/._week9e.pdf
Data structures assignment/week9d.pdf
Resiliency
CI583: Data Structures and Operating Systems File systems: Resiliency
1 / 12
Resiliency
Resiliency
Early file systems such as S5FS performed badly in the face of crashes and other unexpected shutdowns.
Not only could the files which were open at the time be corrupted, but other completely unrelated files could also be damaged.
It is easy to understand how unwritten modifications to a file could be lost, but how were other files being affected?
2 / 12
Resiliency
Resiliency
This happened because data structures which describe the whole file system, such as the superblock and the I-list, could easily become corrupted.
This is even more serious than losing the latest version of a user’s file.
In the worst case, two separate inodes could point to the same data, or system files could be marked as free space, etc!
3 / 12
Resiliency
Resiliency
The problem stems from the fact that many operations that we think of as a “semantic unit”, such as the system call write, actually comprise many operations.
Consider the task of adding an element to back of a queue.
In combination with caching, an interruption here can lead to the “last” element in the queue being some random memory location.
4 / 12
Resiliency
Resiliency Corrupting a data structure
. . .
5 / 12
Resiliency
Resiliency Corrupting a data structure
. . . .
6 / 12
Resiliency
Resiliency Corrupting a data structure
. . . .
7 / 12
Resiliency
Resiliency Corrupting a data structure
. . . ?
8 / 12
Resiliency
Resiliency
When a cache is being used, the user may actually have instructed the OS to save their work and been given feedback that the task is done, whilst in fact the changes are still in the cache and will not survive a crash.
We require the metadata structures on disk (list of free space, inodes, diskmaps within inodes, directories, etc) to be well formed at all times even if, at a push, they might not contain the latest updates.
If this is the case, then the data is consistent.
9 / 12
Resiliency
Resiliency
The measures taken by file systems to ensure consistency have normally focused on metadata consistency, rather than the contents of user files, although this is obviously important to the user.
One approach is to ensure that every change to the system (write, creat, rename, unlink) takes the system from one consistent state to another.
This is called a consistency-preserving approach, as used by FFS (80s UNIX file system, one of the successors to S5FS) for example. This is difficult in practice as every system change involves many individual changes.
10 / 12
Resiliency
Resiliency
A second approach is called the transactional approach.
Using this approach, we collect updates into groups called transactions, inspired by the database world.
This group of updates can be treated as a single action with the ACID properties.
11 / 12
Resiliency
Resiliency
Atomic: each transaction is all-or-nothing – either all of it takes place or none of it does.
Consistent: the system is always consistent. None of the inconsistent states that might exist while a transaction is taking place are visible outside of the transaction.
Isolated: a transaction has no effect until it is committed to disk, and is not affected by any other ongoing (uncommitted) transactions.
Durable: once a transaction is committed, its effects persist.
The two main approaches to providing these properties are journalling and shadow-paging, which we will look at in turn.
12 / 12
- Resiliency
__MACOSX/Data structures assignment/._week9d.pdf
Data structures assignment/week4c.pdf
CI583: Data Structures and Operating Systems
Algorithmic strategies to solve NP-hard problems
1 / 23
Greedy algorithms
Greedy algorithms work by looking at a subset of a larger problem
and �nding the best solution for that subset.
Dijkstra's Shortest Path algorithm is a greedy algorithm developed
in the late 50s. At each step it looks at adjacent edges and decides
which one to add to the spanning tree. By doing this repeatedly,
we end up with a spanning tree that has the minimum overall total,
the MST.
2 / 23
Dijkstra's shortest path
The algorithm works by putting all nodes into one of three
categories:
1 in the tree: nodes already added to the spanning tree,
2 on the fringe of the tree: those nodes adjacent to the current
node, and
3 not yet considered.
3 / 23
Dijkstra's shortest path
The algorithm, informally:
1 Select a starting node.
2 Build the initial fringe from nodes adjacent to
the starting node.
3 While there are nodes not yet considered, do
1 Choose the edge in the fringe with the smallest
weight.
2 Add the associated node to the tree.
3 Continue with the new selected node and updated
fringe.
4 / 23
Backtracking
The class of problems related to �nding a path through a maze or
avoiding a series of obstacles can be solved by backtracking
algorithms. This type of algorithm makes choices at each step and,
when it reaches a dead end, retraces its steps to a point at which
an alternative choice can be made.
Image c©http://www.liyenchong.com
5 / 23
Backtracking
Backtracking techniques save the state of the problem each time a
choice is made. These techniques are not only useful for
path-�nding.
Consider the N-queens problem. This problem asks how to position
N queens on an N × N chess board in such a way that they don't threaten each other.
6 / 23
4-queens
We can develop a state space tree to describe the 4-queens
problem. Note that no row, column or diagonal can contain more
than one queen. Each level of the state space tree represents the
possible places where each queen can be placed for one of the rows
of the board.
The state space tree will have 256 leaves, with each path from the
root to a leaf representing a possible solution. Most of these are
not solutions of course.
7 / 23
4-queens
root
1,1 1,31,2 1,4
2,1 2,2 2,3 2,4 2,1 2,2 2,3 2,4
8 / 23
4-queens
Given the state space tree, we can carry out a depth-�rst traversal
to place 4 queens on the board, then check whether they can
attach each other:
root
1,2
2,4
3,1
4,3
9 / 23
4-queens
What is wrong with this approach?
Using this approach, much more work is done than necessary. As
soon as we place a queen so that it threatens another, a path that
passes through that node will not lead to a solution and we call
that a nonpromising node. So there is no need to populate or
search subtrees with a nonpromising root.
A backtracking recursive solution to n-queens would stop recursing
whenever it reaches a nonpromising node. Note that, in this case,
the algorithm does not literally need to retrace its steps.
10 / 23
4-queens
What is wrong with this approach?
Using this approach, much more work is done than necessary. As
soon as we place a queen so that it threatens another, a path that
passes through that node will not lead to a solution and we call
that a nonpromising node. So there is no need to populate or
search subtrees with a nonpromising root.
A backtracking recursive solution to n-queens would stop recursing
whenever it reaches a nonpromising node. Note that, in this case,
the algorithm does not literally need to retrace its steps.
10 / 23
Backtracking
The general strategy:
procedure SearchSpace(i) if there is a solution then
output the solution
else
for every possible next step do
if the step is promising then
SearchSpace(i+1)
end if
end for
end if
end procedure
11 / 23
Dynamic programming
Dynamic programming techniques include algorithms in which the
most e�cient solutions depend on choices that might change with
time.
The key feature to dynamic programming is memoisation � this is
the technique of storing the result of expensive computations in
order to reuse them.
12 / 23
Dynamic programming
This Java method calculates the nth element in the Fibonacci sequence (0, 1, 1, 2, 3, 5, 8 . . .):
1 int fibonacci(int n) {
2 if (n<2) return n;
3 return fibonacci(n-1)+fibonacci(n-2);
4
5 }
13 / 23
Dynamic programming
This is very ine�cient. To calculate fibonacci(10) we calculate
fibonacci(9) and fibonacci(8). To calculate fibonacci(9)
we calculate fibonacci(8) (again) and fibonacci(7), and so on.
A solution that uses memoisation:
1 int fibonacci(int n) {
2 if (n<2) return n;
3 int[] data = new int[n];
4 data [0] = 1;
5 data [1] = 1;
6 for(int i=2;i<n;i++)
7 data[i] = data[i-1]+ data[i-2];
8 return data[n-1];
9
10 }
An even more e�cient solution would store just the last two values,
as these are all we will need.
14 / 23
Next time
We have described the set of problems for which there is
(currently!) no polynomial time solution, looked at some examples,
and at some algorithmic techniques that can be used to tackle
these problems.
In a later lecture we will look at other important algorithmic
techniques: ones which make use of probability and chance, and
ones which are suited for paralellisation.
15 / 23
- Backtracking
- Dynamic programming
__MACOSX/Data structures assignment/._week4c.pdf
Data structures assignment/week5e.pdf
CI583: Data Structures and Operating Systems
Encryption
1 / 19
Outline
1 Encryption
2 Lossy algorithms
2 / 19
Random numbers
Before beginning our discussion of encryption, we need to consider random and pseudo-random numbers.
The only real randomness is found in nature.
If you really need a random number you might start by sampling radiation coming from deep space.
We can emulate this kind of chaos with a hardware random number generator that generates electronic static or something similar.
Whether nature is truly chaotic or follows patterns that we can't yet perceive is an open question.
3 / 19
Random numbers
The �random� numbers we normally encounter in computing are pseudo-random, generated by deterministic algorithms called PRNGs.
Every time we follow the same process starting from the same point we will get the same series of numbers.
We can usually make a process like this appear to be �random enough� by starting from a di�erent point, such as the current time elapsed since the UNIX era began (1st January 1970).
When owners of the �rst iPods complained that the shu�e function �wasn't random enough�, Apple made it less random so that it seemed more random.
4 / 19
Random numbers
An e�ective type of PRNG is based on the idea of a seed and uses two constants, a multiplier, a, and a modulus, m. The seed holds the last number generated and is initialised to a value in the range [1, m − 1]. Many implementations divide the seed by m, generating a number between 0 and 1.
1 int a = 16807; // values for a and m suggested by Park
and Miller , 1988
2 int m = 2147483647;
3 int seed;
4 double rand() = {
5 if (seed == 0) seed = time() % m;
6 seed = (seed*a) % m;
7 return seed / m;
8 }
We can use this to get a number in the range low-high as follows: (high − low) × rand() + low.
5 / 19
Encryption
Encryption and, more generally, cryptography, are very large subjects which often rely on advanced mathematics.
They include some important ideas from an algorithms point of view though, so we will consider these from quite a high level.
Encryption is a reversible process that enables us to safeguard data so that is can only be read by known parties � those to whom we have shared a decryption key.
6 / 19
Encryption
There are two types of encryption: symmetric key (both parties use the same key to encrypt/decrypt) and public-key (the sender uses a private key to encrypt and distributes a second, public decryption key to receivers).
Public-key encryption is widely used to protect data in transit or �sign� data such as emails in a way that veri�es its origin. It is built into several internet protocols such as Secure Shell (SSH).
7 / 19
Public-key encryption
Public-key encryption depends on the existence of a pair of keys, public and private, and on the security of the private key.
The keys are mathematically linked but it is impossible or impractical to derive the contents of the private key from the public one.
1 $ cat .ssh/id_dsa.pub
2 ssh -dss AAAAB3NzaC1kc3MAAACBAJdslmtghbFRlYB5QlLnCHpK
3 xeDKhy93Ba3gF02+RLpBU2QuzozDT +0 xP0vO6R65qbrJlSt1xiuW
4 FuomFLg/PeIwTy3dbExVujhA/OSGEMLJ9moEjmTqyu8OiZvFKu55
5 bW33cl6g46suMW +5 cNAAAAFQDobWfSlzR9YeTKRIw+Zx3YoBj9IQ
6 AAAIBcM+Hy9cGLbNJRZVJUDWnVWQcajxtz
7 ...
8 / 19
Key generation
A widely-used public key algorithm is RSA, developed in 1977 by Rivest, Shamir and Adleman of MIT.
The public key is based on the product of two large prime numbers.
If the numbers used are big enough, working out the candidates by brute force will either take years on a supercomputer or be completely impractical.
1 Choose two large prime numbers, p and q.
2 Compute n = pq. The bit length of n is the key length.
9 / 19
Key generation
3 Compute Φ(n), the number of integers less than or equal to n which are coprime with n � no number divides them both except 1.
4 Choose an e where 1 < e < Φ(n) and e and Φ(n) are coprime.
5 Compute d, the multiplicative inverse of e (modulus Φ(n)). I.e. e−1 ≡ d (modulus Φ(n)) � example on last slide.
Then we know that de ≡ 1 (modulus Φ(n)).
The public key consists of (n, e).
The private key consists of (n, d). d and anything that can help recreate it (p, q and Φ(n)) are secret.
10 / 19
RSA encryption
Alice sends her public key (n, e) to Bob. Bob wants to send a message, M, to Alice.
Bob turns M into an integer m, 0 < m < n using a reversible scheme that adds some randomised padding to the message.
Then Bob creates the encrypted message c ≡ me (modulus n). (See Cormen book on RSA for an e�cient method.)
Bob sends c to Alice.
11 / 19
RSA decryption
Alice receives c and computes m ≡ cd (modulus n).
She recovers M from m by reversing the padding scheme.
RSA is the �rst of the public-key algorithms and the simplest to describe at a high level.
Although the maths can be daunting, the details and implications of this and other schemes are very interesting.
See, for example, Understanding Cryptography by Prenner, Paar and Pelzl, Springer, 2009.
12 / 19
Summing up
We have seen some simple and elegant algorithms for compressing human-readable content and more complex ones that compress binary data based on human perception.
We have also described some of the algorithms for encrypting and decrypting data that secure digital communications depend on.
Next week: Probabilistic and non-deterministic algorithms.
13 / 19
Example of multiplicate inverse and multiplication by powers
Suppose we want to �nd x, the multiplicative inverse of 3, modulus 11. That is,
3−1 ≡ x(modulus11).
This is the same as 3x ≡ 1 (modulus 11). One value for x is 12, since 3 × 4 = 12 ≡ 1 (modulus 11).
Method for exponentiation by squaring xn in O(lg n) time (not tail-recursive):
1 int expBySquare(int x, int n) {
2 if(n==1) return x;
3 else if(n%2==0) return expBySquare(x*x, n/2);
4 else return x * expBySquare(x*x, (n-1) /2);
5 }
14 / 19
Lossy algorithms
Lossy compression depends on understanding how we perceive sensory data. It involves trade o�s between the compression achieved, the quality of the end result and the computational complexity of the algorithms used.
How aggressively this can be carried out depends on the context.
Loss of precision in a bitmap image stored on a digital camera is quite acceptable within certain bounds, and may not even be detectable to the human eye. It all depends on how the data is perceived.
15 / 19
Lossy algorithms
Lossy algorithms work by discarding data. This could be done by �rounding� the data so that, for example, all shades of grey in an image �le are converted to a single value, working on the assumption that human vision is more sensitive to changes in luminosity than it is to hue.
We will brie�y consider methods for audio compression.
16 / 19
Audio compression
There are many lossless audio compression techniques, such as Free Lossless Audio Compression (FLAC), but lossy techniques, such as that used by MP3, achieve much better compression � e.g. 80-95% as opposed to 40-50%.
Most techniques (e.g. MP3) work by �rst converting the input from a series of data over time to data over frequencies: i.e. how much of the input falls within a given frequency. This transformation is reversible.
17 / 19
Audio compression
Once transformed in this way, di�erent frequencies can be allocated priority (bits in the compressed output) depending on how audible they are.
The �rst step is to discard frequencies beyond or close to the limits of human hearing.
Next, the algorithms consider simultaneous masking: when two sounds occur simultaneously, one (stronger) signal may make the other completely inaudible, or we may hear a single tone which is a combination of the two. Understanding how the ear works allows us to discard a lot of data at this stage.
18 / 19
Audio compression
A related phenomenon is temporal masking: a weaker signal occurring immediately after a strong signal, depending on their respective frequencies, may be inaudible and can be discarded.
Variations in loudness can also be normalised in some circumstances, since they may be perceived as equally loud by someone with normal hearing.
19 / 19
- Encryption
- Lossy algorithms
__MACOSX/Data structures assignment/._week5e.pdf
Data structures assignment/week4a.pdf
CI583: Data Structures and Operating Systems
What are NP-hard problems?
1 / 10
What is NP?
All of the algorithms we have looked at so far have had solutions in
polynomial time or better.
The worst-case scenario is O(nm), where n is the size of the data.
So we can get a solution to a problem using one of these algorithms
in a �reasonable� time.
This class of problems is called P (polynomial), and the problems in
it are said to be tractable.
2 / 10
P and NP
This is in contrast to the class of problems for which there is no
polynomial (or better) solution: NP (non-deterministic polynomial),
the class of intractable problems.
Algorithms known which can solve these problems tend to be of
exponential (O(2n)) or factorial (O(n!)) order.
If an algorithm is O(2n) then every time we add a single element to the input the time taken doubles.
3 / 10
Travelling salesman
As an example, consider the travelling salesman problem. We are
given a set of cities and a cost of travelling between each of them.
The goal is to visit each city once, ending up back at the city from
which we began, and minimising the cost.
4 / 10
Travelling salesman
This a problem that often needs to be solved in the real world �
e.g. calculating the best route for a delivery van or how to
e�ciently send messages among nodes in a network.
If we have 8 cites, there are 40,320 orderings. A brute force
solution is one that inspects every possibility.
For 10 cities there are 3,628,800 possibilities � to �nd the best
route we have to examine them all.
For 15 cities, if we could do 100 route comparisons per second then
it would take more than 400 years to �nd the answer by brute force.
5 / 10
P and NP
Thus, a deterministic algorithm (one that examines all possibilities)
would take an extremely long time to complete.
To show that the problem is in NP we have to describe a two-step
process for solving it � the �rst step generates a non-deterministic
potential solution to the problem, the second step checks whether
this is a true solution.
In our case, the �rst step would be to generate a list of the cities to
visit in some random order. This step would run in O(n). The second step would be to check the cost of the solution, also in
O(n). The point is that we don't know how many times we will need to repeat this.
6 / 10
NP-complete
We may be tempted to think that a problem is in NP just because
we haven't come up with a polynomial algorithm yet.
The term NP-complete is used to describe the hardest problems in
NP.
If we were ever to �nd a polynomial solution to an NP-complete
problem then we could �nd a polynomial solution for all of them.
7 / 10
NP-complete
This is because a problem is NP-complete if and only if every other
problem in the class can be transformed into it.
To show that problem A is NP-complete we need to show how to transform it into another NP-complete problem, B.
Because B is NP-complete, every problem in NP could be transformed into B.
8 / 10
P and NP
Typical NP problems:
1 Graph colouring: assign a �colour� to each node in a graph so
that no adjacent nodes have the same colour. Applications
include timetabling problems.
2 Bin packing: given a number of bins and a set of objects with
varying sizes, what is the smallest number of bins we need to
store all objects? This can be used to work out how to use
materials e�ciently, such as how to cut a piece of metal into
smaller parts with minimal waste.
3 CNF-SAT problem: given a logical expression in conjunctive
normal form (i.e. is a conjunction of disjunctive clauses), is
there some assignment that makes the whole thing true?
9 / 10
P and NP
This sets the scene for one of the most famous question in CS:
P = NP?
We know that P is a subset of NP � for every problem that has a
polynomial solution (e.g. sorting) we could come up with a worse,
non-deterministic solution. Given a problem in NP, X, for which there is no polynomial deterministic solution, we need to show that
X is not in P.
All can say so far about whether X is also in P is that we haven't found a deterministic solution yet. From a common-sense point of
view computer scientists are con�dent that P 6= NP, but the interesting thing is why this is so hard to prove. The search for a
proof continues.
10 / 10
- P = NP?
__MACOSX/Data structures assignment/._week4a.pdf
Data structures assignment/week9f.pdf
Efficient directories
CI583: Data Structures and Operating Systems File systems: directories
1 / 9
Efficient directories
Efficient directories
As a last word on file systems, we will briefly look at how directories might be implemented in a modern system.
In simple file systems such as S5FS, a directory is an ordinary file that maps some file names to metadata (file size, starting block, ownership, etc), listed in the order in which the files were added.
2 / 9
Efficient directories
Efficient directories
This works OK when directories contain only a small number of files.
Even when directories are small, how should we maintain these directory files when files are added to and removed from the directory?
If the first 3 files are removed from the directory, is that space now lost due to fragmentation?
S5FS ignores these issues (and was right to do so – read about worse is better!).
https://en.wikipedia.org/wiki/Worse_is_better
3 / 9
Efficient directories
Efficient directories
Times have changed, however.
Directories might contain thousands of files.
Organising them in this linear (O(n)) way means, in the worst case, searching to the end of the list every time we add, rename, or otherwise access a file.
4 / 9
Efficient directories
Efficient directories
There are two main approaches to providing more efficient directories:
1 hashing: construct a function, f , such that f (/home/me/myfile) returns the block that contains the directory entry for /home/me/myfile .
The details of this approach get complicated when we consider adding and removing items from directories, which require us to modify the hashing function (and make sure it continues to work for the previous cases),
5 / 9
Efficient directories
Efficient directories
2 indexes: represent the contents of a directory as an efficient data structure of the type used in an RDBMS.
6 / 9
Efficient directories
Efficient directories
Indexing involves uses a balanced search tree to represent the contents of a directory.
The tree is balanced because each leaf node is the same distance from the root.
Searching such a tree is very efficient, but insertion and deletion are more expensive operations.
7 / 9
Efficient directories
Efficient directories
The most common type of balanced tree used in file systems is the B+ tree. Only the leaves contain data, which are blocks containing directory entries. This figure is an over-simplification, but conveys the idea.
rob
dan tim
ann dan rob timDATA
< rob >= rob
Demo
8 / 9
Efficient directories
Next time
Memory management: virtual memory, page tables, etc.
9 / 9
- Efficient directories
__MACOSX/Data structures assignment/._week9f.pdf
Data structures assignment/week8b.pdf
Managing hardware Shared libraries
CI583: Data Structures and Operating Systems OS Design Principles
1 / 21
Managing hardware Shared libraries
Outline
1 Managing hardware
2 Shared libraries
2 / 21
Managing hardware Shared libraries
Managing hardware
In an earlier lecture we discussed device drivers, which contain the code that knows how to interact with a given device.
How do we make sure that the right driver is associated with the right device, that all devices are properly initialised when the system starts, or that new devices are recognised correctly when attached?
3 / 21
Managing hardware Shared libraries
Managing hardware
In early UNIX systems the kernel had hard-coded support for the relevant devices. In order to add a new device, the user needed to recompile the kernel image.
Each device was identified by a device number which was a pair of the major device number, which identified the driver, and minor device number, which identified the particular device among those handled by this driver.
4 / 21
Managing hardware Shared libraries
Managing hardware
Special files were created on the file system in the /dev directory to refer to devices.
These “special” files refer to the device using its device number. Thus, when an application opens the “file” /dev/sda1 the kernel uses the right device and driver.
5 / 21
Managing hardware Shared libraries
Managing hardware
This approach was laborious and inadequate for the number of devices that can be used with today’s systems.
The modern approach uses what we might call meta-drivers, each responsible for the class of devices that can use a particular bus,
So, the USB driver knows about USB devices and probes the system at start-up time to see which are attached.
6 / 21
Managing hardware Shared libraries
Managing hardware
The correct drivers and kernel modules are then loaded to initialise these devices.
Whilst a modern Linux system is running, the udev “daemon” listen for the connection and removal of devices and updates the contents of /dev accordingly.
Most of what udev does is in userland (doesn’t require special privileges).
7 / 21
Managing hardware Shared libraries
Interacting with a device
We begin by considering how the OS might interact with a very simple device – a terminal, which takes user input at the command line and displays the results in a simple text-only UI.
Although this might seem like an obsolete example, the way that the OS reads from and writes to this device is something that can be adapted for many contexts.
8 / 21
Managing hardware Shared libraries
Interacting with a device
We can see straight away that it’s not enough to read one character at a time from the device, or to write data straight to the device. Why not?
1 Data may be sent to the device faster than they can be processed, or generated by the user faster than the application can accept it.
9 / 21
Managing hardware Shared libraries
Interacting with a device
2 Chars may arrive from the keyboard when there is no waiting read request.
10 / 21
Managing hardware Shared libraries
Interacting with a device
3 Input may need to be processed in some way before they read the application.
For instance, chars need to be echoed to the screen so that the user can see what they are typing.
Chars may be grouped into lines which can be edited before submitting by pressing enter.
The terminal may support tab-completion.
11 / 21
Managing hardware Shared libraries
Interacting with a device
Items 1 and 2 are examples of the producer-consumer problem, in which two processes need to synchronise their access to a finite queue or buffer.
The producer’s job is to generate a piece of data, put it into the buffer and start again.
12 / 21
Managing hardware Shared libraries
Interacting with a device
At the same time, the consumer is consuming the data (i.e., removing it from the buffer) one piece at a time.
The problem is to make sure that the producer won’t try to add data into the buffer if it’s full and that the consumer won’t try to remove data from an empty buffer.
13 / 21
Managing hardware Shared libraries
Interacting with a device
The OS will communicate with many devices using the same model of input and output queues (even a window manager, which controls the gui – here the input comes from keyboard and mouse, and the window manager needs to keep track of which application window has the focus, whilst the output is a stream of requests to repaint parts of the display using new data).
We separate this common functionality into something called a line discipline module.
14 / 21
Managing hardware Shared libraries
Shared libraries
We have already mentioned the fact that each OS has an API and that high-level languages provide convenient calls to the API in standard libraries.
As well as wrappers for the API, standard libraries contain large numbers of standard functions for convenience.
15 / 21
Managing hardware Shared libraries
Shared libraries
These shared libraries are called dynamic-linked libraries on Windows and shared objects on Linux.
One advantage of using them is that they need not be loaded until needed, improving the start-up time of a program.
16 / 21
Managing hardware Shared libraries
Shared libraries
The biggest advantage is code reuse – few programmers would attempt GUI programming without a library.
One disadvantage is that they bring increased complexity (DLL A depends on DLL B, which depends on DLL C – are the versions of these DLLs compatible?) – this complexity is greatly tamed by managed runtime environments like .NET or the JVM.
17 / 21
Managing hardware Shared libraries
Shared libraries
When a program is converted into machine code that can be executed directly by the OS, it needs to contain everything it needs to run.
When we include a call to a shared function in our program, such as a call to the C function printf (or, more indirectly, System.out.println in Java), a compiler could take one of several choices:
18 / 21
Managing hardware Shared libraries
Shared libraries
Approaches to shared libraries:
1 Include a copy of printf with our program. This would make our programs much larger than they need to be and be a waste of storage.
2 Load a copy of printf at runtime, along with our program. There would only be one copy on disk but each program would load it into memory: again, this would be very inefficient.
19 / 21
Managing hardware Shared libraries
Shared libraries
Approaches to shared libraries:
3 Ensure that all processes that make use of printf refer to the same known location, L. If some process wants to make use of that location for another purpose, move the single copy of printf to another location, M. This is the approach taken by Windows.
4 Ensure that each process that calls on printf maintains a table mapping an arbitrary memory location, L1, to the real location in memory of printf, L2. This approach is called position-independent code and is the approach used by most modern UNIX systems.
20 / 21
Managing hardware Shared libraries
After the break
Storage, virtual memory, virtualisation, scheduling.
21 / 21
- Managing hardware
- Shared libraries
__MACOSX/Data structures assignment/._week8b.pdf
Data structures assignment/week8c.pdf
Scheduling
CI583: Data Structures and Operating Systems Scheduling
1 / 20
Scheduling
Scheduling
We have said that the main purpose of an OS is to manage resources, and one of the ways of doing this is to ensure that processes and threads have access to a “fair” amount of processor time.
Managing the access to processors is called scheduling. This is a dynamic process – the scheduler has to respond immediately to demands for processor time, as threads move in and out of the runnable state.
2 / 20
Scheduling
Scheduling
At the highest level, we can consider that we have a list of runnable threads which we want to sort so that the most urgent ones are at the front of the list. But what do we mean by “urgent”? That could be a question of:
giving priority to interactive threads,
giving priority to system-level threads,
maximising the number of threads processed per unit of time,
a combination of the above.
3 / 20
Scheduling
Scheduling strategies
The strategy we use will depend on the type of system we are designing:
1 Simple batch systems (SBS): Each process runs to completion and the job of the scheduler is to pick the next task to be run.
The two main considerations are system throughput and average wait time.
4 / 20
Scheduling
Scheduling strategies
2 Multiprogrammed batch systems: Same as SBS but several jobs are run concurrently.
The considerations from the SBS are supplemented with the questions of how many jobs should be running at any one time and how the processor time is apportioned amongst running jobs.
5 / 20
Scheduling
Scheduling strategies
3 Time-sharing systems: here the main question becomes apportioning time to the jobs that are ready to execute – the runnable threads.
The main concern is response time – the time from when a command is given to when it is completed. Short requests should be completed quickly.
6 / 20
Scheduling
Scheduling strategies
4 Shared servers: A single computer being used by many clients.
The question arises here of giving each client a reasonable apportionment of time, instead of focusing only on runnable threads.
7 / 20
Scheduling
Scheduling strategies
5 Real-time systems: Can be soft or hard.
An example of soft real-time would be video processing – data must be processed in a strictly synchronised fashion and as efficiently as possible, but some lag is not a disaster.
An example of hard real-time would be autonomous vehicle software that alters the direction of a car – certain commands must be handled in a timely way.
8 / 20
Scheduling
Simple Batch Systems
On an SBS, jobs are run one at a time and, when a job ends, we can choose between several queued jobs to decide which to run next.
There are two approaches we could take:
1 First-in-first-out (FIFO): jobs are executed in the order they were submitted to the system.
2 Shortest-job-first (SJF): some (relatively crude) measure is used to decide how long a job is going to take, and the shortest will be executed first.
9 / 20
Scheduling
Simple Batch Systems
FIFO might seem fairest, but if we take average throughput as the measure of effectiveness for our scheduler, then SJF will perform best.
However, without further measures, SJF could mean that a very long job is queued indefinitely, so we need to make sure that doesn’t happen.
10 / 20
Scheduling
Multiprogrammed Batch Systems
An MBS is essentially the same as SBS except that several jobs are held in memory at one time and time-slicing is used to switch focus between the currently active job.
To do this we need to decide how many jobs to hold in memory at any one time and to decide on a time quantum – the amount of time to allocate to a job before it is preempted in favour of the next job.
A solution which gets round some of the problem with SJF is to hold multiple queues of relatively short and relatively long jobs, so a short and a long job will be held in memory at any one time.
11 / 20
Scheduling
Time-Sharing Systems
A TSS is one which has several users logged on at any one time.
The main concern of the scheduler is that the system should feel responsive. A job which ought to be short should be short.
12 / 20
Scheduling
Time-Sharing Systems
If a job that normally takes 3 or 4 minutes, such as compiling some code, takes 5 minutes, users probably won’t mind.
But if a job that should be very quick, such as opening a file in a word processor, takes 1 minute then the users will think the system is slow.
13 / 20
Scheduling
Time-Sharing Systems
A scheduling strategy for a TSS should favour short and interactive operations at the possible expense of longer ones.
As a first attempt, consider a time-sliced scheduler that uses a round robin strategy. When a thread uses up its time quantum, it is moved to the back of the queue and the next thread gets to run.
14 / 20
Scheduling
Time-Sharing Systems
As we want to prioritise short, interactive operations, we can modify this strategy to assign a priority to each thread. We can use this notion of high/low priority to move short, interactive threads to the top of the list or to maintain multiple queues.
SHORT
SHORT/ INTERACTIVE
LONG
15 / 20
Scheduling
Time-Sharing Systems
Determining the level of interactivity of a process will necessarily involve some guesswork.
A more effective algorithm is to reduce the priority of a thread each time it uses a time quantum.
16 / 20
Scheduling
Time-Sharing Systems
The first time a thread runs, it does so at the highest priority level and, presuming it isn’t completed, its priority is reduced.
The next time it runs, it’s priority is reduced again, and so on. This is called a multilevel feedback queue.
In this system, threads in low priority queue can only run if there are no threads in a higher queue.
17 / 20
Scheduling
Time-Sharing Systems
HIGH
LOW
18 / 20
Scheduling
Time-Sharing Systems
However, we can improve on this by taking other factors than the number of time quanta consumed into account.
When we run a thread, we can modify its priority based on how long it has been waiting since it last ran (e.g. it may have been in the paused state for a long time, waiting for an IO operation to end).
19 / 20
Scheduling
Time-Sharing Systems
We can sum this up by saying a thread’s priority gets worse while it is running, and better while it is waiting.
This is the strategy used by the schedulers in Windows and Linux today.
At the same time, each OS allows the user some way of stating how important they consider a process and its threads to be (on Linux, this is the nice command).
20 / 20
- Scheduling
__MACOSX/Data structures assignment/._week8c.pdf
Data structures assignment/week5d.pdf
CI583: Data Structures and Operating Systems
Compression algorithms
1 / 25
Outline
1 Compression
2 Run-length encoding
3 Hu�man coding
2 / 25
Data, data everywhere!
In the digital age, we're drowning in data. Google pioneered the �Big Data� era by never throwing anything anyway and storing loosely connected or unconnected data with the aim of applying meaning to it later on.
Almost all data generated is metadata produced automatically.
The global data volume is estimated to have exceeded 64 zettabytes in 2020.
3 / 25
Data, data everywhere!
It's hard to imagine how much data is contained in 64 zettabytes...
1 zettabyte = 1000 exabyte = 1 million terabytes = 1 trillion GB
It has been estimated that all human words ever spoken could be encoded into 5 exabytes1.
1 New York Times: http://tx0.org/5ls
4 / 25
Compression algorithms
Storage might be relatively cheap but the fact that we can store so much data and transfer it across networks etc is largely thanks to e�ective and widely used compression algorithms which retain, to a greater or lesser degree, the meaning of the input.
The �eld that studies this is called Information Theory, combining mathematics and computer science, and founded by Claude Shannon with a series of landmark works in the 1940s and 50s.
A compression algorithm can be lossless (information-preserving) or lossy (some information considered non-essential is discarded).
5 / 25
Run-length encoding
Perhaps the simplest example of a lossless encoding is a run-length encoding (RLE). This works by replacing repetition with a single value and a count. Thus, if we have a protocol that encodes the pixel values of one row of a black-and-white image like so:
WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWW
WWBWWWWWWWWWWWWWW
this has the following RLE:
12 W1B12W3B24W1B14W
6 / 25
Optimising RLE
RLE algorithms are simple and can be extremely e�ective especially for data with little variation, where savings of up to 90% are not unusual.
To store data with more variation in it, such as text, we can optimise the process by transforming the input before applying the RLE. Obviously, such a transformation needs to be reversible so that we can recover the original.
7 / 25
Burrows-Wheeler Transform
The Burrows-Wheeler Transform (BWT) transforms a sequence of characters so that the most commonly repeating characters are adjacent to each other, making the application of RLE more e�ective. That would be easy enough in itself � just sort the input � but, marvellously enough, BWT is also reversible.
This transformation is used in the bzip compression format, widely used on UNIX systems.
8 / 25
Burrows-Wheeler Transform
Take some input, e.g. the word billing. To optimise this for RLE we want multiple occurrences of the same character to be next to each other. Put special characters at the start and end of the input, e.g. ^billing$.
Form the set of rotations of the input. We rotate a word by moving one character from the beginning to end.
9 / 25
Burrows-Wheeler Transform
^billing$
billing$^
illing$^b
lling$^bi
ling$^bil
ing$^bill
ng$^billi
g$^billin
$^billing
Then sort this table in lexicographic order, where x < ^ < $.
10 / 25
Burrows-Wheeler Transform
billing$^
g$^billin
illing$^b
ing$^bill
ling$^bil
lling$^bi
ng$^billi
^billing$
$^billing
Then the last column of the table is the BWT of the input: ^nbllii$g.
11 / 25
Burrows-Wheeler Transform
To take the inverse BWT, i.e. to recover the original, we can insert the BWT as one column of a table, then sort the rows of the table, add another column, sort, and so on. When we have added all columns and sorted for the last time, the row which ends with $ is the original input. Try it yourself with a short string.
As well as being used with �regular� text �les, BWT is often applied to large amounts of data in bioinformatics � e.g. sequences of millions of characters or bits representing DNA.
12 / 25
Hu�man coding
In textual data, each character typically takes 8 (ASCII) or 16 (UTF-16) bits. Clearly, we should be able to save some space just by encoding text in a binary format.
Hu�man coding is an elegant compression algorithm that combines ideas from RLE (i.e. it exploits frequency information) with binary encoding to save space. Hu�man coding was invented by David Hu�man in 1952.
13 / 25
Hu�man coding
The idea is to generate a binary sequence that represents each character required. This might be the English alphabet, some subset of that, or any collection of symbols. Say we have a 100KB �le made up of repetitions of the letters a to f.
We start by creating a frequency table:
a b c d e f
Frequency (thousands) 45 13 12 16 9 5
14 / 25
Hu�man coding
If we use a �xed-length code we can encode this data in about 37.5KB. If we use a variable-length code and assign the shortest code to the most frequently used characters, we can encode it in just 28KB.
a b c d e f
Frequency (thousands) 45 13 12 16 9 5 Code (�xed-length) 000 001 010 011 100 101 Code (variable-length) 0 101 100 111 1101 1100
15 / 25
Hu�man coding
To send this data over the wire or store it in a �le, we need to send/store the mapping from code to characters followed by the concatenation of all binary codes as they appear in the unencoded document.
Any sequence of codes must be unambiguous � we can only decode this at the other end if no code is a pre�x of any other. If we had used 1 for c and 11 for d, how would we decode 111?
a b c d e f
Code (variable-length) 0 101 100 111 1101 1100
16 / 25
Hu�man coding
To send this data over the wire or store it in a �le, we need to send/store the mapping from code to characters followed by the concatenation of all binary codes as they appear in the unencoded document.
Any sequence of codes must be unambiguous � we can only decode this at the other end if no code is a pre�x of any other. If we had used 1 for c and 11 for d, how would we decode 111?
a b c d e f
Code (variable-length) 0 101 100 111 1101 1100
16 / 25
Hu�man trees
We can create these variable-length codes using a binary tree (not a search tree). In a Hu�man Tree the leaves contain the data, a character and its frequency. Internal nodes are labelled with the combined frequencies of their children.
100
55
25
a: 45
d: 16c: 12
0 1
0
b: 13
1
0
30
14
f: 5 e: 9
0
0
1
1
1
17 / 25
Hu�man trees
To decode data we go start at the root and go left for 0, and right for 1 until we get to a leaf. So, to decode 0101100, we start at the root, read an a, start at the root again, and so on to read abc. An optimal Hu�man tree is full.
100
55
25
a: 45
d: 16c: 12
0 1
0
b: 13
1
0
30
14
f: 5 e: 9
0
0
1
1
1
18 / 25
Creating the Hu�man coding
The most elegant part of this scheme is the algorithm used to create the tree:
1 Make a tree node object for each character, with an extra label for its frequency.
2 Put these nodes in a priority queue, where the lowest frequency has highest priority.
3 Repeatedly: 1 Remove two nodes from the queue and insert them as children
to a new node. The char label of the new node is blank and
the frequency label is the sum of the labels of its children. 2 Put the new node back in the queue. 3 When there is only one item in the queue, that's the Hu�man
tree.
19 / 25
Creating the Hu�man coding
a: 45d: 16c: 12 b: 13f: 5 e: 9
20 / 25
Creating the Hu�man coding
a: 45d: 16c: 12 b: 13
f: 5 e: 9
14
21 / 25
Creating the Hu�man coding
a: 45d: 16
f: 5 e: 9
14
c: 12 b: 13
25
22 / 25
Creating the Hu�man coding
a: 45
c: 12 b: 13
25
d: 16
f: 5 e: 9
14
30
23 / 25
Creating the Hu�man coding
a: 45
c: 12 b: 13
25
d: 16
f: 5 e: 9
14
30
55
24 / 25
Creating the Hu�man coding
100
55
25
a: 45
d: 16c: 12
0 1
0
b: 13
1
0
30
14
f: 5 e: 9
0
0
1
1
1
25 / 25
Transmitting the Hu�man coding
To use a Hu�man coding as part of a compressed �le format we begin the �le with a �magic number� identifying the format used and the length of the alphabet, A, used by the �le. We then store the table as an array in the next |A| bits, followed by one code after another until the EOF character.
This scheme is used by some of the most widely used compressed formats such as GIF and ZIP.
26 / 25
- Compression
- Run-length encoding
- Huffman coding
__MACOSX/Data structures assignment/._week5d.pdf
Data structures assignment/MemoryManagementSimulator.pdf
CI283 Operating Systems Memory Management Simulator
Purpose In this exercise, you will use a Memory Management Simulator. This guide explains how to use the simulator and describes the display and the various input files used and the output files produced by the simulator.
The memory management simulator illustrates page fault behaviour in a paged virtual memory system. The program reads the initial state of the page table and a sequence of virtual memory instructions and writes a trace log indicating the effect of each instruction. It includes a graphical user interface so that you can observe page replacement algorithms at work.
Download memory.zip form Studentcentral and unzip it.
Running the Simulator The program reads a command file, optionally reads a configuration file, displays a GUI window which allows you to execute the command file, and optionally writes a trace file.
To run the program, first compile all Java files and then enter the following command line.
$ java MemoryManagement commands memory.conf
The program will display a window allowing you to run the simulator. You will notice a row of command buttons across the top, two columns of "page" buttons on the left, and an informational display on the right.
Typically, you will use the step button to execute a command from the input file, examine information about any pages by clicking on a page button, and when you're done, quit the simulation using the exit button.
The buttons:
Button Description run runs the simulation to completion. Note that the simulation pauses and
updates the screen between each step.
step runs a single setup of the simulation and updates the display. reset initializes the simulator and starts from the beginning of the command file. exit exits the simulation. page n display information about this virtual page in the display area at the right.
The informational display:
Field Description status: RUN, STEP, or STOP. This indicates whether the current run or
step is completed. time: number of "ns" since the start of the simulation. instruction: READ or WRITE. The operation last performed. address: the virtual memory address of the operation last performed. page fault: whether the last operation caused a page fault to occur. virtual page: the number of the virtual page being displayed in the fields
below. This is the last virtual page accessed by the simulator, or the last page n button pressed.
physical page: the physical page for this virtual page, if any. -1 indicates that no physical page is associated with this virtual page.
R: whether this page has been read. (1=yes, 0=no) M: whether this page has been modified. (1=yes, 0=no) inMemTime: number of ns ago the physical page was allocated to this virtual
page. lastTouchTime: number of ns ago the physical page was last modified. low: low virtual memory address of the virtual page. high: high virtual memory address of the virtual page.
The Command File The command file for the simulator specifies a sequence of memory instructions to be performed. Each instruction is either a memory READ or WRITE operation, and includes a virtual memory address to be read or written. Depending on whether the virtual page for the address is present in physical memory, the operation will succeed, or, if not, a page fault will occur.
Operations on Virtual Memory
There are two operations one can carry out on pages in memory: READ and WRITE.
The format for each command is
operation address
or operation random
where operation is READ or WRITE, and address is the numeric virtual memory address, optionally preceded by one of the radix keywords bin, oct, or hex. If no radix is supplied, the number is assumed to be decimal. The keyword random will generate a random virtual memory address (for those who want to experiment quickly) rather than having to type an address.
For example, the sequence
READ bin 01010101 WRITE bin 10101010 READ random WRITE random
causes the virtual memory manager to:
1. read from virtual memory address 85 2. write to virtual memory address 170 3. read from some random virtual memory address 4. write to some random virtual memory address
Sample Command File
The "commands" input file looks like this:
// Enter READ/WRITE commands into this file // READ // WRITE READ bin 100 READ 19 WRITE hex CC32 READ bin 100000000000000 READ bin 100000000000000 WRITE bin 110000000000001 WRITE random
The Configuration File The configuration file memory.conf is used to specify the the initial content of the virtual memory map (which pages of virtual memory are mapped to which pages in physical memory) and provide other configuration information, such as whether operation should be logged to a file.
Setting Up the Virtual Memory Map
The memset command is used to initialize each entry in the virtual page map. memset is followed by six integer values:
1. The virtual page # to initialize
2. The physical page # associated with this virtual page (-1 if no page assigned) 3. If the page has been read from (R) (0=no, 1=yes) 4. If the page has been modified (M) (0=no, 1=yes) 5. The amount of time the page has been in memory (in ns) 6. The last time the page has been modified (in ns)
The first two parameters define the mapping between the virtual page and a physical page, if any. The last four parameters are values that might be used by a page replacement algorithm.
For example,
memset 34 23 0 0 0 0
specifies that virtual page 34 maps to physical page 23, and that the page has not been read or modified.
Note:
• Each physical page should be mapped to exactly one virtual page. • The number of virtual pages is fixed at 64 (0..63). • The number of physical pages cannot exceed 64 (0..63). • If a virtual page is not specified by any memset command, it is assumed that
the page is not mapped.
Other Configuration File Options
There are a number of other options which can be specified in the configuration file. These are summarized in the table below.
Keyword Values Description enable_logging true
false Whether logging of the operations should be enabled. If logging is enabled, then the program writes a one- line message for each READ or WRITE operation. By default, no logging is enabled. See also the log_file option.
log_file trace- file- name
The name of the file to which log messages should be written. If no filename is given, then log messages are written to stdout. This option has no effect if enable_logging is false or not specified.
pagesize n power p
The size of the page in bytes as a power of two. This can be given as a decimal number which is a power of two (1, 2, 4, 8, etc.) or as a power of two using the power keyword. The maximum page size is 67108864 or power 26. The default page size is power 26.
addressradix n The radix in which numerical values are displayed. The default radix is 2 (binary). You may prefer radix 8 (octal), 10 (decimal), or 16 (hexadecimal).
Sample Configuration File
The "memory.conf" configuration file looks like this:
// memset virt page # physical page # R (read from) M (modified) inMemTime (ns) lastTouchTime (ns) memset 0 0 0 0 0 0 memset 1 1 0 0 0 0 memset 2 2 0 0 0 0 memset 3 3 0 0 0 0 memset 4 4 0 0 0 0 memset 5 5 0 0 0 0 memset 6 6 0 0 0 0 memset 7 7 0 0 0 0 memset 8 8 0 0 0 0 memset 9 9 0 0 0 0 memset 10 10 0 0 0 0 memset 11 11 0 0 0 0 memset 12 12 0 0 0 0 memset 13 13 0 0 0 0 memset 14 14 0 0 0 0 memset 15 15 0 0 0 0 memset 16 16 0 0 0 0 memset 17 17 0 0 0 0 memset 18 18 0 0 0 0 memset 19 19 0 0 0 0 memset 20 20 0 0 0 0 memset 21 21 0 0 0 0 memset 22 22 0 0 0 0 memset 23 23 0 0 0 0 memset 24 24 0 0 0 0 memset 25 25 0 0 0 0 memset 26 26 0 0 0 0 memset 27 27 0 0 0 0 memset 28 28 0 0 0 0 memset 29 29 0 0 0 0 memset 30 30 0 0 0 0 memset 31 31 0 0 0 0 // enable_logging 'true' or 'false' // When true specify a log_file or leave blank for stdout enable_logging true // log_file // Where is the name of the file you want output // to be print to. log_file tracefile
// page size, defaults to 2^14 and cannot be greater than 2^26 // pagesize or <'power' num (base 2)> pagesize 16384 // addressradix sets the radix in which numerical values are displayed // 2 is the default value // addressradix addressradix 16 // numpages sets the number of pages (physical and virtual) // 64 is the default value // numpages must be at least 2 and no more than 64 // numpages numpages 64
The Output File The output file contains a log of the operations since the simulation started (or since the last reset). It lists the command that was attempted and what happened as a result. You can review this file after executing the simulation.
The output file contains one line per operation executed. The format of each line is:
command address ... status
where:
• command is READ or WRITE, • address is a number corresponding to a virtual memory address, and • status is okay or page fault.
Sample Output
The output "tracefile" looks something like this: READ 4 ... okay READ 13 ... okay WRITE 3acc32 ... okay READ 10000000 ... okay READ 10000000 ... okay WRITE c0001000 ... page fault WRITE 2aeea2ef ... okay
Suggested Exercises
1. Create a command file that maps any 8 pages of physical memory to the first 8 pages of virtual memory, and then reads from one virtual memory address on each of the 64 virtual pages. Step through the simulator one operation at a time and see if you can predict which virtual memory addresses cause page faults. What page replacement algorithm is being used?
__MACOSX/Data structures assignment/._MemoryManagementSimulator.pdf
Data structures assignment/.DS_Store
__MACOSX/Data structures assignment/._.DS_Store
Data structures assignment/week4d.pdf
CI583: Data Structures and Operating Systems
Recursion, and some recursive problems
1 / 22
Outline
1 Recursion
2 The Towers of Hanoi
2 / 22
Recursion
Recursion occurs when an algorithm (or method, or function), A, requires us to execute A again, repeatedly.
So that the execution of A does not diverge (run forever) we need to de�ne conditions under which the recursion will end.
We call this the base case: the conditions under which the recursion continues are called the recursive case.
Recursion is closely linked to mathematical induction.
3 / 22
Recursion
Any iterative algorithm can be expressed via recursion, and vice versa.
There are (Turing complete) programming languages which don't include any construct for looping, such as Haskell.
Sometimes an iterative solution is the clearest, but recursive code is often more concise and expresses the solution to the problem elegantly and directly.
Recursion and iteration are more general constructs than the algorithmic strategies we considered in the last lecture (greedy algorithms, dynamic programming, etc).
4 / 22
Recursion
We have seen recursion in OO style, when traversing tree structures:
1 class Node extends Tree {
2 void traverse () {
3 left.traverse ();
4 this.visit ();
5 right.traverse ();
6 }
7 }
8 class Leaf extends Tree {
9 void traverse () {
10 this.visit();
11 }
12 }
5 / 22
Recursion
We can also use it directly within a single method. A simple (naive, in fact) way to compute the nth triangular number (the nth triangular number is found by adding n to the (n − 1)th, so the sequence runs [1, 3, 6, 10 ...]):
1 int triangle(int n) {
2 if (n == 1) return 1;
3 else return n + triangle(n - 1);
4 }
The recursive call should normally be made with smaller input, so that we know the algorithm will terminate. In some recursive algorithms the input might grow before it shrinks but, obviously, shrink it must.
6 / 22
Recursion
The reason triangle can be considered naive is that it will use much more memory at runtime than an iterative solution. In fact, it could cause stack over�ow errors for even fairly small values of n. Look at what happens when we run it:
1 triangle (5)
2 if (5==1) 1 else 5 + triangle (5-1)
3 if (false) 1 else 5 + triangle (5-1)
4 5 + triangle (4)
5 5 + (if (4==1) 1 else 4 + triangle (4-1))
6 5 + (4 + (if (3==1) 1 else 3 + triangle (3-1)))
7 5 + (4 + (3 + (if (2==1) 1 else 2 + triangle (2-1))))
8 5 + (4 + (3 + (2 + (if (1==1) 1 else 1 + triangle (1-1)
))))
9 5 + (4 + (3 + (2 + 1)))
10 15
7 / 22
Recursion
The solution to this is to make triangle tail-recursive, meaning that the recursive call is the last thing the method does.
In this way, we don't need to keep intermediate values hanging around.
Fortunately, every non-tail-recursive algorithm can be rewritten to a tail-recursive version.
Unfortunately, the resulting code is often much less intuitive.
8 / 22
Recursion
In terms of the algorithmic strategies from the last lecture, recursion is particular important in divide-and-conquer strategies.
These are strategies in which we divide the problem into two parts to solved separately (or discard one of them).
An example would be Binary Search, though the algorithm we gave in week 1 was iterative.
We will look at two e�cient sorting algorithms that use a divide-and-conquer approach, MergeSort and QuickSort.
9 / 22
The Towers of Hanoi
The Towers of Hanoi is an ancient puzzle. Move the discs from the �rst peg to the last. You may only move one disc at a time, and you can never put a disc on top of a smaller one. Solving it manually, a pattern soon emerges � see the applet on studentcentral.
10 / 22
The Towers of Hanoi
We will call the set of all discs, in their right order, the stack. (Nothing to do with the data structure.)
4
3
2
1
A B C
11 / 22
The Towers of Hanoi
If you solve the problem manually you will �nd that intermediate, smaller substacks form during the process of solving the puzzle. The only way to transfer disc 4 to tower C is to move everything on top of it out of the way!
4 3
2
1
A B C
12 / 22
The Towers of Hanoi
A recursive solution to move n discs from a source tower, S, to a destination tower, D, via an intermediate tower, I:
1 Move the substack of n − 1 discs from S to I. 2 Move the largest disc from S to D.
3 Move the substack from I to D.
(Recursive de�nitions feel like cheating sometimes!) When we begin, n = 4, S = A, I = B and D = C.
13 / 22
The Towers of Hanoi
4
3
2
1
A B C
14 / 22
The Towers of Hanoi
4 3
2
1
A B C
15 / 22
The Towers of Hanoi
43
2
1
A B C
16 / 22
The Towers of Hanoi
4
3
2
1
A B C
17 / 22
The Towers of Hanoi
But we can't move the substack of discs [1,2,3] all at once. Still, it's easier than moving 4 discs...
In fact, we can move the three-disc substack from A to B in the same way as moving the entire stack but for di�erent values of S, I and D: move substack [1,2] to intermediate tower C, then 3 to destination tower B, then substack [1,2] from C to B.
The problem is getting smaller...
18 / 22
The Towers of Hanoi
How do we move the substack [1,2] from A to C?
This one is quite easy: move 1 to B, then 2 to C, then 1 to C.
This is the base case: note that we have said exactly what we will do without hand-waving like �move the substack...�
19 / 22
The Towers of Hanoi
A recursive solution in Java:
1 class TowersApp {
2 static int nDiscs = 3;
3
4 public static void main(String [] args) {
5 doTowers(nDiscs , 'A', 'B', 'C');
6 }
7 //...
8 }
20 / 22
The Towers of Hanoi
A recursive solution in Java:
1 class TowersApp {
2 //...
3 public static void doTowers(int n, char from , char
inter , char to) {
4 if (n==1) {
5 System.out.printf("Disc 1 from %c to %c\n", from
, to);
6 } else {
7 doTowers(n-1, from , to, inter);//swap from and
inter
8 System.out.printf("Disc %d from %c to %c\n", n,
from , to);
9 doTowers(n-1, inter , from , to);//swap inter and
to
10 }
11 }
12 }
21 / 22
The Towers of Hanoi
It seems astonishing at �rst that a problem that seems like it should be quite complicated can be solved with so little code!
This is certainly a case of the recursive solution being elegant and concise.
We can of course produce an iterative solution to the Towers of Hanoi problem, but it is longer and less clear (to my mind) what is going on.
22 / 22
- Recursion
- The Towers of Hanoi
__MACOSX/Data structures assignment/._week4d.pdf
Data structures assignment/week9c.pdf
Physical media for external storage Optimisations
CI583: Data Structures and Operating Systems File systems: improvements over the early
systems
1 / 15
Physical media for external storage Optimisations
Outline
1 Physical media for external storage
2 Optimisations
2 / 15
Physical media for external storage Optimisations
Media
So far, we have been ignoring the issue of the actual media that data blocks are stored on.
We have been assuming that any block of data can be retrieved with the same cost (O(1)), as if we were accessing locations in a big array.
For a solid-state drive (SSD) this is more or less true.
3 / 15
Physical media for external storage Optimisations
Media
However, if our data is stored on a (mechanical) hard disk drive (HDD) this is very far from being the case. The fact that S5FS also makes this simplifying assumption is one of the reasons for its poor performance.
Image copyright http://www.file-recovery.com/
4 / 15
Physical media for external storage Optimisations
Disk architecture
A typical disk drive consists of several platters each of which has one or two recording surfaces.
Image copyright http://www.file-recovery.com/
5 / 15
Physical media for external storage Optimisations
Disk architecture
Each surface is divided into a number of concentric tracks, and each track is divided into a number of sectors. All the sectors are the same size, and outer tracks have more sectors.
6 / 15
Physical media for external storage Optimisations
Disk architecture
Data is read and written using a set of read/write heads, one per surface. The heads are connected to arms that move together across the surfaces. Only one head is active at any one time. The set of tracks that is under the heads at any one time is called a cylinder.
7 / 15
Physical media for external storage Optimisations
Disk architecture
It should be clear that the idea of the cylinder is an important one – if we take steps to store data on separate surfaces but within the same cylinder we can switch from one head to another (which we can assume takes no time at all) without needing to spin the surfaces. To accommodate this, the position of the heads is offset from each other.
8 / 15
Physical media for external storage Optimisations
Disk architecture
In order to read or write to a location on disk, the disk controller does the following:
1 Move the heads over the correct cylinder. This is the seek time.
2 Rotate the platter until the desired sector is under the head. This is the rotational latency.
3 Rotate the platter so that the entire sector passes under the head, in order to read or write data. This is the transfer time.
9 / 15
Physical media for external storage Optimisations
Disk architecture
Rotational latency depends on the rate at which the disk spins (e.g., 10,000 RPM), and transfer time depends on the spin rate and the number of sectors per track.
Average seek time is usually the most important factor, in the low milliseconds in a typical drive – a very long time compared to the speed at which processors work.
An efficient file system has to take steps to reduce this.
10 / 15
Physical media for external storage Optimisations
Optimisations
With regard to storage media, the two key optimisations the designer of a file system can make are to reduce seek time and increase the amount of data transferred.
A straightforward way to reduce seek time is to use buffering.
11 / 15
Physical media for external storage Optimisations
Optimisations
Just as the OS buffers data from the file system (fetches more than it needs and stores recently accessed data in a cache), disk controllers use a pre-fetch buffer in which the whole of the most recently accessed sector is stored.
This will improve latency for reads but not writes.
12 / 15
Physical media for external storage Optimisations
Optimisations
With regard to storage media, the two key optimisations the designer of a file system can make are to reduce seek time and increase the amount of data transferred.
Other improvements (used in file systems such as Linux ext2):
1 Increased block size. Helpful, but complex data allocation strategies are need to avoid excessive fragmentation (see Doeppner, section 6.1).
2 Reduce seek time by data allocation strategies. Allocate the next block in a way that takes disk architecture into account – use as few cylinders as possible.
13 / 15
Physical media for external storage Optimisations
Optimisations
3 Reduce rotational latency by data allocation strategies. Allocate blocks so that they can be read without repositioning the heads.
This is not a simple as allocating the blocks sequentially – the OS reads data one block at a time. So, to access two blocks, the OS issues the first instruction then waits for an interrupt to say that the work is complete.
By the time the OS responds by asking for the next block, the disk has spun past the subsequent block and will need to complete an entire revolution before the heads are over it again. So, subsequent blocks are interleaved to give the OS time to ask for them just as they are approaching the heads.
14 / 15
Physical media for external storage Optimisations
Optimisations
4 Clustering. Allocate blocks in groups, rather than one by one. ext2 pre-allocates 8 blocks at a time, eventually giving them up if they are not used or space becomes short.
5 Aggressive caching. If memory is abundant, maintain a very large cache and pre-fetch entire files. Works well for reading. For writing, we need to periodically write the cached data back to disk and maintain a log of updates so that work is not lost in case of a crash.
15 / 15
- Physical media for external storage
- Optimisations
__MACOSX/Data structures assignment/._week9c.pdf
Data structures assignment/memory.zip
memory/.DS_Store
__MACOSX/memory/._.DS_Store
memory/commands
// Enter READ/WRITE commands into this file // READ <OPTIONAL number type: bin/hex/oct> <virtual memory address or random> // WRITE <OPTIONAL number type: bin/hex/oct> <virtual memory address or random> READ bin 100 READ 19 WRITE hex CC32 READ bin 100000000000000 READ bin 100000000000000 WRITE bin 110000000000001 WRITE random
memory/Common.java
memory/Common.java
public
class
Common
{
static
public
long
s2l
(
String
s
)
{
long
i
=
0
;
try
{
i
=
Long
.
parseLong
(
s
.
trim
());
}
catch
(
NumberFormatException
nfe
)
{
System
.
out
.
println
(
"NumberFormatException: "
+
nfe
.
getMessage
());
}
return
i
;
}
static
public
int
s2i
(
String
s
)
{
int
i
=
0
;
try
{
i
=
Integer
.
parseInt
(
s
.
trim
());
}
catch
(
NumberFormatException
nfe
)
{
System
.
out
.
println
(
"NumberFormatException: "
+
nfe
.
getMessage
());
}
return
i
;
}
static
public
byte
s2b
(
String
s
)
{
int
i
=
0
;
byte
b
=
0
;
try
{
i
=
Integer
.
parseInt
(
s
.
trim
());
}
catch
(
NumberFormatException
nfe
)
{
System
.
out
.
println
(
"NumberFormatException: "
+
nfe
.
getMessage
());
}
b
=
(
byte
)
i
;
return
b
;
}
public
static
long
randomLong
(
long
MAX
)
{
long
i
=
-
1
;
java
.
util
.
Random
generator
=
new
java
.
util
.
Random
(
System
.
currentTimeMillis
());
while
(
i
>
MAX
||
i
<
0
)
{
int
intOne
=
generator
.
nextInt
();
int
intTwo
=
generator
.
nextInt
();
i
=
(
long
)
intOne
+
intTwo
;
}
return
i
;
}
}
memory/ControlPanel.java
memory/ControlPanel.java
import
java
.
applet
.
*
;
import
java
.
awt
.
*
;
public
class
ControlPanel
extends
Frame
{
Kernel
kernel
;
Button
runButton
=
new
Button
(
"run"
);
Button
stepButton
=
new
Button
(
"step"
);
Button
resetButton
=
new
Button
(
"reset"
);
Button
exitButton
=
new
Button
(
"exit"
);
Button
b0
=
new
Button
(
"page "
+
(
0
));
Button
b1
=
new
Button
(
"page "
+
(
1
));
Button
b2
=
new
Button
(
"page "
+
(
2
));
Button
b3
=
new
Button
(
"page "
+
(
3
));
Button
b4
=
new
Button
(
"page "
+
(
4
));
Button
b5
=
new
Button
(
"page "
+
(
5
));
Button
b6
=
new
Button
(
"page "
+
(
6
));
Button
b7
=
new
Button
(
"page "
+
(
7
));
Button
b8
=
new
Button
(
"page "
+
(
8
));
Button
b9
=
new
Button
(
"page "
+
(
9
));
Button
b10
=
new
Button
(
"page "
+
(
10
));
Button
b11
=
new
Button
(
"page "
+
(
11
));
Button
b12
=
new
Button
(
"page "
+
(
12
));
Button
b13
=
new
Button
(
"page "
+
(
13
));
Button
b14
=
new
Button
(
"page "
+
(
14
));
Button
b15
=
new
Button
(
"page "
+
(
15
));
Button
b16
=
new
Button
(
"page "
+
(
16
));
Button
b17
=
new
Button
(
"page "
+
(
17
));
Button
b18
=
new
Button
(
"page "
+
(
18
));
Button
b19
=
new
Button
(
"page "
+
(
19
));
Button
b20
=
new
Button
(
"page "
+
(
20
));
Button
b21
=
new
Button
(
"page "
+
(
21
));
Button
b22
=
new
Button
(
"page "
+
(
22
));
Button
b23
=
new
Button
(
"page "
+
(
23
));
Button
b24
=
new
Button
(
"page "
+
(
24
));
Button
b25
=
new
Button
(
"page "
+
(
25
));
Button
b26
=
new
Button
(
"page "
+
(
26
));
Button
b27
=
new
Button
(
"page "
+
(
27
));
Button
b28
=
new
Button
(
"page "
+
(
28
));
Button
b29
=
new
Button
(
"page "
+
(
29
));
Button
b30
=
new
Button
(
"page "
+
(
30
));
Button
b31
=
new
Button
(
"page "
+
(
31
));
Button
b32
=
new
Button
(
"page "
+
(
32
));
Button
b33
=
new
Button
(
"page "
+
(
33
));
Button
b34
=
new
Button
(
"page "
+
(
34
));
Button
b35
=
new
Button
(
"page "
+
(
35
));
Button
b36
=
new
Button
(
"page "
+
(
36
));
Button
b37
=
new
Button
(
"page "
+
(
37
));
Button
b38
=
new
Button
(
"page "
+
(
38
));
Button
b39
=
new
Button
(
"page "
+
(
39
));
Button
b40
=
new
Button
(
"page "
+
(
40
));
Button
b41
=
new
Button
(
"page "
+
(
41
));
Button
b42
=
new
Button
(
"page "
+
(
42
));
Button
b43
=
new
Button
(
"page "
+
(
43
));
Button
b44
=
new
Button
(
"page "
+
(
44
));
Button
b45
=
new
Button
(
"page "
+
(
45
));
Button
b46
=
new
Button
(
"page "
+
(
46
));
Button
b47
=
new
Button
(
"page "
+
(
47
));
Button
b48
=
new
Button
(
"page "
+
(
48
));
Button
b49
=
new
Button
(
"page "
+
(
49
));
Button
b50
=
new
Button
(
"page "
+
(
50
));
Button
b51
=
new
Button
(
"page "
+
(
51
));
Button
b52
=
new
Button
(
"page "
+
(
52
));
Button
b53
=
new
Button
(
"page "
+
(
53
));
Button
b54
=
new
Button
(
"page "
+
(
54
));
Button
b55
=
new
Button
(
"page "
+
(
55
));
Button
b56
=
new
Button
(
"page "
+
(
56
));
Button
b57
=
new
Button
(
"page "
+
(
57
));
Button
b58
=
new
Button
(
"page "
+
(
58
));
Button
b59
=
new
Button
(
"page "
+
(
59
));
Button
b60
=
new
Button
(
"page "
+
(
60
));
Button
b61
=
new
Button
(
"page "
+
(
61
));
Button
b62
=
new
Button
(
"page "
+
(
62
));
Button
b63
=
new
Button
(
"page "
+
(
63
));
Label
statusValueLabel
=
new
Label
(
"STOP"
,
Label
.
LEFT
)
;
Label
timeValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
instructionValueLabel
=
new
Label
(
"NONE"
,
Label
.
LEFT
)
;
Label
addressValueLabel
=
new
Label
(
"NULL"
,
Label
.
LEFT
)
;
Label
pageFaultValueLabel
=
new
Label
(
"NO"
,
Label
.
LEFT
)
;
Label
virtualPageValueLabel
=
new
Label
(
"x"
,
Label
.
LEFT
)
;
Label
physicalPageValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
RValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
MValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
inMemTimeValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
lastTouchTimeValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
lowValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
highValueLabel
=
new
Label
(
"0"
,
Label
.
LEFT
)
;
Label
l0
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l1
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l2
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l3
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l4
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l5
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l6
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l7
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l8
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l9
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l10
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l11
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l12
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l13
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l14
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l15
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l16
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l17
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l18
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l19
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l20
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l21
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l22
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l23
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l24
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l25
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l26
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l27
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l28
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l29
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l30
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l31
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l32
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l33
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l34
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l35
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l36
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l37
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l38
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l39
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l40
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l41
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l42
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l43
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l44
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l45
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l46
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l47
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l48
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l49
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l50
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l51
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l52
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l53
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l54
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l55
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l56
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l57
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l58
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l59
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l60
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l61
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l62
=
new
Label
(
null
,
Label
.
CENTER
);
Label
l63
=
new
Label
(
null
,
Label
.
CENTER
);
public
ControlPanel
()
{
super
();
}
public
ControlPanel
(
String
title
)
{
super
(
title
);
}
public
void
init
(
Kernel
useKernel
,
String
commands
,
String
config
)
{
kernel
=
useKernel
;
kernel
.
setControlPanel
(
this
);
setLayout
(
null
);
setBackground
(
Color
.
white
);
setForeground
(
Color
.
black
);
resize
(
635
,
545
);
setFont
(
new
Font
(
"Courier"
,
0
,
12
)
);
runButton
.
setForeground
(
Color
.
blue
);
runButton
.
setBackground
(
Color
.
lightGray
);
runButton
.
reshape
(
0
,
25
,
70
,
15
);
add
(
runButton
);
stepButton
.
setForeground
(
Color
.
blue
);
stepButton
.
setBackground
(
Color
.
lightGray
);
stepButton
.
reshape
(
70
,
25
,
70
,
15
);
add
(
stepButton
);
resetButton
.
setForeground
(
Color
.
blue
);
resetButton
.
setBackground
(
Color
.
lightGray
);
resetButton
.
reshape
(
140
,
25
,
70
,
15
);
add
(
resetButton
);
exitButton
.
setForeground
(
Color
.
blue
);
exitButton
.
setBackground
(
Color
.
lightGray
);
exitButton
.
reshape
(
210
,
25
,
70
,
15
);
add
(
exitButton
);
b0
.
reshape
(
0
,
(
0
+
2
)
*
15
+
25
,
70
,
15
);
b0
.
setForeground
(
Color
.
magenta
);
b0
.
setBackground
(
Color
.
lightGray
);
add
(
b0
);
b1
.
reshape
(
0
,
(
1
+
2
)
*
15
+
25
,
70
,
15
);
b1
.
setForeground
(
Color
.
magenta
);
b1
.
setBackground
(
Color
.
lightGray
);
add
(
b1
);
b2
.
reshape
(
0
,
(
2
+
2
)
*
15
+
25
,
70
,
15
);
b2
.
setForeground
(
Color
.
magenta
);
b2
.
setBackground
(
Color
.
lightGray
);
add
(
b2
);
b3
.
reshape
(
0
,
(
3
+
2
)
*
15
+
25
,
70
,
15
);
b3
.
setForeground
(
Color
.
magenta
);
b3
.
setBackground
(
Color
.
lightGray
);
add
(
b3
);
b4
.
reshape
(
0
,
(
4
+
2
)
*
15
+
25
,
70
,
15
);
b4
.
setForeground
(
Color
.
magenta
);
b4
.
setBackground
(
Color
.
lightGray
);
add
(
b4
);
b5
.
reshape
(
0
,
(
5
+
2
)
*
15
+
25
,
70
,
15
);
b5
.
setForeground
(
Color
.
magenta
);
b5
.
setBackground
(
Color
.
lightGray
);