lab03_capacity_planning.doc

Lab Assignment: Capacity Planning

Objectives:

· Calculate the space required for your tables

· This is just an estimation exercise. It need not be 100% accurate.

Procedure:

For each table in your ERD, you will calculate the expected amount of space that the table will take. Here's how.

Oracle stores data in blocks. For your databases, the database block size is 8192 bytes (from the db_block_size parameter in the init.ora file). However, not all 8192 bytes of each block is used to store data. Instead, there is a some internal overhead at the beginning of each block. The remaining space after the overhead is used to store a row or rows in the block.

In order to determine how much space each table will take, we need to calcuate how much space is available for data in each block. Then, we determine the average row size, and from that, how many rows can fit in each block. Using the number of rows per block and the number of rows of data we will have, we can determine how many blocks will be necessary to store all the data.

First, calculate the amount of overhead in each block.

overhead = fixed header + variable transaction header + table directory + row directory

fixed header = 57 bytes

transaction header = 23 * the inittrans value for the table

We'll use inittrans = 1, so the transaction header = 23

table directory = 4 for non-cluster databases

row directory = 2 * R where R = number of rows in the block

We don't know the number of rows in the block yet

So far, we get overhead = 57 + 23 + 4 + 2*R = 84 + 2*R

The space remaining in the block after overhead is therefore:

8192 - (84 + 2*R) = 8108 - 2*R

Some of the remaining space will remain unused, based on the pctfree value, while the rest will be used for data. That is, of the non-overhead space in the block, a percentage of that (pctfree) will remain unused in order to keep some space available for future updates that might increase the size of a row. If pctfree is 10% then the space used for data is:

data space = available space - (available space * pctfree)

data space = (8108 - 2*R) - ((8108 - 2*R) * 10/100)

data space = (8108 - 2*R) - (810.8 - .2*R)

data space = 8108 - 2*R - 810.8 + .2*R

data space = 7297.2 - 1.8*R

Next, we need to figure out the average size of a row. The row size consists of a 3 byte row header, plus the sum of the sizes of each column. The column sizes depend on the data type. Specifically:

char = 1 byte per character

date = 7 bytes

varchar2 = estimate the average size based on the data

number = estimate the average value of a number in the column, then the number of bytes to hold that value is log(avg_value) / log(2) / 8 rounded up to the nearest integer

Here's an example. Let's say a row in our table contains these columns:

employee_id number(38)

name varchar(100)

dept_code char(4)

Assume that employee_ids tend to be 7 digits. So the average value of an employee_id might be 5000000. log(5000000) / log(2) / 8 = 2.78, rounded up gives us 3 bytes on average for the employee_id

For the name, let's assume 40 bytes on average

And the dept_code is 4 bytes always

In total, the size of an average row is 3 + 3 + 40 + 4 = 50 bytes

Now that we know the overhead size and the size of an average row, we need to calculate the number of rows per block. This will be the available data space divided by the average row size.

R = (7297.2 - 1.8*R) / 50

R = 145.94 - .036R

1.036R = 145.94

R = 140.87

Round up, and we get 141 rows per 8192 byte block

If the table contains 1000 rows, then it will take:

blocks = ceiling(1000 / 141) = ceiling(7.09) = 8

It will take 8 blocks, or 65536 bytes of storage to store the entire table.