Computer Architecture Reflective Journal Report

profileSanish123
CALectureWeek9.ppt

Kent Institute Australia Pty. Ltd.

ABN 49 003 577 302 CRICOS Code: 00161E
RTO Code: 90458 TEQSA Provider Number: PRV12051

CARC103 – Computer Architecture

*

*

Prescribed Text

Bird, S. D. (2017), Systems Architecture, 7th ed, Cengage Learning

*

*

Systems Architecture,
Seventh Edition

Chapter 12

Secondary Storage Management

*

*

Systems Architecture, Seventh Edition

Chapter Objectives

  • In this chapter, you will learn to:
  • Describe the components and functions of a file management system
  • Compare the logical and physical organization of files and directories
  • Explain how secondary storage locations are allocated to files and describe the data structures used to record those allocations

*

Systems Architecture, Seventh Edition

Chapter Objectives (continued)

  • Describe file manipulation operations
  • List access controls that can be applied to files and directories
  • Describe file migration, backup, and recovery methods
  • Explain methods for ensuring fault tolerance
  • Compare storage consolidation methods, such as storage area networks, network-attached storage, and cloud-based storage services

*

Systems Architecture, Seventh Edition

FIGURE 12.1 Topics covered in this chapter

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

File Management Overview

  • Users and programs don’t interact directly with secondary storage blocks and devices
  • They interact with files – groups of storage blocks that hold programs and data
  • File management functions such as creation, copying, reading, and writing are implemented within the service layer (a.k.a. file control layer)
  • Application programs interact directly with file management service layer routines
  • Users interact with the command layer which, in turn, uses service layer routines to perform file management tasks

FIGURE 12.2 File management system layers

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Logical and Physical Storage Views

  • Secondary storage devices hold bits, but access those bits in larger units generically called blocks (for example, one sector of a magnetic hard disk)
  • Blocks are to small for users and programs to directly interact with
  • Users and programs need larger and more useful logical abstractions, including:
  • File – a group of blocks that holds one related set of data or instructions
  • Folder (directory) – a container for a related set of files
  • Volume – a container for a hierarchically organized set of folders and files
  • For example, C: on Windows

FIGURE 12.3 Logical and physical secondary storage views for a typical laptop computer

*

Systems Architecture, Seventh Edition

File Content and Type

  • Most FMSs directly support only a few file types:
  • Executable programs
  • Operating system commands
  • Textual or unformatted binary data
  • Other file types are indirectly supported by file association: a defined relationship between file types and the programs or OS utilities that manipulate them

*

Systems Architecture, Seventh Edition

Volumes, Files, and Folders in Windows Explorer

*

Systems Architecture, Seventh Edition

Volumes

  • A single disk can be divided into multiple volumes or a volume can span multiple disks
  • A volume is an independent FMS management unit for purposes of:
  • Driver letter (Windows) or mount point (UNIX/Linux) assignment
  • Error checking, backup/recovery, defragmentation, and recycling
  • Root directory access controls
  • Quotas, versioning, journaling, and other optional FMS features

*

Systems Architecture, Seventh Edition

File Content and Organization

  • Files can contain many different kinds of data:
  • Machine or OS instructions (for example, a Windows .exe file or Linux shell script)
  • Textual data (for example, as created by Notepad)
  • Image data (for example, as capture by a camera)
  • Formatted documents (for example, a Web page)
  • Variations in file content lead to variations in internal file organization, for example:
  • Files containing machine language programs are organized to simplify loading the instructions into memory
  • Files containing web pages are organized to simplify transmission across a network and Web browser display
  • Image files are organized to simplify display and editing

*

Systems Architecture, Seventh Edition

File Type and Association

  • Operating systems generally support two file-related:
  • File type – A limited set of file types that the OS keeps track of and uses to enable or restrict certain functions, for example:
  • Types – executable instructions, OS commands, other
  • Enable/restrict:
  • Executable files can be loaded into memory and executed by the CPU but can’t be printed or read/written by programs
  • OS command files (scripts) can be executed by an OS utility and can be printed and read/written by programs
  • Other files can’t be executed but can be printed and read/written by programs
  • File type association – Specific file types are associated with specific programs that perform specific actions on those files, for example:
  • Ordinary text file – Notepad used to print/edit
  • JPEG image file – Adobe Illustrator used to print/edit
  • HTML file – Edited by Word and displayed/printed by Internet Explorer
  • Windows provides an extensible set of file types
  • Filename extension determines file type, for example
  • Document.txt is a text file associated with Notepad
  • Document.doc is a document file associated with Microsoft Word
  • Double-clicking a file opens it within the program associated with that file’s type

*

Systems Architecture, Seventh Edition

File Type and Association - Exercise

  • Perform the following tasks on computer running Windows:
  • From the desktop, click Start and select Control Panel
  • Click Programs
  • Click Default Programs
  • Click Associate a file type or protocol with a specific program
  • View the list of file types and their default program associations
  • Select .txt and click Change Program
  • View the list of programs that could be assigned to .txt files (click Cancel when you’re finished)
  • Close all of the open control panel windows
  • Open a folder containing some of your own files, preferably of several different types – note the displayed file names
  • Click Organize in the upper left menu bar
  • Click Folder and search options
  • Click the View tab
  • Uncheck the box labeled Hide extensions for know file types
  • Click OK
  • Reexamine the displayed file names and note that all files with the same extension have the same icon displayed to the left of the filename.

*

Systems Architecture, Seventh Edition

FIGURE 12.4 Registered Windows file types and associated programs

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

FIGURE 12.5 Context menus and commands for a Windows file type

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Folder Content and Structure

  • Contain information about files and other folders, typically:
  • Name
  • File type
  • Location
  • Size
  • Ownership
  • Access controls
  • Time stamps

*

Systems Architecture, Seventh Edition

FIGURE 12.7 Windows folder listing with additional details

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Hierarchical Folder Structure

  • In most OSs, folders can contain other folders, but no folder can be contained within more than one folder
  • Result is a hierarchical (or tree) folder structure as shown in the left frame of the next slide
  • Terminology (refer to next slide for examples)
  • Current (working) directory – folder currently displayed or the folder location within which a program looks for files
  • For example, Chapter08 in the figure
  • Home folder – default current folder for a user or program
  • For example, C:\Users\Burd\My Documents for the user Burd under Windows
  • Complete path or fully qualified reference – filename plus all folder names upward through the volume root
  • For example, T:\Systems Architecture\6e\Chapters\Chapter08\Solutions_08_Au.doc
  • Relative path – file name plus all folder names upward to the current (working) folder
  • For example, if the current folder is 6e then .\Chapters\Chapter08\Solutions_08_Au.doc is a relative path to Solutions_08_Au.doc where . is a shorthand for the current folder

*

Systems Architecture, Seventh Edition

Current (working) directory

FIGURE 12.8 A hierarchical directory structure

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Graph Folder Structure

  • Graph folder structures allow a folder to be contained within multiple other folders
  • That introduces the possibilities of loops, which complicate operations such as error checking and enumerating volume contents
  • Many OSs take an “in-between” approach based on links or shortcuts (see dotted line in figure)

FIGURE 12.9 A graph folder structure

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Storage Allocation Overview

  • The user doesn’t “see” hard disk drives, SSDs, and other secondary storage devices
  • The user does see volumes, folders, and files managed by the operating system
  • Storage allocation is the processes of managing or linking the relationship(s) between:
  • Storage objects the user sees (a logical view), and
  • Storage devices/locations in hardware (a physical view)

*

Systems Architecture, Seventh Edition

Allocation Units

  • Recall from chapter 5 that most disk drives (and devices that simulate them) exchange data move data back and forth to memory in 512 or 4096 byte chunks called sectors
  • An operating system manages chunks of secondary storage called allocation units, which may be the same size as sectors or a multiple thereof
  • Allocation units are assigned (or allocated) to logical storage objects such as volumes, directories, and files
  • Allocation unit size is determined when a volume is initialized – size determines:
  • How efficiently storage space is allocated and managed
  • The size of data structures (tables) that track allocation units

*

Systems Architecture, Seventh Edition

Allocation Units (continued)

  • Allocation unit size trade-offs
  • Efficient use of secondary storage space for files
  • Size of storage allocation data structures
  • Efficiency of storage allocation procedures
  • Smaller units: more efficient use of storage space
  • Larger units: allow for smaller storage allocation data structures

*

Systems Architecture, Seventh Edition

Storage Allocation Tables

*

Systems Architecture, Seventh Edition

Storage Allocation Tables - Continued

  • Small allocation unit size (and large storage allocation tables) wastes less storage space when there are many small files (smaller than the allocation unit size)
  • For example, storing a 1 byte file wastes:
  • 511 bytes if allocation unit size is 512 bytes
  • 4095 bytes if allocation unit size is 4096 bytes
  • Larger allocation unit size (and smaller storage allocation tables) speeds tasks that need to search or update the table
  • For example, calculating total free space or copying a large file
  • The proper trade-off among the two factors depends on many factors including average file size, volume size, disk I/O speed, and available RAM for buffering and caching
  • Allocation unit size can be chosen by the system administrator or automatically determined by the OS when a volume is created

*

Systems Architecture, Seventh Edition

FIGURE 12.8 Storage blocks allocated to three files

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Free allocation units are assigned to a hidden system file called SysFree.

TABLE 12.1 Directory content for the files in Figure 12.8

*

Systems Architecture, Seventh Edition

FIGURE 12.9 A storage allocation table matching the storage allocations in Figure 12.8

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Blocking

  • A logical record is the unit of data read from or written to a file by an application program (e.g., all of the data about one customer)
  • A physical record is the unit of data read from or written to a storage device in a single operation
  • Comparing the relative size of logical and physical records, there are three possible relationships:
  • Logical record size = physical record size - unblocked
  • Logical record size < physical record size
  • Logical record size > physical record size

*

Systems Architecture, Seventh Edition

Blocking - Continued

If logical and physical records are of different sizes then the term blocking factor describes the number of logical records per physical record

Burd, Systems Architecture, 7th edition, Figure 12.12 Copyright © 2016 Cengage

*

Systems Architecture, Seventh Edition

Blocking and Buffering

  • The FMS or OS uses buffers to support data transfer between secondary storage and an application program
  • Each buffer is the same size as a physical record – 512 or 4096 bytes for most disk drives
  • Multiple buffers can be used to improve performance
  • A physical read operation moves one physical record from a secondary storage device to a buffer
  • A logical read operation moves one logical record from one or more buffers to the application program

FIGURE 12.11 Input from secondary storage to an application program using a buffer

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

File Manipulation

  • Exact set of service layer functions varies among FMSs, but typically includes create, copy, move, delete, read, and write
  • Application programs interact directly with FMS through OS service layer
  • Users interact indirectly with FMS through command layer

*

Systems Architecture, Seventh Edition

File Open and Close Operations

  • The OS service layer contains utility functions to open and close files:
  • File open operation

Locate the file in the directory structure and read its directory entry

Search an internal table of open files to see whether the file is already open

Ensure that the process has enough privileges to access the file

Allocate one or more buffers

Update an internal table of open files

  • File close operation

Flush the program’s file I/O buffers to secondary storage

Deallocate buffer memory

Update the file’s directory entry time stamps

Update the open file table

*

Systems Architecture, Seventh Edition

File Delete and Undelete Operations

  • There are multiple ways for an OS/FMS to perform a file delete operation – the most common are:
  • Move the file’s directory entry to a directory of deleted files (e.g., a recycle bin)
  • A file undelete operation simply moved the directory information back to its original location
  • Mark the file as deleted in its directory entry and mark its allocation units as free in the storage allocation table
  • File undelete is possible if neither the directory entry nor the storage allocation table have been overwritten with new data
  • Overwrite the data content of the file’s allocation units with zeros, mark the allocation units as free in the storage allocation table, and overwrite the file’s directory entry with zeros
  • File undelete isn’t possible via the OS/FMS, though some advanced forensic techniques can examine individual disk bits and determine earlier values of those bits

*

Systems Architecture, Seventh Edition

Access controls

Each file or folder includes an access control list (ACL) that specifies users or groups and what they’re allowed to do (sometimes described by the word privileges)

The ACL is checked against user ID when a file is opened and manipulated

FIGURE 12.14 Windows access control list content

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

File Migration (Version Control)

  • Versioning - As file contents are updated creating new versions, older versions are archived
  • Older versions “migrate” from:
  • Fast local storage
  • Slower local storage
  • Remote storage such as cloud-based backups
  • Older versions can still be “seen” in folder listings and are copies back if needed
  • Balances storage cost of each file version with anticipated user demand for that version

*

Systems Architecture, Seventh Edition

File Backup

  • Protects against data loss (file content, folder content, and storage allocation tables)
  • Storing backup copies:
  • Different storage device within local computer
  • Removable storage device attached to local computer
  • A local network-attached backup device or server
  • A remote network-attached backup server

*

Systems Architecture, Seventh Edition

Archive Attributes and Timestamps

  • Each file/directory entry includes data used by backup utilities to determine if or when a file was backed up.
  • Windows uses a Boolean archive attribute which is set if file has been modified since last backup.

*

Systems Architecture, Seventh Edition

Backup Types

  • Full backup
  • Copies all selected files and directories
  • Clears archive attributes
  • Incremental backup
  • Copies only files and directories created or changed since the last incremental backup (files with the archive attribute set)
  • Clears archive attributes
  • Differential backup
  • Copies only files and directories created or changed since the last normal or incremental backup (files with the archive attribute set)
  • Doesn’t clear archive attributes

*

Systems Architecture, Seventh Edition

Transaction Logging

  • Automatically records all changes to file content and attributes in a separate storage area
  • Also writes them to the file’s I/O buffer
  • Provides high degree of protection against data loss due to program or hardware failure
  • Imposes a performance penalty
  • Used only when costs of data loss are high

*

Systems Architecture, Seventh Edition

File Recovery

  • Automated and manual components
  • Can search backup logs for copies of lost or damaged files
  • Can perform consistency checking and repair procedures for crashed system or physically damaged storage device
  • All storage locations appear in the storage allocation table and other data structures.
  • All files have correct folder entries.
  • All storage locations of a file can be accessed through the storage allocation table.
  • All storage locations can be read and/or written.

*

Systems Architecture, Seventh Edition

Fault Tolerance

  • What is fault tolerance?
  • The term fault tolerance describes software, hardware, and operating procedure characteristics that ensure:
  • Minimal resource/service unavailability due to faults
  • Minimal data loss due to faults
  • Common fault causes:
  • Power interruption
  • Hardware failure (e.g., disk or network crash)
  • Operating system failure (BSOD - blue screen of death)
  • Service failure (e.g., web service crash, denial of service attack)

*

Systems Architecture, Seventh Edition

Fault Prevention and Mitigation

  • Most faults can’t be completely prevented, but their probability of occurrence can be reduced and their negative consequences can be minimized
  • Common fault prevention strategies:
  • Adequate security
  • Power conditioning
  • Reliability testing and configuration
  • General fault mitigation strategies
  • Hardware redundancy (e.g., RAID, backup servers)
  • Service redundancy (e.g., load balancing)
  • Resource redundancy (e.g., file replication and periodic backup to redundant and/or remote storage devices)

*

Systems Architecture, Seventh Edition

Mirroring

  • All disk write operations are made concurrently to two different storage devices
  • Provides high degree of protection against data loss with no performance penalty if implemented in hardware
  • Disadvantages
  • Cost of redundant disk drives
  • Higher cost of disk controllers that implement mirroring

*

Systems Architecture, Seventh Edition

RAID Overview

  • RAID stands for redundant array of inexpensive disks
  • RAID (most levels) achieves fault tolerance through various types and levels of redundant storage
  • Higher redundancy usually implies greater fault tolerance
  • RAID (most levels) achieves performance improvements through parallelism
  • There are multiple kinds of RAID (a.k.a. RAID levels)
  • Each level is a different combination or parallelism, redundancy amount, and redundancy type
  • “Standard” RAID levels include 0-5
  • RAID 2, 3, and 4 are antiques
  • RAID 10 is widely used but there is no “standard”

*

Systems Architecture, Seventh Edition

RAID 0 (Striped Volumes)

  • Striped Volume (RAID-0)
  • Volumes on multiple disks are combined into one volume
  • Data is systematically distributed across all disks
  • For example, a 1 MB file is spread equally across four disks in rotating 64 KB blocks resulting in 256 KB stored on each disk
  • There is some associated CPU overhead.
  • Pros - data access is faster IF all disks can be read/written in parallel (disk controller must support parallelism)
  • Cons - if any disk fails the entire volume is trashed
  • Alternatives:
  • One big VERY fast disk
  • RAID-5 or higher
  • Striping by itself is primarily used to increase read/write performance (e.g., a gaming desktop)
  • Higher RAID levels combine striping with redundancy

*

Systems Architecture, Seventh Edition

FIGURE 12.16 Data striping across four disks

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

RAID 1 (Mirrored Volumes)

  • Mirrored Volumes (RAID-1)
  • Every volume has a duplicate on a different disk
  • All writes go to both disks in parallel
  • Reads come from either disk
  • Pros:
  • High fault tolerance
  • Cons:
  • Double cost or half capacity
  • Write performance is always slower but not by much if using a smart disk controller
  • As disks have become cheaper, mirroring (alone or in combination with redundancy) has become much more common

*

Systems Architecture, Seventh Edition

RAID-5

  • RAID-5 Volume
  • Volume spans multiple disks (at least three)
  • Data content is striped
  • Parity information is added within file in such a way that parity data alternates across disks per file/directory
  • If any drive fails, data can be recovered from the remaining drives
  • Lost parity information isn’t a problem (it’s redundant)
  • Lost data can be recomputed from remaining blocks
  • For example, assume 9 disks, 1 byte, even parity
  • 1 1 0 0 1 1 0 0 + parity bit (0)
  • A “lost” bit is zero if there are an even number of remaining 1 bits, one otherwise

*

Systems Architecture, Seventh Edition

RAID-5 ― Continued

  • Pros:
  • High fault tolerance - access continues in the event of any single disk failure
  • High read performance - same as striped volume (parity data is ignored when reading)
  • Cons:
  • Slower write performance than striped volume due to computation and storage of parity information
  • Storage capacity is less than striped - one disk worth of data is redundant parity information

*

Systems Architecture, Seventh Edition

RAID-5 Failure Recovery

  • If one disk dies the system continues to function but without protection against additional failures
  • To restore failure protection:
  • Install new disk
  • Format new disk
  • Recreate lost data on new disk by doing “parity math”
  • Recovery operation is slow because it’s both
  • CPU intensive
  • I/O intensive
  • System throughput for “real” work typically declines – by how much?

*

Systems Architecture, Seventh Edition

FIGURE 12.18 16 KB stored in a four-disk RAID 10 array

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Storage Consolidation

  • Until at least the 1990s:
  • File systems were always integral parts of an operating system
  • Each operating system controlled one set of hardware, including secondary storage devices
  • The term direct-attached storage (DAS) describes any architecture in which software “talks” to secondary storage devices that are directly connected to the computer system that executes the software

*

Systems Architecture, Seventh Edition

Direct-Attached Storage Limitations

  • DAS made a lot of sense in the mainframe era
  • Most organizations owned one mainframe
  • All software ran on the mainframe
  • All data was stored on directly-attached disks
  • By the late 1980s it started to be a limitation:
  • Organizations use multiple computer systems
  • Could a file or database stored on disks attached to one computer system be accessed by software running on another computer system?
  • Could two computer systems share a pool of disks or a common file system or database stored on that pool of disks?
  • After multiple evolutionary steps – three main approaches were developed to “answer” the questions above:
  • Storage-area network
  • Network-attached storage
  • Cloud-based storage

*

Systems Architecture, Seventh Edition

Storage Area Network

  • A storage area network (SAN) consists of:
  • A “back-end” network connecting a family of servers
  • One server is a storage server and accepts I/O requests from the other servers via the network
  • A pool of secondary storage devices attached to a storage server and managed as one large set of allocation units (disk sectors)

FIGURE 12.19 A server cluster with a storage area network

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

Network-Attached Storage

  • Network-attached storage (NAS) uses:
  • A pool of secondary storage devices attached to a storage
  • A storage server manages the storage pool as one or more file systems and responds to read/write requests from other servers
  • An ordinary WAN or LAN connects all servers

FIGURE 12.20 Network-attached storage

Courtesy of Course Technology/Cengage Learning

*

Systems Architecture, Seventh Edition

SAN and NAS Compared

  • Access types
  • SAN: Client servers perform direct disk I/O at the sector level via the storage server
  • NAS: Client servers transfer service layer file I/O requests to the storage server for processing
  • Performance
  • SAN: Supports high-performance disk I/O for a small number of servers in close geographic proximity
  • NAS: Supports lower-performance file I/O but can potentially support a larger number of servers over a wide geographic area
  • Cost
  • Relatively high – High-capacity back-end switching network
  • Relatively low – storage server is an “appliance” and existing network connections are used

*

Systems Architecture, Seventh Edition

Cloud-Based Storage

  • DAS, SAN, and NAS are holdovers from an era when files were assumed to be stored and accessed from a single device
  • Data stored on a server and users accessed via an application
  • Data stored on a personal computing device
  • In the modern world, users have access to many computing devices and they want to access their files from any of them
  • Examples of modern tools that enable user-centric storage:
  • GoogleDrive
  • DropBox
  • Microsoft OneDrive

*

Systems Architecture, Seventh Edition

Cloud-Based Storage - Continued

  • Common features of cloud-based storage services:
  • Each user has a unique account and user ID
  • A user can associated their cloud storage user ID with one or more computing devices (e.g., their smartphone, tablet, and laptop)
  • User files stored on any associated device are automatically synchronized to and from the cloud and then to and from all other associated devices
  • A Web-based application provides a way for users to access files from devices that haven’t been associated with their user ID
  • The underlying technology is similar to caching
  • A server or server group is the main storage device
  • A volume or folder points to cloud-based storage
  • As files area access they’re cached on local storage of each computing device from which the user accesses files
  • Changes to cached copies are replicated back to the server(s)
  • When a user access a cached file copy on any device, it’s compared to the server copy and a refresh is performed if needed

*

Systems Architecture, Seventh Edition

Summary

  • File management systems
  • Directory content and structure
  • Storage allocation
  • File manipulation
  • Access controls
  • File migration, backup, and recovery
  • Storage consolidation

*

kent.edu.au

Kent Institute Australia Pty. Ltd.
ABN 49 003 577 302 ● CRICOS Code: 00161E ● RTO Code: 90458 ● TEQSA Provider Number: PRV12051

*

*