Software Engineering - Fault Tree

profilexoalingabuyar
CIS2341Lesson5.pptx

CIS 2341 Lesson 5 Failure Mode and Effect Analysis.

References

Failure Mode Effect Analysis: FMEA from Theory to Execution, D.H. Stamatis

The FMEA Pocket Handbook, Kenneth W. Dailey

Softrel.com

Effects of Failures

If we only assured 99.9% quality in the US, the failure rate would result in the following effects:

Risk reduction via FMEA

The elimination, control or reduction of risk is a total commitment by the entire organization, and it is more often than not the responsibility of the engineering department. FMEA is a specific methodology to evaluate a system, design, process, or service for possible ways in which failures (problems, errors, risks, concerns) can occur.

Why conduct an FMEA? Benefits of executing this method:

Helps to define the most significant opportunity for achieving fundamental differentiation

Improves the quality and reliability and safety of a product or service

Helps to select alternatives with high reliability and high safety potential during early phases of the system development life cycle

Improves the company’s image and competitiveness

Helps to increase customer satisfaction

Helps to determine redundancy of the system

Helps to establish the forum for defect for defect prevention

Helps to define the corrective action

Establishes a priority for design improvement actions

Lists potential failures and identifies the relative magnitude of their effects

Provides the basis for the test program during development and final validation of the system, design, process or service

What is a failure mode

A failure mode is the effect by which a failure is observed in a system component. It is important that all possible or potential failure modes of a system be listed as this is the essential basis of the FMEA.

For new components, reference can be made to other components with similar functions and structures and tests performed on them

For commonly used components already in service, records on their performance, reported failures, and existing tests can be consulted

Complex components that can be broken down into elements can be analyzed qualitatively, treating each systems

2 common ways of classifying failure modes

1. Identification if general failure modes (such as premature operation, time out, failure during operation, failure to cease operation)

2.By listing as completely as possible all generic failure modes (such as erroneous input, loss of output, security issues, communications etc)

System evaluation

Evaluation for a system using FMEA method

Standards methodology

requirements

Development

Quality Assurance Evaluation

Evaluation Reports

Quality Assurance track corrective action

The process of FMEA

To conduct FMEA effectively, one must follow a systematic approach.

Select the team and brainstorm

Functional block diagram and/or process flowchart

Prioritize

Data Collection

Analysis

Results

Confirm, evaluate, measure

Repeat (do it all over again)

What happens after the completion of FMEA?

Is the problem identification specific?

Was the effect, symptom or root cause identified?

Is the corrective action measurable?

Is the corrective action proactive?

Is the corrective action realistic and sustainable?

FMEA Process

8

The basis FMEA

Information necessary to perform the FMEA:

The different system elements with their characteristics

The connection between elements, tasks, components

Redundancy level and nature of redundancy

Data pertaining to functions, characteristics and performance

A failure mode is the effect by which a failure is observed in a system component. It is important that all possible or potential failure modes of a system be listed .

GENERAL FAILURE MODES

Premature operation

Failure to operate at a prescribed time

Failure to cease operation at a prescribed time

Failure during operation

Failure to start, stop, switch, close

Loss of input, loss out output, erroneous input/output

Tolerance failures

Code errors

Security issues

Intermittent operation

Identification of Failure Modes

Identification of failure modes, their causes and effects, their relative importance, and their sequence:

The operation of a successful FMEA is dependent on the performance of critical system elements. The key to evaluation of system performance is the identification of critical elements. The procedures for identifying failure modes, their causes and effects can be effectively enhanced by the preparation of a list of failure modes anticipated in view of:

System usage

Mode of operation

Pertinent operation specifications

Time constraints

Environment

Failure Mode Checklist (example)

Logic Missing

Are all constants defined and used?

Are all defaults checked explicitly (blanks in an input field) ?

If character strings are created, are they complete? Are delimiters used and necessary?

If a keyword has many unique values, are they all checked?

Are all keywords tested in a macro?

Are all keyword related parameters tested in a service routine?

Are all increment counts properly initialized?

After processing a data entry table, should any value be decremented/incremented?

Is provision made for possible processing at logical checkpoints (end-of-file etc)

If a queue is being manipulated, can the execution be interrupted?

After queuing/de-queuing, should any value be decremented or incremented?

Should any registers be saved on entry?

Should any registers be restored on exits?

11

Failure Mode Checklist ( cont’d)

Logic Wrong

Are literals used where there should be constant data names?

On comparison of group items, should all fields be compared?

Are internal variables unique?

Logic Extra

Are all data areas necessary?

Does this module contain redundant logic?

Control block definition/usage missing

Are pointers declared as XX bit pointers?

Is the bit configuration for input/output parameters defined?

Is the field property defined in the control block/data area?

Is the design dependent on building/creating/deleting various control blocks/data areas, is it provided for in the code?

12

Failure Mode Checklist ( cont’d)

Bits, Byte, Reset etc.

Initialize all variables before usage – never assume zeroes

Initialize all fields of a control block, do not leave garbage

Reserved fields must be initialized to zero

Early termination – pointer values not reset

First buffer released, but not others

Data types, variable lengths

When defining counters, make sure boundaries are sufficient

Make code data independent whenever possible

Permutations and parameter values, labels

Parms passed in wrong order

Update return code on error conditions

Missing parameters (comma missing, moved/copied code etc.)

Duplicate labels

Made-up labels as coder went along

13

Failure Mode Checklist ( cont’d)

Loop logic errors

Consider all flags on each iteration

Consider 3 loop conditions: 1st pass, last pass, middle iteration

Initialize all flags and counters before entering loop

Increment counters on each iteration

Update all pointers on each iteration

Wrong bit checked

DO WHOLE instead of DO UNTIL

OR instead of AND on IF statement

Tested OFF instead of ON

X ‘YY’ should have been X ’10’

Resetting of Bits in Wrong Place

Flag set in control block at wrong time

14

Easy Steps of FMEA

13 Easy steps of FMEA

Create a detailed Component List

Identify functions

Identify failure modes

Describe the effects

Assign Severity ranking

Identify root causes

Assign occurrence ranking (OR)

Identify Design current control

Assign detection ranking (DR)

Calculate Risk Priority Number (RPN)

Sort RPN from high to low

Set action items and take corrective action

Recalculate the resulting RPN and return to step 10

15

Example 1 of Fault Tree Analysis

Module/function : a user accessing an application on a web site (any with authorization)

16

Example 2 of Fault Tree Analysis

17

Example 3 of Fault Tree Analysis

18

Critical Failure modes and effects

Security

Failures: any security breach or penetration is critical failure. Effect: loss of business, loss of Walmart brand recognition and potentially legal liability. Example: injection attack possibility or email fraud at <input value="" placeholder="Email address" title="Email address" data-tl-id="footer-GlobalEmailSignup-formInput" class="form-control " data-reactid=".2glvuk2txc0.1.0.3.0.1.0.0.1.0.0.1.2.1">

Data

Failures :any significant data corruption or data transfer failure is critical. Effect: loss of consumer data, loss of advertising data and ultimately loss of Walmart revenue Example: data transfer from advertisers can be Source data can become corrupted <img src="" alt="" data-triggered="0" data-beacon-src="//beam.hlserve.com/b/I8K39PA7KUCqLL0niGw31Q?hlpt=H&amp;fid=96&amp;pageguid=fbcc557f-1ebf-40ba-8b8e-

19

Critical Failure modes and effects

Navigation

Failures :any inaccessible links or clickable images if they don’t work is critical. Effect: user cannot find item and cannot complete purchase, ultimately revenue loss to Walmart Example: clickable link does not navigate user to correct target page <img src="https://tpc.googlesyndication.com/simgad/17347722438563713569" border="0" width="300" height="250" alt="" class="img_ad">

Infrastructure

Failure :any significant outage be it network or server failure is critical. Effect: user cannot operate the web site and ultimately Walmart revenue loss. Example: if DNS server goes down and becomes unresponsive href="//beacon.walmart.com" rel="dns-prefetch"/><link

20

Practice

1. Prepare a Fault Tree analysis of www.zillow.com home page

2. Identify how many critical failure modes can occur in the feature set you selected and record the effects of each.

21

Homework

Reading:

The FMEA pocket handbook by Kenneth W Dailey

Writing, individual submission: will be posted in Moodle

1. Prepare a Fault Tree analysis of www.ndnu.edu/admissions/request-info-freshman page.

2. Identify how many critical failure modes can occur in the feature set you selected and record the effects of each.

22