Toastmasters application development

Divya Gorrela
project_final_srs.pdf

1

Christopher Hansen

Professor Batchelder

11/16/2020

Software Requirement Specification Document for Toastmaster’s Toolbox System

Table of Contents

Preface .......................................................................................................................................................... 3

Introduction .................................................................................................................................................. 3

Glossary ......................................................................................................................................................... 4

User Requirements Definition ...................................................................................................................... 5

Functional Requirements .......................................................................................................................... 5

Non-Functional Requirements .................................................................................................................. 5

Use Case Diagram ..................................................................................................................................... 6

System Architecture ...................................................................................................................................... 7

Server and Host Application Logic Module ............................................................................................... 7

Facial Expression Recognition Thread ................................................................................................... 8

Web Server Hosting Thread .................................................................................................................. 8

Speech Recognition Thread .................................................................................................................. 9

Timer Thread ......................................................................................................................................... 9

File I/O Methods ................................................................................................................................... 9

Reporting Methods ............................................................................................................................. 10

Generic Application Methods ............................................................................................................. 10

Server and Host Graphical User Interface Module ................................................................................. 10

Client Ah-Counter Application Logic Module ......................................................................................... 10

Client Ah-Counter Graphical User Interface Module .............................................................................. 11

System Requirements Specification ........................................................................................................... 12

Functional Requirements ........................................................................................................................ 12

Non-functional Requirements ................................................................................................................ 12

System Models ............................................................................................................................................ 15

System Context Diagram ......................................................................................................................... 15

Software System Context Diagram ......................................................................................................... 16

2

System Architecture Overview ............................................................................................................... 17

System Evolution......................................................................................................................................... 18

Appendix A – Hardware Requirements ...................................................................................................... 19

Appendix B – Diagrams ............................................................................................................................... 19

3

Preface

The software requirements specification document serves to provide information

pertaining to the requirements engineering, system architecture, system modeling, and system

design of the Toastmaster’s Toolbox system. The material focus of this document is for the

current developer and any future developer(s) to understand what has been or will be

implemented for this system. This is the second version of the software requirements

specification document, and incorporates details of the current system, as well as corrections

made to the document from the original revision.

Introduction

This system will serve as an aid in Toastmasters presentations. The system is called a

Toastmasters Toolbox and will fulfill many of the roles in a traditional Toastmasters meeting.

These roles include facial expression analysis, speech disfluency feedback, timing cue, provide

feedback on the speaker’s rate of speech, and will provide a report to the user at the end of the

speech. This system will be composed of a software system that runs on a single computer, or

PC. The system aims at helping the user improve their public speaking abilities in a stand-alone

manner. This is in line with the intention of Toastmasters.

4

Glossary

1. API: Application Programming Interface; computing interface which defines interactions

between multiple software intermediaries.

2. Emulation: reproduction of the function or action of a different computer, software

system, etc.

3. Graphical User Interface (GUI): a visual way of interacting with a computer using items

such as windows, icons, and menus, used by most modern operating systems.

4. Linux: open-source operating system.

5. Lubuntu: lightweight distribution of Ubuntu, a Linux operating system.

6. Module: a python file that contains a set of variables, functions, and/or classes.

7. Package: a namespace that can contain multiple packages and modules.

8. PC: Personal Computer

9. Processor: an electrical unit that executes instructions and computations

10. Real-time: the actual time during which a process or event occurs.

11. Software Architecture: fundamental structure of a software system

12. Virtual Machine: An emulation of a computer system.

5

User Requirements Definition

This system will fulfill multiple roles that are typically performed by a human observer

during a Toastmasters speech.

Functional Requirements 1. The system shall perform a real-time facial expression analysis of the current speaker.

2. The system shall provide speech disfluency feedback to the current speaker.

3. The system shall display timing cues to the current speaker that indicate how long the

speaker has spoken.

4. The system shall provide a report of the overall performance of the speaker once the

speech is done.

5. The system will provide real-time feedback on the rate of speech of the current speaker.

Non-Functional Requirements 1. The system shall be composed of software within a single PC as well as one external

device for the Ah-Counter.

2. The system should perform in real-time.

3. The system will utilize a virtual machine running a Lubuntu distribution of Linux hosted

on a Windows 10 operating system.

4. The software will be developed in a Linux environment running within the virtual

machine.

5. The system will utilize a Graphical User Interface to allow the users to provide input and

observe outputs from the system.

6

Use Case Diagram

Figure URD-1

7

System Architecture

The architecture of the system can be classified as a client-server layered architecture.

The system will be composed of four main modules. Each of the modules are contained within

their own respective file. In total, there are four files for each of the four modules. There are two

higher level modules for the two different GUIs for the server side of the system and the client

side of the system. These are called the Server and Host Graphical User Interface Module and

the Client Ah-Counter Graphical User Interface Module. These will allow the users to interact

with the system through an intuitive interface using a variety of methods provided by the GUIs.

The server and client will each have a lower-level, logic, and event driven module as well that

interact with each other and their respective GUIs. These are called the Server and Host

Application Logic Module and the Client Ah-Counter Application Logic Module. These modules

will utilize, and therefore have access to even lower-level, open-sourced modules that are

contained within external libraries that are included in the files. All included modules are

imported at the beginning of each file for their respective module. See the System Models

section for a graphical representation of the overview for the system architecture, Figure SM-3.

Server and Host Application Logic Module

The Server and Host Application Logic Module provides most of the functionality of the

Toastmaster’s Toolbox system. This module is contained within the python file Final_Project.py

and is the most processor intensive module. This module can be broken down into 7 main

sections. These sections provide the classes and methods that are used to fulfill a large portion of

the functional and non-functional requirements of the system. These 7 sections are composed of

4 threads that run simultaneously, as well as 3 additional sections that are comprised of methods

that are grouped according to their functionality. These sections are as follows: Facial Expression

8

Recognition Thread, Web Server Hosting Thread, Speech Recognition Thread, Timer Thread,

File Input/Output (I/O) Methods, Reporting Methods, and Generic Application Methods.

Facial Expression Recognition Thread

This thread utilizes the fer (Facial Expression Recognition) and Video modules from the

external FER package. This is an open-source software that utilizes a neural-engine to determine

what emotions a user is exhibiting by acquiring video from the webcam and processing the feed

to determine the facial expressions. This thread continuously loops and analyzes the expressions

by frame from the webcam. This thread also maintains output to the Server and Host GUI to

update the speaker on what emotion they are exhibiting, the ‘magnitude’ of the emotion,

calculates and outputs the FPS (Frames-per-Second) counter to the Server and Host GUI, and

also allows the user to mirror the video feed based on if a QRadioButton widget is activated on

not on the Server and Host GUI. This thread begins running at the launch of the program and

will terminate once a user stops the speech. The thread can resume only if a user begins a speech

again, after stopping it.

Web Server Hosting Thread

This thread utilizes the flask module from the external open-source Flask package. The

thread is simple, simply hosting a simple web server. This section of code also routes to and runs

methods based on requests from the client. These methods include ways to update the GUI with

the current Ah-Count and play an audio file when the ah-counter dings the speaker, and also

allows the ah-counter to change the background color of the Ah-Counter label on both Server

and Client GUI Modules. This thread begins running at the launch of the program and will

terminate once a user stops the speech. The thread can resume only if a user begins a speech

again, after stopping it.

9

Speech Recognition Thread

This thread utilizes the pyaudio module from the PyAudio package, as well as the

speech_recognition module from the Speech Recognition package. This thread continuously

loops, gathering audio data from the microphone by utilizing the pyaudio module and its

functions, sends this audio data to Google’s speech recognition API (application programming

interface) using the speech_recognition module, calculates the average words-per-minute, and

then outputs this to the Server and Host GUI Module. This thread also outputs a string of the

recognized audio to the Server and Host GUI, as well as the number of words that it recognized.

If this thread does not recognize any audio, it will output a corresponding message. This thread

begins running at the launch of the program and will terminate once a user stops the speech. The

thread can resume only if a user begins a speech again, after stopping it.

Timer Thread

This thread utilizes the internal time module. This thread begins once the user starts the

speech via the Server and Host GUI. The thread begins processing by receiving values for

different speech timer settings from the Server and Host GUI, and then begins a timer. Through

each loop, the thread determines if the speaker is within certain thresholds that were specified

earlier and outputs a background color to the Server and Host GUI that represents the timer flags.

The color corresponds to the threshold that has been met or surpassed. This thread ends once the

speaker reaches the end of the allotted speaking time.

File I/O Methods

This section is a grouping of two methods that can be called to either save the generated

report to a .txt file or import a report from an existing .txt file. The .txt file that is saved is always

“Toastmaster Report.txt”. If this file already exists, it will be overwritten. The import function

10

will import a file under the same “Toastmaster Report.txt” name and output the report to the

Server and Host GUI.

Reporting Methods

These methods provide ways to generate the report after the speaker is done and output it

to the Server and Host GUI, navigate to the report page on the GUI, or to cancel the report,

which just navigates to the previous page of the Server and Host GUI.

Generic Application Methods

This section contains methods to begin and stop the speech, to terminate all running

threads, to set the speech settings once the user clicks the ‘Enter’ button on the first page of the

Server and Host GUI, and a function to quit the application when user exits the program.

Server and Host Graphical User Interface Module

This is the highest-level module that allows the speaker to interact with the system. It

allows the speaker to provide inputs in the form of button or key presses and provides outputs to

the user that it receives from the lower-level Server and Host Application Logic Module. This

module is composed of a single pyQt5 file called Final_Project.ui.

Client Ah-Counter Application Logic Module

This module is a lower-level module that takes in inputs from the Client Ah-Counter GUI

and provides outputs to the Client Ah-Counter GUI as well as the Server and Host Application

Logic Module via a network. This module is not processor intensive and runs one thread that

accesses the hosted web server, ran by the Server and Host Application Logic Module. The

module also contains functions that run when triggered by the event of the Ah-Counter clicking

11

one of the buttons. These functions execute simple logic functionality such as incrementing or

decrementing the ah counter and then request to post to an address of the webserver.

Client Ah-Counter Graphical User Interface Module

This is the highest-level module that allows the Ah-Counter to interact with the system. It

allows the Ah-Counter to provide inputs in the form of button presses and provides outputs to the

user that it receives from the lower-level Client Ah-Counter Application Logic Module. This

module is composed of a single pyQt5 file called flask_client.ui.

12

System Requirements Specification

Functional Requirements 1. The system shall output the real-time video of the speaker in the GUI. This is to mimic

the function of a mirror.

2. The system shall analyze the expressions of the speaker using the footage captured by the

webcam.

3. The system shall determine the mood of the speaker based on the expressions it analyzes.

Non-functional Requirements 1. The capture, analysis of expressions, and providing feedback shall all occur in real-time.

2. The system shall capture the audio of the speaker’s speech via a microphone.

3. The system shall parse through the speech in real-time.

4. The system shall output an audio signal and graphical signal through the GUIs when the

speaker uses a filler word such as “uh” or “um.”

5. The Ah-Counter shall use a device connected to the internet that allows them to ding the

speaker via a button when the speaker uses a filler word.

6. The Ah-Counter’s device shall communicate with the system via a webserver.

7. The webserver shall be hosted on the speaker’s machine.

8. The system shall keep track of how many times the Ah-Counter dings the speaker.

9. The system shall keep track of the number of filler words that it detects.

10. The system shall output a graphical signal that has a color that corresponds to the

appropriate time for the cue.

10.1 There shall be three different cues corresponding to three different times.

10.2 There shall be 3 different colors that correspond to the three different cues.

11. The user shall be able to configure the three different times that correspond to the cues.

13

12. The system will display a report that contains information about the speech as soon as the

speech is over.

12.1 The report shall display the amount of time that the speech took.

12.2 The report shall display the number of times the Ah-Counter dinged the speaker.

12.3 The report shall display the most used emotion by the speaker.

12.4 The report shall display the least used emotion by the speaker.

12.5 The report shall display the rate of speech of the speaker.

13. The speech ends when the speaker presses a stop button on the GUI, or when the speaker

runs out of time.

14. The system shall save the report to a .txt file.

15. The system’s main program shall be executed in a virtual machine.

16. The virtual machine shall run Lubuntu.

17. The system shall allow an external device to connect to it via a webserver.

18. The external device shall communicate with the system via the webserver.

19. The system’s latency, or the classification of real-time shall be less than 250

milliseconds.

20. The real-time feedback of the rate of speech tracker shall refresh the value every 5

seconds.

21. The virtual machine shall be hosted on a Windows 10 operating system.

22. The GUI shall output information to the speaker in real-time.

22.1 The GUI shall display the number of times the Ah-counter dinged the speaker.

22.2 The GUI shall display the current rate of speech.

22.3 The GUI shall display the current mood it perceives the speaker to have.

14

23. The GUI shall have buttons to start and stop the speech.

24. The GUI shall have a window for webcam feedback.

15

System Models

This chapter includes graphical system models showing the relationships between the

system components and the system and its environment.

System Context Diagram

Figure SM-1

16

Software System Context Diagram

Figure SM-2

17

System Architecture Overview

Figure SM-3

18

System Evolution

This describes the fundamental assumptions on which the system is based, and any

anticipated changes due to hardware evolution, changing user needs, and so on.

This system has been designed around a few hardware requirements and assumptions. I

assume the speaker’s computer has access to a webcam with a resolution of at least 400x300

pixels, a microphone, an internet connection, and at least two physical processing cores. The ah-

counter’s machine hardware requirements are even less; at least two physical processing cores

and an internet connection. Both machines need two processing cores because at least one core is

dedicated to the host’s operating system, and the second to the virtual machine’s operating

system. It would be preferable if the host’s machine had 4 or more cores so that they could

dedicate at least 2 to the virtual machine, due to the heavy processing nature of the Server and

Host Application Logic module.

Future design changes should consider limiting the number of threads required to run the

software, but enough to maintain core functionality of the system. The number of threads the

host machine should execute is proportional to the number of dedicated processing cores the

virtual machine has access to. Higher resolution webcams are welcome, but performance can

depreciate quickly depending on the resolution and processing power of the machine. If the host

machine’s processor can maintain required performance goals while the resolution of the

webcam is increased, facial expression recognition will be improved, as well as video feed

quality.

19

Appendix A – Hardware Requirements

Hardware requirements define the minimal configurations for the system.

1. At least two physical processing cores on server/speaker and client/ah-counter machines.

2. Server machine must have a webcam with a minimum resolution of 400x300.

3. Server machine must have access to a webcam.

4. Server and client machines must have internet connection of at least 2 Mbps download

and 500 Kbps upload.

Appendix B – Diagrams Figure URD-1: Use Case Diagram to show how users will engage in the functionality of the

system.

Figure SM-1: System Context Diagram to show the boundaries and interactions of the entire

system, including hardware components.

Figure SM-2: Software System Context Diagram to show the boundaries and interactions of the

software system.

Figure SM-3: System Architecture Overview to show the fundamental structure of the software

system.