Finance

Sam321
InformationRetrievalTheory.pdf

Information Retrieval Theory

Accounting Analytics

1

What is querying?

 Querying is asking questions about the data in the database and manipulating or combining the data in different ways in order to generate “information” from the data. For example:  We can choose specific records and specific attributes in

tables.  We can use multiple tables together.  We can make calculations based on the data.  We can group data together.

2

Requirements for effective querying

1. A database that is well-designed. 2. A query developer who understands the table

structures and the nature of the data in the tables. 3. A query developer who understands the desired query

output. 4. A query developer who has good logic and reasoning

skills. 5. A query developer who knows the querying language

used to retrieve information from the enterprise database.

3

Two query languages

 Structured Query Language (SQL)  The user enters commands according to a pre-defined syntax to

retrieve desired data.

 Query By Example (QBE)  The user starts with a sample of the table(s) columns and marks

the fields he or she wants to include in the answer. Defaults are available for summarizing and manipulating the data.

4

SQL (Structured Query Language)

 Each query statement follows the same structure:

SELECT attribute name(s) FROM table name(s) WHERE criteria is met

5

SQL statements

 SQL’s SELECT component isolates columns  SQL’s FROM component is used for identifying the

table(s) involved  if more than one table is used, the FROM statement also helps

accomplish a join (together with the WHERE statement that specifies how the join occurs)

 SQL’s WHERE component isolates rows  also helps accomplish a join  may be left blank for single-table queries that retrieve all rows

6

Exercise

7

SQL: Question 1

8

Report the sales from Customer 1 (include customer number, sale number, amount, and date).

Select [Cust#], [Sale#], [Amount], [Date] From [tblSale] Where [Cust#] = 1;

tblSale Sale# Cust# SalesClerk# Date Amount

1001 1 SC-1 07/01 1,400.00 1002 2 SC-2 07/21 850.00 1003 3 SC-2 07/22 300.00 1004 1 SC-1 07/22 2,550.00 1005 4 SC-2 07/31 1,650.00 1006 5 SC-1 08/15 200.00 1007 2 SC-3 08/21 4,000.00

Answer: Cust# Sale# Amount Date

1 1001 $1,400.00 07/01 1 1004 $2,550.00 07/22

Join types (joins are required whenever more than one table is specified)  Inner join

 includes only the records from both tables that have the exact same values in the fields that are joined

 Outer join  includes all records from one table, and matches those records

from the other table for which values in the joined fields are equal

9

Left Outer Join Right Outer Join

SQL Inner-Join: Question 2

10

Report only the sales that have been paid (include sale number, amount of sale, customer number, CR#, and amount applied).

Select [Sale#], [Amount], [Cust#], [CR#], [AmtApplied] From [tblSale], [tblCR-Sale] Where [tblSale.Sale#] = [tblCR-Sale.Sale#];

tblSale Sale# Cust# SalesClerk# Date Amount 1001 1 SC-1 07/01 1,400.00 1002 2 SC-2 07/21 850.00 1003 3 SC-2 07/22 300.00 1004 1 SC-1 07/22 2,550.00 1005 4 SC-2 07/31 1,650.00 1006 5 SC-1 08/15 200.00 1007 2 SC-3 08/21 4,000.00

tblCR-Sale CR# Sale# AmtApplied CR-1 1001 $1,400.00 CR-1 1004 $2,550.00 CR-2 1002 $850.00 CR-3 1003 $300.00

Answer: Sale# Amount Cust# CR# AmtApplied 1001 $1,400.00 1 CR-1 $1,400.00 1002 $850.00 2 CR-2 $850.00 1003 $300.00 3 CR-3 $300.00 1004 $2,550.00 1 CR-1 $2,550.00

SQL Outer-Join: Question 3 Report all sales (include sale number, amount of sale, customer number, CR#, and amount applied).

Select [Sale#], [Amount], [Cust#], [CR#], [AmtApplied] From [tblSale] LeftJoin [tblCR-Sale] Where [tblSale.Sale#] = [tblCR-Sale.Sale#];

tblSale Sale# Cust# SalesClerk# Date Amount

1001 1 SC-1 07/01 1,400.00 1002 2 SC-2 07/21 850.00 1003 3 SC-2 07/22 300.00 1004 1 SC-1 07/22 2,550.00 1005 4 SC-2 07/31 1,650.00 1006 5 SC-1 08/15 200.00 1007 2 SC-3 08/21 4,000.00

tblCR-Sale CR# Sale# AmtApplied CR-1 1001 $1,400.00 CR-1 1004 $2,550.00 CR-2 1002 $850.00 CR-3 1003 $300.00

Answer Sale# Amount Cust# CR# AmtApplied 1001 $1,400.00 1 CR-1 $1,400.00 1002 $850.00 2 CR-2 $850.00 1003 $300.00 3 CR-3 $300.00 1004 $2,550.00 1 CR-1 $2,550.00 1005 $1,650.00 4 1006 $200.00 5 1007 $4,000.00 2

Mathematical comparison operators

 SQL Queries may include mathematical comparison operators:  = equal to  < less than  <= less than or equal to  > greater than  >= greater than or equal to  <> not equal to (or != in some software)

 Mathematical comparison operators are typically included in the WHERE clause of the SQL statement, and may be used on all types of fields  For date fields, dates that are earlier in time are “less than” dates that

are later in time.  For text fields, A < B < C, etc.

12

SQL mathematical comparison operators

13

Select [BankAccount#], [BegBalance] From [tblCash] Where [BegBalance] >= 50000;

tblCash

BankAccount# Type Bank Name BegBalance 128100020099 Checking BB&T 28,602.99 198002937850 Checking BOA 7,844.12 198012888621 MoneyMarket BOA 100,000.00 787502946602 Payroll National First 4,930.00

Answer: BankAccount# BegBalance

198012888621 100,000.00

Queries with logical operators

 Queries may include logical operators  AND accomplishes a set intersection – answer includes all

instances that meet BOTH conditions  OR accomplishes a set union – answer includes all instances that

meet one condition and all instances that meet the other condition

 NOT identifies instances that do not meet one or more conditions

14

Queries with special operators

 BETWEEN is used to define the range limits.  The end points of the range are included

15

Select [Sale#], [Date], [Amount] From [tblSale] Where [Date] BETWEEN 7/1 and 7/31;

tblSale Sale# Cust# SalesClerk# Date Amount

1001 1 SC-1 07/01 1,400.00 1002 2 SC-2 07/21 850.00 1003 3 SC-2 07/22 300.00 1004 1 SC-1 07/22 2,550.00 1005 4 SC-2 07/31 1,650.00 1006 5 SC-1 08/15 200.00 1007 2 SC-3 08/21 4,000.00

Answer: Sale# Date Amount

1001 07/01 $1,400.00 1002 07/21 $850.00 1003 07/22 $300.00 1004 07/22 $2,550.00 1005 07/31 $1,650.00

Queries with special operators

 IS NULL is used to retrieve attributes for which the value is null.

 EXISTS is used to retrieve attributes for which the value is not null.

16

Aggregation functions in queries

 An aggregation function summarizes the data values within a field (column)  COUNT summarizes the number of rows that contain a given

value in the field  AVERAGE computes the arithmetic mean value of all rows

included in the answer  SUM computes the arithmetic sum of all rows included in the

answer  MIN identifies the minimum (lowest) attribute value for the field  MAX identifies the maximum (greatest) attribute value for the field

17

Queries with horizontal calculations

 “Horizontal” calculations mathematically combine values from different fields for each row  Horizontal calculations should NOT be included in the same

query as an aggregation function  One query may perform a horizontal calculation and another query

that builds on the first query may perform the aggregation function, or vice versa  The “correct” order for the queries depends on the goal

18

Query by example (QBE)

 I will use the same questions to query this same database in Microsoft Access for instructional purposes.

 We will review the SQL for these same queries in Access.

19

Querying summary

 Querying provides the power of the relational database model  Once you unlock the mystery of query construction, you can tap into the

wealth of information that is at your fingertips in a well-designed relational database

 Querying requires organized thinking and logic  You must understand the structure of the database tables and the nature

of the data in those tables.  You must identify which table(s) are needed for each query, and

determine the appropriate manipulations that need to be made in the appropriate sequence  Some people find it helpful to organize their thinking by considering what the desired

output should look like  Manually calculating the query result using a representative data sample and then

comparing it to the output of the query is also very helpful for identifying query errors  Remember to separate horizontal calculations from vertical aggregations

 Comprehensive testing of queries is crucial before releasing queries for use by general users

20

  • Information Retrieval Theory
  • What is querying?
  • Requirements for effective querying
  • Two query languages
  • SQL (Structured Query Language)
  • SQL statements
  • Exercise
  • SQL: Question 1
  • Join types (joins are required whenever more than one table is specified)
  • SQL Inner-Join: Question 2
  • SQL Outer-Join: Question 3
  • Mathematical comparison operators
  • SQL mathematical comparison operators
  • Queries with logical operators
  • Queries with special operators
  • Queries with special operators
  • Aggregation functions in queries
  • Queries with horizontal calculations
  • Query by example (QBE)
  • Querying summary