Python scripting

profileAdejoke
CIS1233.5PerformanceAssessment.docx

Performance Assessment 3.5 – Web scraping and reading PDF files

Task 1 – Pseudocode

Now you are going to create pseudocode for three functions readwebpage, soup, and output quotes along with the main program. These functions should work together to read, write and append a line onto a file.

Let’s start with the pseudocode

Function readwebpage

End.

Function parsehtml

End.

Function outputquotes

End.

Program Main

End.

Deliverables for Task 1

· Pseudocode to read in and write out data to a file

Task 2 – Writing the program for webscraping

Write a program to read in a web page, process the data, and write out the quotes to the screen called <Your Name>_PA32 that will scrape the webpage https://quotes.toscrape.com/page/2/ onto your program screen. Make sure to include your student id in the first print statement of the program and output the parsed quotes with their authors.

Create a function named readwebpage which opens the url https://quotes.toscrape.com/page/2/ parses the data using a second function and the outputs the quotes and authors using a 3rd function.

Take a screenshot of your completed program and another of your output.

Deliverables for Task 2

· Screenshot of your completed program and the output

Task 3 – Writing the program to read PDFs

Write a program to pull text information from a PDF document including 2 functions. Pull the data data from the pdf file USCensus.pdf into a text file called USCensus_Output.txt.

Take a screenshot of your completed program and another of your output.

Deliverables for Task 3

· Screenshot of your completed program and the output