Problem Set - OOP
Copyright MITâs 6.0001 Introduction to Computer Science and Programming in Python Fall 2016
Adjusted for the Needs of the Course Introduction to Computer Science by the Chair of Digital Industrial Service Systems FAU
In this weekâs problem set, weâre going to discover some benefits of object-oriented programming. Additional information on the MIT problem sets in general can be found at the MIT OCW site.
Getting started
Hereâs how to download this problem into your own CS50 IDE. Log into CS50 IDE and then, in a terminal window, execute each of the below
- Execute
cd
to ensure that youâre in~/
(i.e., your home directory). - Execute
mkdir MIT_OOP
to make (i.e., create) a directory calledMIT_OOP
in your home directory. - Execute
cd MIT_OOP
to change into that directory - Execute
wget introcs.is.rw.fau.de/assets/pdfs/MIT_OOP.zip
to download a (compressed) ZIP file with this problemâs distribution. - Execute
unzip MIT_OOP.zip
to uncompress that file. - Execute
rm MIT_OOP.zip
followed byyes
ory
to delete that ZIP file. - Execute
ls
. You should see this problemâs distribution:feedparser.py, project_util.py, ps5.py, ps5_test.py, stories.txt, triggers.txt
Do not change anything in feedparser.py
, project_util.py
and ps5_test.py
.
Background
RSS Feed Filter
RSS Overview
News sites, such as BBC News or Yahoo Top Stories, have content that is updated on an unpredictable schedule. One tedious way to keep track of this changing content is to load the website up in your browser and periodically hit the refresh button. Fortunately, this process can be streamlined and automated by connecting to the websiteâs RSS feed, using an RSS feed reader instead of a web browser. An RSS reader (e.g. Sage) will periodically collect and draw your attention to updated content. RSS stands for âReally Simple Syndication.â An RSS feed consists of (periodically changing) data stored in an XML-format file residing on a web-server. For this problem set, the details are unimportant. You donât need to know what XML is, nor do you need to know how to access these files over the network. We have taken care of retrieving and parsing the XML file for you.
Data Structure Design
RSS Feed Structure: Yahoo
First, letâs talk about one specific RSS feed: Yahoo Top stories. The URL for the Yahoo Top stories feed is: http://news.yahoo.com/rss/topstories If you try to load this URL in your browser, youâll probably see your browserâs interpretation of the XML code generated by the feed. You can view the XML source with your browserâs âView Page Sourceâ function, though it probably will not make much sense to you. Abstractly, whenever you connect to the Yahoo Top stories RSS feed, you receive a list of items. Each entry in this list represents a single news item. In a Yahoo News feed, every entry has the following fields:
- guid: A globally unique identifier for this news story.
- title: The news storyâs headline.
- description: A paragraph or so summarizing the news story.
- link: A link to a website with the entire story.
- pubDate: Date the news was published
- category: News category, such as âTop Storiesâ
Generalizing the Problem
This is a little trickier than weâd like it to be, because each of these RSS feeds is structured a little bit differently than the others. So, our goal is to come up with a unified, standard representation that weâll use to store a news story. We want to do this because we want an application that aggregates several RSS feeds from various sources and can act on all of them in the exact same way. We should be able to read news stories from various RSS feeds all in one place.
Parsing the Feed
Parsing is the process of turning a data stream into a structured format that is more convenient to work with. We have provided you with code that will retrieve and parse the Google and Yahoo news feeds.
Testing
You might have noticed that this problem set distribution code contains a file called ps5_test.py. Do not change anything in this file otherwise your tests will fail!
This file contains the unittests you will need to check whether your program performs as required at the different stages
of your coding. When you can of course run the check50 command given to you at the end of this specificationâŠhowever, this
would run all of the unittests making it quite confusing to validate individual checks, like whether you have for instance implemented
only one class such as NewsStory
correctly.
By using ps5_test.py you can thus run the unittest to check parts of your program without having to completely implemented it!
Have a look at the file ps5_test.py. You will see that you have two classes which are of the object type unittest.TestCase
.
These are your two TestClasses
which in total include 16 TestMethods
!
Each of these tests can be run individually. But how can we run test cases from our command line? Well if you are generally interested in unittests and how to work with them in python we recommend you have a look at relevant python docs. Nevertheless, we will try to give you a brief rundown.
The unittest module can be used form the command line to run tests from modules, classes or even individual test methods:
Running a Test module
A so called Test module is the test file so in our case ps5_test.py
.
We can run any Test module/file by writing the following into our terminal!
python3 -m unittest test_module
In our specific Pset you can run this line in your terminal!
python3 -m unittest ps5_test
Running a Test Class
A Test Class are the classes of type unittest.TestCase
that are defined in your
Test module
. In our example we only have two Test Classes, ProblemSet5NewsStory
&
ProblemSet5
. We can run our individual Test Classes by giving the following command:
python3 -m unittest test_module.test_class
In our specific Pset if we were to run all the tests included in Test Class ProblemSet5
we can
run this command:
python3 -m unittest ps5_test.ProblemSet5
Running a Test Method
This is were it gets interesting we can even run individual Test Methods that are included in our Test Classes. The general terminal command line input would look like this:
python3 -m unittest test_module.test_class.test_method
If you want to test whether you have specified the constructor of your Class
NewsStory
correctly in Problem1 you can run the following terminal command.
python3 -m unittest ps5_test.ProblemSet5NewsStory.testNewsStoryConstructor
As you can see testing is quite simple! You can find out the name of all the test methods by looking into the file ps5_test. Then you can simply run individual tests and see your progress by checking your individual tasks one at a time!
Nevertheless, we understand that this might be a bit much at once! You can of course always run the check50 command (which we recommend doing as soon as you have completed the Pset). However, this can lead to frustration as this Pset requires a lot of individual smaller steps before you will pass the majority of checks! You can avoid this by unit testing!
Now let us jump straight into what you need to do!
Specification
Create Class NewsStory
đĄ Task
Parsing all of this information from the feeds that Google/Yahoo/etc. gives us is no small feat. So, letâs tackle an easy part of the problem first. Pretend that someone has already done the specific parsing, and has left you with variables that contain the following information for a news story:
- globally unique identifier (GUID) - a string
- title - a string
- description - a string
- link to more content - a string
- pubdate - a datetime
We want to store this information in an object that we can then pass around in the rest of our program.
Your task, in this problem, is to write a class, NewsStory
, starting with a constructor that takes (guid, title, description, link, pubdate)
as arguments and stores them appropriately. Remember how to use the init method to create your constructor. NewsStory also needs to contain the following methods:
- get_guid(self)
- get_title(self)
- get_description(self)
- get_link(self)
- get_pubdate(self)
The solution to this problem should be relatively short and very straightforward (please review what get methods should do if you find yourself writing multiple lines of code for each â they usually only return something). Once you have implemented NewsStory all the NewsStory test cases should work.
Triggers
Given a set of news stories, your program will generate alerts
for a subset of those stories. Stories with alerts will be displayed to the user, and the other stories will be silently discarded. We will represent alerting rules as triggers
.
A trigger is a rule that is evaluated over a single news story and may fire to generate an alert. For example, a simple trigger could fire for every news story whose title contained the phrase âMicrosoft Officeâ. Another trigger may be set up to fire for all news stories where the description contained the phrase âBostonâ.
Finally, a more specific trigger could be set up to fire only when a news story contained both the phrases âMicrosoft Officeâ and âBostonâ in the description. Essentially a trigger is the filter that determines what is interesting and what is not.
In order to simplify our code, we will use object polymorphism. We will define a trigger interface and then implement a number of different classes that implement that trigger interface in different ways.
Trigger Interface
Each trigger class you define should implement the following interface, either directly or transitively. It must implement the evaluate
method that takes a news item (NewsStory object) as an input and
returns True if an alert should be generated for that item. We will not directly 3 use the implementation of the Trigger class, which is why it raises an exception should anyone attempt to use it.
The class below implements the Trigger interface (you will not modify this). Any subclass that inherits from it will have an evaluate
method. By default, they will use the evaluate method in Trigger
, the superclass, unless they define their own evaluate function, which would then be used instead. If some subclass neglects to define its own evaluate()
method, calls to it will go to Trigger.evaluate()
, which fails (albeit cleanly) with the NotImplementedError :
class Trigger(object):
def evaluate(self, story):
"""
Returns True if an alert should be generated
for the given news item, or False otherwise
"""
raise NotImplementedError
We will define a number of classes that inherit from Trigger
. In the figure below, Trigger
is a superclass, from which all other classes inherit. The arrow from PhraseTrigger
to Trigger
means that PhraseTrigger
inherits from Trigger
- a PhraseTrigger
is a Trigger
. Note that other classes inherit from PhraseTrigger
.
Phrase Triggers
Having a trigger that always fires isnât interesting; letâs write some that are interesting! A user may want to be alerted about news items that contain specific phrases. For instance, a simple trigger could fire for every news item whose title contains the phrase âMicrosoft Officeâ. In the following problems, you will create a phrase trigger abstract class and implement two classes that implement this phrase trigger. A phrase is one or more words separated by a single space between the words. You may assume that a phrase does not contain any punctuation. Here are some examples of valid phrases:
- âpurple cowâ
- âPURPLE COWâ
- âmOoOoOoOâ
- âthis is a phraseâ
But these are NOT valid phrases:
- âpurple cow???â (contains punctuation)
- âpurple cowâ (contains multiple spaces between words)
Given some text, the trigger should fire only when each word in the phrase is present in its entirety and appears consecutively in the text, separated only by spaces or punctuation. The trigger should not be case-sensitive. For example, a phrase trigger with the phrase âpurple cowâ should fire on the following text snippets:
- âPURPLE COWâ
- âThe purple cow is soft and cuddly.â
- âThe farmer owns a really PURPLE cow.â
- âPurple!!! Cow!!!â
- âpurple@#$%cowâ
- âDid you see a purple cow?â
Dealing with exclamation marks and other punctuation that appear in the middle of the phrase is a little tricky. For the purpose of your parsing, pretend that a space or any character in string.punctuation is a word separator. If youâve never seen string.punctuation before, have a look here, or go to the Python shell and type:
>>> import string
>>> print string.punctuation
Play around with this a bit to get comfortable with what it is. Other functions you might find useful are:
- string.split()
- string.replace()
- string.join()
- string.lower()
- string.upper()
Create Class Phrase Trigger
đĄ Task
Implement a phrase trigger abstract class, PhraseTrigger
. In its class constructor it should take in a string âphraseâ as an argument. This trigger should not be case-sensitive (it should treat Intel
and intel
as being equal). This should be familiar to you! Functions such as upper()
or lower()
are nothing new for you.
PhraseTrigger
should be a subclass of Trigger so as to inherit the evaluate method.
It has one new method, is_phrase_in, which takes in one string argument âtextâ. It returns True
if the whole phrase self.phrase
is present in text
, False
otherwise, as described in the above examples. This method should not be case-sensitive. Implement this method.
Because this is an abstract class, we will not be directly instantiating any PhraseTrigger
. PhraseTrigger
should inherit its evaluate method from Trigger
. We do this because now we can create subclasses of PhraseTrigger
that use its is_phrase_in
function.
You are now ready to implement PhraseTrigger
âs two subclasses: TitleTrigger
and Description
.
Create Title Trigger
đĄ Task
Implement a phrase trigger subclass, TitleTrigger
that fires when a news itemâs title contains a given phrase. For example, an instance of this type of trigger could be used to generate an alert whenever the phrase Intel processors
occurred in the title of a news item.
As it was in PhraseTrigger
, the phrase should be an argument to the classâs constructor, and the trigger should not be case-sensitive.
Think carefully about what methods should be defined in TitleTrigger and what methods should be inherited from the superclass. Once youâve implemented TitleTrigger
, the TitleTrigger
unit tests in our test suite should pass. Remember that all subclasses that inherit from the Trigger interface should include a working evaluate
method.
If you find that youâre not passing the unit tests, keep in mind that FAIL means your code runs but produces the wrong answer, whereas ERROR means that your code crashes due to some error.
Create Description Trigger
đĄ Task
Implement a phrase trigger subclass, DescriptionTrigger
, that fires when a news itemâs description contains a given phrase. As it was in PhaseTrigger
, the phrase should be an argument to the classâs constructor, and the trigger should not be case-sensitive.
Once youâve implemented DescriptionTrigger
, the DescriptionTrigger
unit tests in our test suite should pass. As a hint your TitleTrigger
and DescriptionTrigger
should be identical as they both should use the is_phrase_in
method defined in their superclass.
Time Triggers
Letâs move on from PhraseTrigger. Now we want to have triggers that is based on when the NewsStory was published, not on its news content. Please check the earlier diagram if youâre confused about the inheritance structure of the Triggers in this problem set.
Create Time Trigger
đĄ Task
Implement a time trigger abstract class, TimeTrigger
, that is a subclass of Trigger
. The classâs constructor should take in time in EST as a string in the format of â3 Oct 2016 17:00:10 â. Make sure to convert time from string to a datetime before saving it as an attribute. Some of datetimeâs methods, strptime and replace, along with an explanation of the string format for time, will come in handy. You can also look at the provided code in process to check. You do not have to implement any other methods.
Because this is an abstract class, we will not be directly instantiating any TimeTrigger
.
Create Before and After Trigger
đĄ Task
Implement BeforeTrigger
and AfterTrigger
, two subclasses of TimeTrigger
.
BeforeTrigger
fires when a story is published strictly before the triggerâs time, and AfterTrigger
fires when a story is published strictly after the triggerâs time. Their evaluate should not take more than a couple of lines of code. Once youâve implemented BeforeTrigger
and AfterTrigger
, the BeforeAndAfterTrigger
unit tests in our test suite should pass.
Composite Triggers
So the triggers above are mildly interesting, but we want to do better: we want to âcomposeâ the earlier triggers to set up more powerful alert rules. For instance, we may want to raise an alert only when both âgoogle glassâ and âstockâ were present in the news item (an idea we canât express with just phrase triggers).
Note that these triggers are not phrase triggers and should not be subclasses of PhraseTrigger
. Again, please refer back to the earlier diagram if youâre confused about the inheritance structure of Trigger.
Create NOT Trigger
đĄ Task
Implement a NOT trigger (NotTrigger
).
This trigger should produce its output by inverting the output of another trigger. The NOT trigger should take this other trigger as an argument to its constructor (why its constructor? Because we canât change what parameters evaluate takes inâŠthatâd break our polymorphism). So, given a trigger T
and a news item x
, the output of the NOT triggerâs evaluate method should be equivalent to not T.evaluate(x)
.
When this is done, the NotTrigger
unit tests should pass.
Create AND Trigger
đĄ Task
Implement an AND trigger (AndTrigger
).
This trigger should take two triggers as arguments to its constructor, and should fire on a news story only if both of the inputted triggers would fire on that item. When this is done, the AndTrigger unit tests should pass.
Create OR Trigger
đĄ Task
Implement an OR trigger (OrTrigger
).
This trigger should take two triggers as arguments to its constructor, and should fire if either one (or both) of its inputted triggers would fire on that item.
When this is done, the OrTrigger
unit tests should pass.
Filtering
At this point, you can run ps5.py
, and it will write Google and Yahoo news items into your stories.txt
file. How many news items? All of them!
Right now, the code weâve given you in ps5.py
gets the news from both feeds every minute and displays the result. This is nice, but, remember, the goal here was
to filter out only the stories we wanted.
Implement Filter Function
đĄ Task
Write a function, filter_stories(stories, triggerlist)
that takes in a list of news stories and a list of triggers, and returns a list of only the stories for which a trigger fires.
After completing Problem 10, you can try running ps5.py
, and various RSS news items should pop up, filtered by some hard-coded triggers defined for you in code near the bottom. You may need to change these triggers to reflect what is currently in the news. The code runs an infinite loop, checking the RSS feeds for new stories every 120 seconds.
User-Specified Triggers
Right now, your triggers are specified in your Python code, and to change them, you have to edit your program. This is very user-unfriendly. (Imagine if you had to edit the source code of your web browser every time you wanted to add a bookmark!)
Instead, we want you to read your trigger configuration from a triggers.txt
file every time your application starts and use the triggers specified there.
Consider the following example configuration file:
// description trigger named t1 t1,DESCRIPTION,Omicron
// title trigger named t2 t2,TITLE,Covid
// description trigger named t3 t3,DESCRIPTION,variant
// composite trigger named t4 t4,AND,t2,t3
// the trigger list contains t1 and t4 ADD,t1,t4
The two other triggers (t2 and t3) are created but not added to the trigger list directly. They are used as arguments for the composite AND triggerâs definition (t4). Each line in this file does one of the following:
- A trigger that fires when the description contains the phrase âOmicronâ (t1) and âvariantâ (t3).
- A trigger that fires when the title contains âCovidâ (t2).
Each line in this file does one of the following:
- is blank
- is a comment (begins with // with no spaces preceding the //)
- defines a named trigger
- adds triggers to the trigger list
each type of line is described below.
Blank: blank lines are ignored. A line that consists only of whitespace is a blank line.
Comments: Any line that begins with // is ignored.
Trigger definitions: Lines that do not begin with the keyword ADD define named triggers. Elements in these lines are separated by commas. The first element in a trigger definition is either the keyword ADD or the name of the trigger. The name can be any combination of letters/numbers, but it cannot be the word âADDâ. The second element of a trigger definition is a keyword (e.g., TITLE, AND, etc.) that specifies the type of trigger being defined. The remaining elements of the definition are the trigger arguments. What arguments are required depends on the trigger type:
- TITLE: one phrase
- DESCRIPTION: one phrase
- AFTER: one correctly formatted time string
- BEFORE: one correctly formatted time string
- NOT: the name of the trigger that will be NOTâd
- AND: the names of the two triggers that will be ANDâd.
- OR: the names of the two triggers that will be ORâd.
Trigger addition: A trigger definition should create a trigger and associate it with a name but should NOT automatically add that trigger to the trigger list. One or more ADD lines in the trigger configuration file will specify which triggers should be in the trigger list. An ADD line begins with the ADD keyword. The elements following ADD are the names of one or more previously defined triggers. The elements in these lines are also separated by commas. These triggers will be added to the trigger list.
Implement Read Trigger Function
đĄ Task
Finish implementing read_trigger_config(filename)
. Weâve written code to open the file and throw away all blank lines and comments. Your job is to finish the implementation. read_trigger_config
should return a list of triggers specified by the configuration file.
Hint: Feel free to define a helper function if you wish! Using a helper function is not required though.
Hint: You will probably find it helpful to use a dictionary where the keys are trigger names.
Once thatâs done, modify the code within the function main_thread to use the trigger list specified in your configuration file, instead of the one we hard-coded for you:
# TODO: Problem 11
# After implementing read_trigger_config, uncomment this line:
# triggerlist = read_trigger_config('triggers.txt')
After completing Problem 11, you can try running ps5.py
, and depending on your triggers.txt
file, various RSS news items should pop up. The code runs an infinite loop, checking the RSS feed for new stories every 120 seconds.
Hint: If no stories are popping up, open up triggers.txt and change the triggers to ones that reflect current events (if you donât keep up with the news, just pick some triggers that would fire on the current yahoo or BBC stories).
Usage
Execute your program per the example below.
$ python ps5.py
Testing
To check your program you can run this line in your terminal.
check50 fau-is/IntroCS/MIT_OOP --local
Execute the below to evaluate the style of your code using style50
.
style50 ps5.py
How to submit
Execute the below, logging in with your GitHub username and password when prompted. For security, youâll see asterisks (*) instead of the actual characters in your password.
submit50 fau-is/introcs/MIT_OOP