Yaogang Lian

Muffin Tutorial: Build a Personal Full-Text RSS Reader (Part 2)

In Part 1 we created a functional prototype that shows a list of RSS feeds and the recent articles in each feed. It also works nicely on both desktop and mobile. In this part of the tutorial, we are going to scrape full texts of articles and display them in a nice Safari Reader style UI.

Scrape Full Texts

To get the full text of an article, we can follow the article link in the RSS feeds, fetch the full article page, remove the clutter, and save the text in the datastore. Note that the code changes in this part can be a bit hard to follow, so please check git commits if you run into issues.

We will use Readability to remove the clutter from article pages, so copy its source code to server/vendor. Readability depends on chardet, so copy that into server/vendor too.

Since fetching full article pages can take quite a while, we should put these tasks into a task queue. While we are at it, let’s also refactor UpdateFeedsHandler and use a task queue for fetching and parsing the RSS feeds. These changes make the server app more robust, especially when you have quite a few RSS feeds. The updated aggregator.py looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
from google.appengine.api import urlfetch
from google.appengine.api import taskqueue
from readability.readability import Document
# ...
# Fetch and parse feeds
class UpdateFeedsHandler(webapp.RequestHandler):
def get(self):
for feed in Feed.all():
taskqueue.add(queue_name='update-feeds', url='/aggregator/update-single-feed', params={'id': feed.key().id_or_name()}, method='GET')
self.response.out.write('update-feeds: done')
# Fetch and parse a single feed
class UpdateSingleFeedHandler(webapp.RequestHandler):
def get(self):
feedId = long(self.request.get('id'))
feed = Feed.get_by_id(feedId)
res = urlfetch.fetch(feed.link, deadline=10)
data = feedparser.parse(res.content)
feedUpdated = data['feed']['updated_parsed']
feedUpdated = datetime.datetime(*feedUpdated[:6]) # convert `time.struct_time` into a `datetime.datetime` object
if feedUpdated != feed.updated:
toSave = []
toFetch = []
for entry in data.entries:
articleUpdated = entry.updated_parsed
articleUpdated = datetime.datetime(*articleUpdated[:6]) # convert `time.struct_time` into a `datetime.datetime` object
summary = re.sub(r'<.*?>', '', entry.summary)
summary = re.sub(r'&nbsp;', '', summary)
a = Article.all().filter("link =", entry.link).get()
if a is None:
a = Article(title=entry.title, link=entry.link, feed=feed, updated=articleUpdated, summary=summary)
toSave.append(a)
toFetch.append(a)
elif a.updated != articleUpdated:
a.title = entry.title
a.updated = articleUpdated
a.summary = summary
toSave.append(a)
toFetch.append(a)
feed.updated = feedUpdated
toSave.append(feed)
db.put(toSave)
# Fetch full texts of articles
for a in toFetch:
taskqueue.add(queue_name='get-full-article', url='/aggregator/get-full-article', params={'link': a.link, 'id': a.key().id_or_name()}, method='GET')
class GetFullArticleHandler(webapp.RequestHandler):
def get(self):
articleId = long(self.request.get('id'))
article = Article.get_by_id(articleId)
link = self.request.get('link')
res = urlfetch.fetch(link, deadline=10)
html = Document(res.content).summary()
article.body = html
article.put()
self.response.out.write('OK')
#
# Application
#
app = webapp.WSGIApplication([
(r'/aggregator/load-feeds', LoadFeedsHandler),
(r'/aggregator/update-feeds', UpdateFeedsHandler),
(r'/aggregator/update-single-feed', UpdateSingleFeedHandler),
(r'/aggregator/get-full-article', GetFullArticleHandler)
], debug=DEBUG)

We also need to set up queue.yaml:

1
2
3
4
5
6
7
8
9
10
11
12
13
queue:
- name: default
rate: 5/s
- name: update-feeds
rate: 5/m
retry_parameters:
task_retry_limit: 5
- name: get-full-article
rate: 1/s
retry_parameters:
task_retry_limit: 5

Now remove all the data from your local datastore using the SDK console, then run load-feeds and update-feeds again. You will see full texts saved in the datastore.



You might have noticed that imageUrl is still not filled. This is the thumbnail we will show next to the article summary. Although some RSS feeds do come with images inside the article summary, it’s a safer approach to pick the first image in the full article page, after removing all the clutter. To peek into the cleaned-up full-text, we will use PyQuery which is a jQuery-like library for Python. Copy PyQuery’s source code to server/vendor. PyQuery requires cssselect so copy that to server/vendor too.

Then we just need to add a couple lines to aggregator.py:

1
2
3
4
5
from pyquery import PyQuery
# Inside GetFullArticleHandler
S = PyQuery(html)
article.imageUrl = S("img:first").attr('src')
# right before "article.body = html"

Now we have all the fields filled in.



Full-Text Reader

With all these full texts in the datastore, we just need a way to display them nicely. Let’s return to the client side and implement a Safari Reader style UI for reading the full text.

First, generate the view files:

1
2
3
$ muffin generate view ReaderView
21:57:35 [INFO]: * Created client/apps/main/views/ReaderView.coffee
21:57:35 [INFO]: * Created client/apps/main/templates/ReaderView.jade

Add the following to ReaderView.jade:

1
2
3
.reader
h2.title <%- article.title %>
<%= article.body %>

We also need to add a lightbox to LayoutView.jade:

1
2
3
#lightbox
.overlay
.content

And update ReaderView.coffee to the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Backbone = require 'Backbone'
class ReaderView extends Backbone.View
template: _.tpl(require '../templates/ReaderView.html')
events: {}
initialize: (@options) ->
# Data passed from superview
@superview = @options.super
@article = @options.article
# Render the template
@$el.html @template({article: @article.toJSON()})
render: => @
module.exports = ReaderView

To show the ReaderView, we need to add event handlers to ArticleListView:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
events:
'click .article-cell img.article-image': 'showReader'
'click .article-cell .title a': 'showReader'
showReader: (e) ->
$current = $(e.target).closest('.article-cell')
index = $current.index()
article = @collection.at(index)
p = new ReaderView {article, super: @, el: $('#lightbox .content')}
$('#lightbox').fadeIn()
$('#lightbox .overlay').on 'click', @hideReader
hideReader: (e) ->
$('#lightbox').fadeOut()

And finally, add these styles to main.less:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#lightbox {
position: fixed;
top: 0;
left: 0;
bottom: 0;
right: 0;
display: none;
z-index: 99999;
.overlay {
position: absolute;
background: black;
.opacity(0.8);
width: 100%;
height: 100%;
}
.content {
margin: 0 auto;
width: 62%;
height: 100%;
}
}
.reader {
position: relative;
height: 100%;
padding: 0 5em;
background: white;
font-family: Georgia, Times, 'Times New Roman', serif;
.box-shadow(3px 3px 3px 3px #333);
overflow-y: scroll;
-webkit-overflow-scrolling: touch;
h2 {
font-family: Georgia, Times, 'Times New Roman', serif;
font-weight: bold;
font-size: 22px;
margin-top: 28px;
}
img {
max-width: 100%;
margin-bottom: 1.6em;
}
p {
font-size: 18px;
line-height: 1.4;
}
}

It’s time to see it in action! This is how it looks on desktop:



The main layout also looks much better with thumbnails:



Unfortunately we still have some work to do on mobile:




We can fix the issue with media queries:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#lightbox .content {
@media (max-width: 1024px) {
width: 76%;
}
@media (max-width: 768px) {
width: 100%;
}
}
.reader {
@media (max-width: 768px) {
padding: 0 6%;
}
}

Now the app looks much better on iPod Touch:



The app looks better on iPad too:



We’re still not done yet. Now that the full-text reader takes over the full screen on iPod Touch, there is no way to go back to the main screen! To fix that, we are going to add a navigation bar whenever the full-text reader takes over the full screen.

First add the following to the top of ReaderView.jade:

1
2
3
4
.topbar
.prev-btn
.next-btn
.done-btn Done

In main.less, add styles for the buttons used in the navigation bar:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
.prev-btn {
position: absolute;
height: 44px;
width: 44px;
top: 0;
left: 8px;
font-size: 16px;
background: url(../images/previous-btn.png) center center no-repeat;
background-size: 44px 44px;
}
.next-btn {
position: absolute;
height: 44px;
width: 44px;
top: 0;
left: 50px;
font-size: 16px;
background: url(../images/next-btn.png) center center no-repeat;
background-size: 44px 44px;
}
.done-btn {
position: absolute;
line-height: 20px;
width: 44px;
top: 8px;
right: 8px;
font-size: 13px;
padding: 3px;
border: 1px solid gray;
.border-radius(5px);
color: #999;
border-color: #999;
text-align: center;
}

There are a few other minor changes in main.less. Check git commits for details.

Now we have a navigation bar on mobile devices:



Let’s make those buttons work. Add event handlers to ReaderView.coffee, also check the current article index to hide or show the buttons.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Backbone = require 'Backbone'
class ReaderView extends Backbone.View
template: _.tpl(require '../templates/ReaderView.html')
events:
'click .topbar .prev-btn': 'showPrevArticle'
'click .topbar .next-btn': 'showNextArticle'
'click .topbar .done-btn': 'dismiss'
initialize: (@options) ->
# Data passed from superview
@superview = @options.super
@article = @options.article
@currentIndex = @superview.collection.indexOf(@article)
@len = @superview.collection.length
@render()
render: =>
# Render the template
@$el.html @template({article: @article.toJSON()})
# Show/hide prev/next buttons
@$('.topbar .prev-btn').hide()
@$('.topbar .next-btn').hide()
if 0 <= @currentIndex - 1 <= @len-1
@$('.topbar .prev-btn').show()
if 0 <= @currentIndex + 1 <= @len-1
@$('.topbar .next-btn').show()
@
showPrevArticle: (e) =>
return false unless 0 <= @currentIndex - 1 <= @len-1
@currentIndex -= 1
@article = @superview.collection.at(@currentIndex)
@render()
showNextArticle: (e) =>
return false unless 0 <= @currentIndex + 1 <= @len-1
@currentIndex += 1
@article = @superview.collection.at(@currentIndex)
@render()
dismiss: =>
@superview.hideReader()
module.exports = ReaderView

Now we can dismiss the reader view by tapping the “Done” button, and use the “prev” and “next” buttons to jump to other articles.

While we are at it, let’s also fix the navigation bar on the main screen. We need to add a “menu” button for mobile devices and a “refresh” button for all devices. Edit LayoutView.jade as below:

1
2
3
4
.topbar
.menu-btn.visible-xs
.title Ars Technica
.refresh-btn

Add the button styles in main.less:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
.title {
position: absolute;
top: 0;
left: 8px;
font-size: 22px;
color: #777777;
@media (max-width: 767px) {
width: 100%;
text-align: center;
font-weight: bold;
}
}
.menu-btn {
position: absolute;
height: 44px;
width: 44px;
top: 0;
left: 8px;
font-size: 16px;
background: url(../images/menu-btn.png) center center no-repeat;
background-size: 44px 44px;
}
.refresh-btn {
position: absolute;
height: 44px;
width: 44px;
top: 0;
right: 8px;
font-size: 16px;
background: url(../images/refresh-btn.png) center center no-repeat;
background-size: 44px 44px;
}

Let’s wire up the buttons. Open LayoutView.coffee and make the following edits:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
events:
'click li.feed': 'showArticles'
'click .topbar .menu-btn': 'showMenu'
'click .topbar .refresh-btn': 'reloadArticles'
'click .article-list-view': 'hideMenu'
reloadArticles: =>
@$('.topbar .refresh-btn').animateRotate(360, 1000, 'linear', null)
@articleListView.reloadArticles()
showMenu: (e) =>
@$('aside').removeClass('hidden-xs')
@$('aside').css
display: 'block'
position: 'absolute'
left: '-220px'
top: 0
width: '220px'
height: '100%'
background: 'white'
'z-index': 3
@$('aside').animate {left: 0}
hideMenu: =>
return false unless $(window).width() <= 480
@$('aside').animate {left: '-220px'}, 'normal', ->
$(this).addClass('hidden-xs')
$(this).css
position: 'static'

Again, check git commits for details.

Where we stand

We did great in this part! The app looks better than ever, and the full-text reader is just gorgeous! Here are some more screenshots on iPod Touch and iPad:











In the next part (the last!) of this tutorial, we will tie some loose ends, optimize the app’s performance, and implement a few more nifty features.

Yaogang Lian

An iOS, Mac and web developer. Focusing on building productivity and educational apps.