Mercurial > kiritsu

Kiritsu: A Humane Content Aggregator
====================================

Motivation
----------

The motivation for this project comes from two separate sources.

The first is the idea that content aggregation should, to whatever
extent possible, be an automated thing.  Despite the fact that RSS has
been around for years, for instance, most users don't know what it is,
and even those who do don't usually go through the additional work of
subscribing to feeds they want to read.  Ideally, a trusted
observer--such as a Mozilla browser--would monitor the browsing habits
of its user and automatically present them with content that it knows
they want, keeping all statistics completely confidential (stored on
the client or in a Weave cloud).  If the user visits Penny Arcade and
xkcd frequently, for instance, the browser can automatically subscribe
to the RSS feeds for those sites and display information about the
latest content from them on, say, the user's start page.  If the user
eventually stops visiting a site, the browser should notice this and
stop subscribing to its feed.  All of this is to say that a user
shouldn't have to even know what RSS or "web syndication" is for it to
help them transparently in a way that doesn't compromise their privacy.

The second motivation for this project is that shortly after joining
Mozilla, I was a bit overloaded with information--a mental state that
some affectionately refer to as "drinking from a fire hose".  While
this was, in some ways, really exciting and invigorating, it
eventually became a bit distracting.  Not only did I have to continue
to consume the information I had taken in before I joined, but I was
also faced with even more sources of it: I now found myself reading
planet.mozilla.org like it was a nervous twitch, checking Mail.app for
my mozilla email (in addition to checking Gmail for other email), and
constantly checking Mozilla's internal message boards as well.  On top
of that, I had to log on to IRC and instant messenger.

What I really needed was a solution that securely aggregated all of
this information in one place, sorted it by importance, and
compartmentalized different kinds of relevant information into views
that mapped directly to my daily activities (e.g., one view for work,
another for play, another for personal finances/chores).  Furthermore,
I wanted to be able to "turn off" the flow of information so that it
wouldn't distract me and I could get my work done.  Once I was ready
for more information, I could turn the flow back on, consuming it only
when I need it, rather than having it constantly pushed at me.

In spirit of this, I decided to call this project "Kiritsu", which is
the Japanese word for "discipline".

Prerequisites
-------------

To use Kiritsu, you currently need:

  * Python 2.5
  * Mark Pilgrim's feedparser module:
    http://www.feedparser.org/

Use
---

Presently, Kiritsu is in a rather nascent and inchoate state, and I
apologize in advance for any frustrations the reader may have in
getting it to work.  In short, this is what has to be done:

  (1) Copy Config.py.sample to Config.py and edit it as
      necessary.
  (2) Copy LocalAuth.py.sample to LocalAuth.py and edit it as
      necessary.
  (3) Run "MakeEverything.py".  If all goes well, a static HTML file
      should have been created for each view--e.g. "work.html" is the
      page for the "Work" view.

Implementation
--------------

Kiritsu operates entirely on RSS and Atom feeds; while other sources,
such as IMAP, XMPP, IRC, and so forth are (or will be) supported by
the framework, they are internally converted into RSS feeds and then
processed as such.  My intent was to use pre-existing standards as
much as possible.

Improvements
------------

As can plainly be seen, Kiritsu can use a lot of improvements,
especially given all the features mentioned in the "Motivation"
section of this document.  In particular:

  * Kiritsu should eventually be able to automatically figure out what
    feeds the end-user wants to read, to as much an extent as
    possible.  One potentially easy way to do this may be to create a
    simple Firefox Extension that integrates with MeeTimer
    (http://getmeetimer.com/) to determine what the end-user is most
    frequently reading, and when they're reading it.  Then it should
    at least be possible to offer different views of information based
    on what time of day (or day of the week) it is.  There's obviously
    lots of room for machine learning algorithms here; for example,
    bayesian filtering could be used to determine what kinds of
    articles within a feed the user finds interesting, and only
    present them with similar articles.  The general goal is to
    prevent information overload, as opposed to presenting the user
    with more information.

  * Support for more information sources needs to be added, such as
    XMPP and IRC.

Other Considerations
--------------------

It should be noted that there are some interface features that are
traditionally considered to be humane which have intentionally been
left out of Kiritsu.  For example, Humanized History [1] is humane
because it doesn't require the user to think about navigation.
Kiritsu, on the other hand, intentionally displays a very limited
amount of information to the user that should comfortably fit on one
page.  The intent is to give the reader a definitive stopping point at
which they can stop consuming information and go back to doing
whatever it is they need to do.

[1] http://www.humanized.com/weblog/2006/04/28/reading_humanized/
author	Atul Varma <varmaa@toolness.com>
date	Fri, 18 Apr 2008 17:13:02 -0700
parents	d5bc8acafca3
children