This is a poorly-organized list of the features I would like to implement for HURL. It's mostly for my own use, to keep track of what's implemented and what's not, and to make sure ideas are written down somewhere so I don't forget about them.
Items with a ``'' next to them are already
implemented; items with a ``
'' next to them aren't.
You can also see what's new in this
document.
Should be a button to show the original
(unformatted) article.
Browsing icons:
Currently these icons are individual inline images, which makes loading them slow (but only the first time with browsers like Mosaic that cache images); would it be better as one long icon with image mapping? I like the individual icons because if they point to an article you've already seen, you can tell from the border that Mosaic puts on the link; also, this allows for `dimmed' icons when an article is unavailable--this isn't really useful for next/previous by date or author, but for thread and selection browsing it will be useful. For example, if you were using image mapping and clicked on ``next in thread'' and there was no next article in that thread, you'd get an annoying error message. Also, using image mapping will impose a higher load on the server machine (although it shouldn't be much higher).
Should be able
to `mark' an article, and save
a list of marked articles somewhere; either on the archive (with a size limit)
or download this list to add to your home page. This could be used to define a
selection of articles which could later be viewed with
the supplied message-ID list
feature.
Also, it would be nice if there was some way to `push' and `pop' message-IDs onto a list (stored on the server); maybe a different list than the other one you're marking (or `n' lists, along with a ``select current list'' feature). This would be used to remember your current location: if you should happen to get distracted and read a bunch of articles that takes you away from the articles you had initially wanted to read (this happens all the time, for instance, you enter a query and read some of the returned articles, then see an author you like and start scrolling through his or her posts, then get tired of them and want to resume your position in the query-browsing.) Before you digress, you could `push' the current message onto a stack, then click on `pop' to return to that position later. The only way to return to a previous location currently is to use your browser to `back' out of all the extra messages you've read.
A button to hide or show buttons to extra
features (that is, the features that don't normally merit a single click
from the article browsing interface).
A button
to apply a certain filter to
this article (for example, to add links pointing to a dictionary on the Web for
each word in the current article.) Currently this and other filters are
implemented as individual icons, but this will eventually be changed to a
generic ``apply a filter...'' button which brings up a list of 10-20 filters
that can be applied.
A button to view
the thread's structure.
A button to rot13-decode the current article.
(implemented, but not documented anywhere; append ``&filter=rot13'' to a
message-ID? query).
Preferences (customizable features):
Mail should be sent to their most recently-used
e-mail address.
If the author has not been active for a defined
period of time (a year?), a warning should be issued.
This should be made intelligent enough so that
it checks the frequency of use of recently-used e-mail addresses, then sends
mail to the one that seems to be used the most often.
Alternatively, this could be something the author can define themself.
Should be able to search for a large variety of
things, mostly based on information found in the headers of articles.
Good lists of these things can be found here
and here. Queries I think we should implement:
Searches on specific header lines such as
Keywords or Organization (defined at installation, depending on the amount
of disk space available for indexes).
Searches for a certain `rating', as registered
by our voting software.
Searches within the body of articles should
only be allowed after the search has been narrowed to a defined level
(e.g., 100 articles), to reduce the load on the server machine. (No
longer true; HURL will support full-text searches Real Soon Now.)
Return a random selection of articles from the
defined search (e.g., ``give me a random 1% of Gerald's articles'').
Most queries should be possible using a regular
URL instead of just a form, so people can make a pointer to query the archive
directly. For example, people can put ``click here to see a list of the top
twenty articles I posted to talk.bizarre, as rated by the voting software''
in their home pages (although ``click here'' is bad document design).
This will be like Mosaic's
news interface, which shows 20 articles,
with a link at the top that says ``Earlier articles...'' and one at the
bottom that says ``Later articles...''
The user should be able to set the number of
articles that are displayed on a single page (implemented, but not currently
documented anywhere; append ``&max=100'' or whatever to the browse? URL.)
The format of the list should be customizable,
with intelligent defaults. For example, if you do a search for all of a certain
author's articles, you'd probably like to see a list of Date and Subject.
However, the user should be able to specify a certain return format.
The script will point to a temporary file of
message-IDs that resides on the server machine, in a directory that gets
purged every week or whatever.
The user should be able
to supply an URL pointing
to a list of message-IDs on their own machine, and have this list formatted and
presented by the archive itself. This would be used for people who want to make
a pointer to a list of their favorite articles from their own home pages.
People who have used multiple e-mail addresses
should be grouped together for their author pages, queries, etc.
Should make a submission program that lets people
submit this kind of information about themselves, which will automatically get
included in the next build.
Might be able to build a preliminary database
of this kind of stuff by looking at people's names (i.e., gerald@vnet.ibm.com
(Gerald Oskoboiny) is most likely the same guy as gerald@amisk.cs.ualberta.ca
(Gerald Oskoboiny) for a particular newsgroup).
It would be convenient to assume that
someone@machine.network.hostname.com is someone@*.hostname.com, but is that
a valid assumption? If not, how about someone@*.network.hostname.com?
(This is important because of things like user@netcom11.netcom.com vs.
user@netcom8.netcom.com, gerald@amisk.cs.ualberta.ca vs.
gerald@gsb008.cs.ualberta.ca, etc.)
When looking at any article, user should be
able to jump to articles in the same thread, as defined by the ``References:''
header lines.
A complex problem: we have
some good advice
on this from Wayne Davison, graciously forwarded to bizarchive
by Pope Clifton.
Should also be able to see a tree of the thread:
Should also have ``next/previous in this
thread'' buttons. Someone once mentioned that next/previous might not make
sense in complex thread structures, but I think if a tree is traversed
depth-first, and each node (?) is sorted by date, it would be a logical
way to view a thread.
Could hopefully steal code from trn for this,
maybe from other places.
Should automatically recognize all e-mail
addresses and message-ID references throughout the article, and make a link
to the relevant place in the archive.
This requires a lot of pre-computing, to check if each possible article reference is in the archive. However, it ensures that all links will be resolved, since articles that are not in the archive won't get a link. Otherwise, we have the article not found syndrome.
This is particularly important to groups like talk.bizarre, which has a lot of stuff crossposted from other groups (whose original articles will not be in the archive). Possibly this can be an option for the installers: whether or not to assume that all article references will be resolved.
Should automatically put links on any URLs
within articles.
Should (?) do keyword replacement; for example
to add features such as acronym expansion, sound effects (e.g.,
*plonk*),
etc. We could get really carried away with this, if we wanted to.
We could define other
filters, such as:
Should have some way to make posts anonymous; i.e., strip anything away
that identifies who the author was.
Should only allow people to vote on articles assigned to them at random.
(So you can't bring up all your articles with a query, then vote on them).
Should have a decreasing probability of presenting an article that is
known to be crappy. For example, if, after receiving 20 votes, the average
vote on a scale of 1 to 10 is less than 2, we can probably assume it will
not significantly improve, so we should start to bring this article up for
voting less often, to make the voting experience more enjoyable. We shouldn't
exclude the article from being voted upon forever,
of course, because it may
redeem itself somehow in the future (maybe it was based on a sensitive
subject that will wear off or something?); the probability of it coming up
again should just decrease, that's all. I don't know what sort of formula
to use for this... probably just linear, but maybe exponential. This sounds
messy, but will be easy to write.
Votes will be tabulated daily (?), then they can be used as a criterion
of a search query.
For initial testing, we can use the votes that PV's been collecting for
the last few months.
Not much to say here, just need to have good
help for each page (article pages, author pages, the query page(!) ).
Might also be fun to implement
context-sensitive online help: when you click on the CSOH button, it reloads
the current screen with each component replaced with a link to help on that
feature. This would be easy to implement, but reloading the whole page might
make it too slow to be useful.
The main reason I want to do this is so I can
make excuses about lame articles from my past, saying why I posted what I did,
explaining jokes that Didn't Quite Work, etc. It might also come in handy for
some other purposes, but if they're too handy, people might start using them
to discuss things...
There will have to be a limit on the amount
of text that can be contributed, to save disk space. Might also want to
escape '<', '>', and '&' characters so people can't put HTML links in.
This could be generalized so that any arbitrary list of message-IDs becomes just another cached search item. (For instance, lists for each author, etc.)
Images that were generated on-the-fly.
Some possible exceptions are: weird thread
structures (e.g., jfw's circular thread), bad header data such as duplicate
message-IDs within an archive of articles (this will definitely happen),
forged articles (we should flag them as such).
For bad thread structures, could just say
``no thread information is available''.
For bad headers, forged articles, etc., we could
add information to the archive about exceptions, without changing the original
article. For a duplicate message-ID, we could replace the Message-ID line in the
header with a new, unique message-ID (for convenience), then make a note that
we've done this somewhere else... possibly by adding an extra header line such
as ``X-HURL-Info: old-message-ID: stupid-duplicate-message-ID@hostname.com''.
Ugly, but manageable.