Re: HURL-like project at sourceXchange

Replies:

  • None.

Parents:

<disclaimer>
I'm not a tax accountant, attorney or immigrations expert
but think there must be some way around this slight obstacle
with your work visa.  If you get booted implementing this
suggestion, remember this limitation of liability statement.
</disclaimer>

Go offshore young man - incorpate in some tax haven like the Ceyman
Islands (where I planning on co-locating my online gambling casino).
Get a domain name registered there, web site hosting on a server
there, po box (if needed), and a bank account.  Open an E-trade
account (or someone who can handle offshore investors online) and play
the market with the money, remembering you're allowed to bring
something like up to $20k/year cash into US and you won't pay any
Capital gains on your winnings.  If Uncle Sam even gets really pissed,
so what you're not a citizen anyway.

Put the code there, even finish writing it while ssh'd to the box
and/or take a vacation there with laptop and have a coding frenzy on
the beach while watching bikinis and drinking rum.  US can't tax you
for work done outside of US, make enough grey area so they can't prove
it wasn't done here (add as many double negatives and rum drinks as
needed until this makes sense).

You could probably find a perl2java to get started if you need to port
to java to collect. If you want help turning it into Java Servlets,
let me know.  I can afford some freelance time, having just turned
down $150/hour moonlighting job yesterday because I know the client is
a total pain in the ass.  Servlets are by far my favorite web
platform, speed parallels or surpasses mod-perl, like staying as o-o
in design as long as possible (until presentation layer where
sometimes there's diminishing returns), etc.  Perl is a favorite for
serious text mangling with regex, Java lacks there without someone
else's packages like Oroinc's.  I could use a week in the Caribbean
having just shoveled rain soaked snow at 40 pounds a shovelfull out of
my driveway.  Just pay for my airfare and the rum, keep the $15k.

>From: Gerald Oskoboiny <gerald@impressive.net>
>
>sourceXchange is a site that links open source software developers
>with paying development projects. The blurb on their home page says:
>
>    sourceXchange is a new forum linking Open Source developers
>    worldwide with intensifying commercial interest in Open
>    Source software.  Call it one version of the Bazaar:
>    sourceXchange is a wide-open marketplace to facilitate a
>    dynamic exchange between buyers and sellers, a place where
>    highly skilled Open Source developers supply their expertise
>    to committed buyers with well-defined, financially backed
>    Open Source projects. -- http://www.sourcexchange.com/
>
>I signed up for an account there a while ago out of general
>interest in the open source community; I don't have much interest
>in finding extra work or whatever.
>
>On Friday I got mail from them titled "new RFPs on sourceXchange",
>and one of the projects (included below) is basically identical
>to what my HURL software aspires to be, and pays $15k USD!
>
>HURL is something I wrote in 1994 for putting news/mail archives
>online:
>
>    http://impressive.net/software/hurl/
>
>I recently started rewriting it from scratch, and the new version
>is used in the current fogo list archives:
>
>    http://impressive.net/archives/fogo/
>
>It still needs a fair bit of work, but I don't think it would
>take more than a month of full-time work to meet the needs of
>this proposal (add mbox support, MIME support, finish other stuff),
>assuming I could convince them that it's okay to do it in Perl.
>(Part-time that could be anywhere from 2-6 months depending how
>much my social life is allowed to suffer.)
>
>I don't think W3C or MIT would have a huge problem with my
>working on something else on the side, but it might be a pain to
>get permission from the INS to have income from something besides
>the job for which I was granted my visa. (I have no idea how much
>of a pain that would be -- maybe I'll make some calls this week.)
>
>I really don't want any extra work right now as I hope to travel
>a fair bit this year, but I was planning to write this code in my
>spare time anyway so it seems stupid not to get $15k for it. Hmm.
>
>Here are the project details as proposed:
>
>http://www.sourceXchange.com/RfpBrowse?Button=Details&rfpID=18
>
>> Project Title: Java servlet- and WebMacro-based browser for Unix
>> mbox-format mail archives
>>
>>       RFP:  18
>>   Sponsor:  Collab.Net
>>   Project:  Java servlet- and WebMacro-based browser for Unix
>>             mbox-format mail archives
>>    Skills:  Java servlets, WebMacro, mbox format files
>>    Detail:  Background:
>>
>> There is no decent Web-based mail archive browser out there, at
>> least none I've seen, and definitely none java servlet based.
>> Hypermail is great, but it has three serious drawbacks as do most
>> of the others:
>>
>>  1. the format of the HTML being presented is hardcoded at the time
>>     the processing is done
>>
>>  2. every message is a separate file - nice for speed, but horrible if
>>     we're talking about millions of messages and a potential shortage of
>>     inodes.  The extra seek() is worth the optimization of disk space and
>>     filesystem structure.
>>
>>  3. scalability - lots of manual or scripted work is usually necessary
>>     to make Hypermail and other tools work for the volume suggested
>>     above.
>>
>>
>> I made a small feeble stab at implementing something like this in
>> perl a while ago - see: http://www.apache.org/~brian/glob/
>>
>> There's no documentation there, but you will see an "indexer.pl"
>> which I've used to index the messages from the .mbx files they
>> were in, and then I used "AddHandler" in Apache to establish
>> mbxhandler.cgi as the "handler" for all .mbx files.  The
>> "handler" model would be ideal, rather than forcing people to
>> access URLs under a /servlet/ hierarchy, but that complexity can
>> possibly be hid using mod_rewrite or even mod_alias.
>>
>> Objective:
>>
>>   Develop a combination of WebMacro templates and Java classes that
>> act as an interface to simple Unix mbox-format mailbox archives for
>> the purposes of rendering them as HTML and attachments.
>>
>> Scope of Work:
>>
>>   Develop a combination of WebMacro templates and Java classes that
>> act as an interface to simple Unix mbox-format mailbox archives for
>> the purposes of rendering them as HTML and attachments.  The code
>> should be tested against a very popular and diverse set of mailing
>> lists.  The software must be capable of interpreting MIME attachments
>> and rendering them as attachments in the message.  The messages in the
>> archive need to be viewable by chron order, by thread, and by author;
>> no need for threading or complex tree-like viewing is necessary, but
>> the ability to step through the list of messages in bunches at a time
>> is desired so that # of messages is never a problem.
>>
>>   The software must protect against any form of "malicious" code
>> snippets by character-escaping all outsider-contributed portions of
>> the resulting HTML pages.  The original message store must be and must
>> remain mbox-format mail archives; an accompanying simple DB file to
>> assist with the indexing is expected as well, to store things like
>> byte offsets for beginnings and lengths of each message, the essential
>> header info like subject/from/date, and thread references. Also
>> implicit in this is that the software must be able to (re)build the
>> indexes from the mail archives.  Not just the updating mentioned
>> below, but the ability to parse through the archive and rebuild the
>> index if it gets lost/corrupted/stale.
>>
>>   This DB would most likely be read into persistent memory by the
>> servlet to avoid having to hit the disk for every entry, with a search
>> to disk for cache misses (i.e., presume data doesn't change, it's just
>> added). It is foreseen that there would be an archiving "script"
>> invoked upon mail delivery that would insert the mail message into the
>> mbox archive and update the archive's index.
>>
>>   In addition, the ability to view a directory of such mbox-format
>> files (organized ideally by /year/month) and navigate between them
>> seamlessly is strongly desired.  This software must be usable at the
>> level of 20K messages per mbox file and an mbox file per month to
>> cover an arbitrary number of years.  Since the interface is
>> template-driven, there should be a per-list configuration file giving
>> some simple intro information and perhaps even allow for some simple
>> look & feel modifications per-list.
>>
>>   Finally, integration with a common Open Source search engine,
>> especially Swish-E, for cross-archive searching, is desired.
>>
>> NOTE: The proposed design may leave lots to be desired.  If you feel
>> that it should be different, feel free to indicate that in your
>> proposal.
>>
>> Remember what it is that we think are shortcomings in the other tools.
>> Yes, we realize that this design requires reparsing of MIME objects
>> during serving instead of at indexing time; that's an acceptable
>> tradeoff, but a scalable way of caching that would be interesting as
>> well.
>>
>>  Deliverables:  Implementation plan
>>                 WebMacro templates
>>                 Java sourcecode
>>                 Indexer script
>>                 Documentation on how to set up and use this toolset
>>
>>    Milestones:  Please propose a set of milestones you feel is
>>                 realistic.  Please include in there steps for
>>                 community feedback and followup development based
>>                 on that feedback.
>>
>>     Cash Compensation:  $15000
>> Non-Cash Compensation:  none
>>       Cash Equivalent:  $0
>>           Review Date:  2000-02-17
>>               License:  BSD
>
>
>--
>Gerald Oskoboiny <gerald@impressive.net>
>http://impressive.net/people/gerald/
>


--
Ted Guild
Software Developer
http://www.guilds.net
ted@guilds.net  

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny