LinuxWorld

Community

Notes from my talk at SCALE

These are my notes for ifdown -a now! Do more, better - offline at SCALE.

Why go offline? The Onion said it best: "48-Hour Internet Outage Plunges Nation Into Productivity." Other reasons include privacy, security, speed, and not having to read TechCrunch every day to see if your data is still going to be there next week. This talk will cover a mix of the top offline productivity hacks, including running an MTA on the laptop, using offlineimap to sync up huge archives of old mail, basic web caching, local search tools, and just enough distributed revision control to be dangerous.

Why go offline?

Organizing information

Git

Distributed revision control, as powerful as you want it. No, you don't have to understand how the whole Linux kernel project stays connected in order to use git for basic stuff such as a small software project, a web site, or a wiki.

On the server (xenu.example.com):

cd ~/public_html
git init-db
git add *.html
git add images/
git commit -a -m "Now keeping my home page in git.  Yay me."

On the laptop:

git clone ssh://xenu.example.com/~ron/public_html/
# you can't "clone" the copy before you "commit"
# on the original
vi resume.html
git status
git commit -a -m "added git experience to my resume"
git push origin

Using git for a simple web site

If you're just "push"ing into ~/public_html/, you'll need a "hook" to check out the new version of your work on the server repository

#!/bin/sh
# Based on a script from Ikke's Blog:
# http://blog.eikke.com/index.php/ikke/2007/08/

WORKDIR="/home/ron/public_html"
export GIT_DIR=$WORKDIR/.git
pushd $WORKDIR > /dev/null
git reset --hard
popd > /dev/null

Other uses for git hooks

  • Rebuild indexes
  • Run sitemap-o-matic
  • Deploy scripts
  • Run "make"
  • Run validators and other tests
  • Announce your work via mail?

Make some scratch repositories

About as easy as "mkdir" and "scp". Git is fast, and it's easy to install on all your computers, not just your "development box".

Play around with commit, revert, push, branching and merging.

It's not just for "git bisect"ing kernel bugs. Yay, bonus link:

http://blog.eikke.com/index.php/ikke/2007/08/09/an_introduction_to_the_git_bisect_featur

Randal Schwartz intro to git http://video.google.com/videoplay?docid=-3999952944619245780

Git home page: http://git.or.cz/

Ikiwiki

Goes with Git like cookies and milk. Keep your Wiki up to date while offline. It's also a blog, an aggregator, and a lightweight task tracker.

It's a "wiki compiler" so it builds out static HTML files, from either Wiki markup or Markdown.

Using ikiwiki on a laptop: http://ikiwiki.info/tips/laptop_wiki_with_git/

Manoj Srivastava's blog (in Arch, not git.)

Ikiwiki's own bug tracker

Portland State Aerospace Society updates their wiki offline, while out in the Black Rock Desert launching rockets -- then merges with any changes that happened on the web site.

Planet/Venus

Very useful blog aggregator. Build yourself a big page of text from a bunch of people's RSS feeds and use either Unison or Wwwoffle to make sure there's a copy on the laptop. As a bonus you can build miscellaneous other files—blogroll, newsletter, whatever.

http://www.intertwingly.net/code/venus/

Blosxom

("blossom") A simple blog engine that works from text files. Write them offline, then run your Unison script and get them onto your site. Easier than ikiwiki to get started with. Needs CGI on the server side.

Nice because you can keep a draft blog entry lying around and just change the name to publish it.

http://www.blosxom.com/

Transferring information

Work offline, then sync everything up.

OK, it's not quite that simple.

ssh

Speed up ssh -- make it set up a socket for connection sharing if possible. This is good when you're tunneling a bunch of stuff at the same time.

host *
ControlMaster auto
ControlPath ~/.ssh/master-%r@%h:%p

(You should see the socket with netstat -a.)

ssh part deux

If you're starting a long-running command such as rsync or offlineimap over an ssh connection, do something like:

ssh mail.example.com true && offlineimap

If there's a network problem, it'll die when it can't ssh, saving time.

Offlineimap

I keep all my mail in local Maildir folders. Makes it easy to use tools other than the mailer, and my chosen mailer is behind the curve on IMAP support anyway.

http://software.complete.org/offlineimap

Mairix

Nifty local mail search tool.

dmarti@zea:~$ mairix tc:Melinda Kendall
Matched 52 messages
dmarti@zea:~$ mairix tc:Melinda Kendall ~linux
Matched 25 messages

(Make a shell function to run mairix, then start mutt pointing to the results folder)

m()
{
    ps h -C mutt && return 0
    mairix -t $*
    mutt -f =mairix
}

http://www.rpcurnow.force9.co.uk/mairix/

Wwwoffle

"World Wide Web Offline Explorer" by Andrew M. Bishop. Since 1997. Surprisingly useful for regular web pages, surprisingly confusing for fancy-pants web apps.

  • Cache a copy of pages you visit (optionally ignoring the servers' no-cache demands)

  • Keep track of pages to visit while offline, and load them when you connect

  • Automatically refresh a list of frequently used pages.

Add your Wwwoffle directory to Tracker's list to index:

WatchDirectoryRoots=/home/dmarti;/var/cache/wwwoffle/http

For a laptop, don't forget to set:

bind-ipv4         = 127.0.0.1

Many web applications have the naughty habit of trying to "blow out the cache" when possible. Wwwoffle is aggressive about using cached copies where possible.

But some web apps get hopelessly confused.

Use proxybutton -- http://proxybutton.mozdev.org/ to conveniently switch proxy on/off from a button in the Firefox toolbar.

http://www.gedanken.demon.co.uk/wwwoffle/

Postfix

Run a proper MTA on your laptop and you don't just queue up outgoing mail in the MUA -- anything that wants to send mail can send it and it'll all get blasted out at once.

In /etc/postfix/main.cf, set:

relayhost = [127.0.0.1]:10025
defer_transports = smtp

And use "sendmail -q" to run the queue.

http://www.postfix.org/

Unison

Good for big media files for which proper revision control is overkill. I use this for my music collection and some Planet stuff.

unison -auto -ui text $HOME/Music ssh://zgp.org/Music

http://www.cis.upenn.edu/~bcpierce/unison/

rsync

Good for files that only get created one place. Use the "ssh [host] true &&" trick, so it doesn't build a big index, try to connect, and die.

http://samba.anu.edu.au/rsync/

Cutting the Web 2.0 chains

Let's see how the web business survived the last bubble. How many of the news stories linked to from Marc Merlin's Windows Refund day page are still up?

ZDNet: no, redirect to news.zdnet.com home page.

Nando Tmes: no, "Nando Media has made the decision to shut down The Nando Times and the SportServer sites effective May 27, 2003..."

New York Times: Yes, full story.

Wired: no, "Sorry, we couldn't find the page you were looking for."

Washington Post: no, "We are unable to locate the page you requested."

LinuxWorld: no, "You've reached a page that doesn't exist on the all new LinuxWorld.com."

San Diego Union Tribune: no, redirect to search page

BBC News: Yes, full story.

New York Daily: no, "Page not found"

(And that's content they paid for.)

theglobe.com

Record-setting 1998 IPO.

"Theglobe.com, based in New York, hosts individuals' Web sites and offers email, chat rooms, and collections of Web sites for people who share the same interests." -- Jennifer Sullivan in Wired

LiveJournal

ljdump by Greg Hewgill. Max Spevack calls it "a simple python script that takes a username and password, and archives all of that account's posts and comments."

http://hewgill.com/software/ljdump/

LinkedIn

Address Book Export http://www.linkedin.com/addressBookExport

And finally

The lost cannot be recovered; but let us save what remains; not by vaults and locks which fence them in from the public eye and use in consigning them to the waste of time, but by such multiplication of copies as shall place them beyond the reach of accident. -- Thomas Jefferson

Related subjects

Joey Hess on slow net connections http://kitenet.net/~joey/blog/entry/dealing_with_dialup/ http://kitenet.net/~joey/blog/entry/slow/

David Heinemeier Hansson: You're not on a fucking plane.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Proxy autoconfig files

Proxy autoconfig files are very useful as well as things like ProxyButton - when written correctly they can sense the browser IP address and know whether you are at home, at work or on a public hotspot network, and configure proxy settings as needed. They work with browsers such as Firefox, Opera, etc, and make the system more self-configuring.

Unfortunately the main distros don't do much with this - would be good if there was a nice GUI tool to specify your environment and write a suitable proxy autoconfig file for you. However, the moderately techie who can write JavaScript can easily write their own. Firefox is the best environment for testing as it converts the alert() calls into an entry in the Error Console under Tools.

Another tool to be aware of

All good ideas -- it might make the most sense just to ask NetworkManager the current location. Ideally you would have everything use the proxy all the time, except for known broken/naughty sites that can't handle it.

Podcast interview with Jane Silber and Carl Richell

Tune in to our podcast for the answers to your Ubuntu questions. What's new in Ubuntu's "Feisty Fawn" release, what does Canonical offer to system integrators, and how many virtualization systems can one distribution offer?

LinuxWorld Conference and Expo San Francisco, August 4-7, 2008.

Linux Plumbers Conference Portland, OR, Sept. 16-19, 2008.

FreedomHEC Santa Monica, November 8-9, 2008.