David Sterry's Blog


Monday, November 28, 2005

Del.icio.us research

I was just wandering around Del.icio.us a bit and I'm sure I gave one or more of their servers a workout. I had an idea: who were the first people to bookmark the web's most popular sites?

On a whim, I started with Technorati. I searched del.icio.us for it and clicked on the red "4124 people" link to see the Technorati : Home url page. After a bit of downloading(that's a big page!) I scrolled to the bottom and found it was a guy named angusf.

Then I looked through some of his earliest bookmarks and began searching for even earlier bookmarks. The earliest one I found was way back in 6-11-2000 by ericbogs. It was my.yahoo.com. I think it just may be the earliest bookmark on del.icio.us and I have a feeling it's a test bookmark since I don't think has been running since 2000.

The Del.icio.us database is amazing...here are some other query ideas:

Of people's first 10 bookmarks, what are the top 100 to appear for the most users?

Compare Alexa's top 100 list for daily reach and traffic with the #s for bookmarks in the del.icio.us database. The question I would be asking is, do bookmarks correlate well with traffic?

Finally, it would be interesting to follow through and build a list of the first users to bookmark each of Alexa's top 100.

There are litterally millions of research ideas on the del.icio.us database and as long as you take care and use the information responsibly we can probably all learn quite a bit about ourselves from this large, open data store.

Tuesday, November 22, 2005

Trend Sweet Trend RSS Feed

I've just created my first RSS feed for trend sweet trend. To do it, I scraped my own page and built the file. You've got options: add it to your feed reader, use a site like feedburner to add it to your page, or subscribe with Firefox so it's always close at hand.

Creating an RSS feed isn't too hard but there were a couple of stumbling blocks. The first one I ran into was that firefox refused to recognize my php generated document as rss.

This may have been because I had some extra whitespace outside of the code, but I thought it was because of the webserver not sending the output of php suffixed scripts as type application/rss+xml. That's why I'm now generating the page using a remote perl script.

The other problem I ran into was encoding of funny characters like double greater than symbols. Why people want to use these in their page titles, I don't know but I solved that using a CDATA section for each title.

More trends and feeds to come...

Sunday, November 20, 2005

Micro-disaster plan

Livemarks became slow yesterday for an hour or so. That meant I had to update my database by trying to connect and download links manually. I didn't think of the possibility that my script might not connect but it's important to plan ahead when you're using a webservice.

You've got to think, what happens if it's not available for an hour, day, or week. What does your site look like then? Turns out my trend sweet trend displayed its header only...no links, no trends, nothing.

Today, I fixed it by displaying a message that livemarks was down and showing how many hours the info is delayed. I'll have to work on the scraping script since running it manually seemed to work with a 4-5 minute time for curl to get the page.

Friday, November 18, 2005

Cream of the crop

The del.icio.us crop that is. I've taken the data I'm gathering for trend sweet trend and made a best-of page. This page shows only the sites that have made it to #1 on livemarks for about an hour or more.

I expect the list will grow by about a link a day so if you check the page every week or so you should get some pretty good old links. The goal of all this is either to make a list of the most useful web tools ever or to help people waste time as efficiently as humanly possible.

Thursday, November 17, 2005

Copilot.com

I've found something good.

Today, I had the opportunity to use Copilot, Joel on Software's easy remote control service. It lets one person control another's computer from anywhere on the planet. The key innovation is that it's so simple to use. Both the helper and helpee go to copilot.com, one pays and gets a code, the other enters the code on the site. Then they both download the software and connect.

I was the helper today and I paid $9.95 for a 24 hour pass. The connection is both encrypted and fast enough to do anything a helper would need to do. With only two buttons on the toolbar - refresh and ctrl-alt-del - the interface is truly idiot proof.

The only thing I would change about the service is the way it's paid for. I would prefer if I could either buy a year's pass or pay a smaller amount for a shorter session than 24 hours...4 hours for $3.95 might work.

At that rate, all I'd have to say is, "Anybody need a hand?"

Wednesday, November 16, 2005

Automatic testing

For a coder the pinnacle of frustration is testing...finding and fixing the bugs you want to squish. You can take one approach which is to find a bug, fix it, and pray that no other bugs appeared during the process. This unfortunately doesn't work when you're dealing with an application of any but the most trivial complexity.

You've got to have a plan: a repeatable process for testing your code to make sure nothing broke. I just created a test plan for a site I'm working on and it came to about 35 actions to test. This is for a site that has about 8 real pages and some database code.

Testing a web site is a unique challenge in that it's hard to write a script to drive all your browsers. What's needed is a standard browser control API. With broad support for such a standard you could write a test harness that drives IE, Firefox, Netscape and Opera all at the same time.

You could put that 21" lcd to use tiling the four browsers while setting the zoom in each to simulate highres surfers. While I'm imagining this, why not have all the tests be recorded as macros? Wow, wouldn't life be perfect if I had that?!?! I doubt it but there is some hope.

After a bit of searching I found this Mozilla ActiveX control project which is a step in the right direction. (tangent: Why don't we have open source activex?) If I get a chance to check out their project, maybe I can automate some of my website testing.

Until then, I'll be repeating my 35 steps by hand and writing test scripts whenever the opportunity arises.

Tuesday, November 15, 2005

fix sweet fix

Today, I fixed the use of a tinyint to store the popularity numbers. Turns out more than 127 people per hour bookmark the most popular sites on del.icio.us! who knew.

I also changed the way the trends work...if a popularity number is constant for a couple hours, it's more visually appealing if the color is carried through from the last reading.

trend sweet trend

What's hot this hour? What sites are people most rapidly saving for later?

I'm introducing here my livemarks trend chart, updated hourly. It and
livemarks are based on del.icio.us so you get to see what people are most
vigorously bookmarking at the moment. The chart is here.

Sunday, November 13, 2005

BaySUG '05 - Samba and Google's Summer of Code

Had an awesome time yesterday at the First Bay Area Super User Group Meeting (BaySUG '05) in Mountain View. With about a hundered or so attendants, there were two talks given. First up was Jeremy Allison where he layed out the genesis and evolution of Samba. He related a lot of great history from Andrew Tridgell's early "frank and honest" conversations with Microsoft about the quality of their SMB code to today's race to complete Microsoft's new SMB2 protocol before Vista 's release.

I've never worked on any large open source project but I got the feeling that it really is a small world and great programmers and documenters can definitely make an impact. In terms of the race to finish SMB2, it seems to me that while it's important to keep an eye on Microsoft's latest tricks, it is perhaps more important that they continue innovating in the best interest of their users. Maybe Vista won't be so relevant in a world of Samba servers and strong Linux desktop clients with superior networking. Most interesting was probably Allison's revelation that Microsoft saw helping Samba early on as a way to get Windows clients into UNIX shops!

The second talk was by Chris DiBona, Open Source Programs Manager at Google. He ran Google's Summer of Code and told us some great specs on the project. They recieved ~9000 applications, accepted 419, and had a success rate of 84%.

With so many applicants, I get the feeling that there is a definite need for more good project leaders like Fyodor of Nmap and Jeremy Allison of Samba. Maybe Google can recruit some past winners as project leaders the next time around. Here's an interview where Fyodor discusses his involvement in the project.

It was cool seeing this picture of 8 or so SUN and BSD boxes that were the genesis of Google at Stanford.

I got a couple free t-shirts, a free diet coke, some coupons, a little network patch cable. I love free marketing junk so it's definitely a great time going to these things. I hope they have BaySUG every year!

Thursday, November 10, 2005

A Million Dollar Dream

What can you buy for a million dollars these days? It used to be a fortune. It used to be pie in the sky for anyone. To be a millionaire was to be “rich” – akin to being able to have anything and anyone you wanted. But it isn’t now. A million dollars isn’t all that much money and it never was. There was always a way to spend it and only so much it could buy. I’d like to present a few ideas of how to spend a million dollars to make it clearer just what a million dollars is.

Starting at the cheapest end of the spectrum I’ll try to put this million dollars into some everyday terms. The cheapest thing I can buy at the grocery store is a package of top ramen for $0.10. That means I could buy 10,000,000 packages of ramen. I could feed(not well) a 30th of the US for one meal. Conversely I could feed about 1000 people(again not well) for a year.

A more moderate way to spend it is to send out DVDs…which can be mass produced for about $1 a piece to a million people. I can send them all about 2 hours of great quality video with whatever I want…I’d even bet that if I made it a good or mysterious title, a lot of people would watch it.

Putting a million dollars in labor terms, lets talk minimum wage. In California, the least you can pay someone is $6.75 per hour. With overhead per employee costing about 60% I could employ about 44 people for a year doing whatever it is that I want as long as it’s not illegal or too silly.

If I talk about engineers that make an average of $60,000 per year and I include overhead I’d be able to hold onto 10 of them for that year. I wonder what 10 engineers can do in a year. Quite a bit I’d imagine…if they’re software engineers they could make an application that does some cool stuff as long as it’s fairly focused.

So there’s a few ways to spend a million dollars. But you’ve got to have it before you spend it so some other time I’ll have to write about how to make a million dollars. How many widgets, software packages or houses would you have to sell to make a million dollars?