Friday, October 26, 2007

Amazon's Diversification Strategy is Paying Off... Big Time

As you may already know, I am a big fan of Amazon. I do most of my shopping there with my Amazon Prime account (2 day free shipping), I host all of my web applications on their Web Services infrastructure, and I evangelize the hell out of them. And it looks like I'm not alone. This post on Mashable has some interesting financial tidbits:

Amazon's revenue rose 41% versus the same quarter last year, the number of files stored on S3 has double in six months to 10 billion, 25,000 developers signed up for AWS in the past quarter bringing the total to 290,000, and sales from Amazon retail affiliates made up 32% of sales.

Now if that's not diversification paying off, I don't know what is. Now why I didn't buy a boatload of Amazon stock when I first heard about Amazon Web Services is beyond me especially because I knew it was going be huge.

"Cluster Challenge" Competition at SC07 Supercomputing Conference

SC07, an international conference on high performance computing, networking, storage, and analysis is holding a competition this year called the Cluster Challenge.  

During SC07 in Reno, teams will assemble, test and tune their machines until the green flag drops on Monday night as the Exhibit Opening Gala is winding down. The race now begins and teams are given data sets for the contest. With CPUs roaring, teams will be off to analyze and optimize the workload to achieve maximum points over the next two days.
In full view of conference attendees, teams will execute the prescribed workload while showing progress and science visualization output on large displays in their areas. As they race to the finish, the team with the most points will earn the checkered flag – presented at the awards ceremony on Thursday.

What I find interesting about this is that they have a power restriction:

Clusters are to be provided by the team and must consist of a single full-height 19” rack. A monitoring power strip will be available into which all components of the cluster must be plugged. A single 30 amp, 110 volt circuit will be provided with a soft cap at 26 amps. Alarms will be sent electronically if power draw exceeds this amount and penalties may be assessed for excess draw.

I wonder if the server of choice will be Sun's with CoolThreads Technology?

So are you up for the challenge? Does the word "grid" give you butterflies in your stomach? If so, go check it out.

Thursday, October 18, 2007

High Availability and Scale Testing for Microsoft's SharePoint Hosting

This article on Joel Oleson's blog provides amazing insight into Microsoft's scalability testing for their SharePoint hosting services. It is very Microsoft product oriented (SQL Server), but the general concepts should apply across the board. These is just too much juicy detail in there to post here, so just go read it. Kudos to Mike Watson for sharing the information (check out Mike's blog, he's got lots of good stuff on there).

Tuesday, October 16, 2007

Two New EC2 Instance Types Now Available: Large and Extra Large

So now you can scale up and out on EC2. Very, very nice. The original size is now called a Small Instance. Here are the specs and prices:

Small Instance (default)*

    1.7 GB memory
    1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
    160 GB instance storage (150 GB plus 10 GB root partition)
    32-bit platform
    I/O Performance: Moderate
    Price: $0.10 per instance hour

Large Instance (~4 small instances)

    7.5 GB memory
    4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
    850 GB instance storage (2 x 420 GB plus 10 GB root partition)
    64-bit platform
    I/O Performance: High
    Price: $0.40 per instance hour

Extra Large Instance (~8 small instances)

    15 GB memory
    8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
    1,690 GB instance storage (4 x 420 GB plus 10 GB root partition)
    64-bit platform
    I/O Performance: High
    Price: $0.80 per instance hour

If I were them, I probably would have called them Tall, Grande and Venti instances.

Amazon EC2 Now Open to the Public

Amazon announced today that EC2 is now open to all developers. If you were on the waiting list, the wait is now officially over so head on over to Amazon, sign up, and fire up an instance just for fun.

Google And IBM to Bring Cloud Computing to School

Google and IBM are bringing cloud computing to a University near you. I wrote a couple of weeks ago about a professor who is planning to use EC2 in the classroom, but it looks like it will no longer be a lone professor teaching his students something useful.

Six universities will be involved in the initiative. They are Carnegie Mellon, Massachusetts Institute of Technology, Stanford University, the University of California, Berkeley, the University of Maryland and the University of Washington.

Google is building a data center, at an undisclosed location, that will contain more than 1,600 processors by the end of the year. I.B.M. is also setting up a data center for the initiative.

The centers will run an open-source version of Google's data center software, and I.B.M. is contributing open-source tools to help students write Internet programs and data center management software.

I sure wish we had this when I was in school instead of learning how to make a single threaded Windows clone (circa 1979) in some obscure programming language on a 10 year old piece'o'**** Sparc.

More at NY Times.

Tuesday, October 9, 2007

Sun's New Pimpin Sparc T5x20 Servers

Just read on Tim Bray's blog about Sun's new T5X20 server line featuring the UltraSPARC T2 Processor. The Sun people seem to be pretty excited about this as you can see from the raving blog posts and the IBM bashing, and I guess they should be. From what I can tell, these are low power (as in electricity) consumption with high power (as in processor) output. And if you were holding back on how to really make your app more concurrent, you should really be thinking about it now.

With eight cores and 64 threads on one chip, integrated 10 GbE networking, crypto, and PCI-Express expansion, you have the jump on anything else on the market. The opportunities for system consolidation and virtualization are here like never before. Consumes less power per core and thread than any processor in its class – without compromising on performance. The UltraSPARC T2 processor gives OEMs a massively threaded, multi-core alternative to more power-hungry, less threaded processors from competing vendors.

Intense. It's like the PlayStation 3's big, bad, not fun brother. Amazon should pick up a few of these bad boys and stick them into the EC2 mix.

Vertical Scaling vs Horizontal Scaling

Vertical: (otherwise known as scaling up) means to add more hardware resources to the same machine, generally by adding more processors and memory.
  • Expensive
  • Easy to implement (generally, no change required in your application)
  • Single point of failure (if main server crashes, what do you do?)
Horizontal: (otherwise known as scaling out, hence the name of this blog) means to add more machines into the mix, generally cheap commodity hardware (like that cheap computer sitting under your desk).
  • Cheap(er) - at least more linear expenditures
  • Hard to implement (much harder than vertical)
  • Many points of failure and therefore can usually handle failures elegantly
Which begs me to ask the question, which one is actually more cost effective considering the engineering time required to make Horizontal Scaling work seamlessly?

And here is a wonderful video by Ted Cahall, CIO, CNET where he explains scaling out (horizontally) on a whiteboard with nice colorful markers.

Amazon S3 now has a Service Level Agreement

Jeff Barr says:
I am very happy to announce that, effective October 1, 2007, The Amazon S3 Service Level Agreement is in effect.
Great news! And it's about time.

The general gist:
Basically, we commit to 99.9% uptime, measured on a monthly basis. If an S3 call fails (by returning a ServiceUnavailable or InternalError result) this counts against the uptime. If the resulting uptime is less than 99%, you can apply for a service credit of 25% of your total S3 charges for the month. If the uptime is 99% but less than 99.9%, you can apply for a service credit of 10% of your S3 charges.
A good step in the right direction. Now if we can get this on EC2, we're golden.

Amazon EC2 in Schools

I just saw an article on CNet entitled Cloud Computing for Students where a computer science professor, Phil Windley, writes about how he plans to use EC2 this semester for teaching his students more than how to write a simple single threaded program to learn a language. He will be teaching them how to write a BIG distributed, scalable application.

Students will be able to create as many machines as they need and use SQS to talk to each other.
The cost for EC2 is $0.10/hour of compute time. With some careful management of the EC2 cloud (like making sure machines aren’t left running when they don’t need to be) I’ll be able to do the class for (hopefully, much) less that $100/student. That’s less than the textbooks for many classes.
The cost is pretty low compared to the learning the students will receive. This type of experience will be invaluable in their careers.

I would love to hear more stories like this, if you know any please post a comment. And Phil, if you need some help with this project, don't hesitate to ask.

Sunday, October 7, 2007

UK's Flexiscale to Compete with Amazon EC2/S3

A UK startup, Flexiscale, is now offering on-demand utility computing.

What makes it different? Here's some interesting features:
  • Has Windows (EC2 only has linux)
  • Static IPs (although I hear Amazon is working on this)
  • Recovers your stuff automatically after a machine failure (very, very nice)
  • Load balances automatically if a box you are on is getting hammered
  • Can scale up on demand (EC2 has fixed specs per machine)
  • Automatic scaling up (will detect overloaded machines and juice them up until the spike is over)
From the looks of it, it appears they are using standard issue Virtual Private Server (VPS) technology that is common in the hosting industry, but charging by the hour instead of by the month. Not a bad idea. The features they offer are pretty amazing if it works as smoothly as they say. The prices are a bit higher than EC2 for an equivalent machine, but still pretty decent, definitely in the "I don't need to ask the CFO to sign off on this" range.

Justin.TV Cuts Costs Drastically by Using Amazon EC2

In this article on NewTeeVee, Michael Seibel, CEO of Justin TV, says:
"by building its own equivalent of Flash Media Servers and using Amazon EC2, the company has cut costs from a market rate of approximately $0.36 per user hour to under a penny per user hour."
Not bad, not bad at all. I actually watched a presentation by Kyle Vogt at the San Francisco Startup Project about their experiences with Amazon Web Services and it was pretty interesting largely because they archive massive quantities of video. Basically all the video, 24 hours per day is archived (50GB of new archive video / day) at the time of the presentation. Why people want to watch Justin sleep and why they would want to archive Justin sleeping is a question for another day (archiving Justine makes a bit more sense), but the technological challenges they are dealing with are interesting none the less.

Oh, and did I mention they recently got funded and opened up their platform so you can now broadcast your life to the world, 24x7.

Sun Grid vs Amazon EC2

This is kind of like comparing apples to oranges, but I'm sure it's crossed some peoples minds:

Amazon EC2
  • $0.10 per CPU hour ($72 / month)
  • An instance is roughly equivalent to a system with a 1.7Ghz x86 processor, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth (bursting to 1Gb)
  • Linux
  • Infinite storage via S3

Sun Grid
  • $1.00 per CPU hour ($720 per month)
  • SunFire dual processor Opteron-based servers with 4 GB of RAM per CPU, Solaris 10 OS, and Sun Grid Engine 6 software.
  • Solaris 10
  • Is there any long term storage option? Doesn't look like it from what I can see.
$72 per month for EC2 which allows you to run anything your heart desires on a fresh linux install. $720 per month for Sun Grid which appears to be very restricted (a "new feature" on the Sun Grid is Internet Access! Wow!). Sun Grid really appears to be for real compute jobs, not for your average every day application. You select an application, upload data, create and run a job, then download the results.

Which would you choose?

What is scalability?

A good article on what scalability is... if you were wondering. And if you were wondering where Amazon EC2 falls, it's in the Horizontal Scalability camp: cheap, commodity machines.

A good quote from the article:
While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible.
True, true.

Jon Udell on The Fourth Platform: P2P

Jon Udell posts a response to Marc Andreessen's post about "The three kinds of platforms you meet on the Internet" in which he says there is a 4th Platform, namely P2P. Interesting idea and he has good points, but there have been very few "legal" apps that have been successful in p2p, namely Skype (and I have a feeling Joost will be successful too), the rest have been for somewhat shady purposes. There has to be real value (eg: free music, free telephony, free TV) for the end user to give up their computers resources to work for others.

Quote from post:
"if we can extend our definition of the cloud to encompass what the Internet originally was: a network of peers. With rare but notable exceptions (e.g. BitTorrent) it hasn’t been that for a long time. I think it will be that again. There’s a level 4 platform waiting in the wings. At level 4, the cloud of storage and computation is partly centralized in a handful of intergalactic clusters, and partly distributed across a network of humble peers. Microsoft’s forthcoming Internet service bus is one example of a level 4 platform. I hope, and expect, we’ll see others."
I would love to see it, I just don't think it's going to happen anytime soon. In fact, I think it's bound to go the opposite direction with thin clients everywhere and all of the processing and storage is in server clouds like EC2.

Lets Get This Started

I've been collecting stuff for a while now and will be posting here from now on. There will be no order to the madness that will ensue. Purely random.