Deprecated: Function split() is deprecated in /home/tbriggs/public_html/s9y/plugins/serendipity_event_metadesc/serendipity_event_metadesc.php on line 101

Links for the Week

Disclaimer: some of these are a bit old for the obvious reason.

• Apparently there was a recent radio show about columnar databases. Go figure.

This TDWI article about Aster Data left me scratching my head. Is this a description of Aster's nCluster product or every MPP database ever?

xkoto "enables commercial off-the-shelf databases to run on a cluster of commodity systems with the same or better reliability and performance as much more expensive proprietary systems". Sounds like a poor man's Oracle RAC to me, not that that's necessarily a bad thing.

• Google search trends for Netezza, DATAllegro, ParAccel, Vertica and Dataupia over the last 12 months. I wonder why Vertica gets so much attention in Canada.

• A great interview with Don Feinberg about hosted database services (both DBaaS and cloud-based). Especially cool is the grid of vendors on the second page.

• Two weeks ago I found this blog post very interesting. I'm no longer sure why.

Trackback specific URI for this entry

No Trackbacks

Hi Tom,

To your point on MPP databases, I wanted to provide a little more information about Aster nCluster since it's not always easy to get across the key differentiators in a short article. There is also a very timely post from David Cheriton (professor of Distributed Systems at Stanford) on our blog addressing the network interconnect in even greater detail, which readers might find interesting.

As you know, for MPP, the bottleneck is the network - this is best overcome in a system built from scratch. To Aster’s knowledge, no one has built a database from the ground up with the network in mind. Others are extending their DBMS by writing a custom operator for the network (e.g., a query optimizer). There are many other operators for disk, caching, sequential scan, etc. With respect to the network bottleneck, most systems have just added one more operator for the network.

However, network traffic isn’t just one problem, it is THE problem. You can’t just write one operator and hope that it overcomes the issue.

While many DBMSs focus on distributed storage and I/O, Aster nCluster is different in that it efficiently optimizes network bandwidth for distributed analytics. It’s easy to claim MPP but very difficult to do it seamlessly, as horizontal scale-out and network efficiency are naturally opposing forces.

I hope this helps relieve some of the head scratching and starts some more discussion around the topic of scalable MPP databases. Check out our blog for more details on this topic.

Great blog, BTW - it's great to see some fellow database junkies.
Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.