Getting to Know DATAllegro, Part II

In the first part of this series, I covered DATAllegro the company and some information about the product. Here, in the second installment, we'll dive deeper into the architecture of the DATAllegro V3 system, and talk about the other interesting things DATAllegro's doing.

Product Architecture


The architecture of the DATAllegro system is disarmingly simple and subtly elegant. As much as I like it, I can't say all that much about it - I don't need to. It's that straightforward.

The DATAllegro system is a 6-part stack:

  • EMC storage on the bottom layer
  • Dell servers for compute nodes
  • Cisco InfiniBand switches for inter-node communications
  • SuSE Linux running the compute nodes
  • Open source Ingres providing core database functionality
  • DATAllegro's IP on top that ties it all together

(Yes, I realize those are listed upside down relative to where they appear in stack. Things would make much less sense if you read them in the other order.)

The way that these components work together is equally elegant. Here's what it boils down to: the data on the storage arrays is specifically arranged to make sequential reads fast. And it's compressed. Each compute node has 8 cores, two of which are dedicated to decompression, leaving 6 cores per node to read and process data. These 6 cores have 6 internal drives within the compute node to use as scratch space. (6 cores, 6 drives... coincidence?) Taken together, you get a simple but effective system that can scan large amounts of data at ridiculous speeds, crunch it in parallel and stay out of its own way while doing scratchwork.

I like it.

Fast scanning alone isn't enough though. Like most every other database, DATAllegro uses aggressive partitioning to split the data into smaller but predictable pieces. To use the Oracle terminology, tables are composite partitioned; they are first hashed across nodes and then hash, range or list sub-partitioned. Designed intelligently this has a huge positive impact on performance (not that that's specific to DATAllegro though.)

The one additional wrinkle here is replication. Somewhat unusually, DATAllegro replicates tables across nodes unless you specify a distribution key, and in addition they recommend replicating smaller (<1 GB) tables across all nodes. This is a bit dangerous, but allows for some control over data distribution that I find appealing. In short, though I think the purists would disagree with me, I think this a desirable approach. Nonetheless, automatic distribution is on the roadmap.

Modularity


Here's what I like most about DATAllegro's product architecture, however: it's modular. Any of the pieces in the stack can theoretically be changed, including the Ingres database layer. This has already been done with compute nodes, in fact; DATAllegro's European partner Bull replaces the Dell compute nodes with their own. I think this makes for some pretty interesting possibilities; more on that later.

High Availability


Each DATAllegro rack normally contains 8 nodes plus a spare, plus 2 master nodes in an active/passive mirror configuration. This is a good foundation for a high-availability system, but the fail-over capabilities feel a bit lacking compared to others. Recovery time was described as "3-5 minutes" with automated query restart probable but not guaranteed. This isn't the best HA story I've heard, but I think that it is more than sufficient for most scenarios nonetheless.

It's worth noting, however, that at the end of that 3-5 minutes the system is back to its former capacity. Other vendors may recover faster, but they often run at reduced speeds until the failed node is replaced. So the trade-off here is recovery time for performance, I think. Which is ultimately better is a matter of opinion.

SQL Dialect


My life as a software developer made me unusually sensitive to variations in SQL dialect, so while I doubt most people care, I always bring it up. :-) DATAllegro describes their SQL dialect as SQL92 with some SQL/99 functionality, particularly date and string manipulation functions as well as some analytic functions.

Interesting Extensions and Applications


Now that we've covered the product, let's talk about some of the other interesting things that DATAllegro is doing. This is where I think the real fun begins, and where DATAllegro really begins to differentiate themselves.

Hub-and-Spoke Grid


A hub-and-spoke grid is, quite simply, an arrangement of database systems with a data warehouse in the middle and many smaller data marts around the outside. This isn't necessarily a technical innovation, but more something people have been doing with varying degrees of success for some time. DATAllegro has embraced this idea, however, and provides the hardware, software and services to allow customers to successfully implement a hub-and-spoke grid.
(Image blatantly stolen from DATAllegro's "Hub-and-Spoke: Getting the Data Warehouse Rolling" whitepaper; download it here.)

I don't have a ton of specifics, but here's my high-level understanding of how this works. Multiple DATAllegro systems can be connected via InfiniBand, allowing high-speed data transfer between nodes. So, the data warehouse at the center of the grid (presumably a DATAllegro MRA) handles all the incoming data in the traditional fashion and is ultimately responsible for it. However, it also pre-aggregates some data (if necessary) and pushes pre-defined data models out to the data mart spokes. Departmental users then connect to and query those data marts directly. This has a number of obvious benefits, including:

  • Data warehouse "cleanliness" - The hub owns all the data and can be sized and managed for its own needs (pre-aggregation and data distribution aside)
  • Inter-departmental consistency - Because all data originates at the hub, the data sent to the spokes is consistent across spokes
  • Data marts can be sized and configured according to data volume, query performance needs, user characteristics, etc.
  • New spokes can be added as necessary

As you may have guessed from the above points, the spokes in a grid needn't be homogeneous. This makes adding and upgrading spokes even easier. I also think it makes for some interesting possibilities... more on that later though.

All together, the hub-and-spoke grid approach is a concept that puts DATAllegro on a different playing field than other database vendors. Rather than trying to build the single fastest database system, this approach focuses on building the most effective enterprise data management infrastructure, which is ultimately more important than the single fastest system. It pains me to say that, because database performance is unavoidably my favorite topic, but at the end of the day it's about how effectively you can use your data. Being fast is a part of that, but it isn't necessarily the only part.

Multi-Temperature Systems


Another interesting DATAllegro's development is the multi-temperature system. This is almost the inverse of a hub-and-spoke grid; here, data is partitioned across two or more heterogeneous systems that conceptually comprise a single database (vs. replicating data across multiple heterogeneous systems in a hub-and-spoke grid). The different systems can be sized according to how "hot" the data is - newer, more frequently accessed data goes on a big, fast system while older data is stored on systems with more storage capacity but less CPU horsepower. New data is automatically migrated to the appropriate system as it ages, making for a system that can store a wealth of data but still service queries for more recent data acceptably fast.

In the future, users will be able to query the system as a whole and have the results from the separate systems automatically stitched together. (No release time frame for this functionality was available.) For now, however, one must connect to and execute queries on the system containing the data of interest. This serves the purpose for now, even if it isn't perfect: new data is available on faster systems, old data is still available and still available fairly quickly, and the migration is automatic. For some applications this alone is sufficient.

Ultimately I think this will be just another application of the IP that DATAllegro uses to tie together the databases that live on the compute nodes in a standard MRA. Here, however, it's applied to databases that live on entirely separate systems. That's certainly the case conceptually anyway, and I think that's downright brilliant. Life as a software developer taught me that the right design will allow you to solve problems you hadn't even thought of yet, and it feels to me like that's what DATAllegro is doing here.

Looking Ahead


That's all for now. As always, if you spot something missing or inaccurate, please let me know.

In the next few days I'll post the third and final article in this series, in which I'll throw out my $0.02 about where DATAllegro is going and what it means for the market as a whole. Fun stuff, I think. Stay tuned.
Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

No comments

The author does not allow comments to this entry