This is a set of notes from an October briefing given by Overland Storage on its new SnapServer DX1 and DX2 systems with DynamicRAID. Attending were:
- Nick Allen, Wikibon
- David Floyer, Wikibon
- Bert Latamore, Wikibon
- Jillian Mansolf, Overland Storage
- Jeff Ferrell, Overland Storage
JM: Overland started out in tape as you are probably know. We moved into the NAS space with the acquisition of the SnapServer line in 2008. We introduced our first SAN into the lineup in February 2010. Over the last 18-24 months, we have been working hard on the next generation of the SnapServer line. We're excited to be introducing that this week. Jeff Ferrell joined the company about 18 months ago, and he & his team have been working on understanding our customer base and defining the right features to add into the SnapServer's Guardian OS.
SnapServer has been selling into the market for more than a decade, we have more than 300,000 units installed, a very large footprint in SMBs and distributed enterprise offices, and in particular companies that need to replicate from many-to-one location or just replicate in their own environment.
We clearly understand that budgets under pressure. IT managers need to acquire storage features and functionality and capacity at a budget affordable to SMBs. Storage requirements are growing in complexity with the explosion of data & the need to predict their storage needs up front and over time. So we wanted to address the notion of overbuying on storage up front, so they don't run out, with something very flexible. Also we recognize that volume management is difficult, requires lots of monitoring, is time consuming, and you typically don't have the ability to manage wasted space. We thought that management capability should be pushed down into this class of product, which is in the under $10,000 NAS space.
We are adding two new products to the lineup, a 2U 12-drive NAS server and 1U 4-drive. Both are expandable with SnapExpansion which are 2U 12-drive expansion units. We include the ability to set up tiers between RAID and DynamicRAID. The 2U will come with replication as a standard feature. We offer Snap Enterprise Data Replicator (Snap EDR) that we've been using to manage infrastructures for 6 years across the globe. That will be available with the 2U. All SnapServers are both block & file using iSCSI for block-based use. Obviously we are proud of these new products that we have been working on for the last 24 months.
Slide 5: The new products are designed to eliminate the need for IT managers to provision storage at all. The idea is to make things simple, whether it's expanding storage in the box by up-leveling drives – replacing an old drive with a larger one & making it part of the RAID array – or beyond the box using Snap Expansion out to 288 Tbytes, or from the volume management standpoint the ability to disaggregate the RAID storage set and create flexible volumes in a limitless manner & have them grow & shrink independently. So in setup customers can chose from traditional or DynamicRAID, the technology we're rolling into Guardian OS with this introduction.
JF: Slide 6: This is a completely new product, and at its core it has a new technology called DynamicRAID. The user has the option to use traditional RAID on the product, but can select DynamicRAID instead. The object of DynamicRAID is to to automate as many of the provisioning decisions users have to make as possible. A lot of what managers have to deal with typically comes down to provisioning issues – upgrades, storage management, space utilization. On the left side of the slide is a traditional RAID picture. It shows a NAS product with four volumes – marketing, sales, finance, engineering. So if the engineering volume fills up, with most products you could add a hard drive to expand that volume. There is plenty of free space on the other drives associated with the other volumes, but that space is not available to engineering. And any space I add into the engineering volume isn't available for these other volumes.
With DynamicRAID we have aggregated the drives in the volumes to create a unified storage pool. The volumes live within that pool. So a certain amount of storage is available, & the volumes can grow & shrink within that pool. You can choose to limit the space available to a volume, but if you don't the volumes can take space as needed. So if the engineering volume needs more space, it just takes it from the pool. If you delete files from a volume, that space becomes available to all volumes. This is completely dynamic allocation of the total available storage among the volumes. If you add drives, that extra storage becomes part of the overall pool and can be used either to create new volumes or add to your old ones. The pool itself can be regulate by policies, and you can change those as well over time.
Another important part of provisioning is the size of drives. Users traditionally have chosen larger drives so they can be sure the drives will be available looking forward. With DynamicRAID you can mix-&-match drive sizes as much as you wish. So you can add drives of different capacities to the pool over time with no downtime for the product, taking advantage of the cost-per-Gbyte decrease as larger drives become available. You also can remove smaller drives & replace them with larger drives. So for example if you have a four-slot unit and have filled all four slots you can increase storage capacity by replacing the original drives with larger drives with no downtime on the product.
Slide 7: This shows the stack for DynamicRAID. We've taken a lot of proven industry standard components that are deployed on millions of servers worldwide and unified them under a simplified management interface. So these are standard RAID 5 and RAID 6. On top of that we have an extremely powerful, logical volume manager which is a powerful virtualization tool that is complicated to use directly. On top of that we have taken a leaf out of VMware's file system and used the file system to extract volumes. On top of that we have our volume extraction layer, which makes all these storage technologies appear to be logical and allows them for instance to cohabit a port. You can create iSCSI volumes or NAS volumes. And adding storage or changing storage is very simply done with a few clicks.
DF: What's the structure underneath this. Are you laying down blocks across the drive? How are you spreading the data?
JF: On the base level there's RAID 5 or RAID 6 drives. The logical volume manager blends all those together to make a single unified storage pool. The logical volume manager also handles the extension of those stripes. Then we abstract the whole thing with the file system. Then on the file system we build our volumes that we present on multiple different protocols, depending on what the users want.
DF: I don't have a mental model. At the moment I'm seeing disks of different sizes and the spreading of the data across them to create arrays. And if I run out of space on a small disk, I don't have that model in my mind of how this works.
JF: You're doing this across the whole storage pool, right?
DF: If the only space is on one large disk, that's no use to me, is it. It's not going to give me protection. I have a mental model problem in understanding about how to use all that data. What solution do you have for that?
JF: You would have a RAID 5 or RAID 6 with dual parity. That's standard.
DF: I've got that. Say I put in a new disk. I'm full on the other disks & want to put in a new RAID 6 volume.
JF: You'd have the new drive, and the storage pool would expand to take in the new drive. Because you have more storage you have more blocks available. It's normal RAID expansion, the way RAID expansion would work in a traditional server. The difference is that that RAID group would have been associated with a particular volume, whereas in our environment it's associated with the pool. But it's exactly the same thing you would do with any RAID expansion. If you have a RAID across 4 disks and you add a 5th one, then all you do is re-layout the volumes so they encompass the 5th disk, and because you have more aggregate space you have more space to store things in.
DF: So you're taking all the existing RAIDs and spreading them across this new disk?
JF: That would depend on the capacity of the new drive vs the old drives. We have multiple RAID stripes. So rather than having one RAID stripe that encompasses all of the drives, we have multiple ones so we can take advantage of disk drives of different sizes. So the correct number of stripes would be expanded to encompass the new drive. Then all of these RAID stripes are blended together through the logical volume manager. It can take a bunch of RAID stripes and treat them as one.
DF: You need some sort of reorganization of this. How long does that take? Disks take a long time to move stuff around. All of it's good, but I just don't understand the constraints or what I should expect as a user for doing this. I love the idea, but I don't understand how it works. I need a diagram that shows me where the blocks are going.
JF: That's something that's in almost every product – RAID expansion.
DF: I put in a nice new 3Tbyte disk and everything else is ½ Tbyte. I can't put all the data on that new disk, I have to spread it, and the others are full. So what happens?
JF: What will happen in that scenario is it will encompass as much of that drive as it can in a redundant fashion. All products in the enterprise have the ability to expand their RAID volumes. So you have a RAID 5 stripe across the 4 disks. So the amount of capacity you have is 3 disks' worth. So when you add a disk, it expands and moves blocks so you now have a 4-disk RAID stripe. It recalculates parity, moves blocks around, behind the scene, and puts these blocks onto the new disk. And this is something I think is universal on all products. You end up with 4 drives' worth of capacity, and the 5th disk is used for parity. So you end up with a whole other disk worth of capacity.
NA: The question here is moving the stuff around to avail yourself of the added capacity. What is unique to Overland/SnapServer?
JF: The overhead is usually about 20% of your performance. That's the default we configure to, but it is user configurable. What's unique is the drives themselves encompass all your volumes. In a traditional RAID solution you would have one volume, say the engineering volume, that would get expanded. In our solution all these volumes get blended together in one storage pool. So all of the volumes use all of the drives. In a traditional RAID you would hard-partition those drives, so that certain drives are associated with certain volumes.
DF: I get that. And that is a very nice way of doing it. What I'm looking at is the constraint of moving everything around when you make a change. And my experience is that that would be a big change and things would slow down or grind to a halt or add risk of what happens if a drive goes out while you are making those changes. When I look at XIV, for example, they have a certain way of doing it. They spread it across all disks in blocks & keep all drive sizes the same so that when they have to recalculate or move things around, it can be incredibly efficient. This approach seems to have risks and overhead. This might be suitable for someone who is not that wired and has tape backup. That is where I'm coming from in looking at this solution.
You've explained all the pieces. But putting that together and making a big change of that sort seems problematic.
JF: The logic is the same that is used in millions of servers worldwide. The thing that is doing the expansion is and worrying about things like whether the RAID stripes are laid out right, is exactly the same code used in every RAID system. So it has an enormous amount of real-world testing, essentially millions of units deployed globally. So I think that is as robust as it's going to get.
DF: But it's usually applied to one set of disks.
JF: It's designed to do exactly the thing we are doing. It's just that typically this is very complicated to do. We've just made it trivially simple to manage. But we're not breaking it or changing it or the way it lays out blocks. It's just the same as it's always been. So the thing you are worried about, the robustness of the re-layout, is the same. We've virtualized using a file system that again is used in thousands of products and created a block layer on top of that. But that layer has nothing to do with the re-layout spanning multiple drive sizes, all of the areas you're worried about. They are very complicated, and you wouldn't want to design a proprietary technology for them.
DF: I understand that. Each layer I'm comfortable with. Its the putting of the three layers together & what happens if something goes wrong in the middle that I don't have a mental model of that says that's really cool. The issue is the three layers together. Its a virtualization layer, a file system layer, and a RAID layer. That's a lot of stuff to be moving around. That's a lot to be dealing with.
JM: This might be better discussed in person. Before we move on, I just want to be clear that we're not changing the way the RAID is being done. We're just giving the user a way to see it and manage it, but we're not changing the way you do RAID expansion.
Slide 8: Folks who are uncomfortable with the technology can use standard RAID, that is an option. We have a lot of customers who will continue to use standard RAID. But a lot of companies want the ability to put in a drive, suck it into the data pool, create volumes and have them in a storage pool, so they can allocate storage that might have been been in the sales volume to the engineering volume. It gives them a lot of flexibility to reallocate the storage in a simple way.
And on the traditional RAID side we provide the ability to replicate back easily to a central location with Snap Enterprise Data Replicator, and we've got really great management tools. There are more than 300,000 SnapServer units deployed in standard RAID. These customers will get a new platform that's a lot faster and more flexible in doing basic RAID expansion.
Slide 9: This is an overview of the areas where we focus. We tend to do well in remote & branch offices of distributed enterprises, mostly because we have small footprint, it's economical, and we have a robust way to replicate data to a centralized location for DR. We have a lot of implementations at pharmacies or retail locations where people use SnapServer in that fashion.
Slide 10: This is a quick overview of the two boxes side-by-side. The 1U scales to 120 TB; the 2U scales to 288 Tbytes using Snap Extension units. You have your choice of dynamic or traditional RAID, and we include the Enterprise Data Replicator with the price of the 2U.
Slide 11: We are announcing into our channel this week. We sell through a number of resellers across the globe. You can buy a 1U for less than $1,600 suggested retail price up to in one unit 36 Tbytes using 3 Tbyte drives for $7,199. So it is certainly economical. We think from a feature standpoint a lot of cool technology. We have folks trying it on a beta test who are happy with the way it works, and we're excited to get it into the channel this week.
DF: The price points are excellent. And the concept is great, it's the potential robustness that I'm concerned about. I don't have the data from you.
JM: I guess we can say that today it sounds fantastic. To your point is the reality is new is not necessarily what everybody wants. So the customer has the option to set up traditional RAID and over time play with the DynamicRAID piece. It works. We promise.
DF: It's a great concept.
NA: We didn't talk about the UI & how easy it is or is not. Can you talk to that?
JF: SnapServer is an 11-yr-old line. They tried to make it as simple as possible. So it is the same general UI that's been on the system. So when they expand the RAID, all they have to do is start the process, and it automatically adds the drives to the pool.
NA: I asked because I have 3 SnapServers in my home office that were evaluation units from the original 10 yrs ago. What I found was as JAVA evolved the UI didn't, and I can't administer my servers any more, because they don't work with the newest version of JAVA.
JM: I'm excited they're still working after 10 years.
NA: I've had a good experience with them. One drive failed, & I had to send it back, & they were able to recover my data. One of the others is a RAID 1 & one drive has failed, but the others been running for 10 years. They are a little hot & noisy.
JM: From a management standpoint we have a remote manager, the SnapServer Manager, with the ability to discover all the servers on a network. You can do global changes, you can do OS upgrades easily. One of the great things about selling Snap is we have customers who come back over & over. They do last a long time. We do have customers with units in operation that are 6, 7, 8, 10-years-old. For a $1,500 product, I think that's pretty good. We could take a look at it. It could mean that you just need an OS upgrade.
NA: My other question is about the business model. SnapServer was to me big news, and it's been acquired, spun off, acquired, pretty much every permutation of M&A you can imagine. So what's different today in the business model versus 5 years ago vs 10 years ago?
JM: It's a brand that sells a lot. The difference today is the focus & dedication to the product.
NA: It sold a lot, but apparently it didn't make a lot of money.
JM: That's not necessarily true. It definitely has made money over time. It was profitable multiple times in its history. One of few independent storage firms still standing. We sell stuff through the channel. We've got a ton of customers. It's a great plan from a sales and marketing standpoint. It's a lot better than starting from scratch as a new company getting into the marketplace.
NA: Why has there been so much churn if everything you say is true?
JM: The storage industry in general has a lot of churn. I think that the acquisition by Adaptec was the most unfortunate. It was acquired by a company that had no idea of what to do with it, how to get into the storage business or the NAS business.
We're focused on getting the best product out there that we can. We put a lot of effort into upgrading Guardian OS and put a lot of thought & effort into the design of the platform.
NA: I would suggest that while it is great that you're trying to get the best product out there, you have a reputation problem.
JM: I don't agree as someone who spends all her time visiting partners around the world and talking to customers every day. Customers like the product. We have a huge installed base of happy customers. Most customers have no clue of the back end of how many times the business has been sold. They buy product through a channel partner & we service & support them.
NA: You alluded to VMware I think because your manager is a virtualization layer that has nothing to do with VMware. But do you have any VMware or hypervisor story?
JF: No, I said that like VMware we use a file system to virtualize the disks. We are the only ones who use a file system to virtualize whole volumes, they virtualize the files.
NA: NetApp's SCSI devices are actually files.
JF: You're right. I guess it is common.
DF: I like the way you put the whole package together. It's logical, & assuming that it works, it is a very interesting way. I would like to recommend it. But the caution in me says there are a lot of new moving parts. If I don't understand how it works, there are a lot of other people who won't understand how it works. And they are taking a risk and should proceed with significant caution. So good luck in getting it out there. I think you could do in my view a significantly better job of making crusty old people like me comfortable.