Tip: Hit Ctrl +/- to increase/decrease text size)
Storage Peer Incite: Notes from Wikibon’s July 27, 2010 Research Meeting
Recorded audio from the Peer Incite:
Virtualization and explosive data growth are the realities of data center operations in the early 21st Century. But while IT shops boost server utilization from an average of 15% to 85% or higher and simultaneously scramble to manage huge data growth, most have given little thought to the network implications of these changes. But those are tremendous. Virtualization centralizes computing in a much smaller number of boxes while shifting entire application compute loads from one box to another across that network, while the data explosion means tremendous growth in the amount of data moving across the data center network.
The result is that the Spanning Tree technology that predominates in data centers today is being pushed, and in the case of some very large data centers is already creating bottlenecks that impact performance and therefore service levels for business end-users. And as network speeds increase from 1 Gbit Ethernet to 10 Gbit and then, eventually, to much higher speeds, the cost of ports is going to rise significantly, mandating more efficient use of network resources than Spanning Tree supports.
The good news is that new technologies are available, and the IEEE is working on a new standard, TRILL (Transparent Interconnection of Lots of Links). Meanwhile, Cisco Systems has come out with what it describes as a TRILL superset, FabricPath. This and other TRILL-based technologies from the network vendors provide a path into the future for data center and, eventually, office LAN networking that, combined with Ethernet-over-Fibre-Channel, will provide a more-than-adequate infrastructure to support computing demands into the foreseeable future.
Obviously IT shops should start evaluating their projected networking needs and plan for an eventual transition to TRILL technologies. This transition does have costs and requires new skill sets, however. Because of these issues, which include a shutdown of the Spanning Tree network during the cut-over, IT should start the learning process concerning TRILL sooner rather than later. You do not want to find yourself in the position of having to move to an entirely new network technology when your Spanning Tree network hits the wall. Therefore, larger IT shops in particular should create a "sandbox" TRILL network where staff can gain experience on a small, non-production environment so that when the time comes the network staff can manage the move smoothly and with minimum disruption and cost.
This edition of the Wikibon newsletter examines the issues involved in moving to TRILL technologies to help you gain a better idea of what is involved. G. Berton Latamore
Data centers have seen a lot of change in the last five years, moving from static configurations built from physical components to a virtual infrastructure that abstracts the environment and allows for mobility. On July 27th, 2010, the Wikibon community welcomed Brad Hedlund, Technology Solutions Architect from Cisco Systems, to discuss how these changes are impacting data center networking. Data Centers today require high performance, highly scalable designs. The options available to re-architect networks have grown dramatically in the last 3 years.
Spanning Tree Replacements
Spanning Tree Protocol (STP) is a link layer network protocol that allows only a single link to be active between any two nodes. This helps ensure a loop-free topology but limits the total bandwidth of the network. Switch architectures were designed with limited bandwidth to support these over-subscribed configurations. Spanning Tree has been enhanced in ways over the years – see details in Wikipedia.
In 2007, Cisco started offering solutions that took the place of Spanning Tree, first with Virtual Switching System (VSS) for the Catalyst switch line and then Virtual Port Channels (vPC) for the Nexus switch line in 2008. VSS and vPC both make a pair of switches look like a single logical switch. These technologies require new hardware, software, and architectural changes, plus network downtime, to deploy. They are becoming popular for new data center architectures but are a challenge to implement in existing environments, requiring CIOs to find a compelling business value to justify the cost and production downtime to deploy.
Cisco recently announced a solution called FabricPath, which is a “superset” of the IETF proposal TRILL (Transparent Interconnection of Lots of Links), providing an option to replace STP and provide greater scalability and flexibility than VSS and vPC. This will typically be for environments that require thousands of ports. Many use cases for TRILL and FabricPath will be on FCoE environments (where SAN configurations use multi-path configurations) and HPC configurations, which require non-blocking, high performance, and scalability.
As environments migrate from 1Gb to 10Gb Ethernet and eventually to 40Gb and 100Gb, the cost of ports becomes more expensive, dictating full utilization of assets, which can not be accomplished with traditional STP and will drive the adoption of replacements such as VSS, vPC and TRILL/FabricPath.
Action item: Internal IT infrastructure must increase their own efficiencies in the face of growing competition from cloud offerings. CIOs should pilot the new network technologies discussed to determine the impact of new architectures on their stack and on change control and management practices. Look for solutions that support interoperability and commitment for standards support.
Network and security professionals need to be enablers, not inhibitors to data center virtualization. The efficiencies, agility, and lower TCO the business gains from data center virtualization are compelling, especially in challenging economic times.
Often network or security professionals construct barriers and silos to facilitate a partitioned IT infrastructure. This is understandable, because networks are burdened with legacy infrastructure and security threats continue to rise. However, the model of physically fencing resources to provide partitioned networks and associated security is changing; most likely forever.
Practitioners need to find ways to preserve security and network flexibility, but at the same time deliver more logical network segmentation. The notion of physical separated networks and VLAN containment is giving way to a virtual world.
Action item: Don't be the "No"-man sitting in your Ivory Tower. Network and security practitioners must endeavor to facilitate new models of securing modern infrastructure. This will require re-thinking legacy notions of how to physically secure infrastructure and architect more logical and flexible environments. The result will be greater agility with less constraints on lines of business across organizations and importantly into global partner ecosystems.
Moving to new networks beyond Spanning Tree Protocol (STP)-based architectures will require support from server and application teams. In particular, as the saying goes, applications are the reason infrastructure exists, and this constituency is driving network infrastructure change.
The Wikibon community has identified four main candidates for next generation network approaches: 1) Organizations that are very large with big, oversubcribed networks and sub-optimal port utilization; 2)Those looking to build new data centers; 3) Organizations that are growth-constrained by lack of network flexibility and poor scalability; and 4) Cloud service providers that can arbitrage infrastructure excellence by reselling IT services. In the view of the Wikibon community, these are the types of firms where investments beyond STP-based networks will deliver tangible ROI.
For organizations outside of this ‘sweet spot,’ the sell to CFOs will be more difficult because the benefits will be much 'softer.' Nonetheless, for those network professionals looking to the future, there is an opportunity to educate the organization on the inevitability of new network architectures, specifically teaming up with the server and application groups to support planning initiatives.
Network professionals need to be proactive in providing a roadmap that synchs with application and server futures. This will involve building proof points through pilot studies that will ensure adoption can move quickly when the time comes. Network engineering teams need to look for ways to bring value to application groups with enabling technologies that deliver Infrastructure 2.0.
Generally, application heads are not focused on infrastructure costs - that is another group's problem. The key for network professionals to sell advanced networking capabilities to the organization is to demonstrate how next gen infrastructure will make applications run better. This approach will create a tailwind from application and server groups and soften cost friction from the CFO's office.
Action item: In 'selling' next generation network architectures to the organization, network professionals must focus on hard dollar value (e.g. port utilization) to get past CFO hurdle rates. However beyond hard dollars, practitioners must emphasize the benefits to application value, specifically linking network infrastructure to improved service levels, better performance and faster time to deploy application function.
Cloud service providers understand the constraints of the current network topologies and are working hard to provide a flexible infrastructure that will allow any-to-any connectivity across multiple high-speed paths, minimizing the number of connections to each component of Infrastructure 2.0.
CIOs will need to position to be able to compete with Cloud Service providers or offer distinct advantages. For example, if an application runs on a SaaS cloud provider, but the data coming out of that system is critical for down-stream applications run in-house, it is only necessary to be close-enough to the price – internal services do not need to always be the least cost. CIOs need to be ready to clearly and forcefully articulate the value of tighter integration of applications. However, there will be many Internet-facing or standalone applications where potentially outsourcing could make sense. Internal systems will need to be nearly cost competitive with external services. To ensure that they will have to drive the IT organization as a whole towards the lowest cost IT Infrastructure 2.0, and plan to replace older technologies such as spanning tree or conservative depreciation factors before they impede progress.
Action item: IT organizations will have to position faster to implement leading edge solutions, particularly for virtualized infrastructures and virtual networks. The internal private cloud should be modeled on best-of-breed service providers, and should include capabilities for self-service and charge/show back.
As practitioners transition from 1Gb Ethernet to 10Gb Ethernet it’s not just about moving to a higher bandwidth, but about taking the opportunity to re-examine the architect the entire network. Specifically, older switch architectures are over-subscribed and underutilized, using Spanning Tree. New switch architectures can support full line rate, non-blocking configurations.
At the same time, practitioners must understand the entire data center in balance from servers, storage, networks and applications. The job of the network is to enable business capabilities and not become a constraint or a prohibitively expensive line item in the budget.
Action item: Practitioners must exploit changes in technology, not just in networks but associated developments in server and storage domains. These include virtualization trends in servers and storage, which include scale-out architectures that allow for mobile placement of applications. These trends drive the need for higher data rates to deliver the required performance and service levels. The required architecture will include much higher access densities that efficiently utilize the ports and bandwidth of the network by replacing Spanning Tree Protocol with solutions such as VSS, vPC and FabricPath from Cisco Systems. Storage replication and server mobility such as VMotion creates large amounts of data that flows over the network. From a network perspective this means you need to have sufficient bandwidth and agility to support these applications.
At this week's Wikibon Peer Incite discussing network architectures beyond spanning tree, Gerry Murphy shared that historically, the networking business has learned to live with the phenomenon of over-subscription, and its technology transitions have mainly come about through attrition.
Cisco has brilliantly used this dynamic to extend its dominance in the business by delivering incremental improvements and selling futures. This practice has frustrated competitors and while incredibly successful is beginning to open cracks in Cisco's armor in increasingly large niches such as cloud services.
In the view of the Wikibon community, Cisco must put less emphasis on blue-sky visions with partial solutions delivery (e.g. Data Center 3.0) and be more clear and succint on what customers can do today, providing roadmaps on how to get from Point A (today) to Point B (tomorrow).
At the same time, Cisco competitors need to move beyond Cisco-bashing and articulate a value proposition that specifically relates to reducing the number of network layers and improving utilization. As well, competitors need to lay out a clear vision that enables virtualization. A key issue for such players is differentiation through vision and execution. Vendors need to focus on their respective strengths, which include software tools, automation, and orchestration. The industry needs to tell a story that is more integrated and leverages its technology prowess beyond straight networking (i.e. convergence).
Action item: Convergence is changing the key management points in the network (e.g. servers are doing more and software value add is a huge opportunity). Suppliers need to pick a path (e.g. commodity or value add; small or large, etc.), deliver a vision around virtualization, and provide tangible products that deliver value today. The most useful visions for practitioners are those that provide specific advice and proof points on how to transition to next-gen network architectures, safely.