11. 7.10 STP Example
In this video, we want to get some practise analysing a spanning tree topology and determining who is the route bridge, what are the root ports, what are the designated ports, and which ports are blocking. And I wish I had been exposed to this. Prior to my interview at Walt Disney World, they asked me a very similar question that we’re going to be answering here, and they asked me which ports are blocking in this topology. Now, at the time, I thought I knew about spanning trees. I knew that it would prevent a layer-two topological loop by not sending traffic over certain links. But what I did not understand is that on those links that are not carrying traffic, both ends are not blocking. Only one end is blocked. The other end is still a designated port because every single link has a designated port. So I missed the question in the interview. Fortunately, I still got the job.
But I wish I had known what we’re about to go over right now. Question number one is: who is the root bridge? And we know it’s the switch with the lowest bid, the lowest bridge ID; that’s the priority first, followed by the Mac address as a tiebreaker. And it looks like switches A and B have lower priorities than switches C and D. A and B have 16 384 c priorities, while D has 32 768 priorities. So it’s going to be either switch A or switch B for the route. And since the priorities are tied, we look at the Mac address. Who has the lowest Mac address? Well, if we take a look at the first three hexadecimal digits of the Mac address, it looks like switcha’s zero is less than switchb’s one. And from that, we can conclude that switch A is the root bridge. Now we ask, “Who are the root ports?” And remember, there are no routeports on the route bridge. A route port is the one and only port on an off-route bridge that’s closest to the route in terms of cost. And we’re going to use the traditional short path cost method, where a ten-gig link has a cost of two and a gig link has a cost of four. Now switch B is connected to switch A over a 10-gigabit link. That’s what the tea refers to. It is port number ten. That’s a $2 link for a 10-gigabit connection. Is that the root port? Well, let’s figure out the cost of the other paths along the route, and we’ll pick the lowest one.
If I left gig 10 seven, it would cost four dollars to get to see, and another four dollars to get from C up to A via gig link. That’s a total cost of $8. Same thing if I go down to D; that will cost me four to go down to D, and then another four to go up to A, for a total of eight. Yes, it looks like tea. 10 for the price of 2. That’s the lowest cost to get back to the root bridge; switch A. So that will be our route port. What about switch D? It could go up to B, which is a cost of four over that gig link, and then over to A, which is a cost of two to go over the ten gig length. That’s four plus two for a total of six. Or, for $4, switchd can take a direct path right up to switcha via that gig link. Well, four is less than six. So we’ll say that the root port for switch D is gigabit port 10. What about Switch C? Well, it’s got three different ways that it might potentially get back to switch A. It could go to switch B for a cost of four and then over to switch A for another cost of two. Four plus two is going to be a total cost of six. Alternatively, we have gig 1010 and 10 one. They each connect directly to switch A. They each have a cost of four. So we’ve got a tie. What do we do in a case like this? Well, if we have a tie where we have two ports with equal costs to get back to the root bridge, ask who is at the other end of this link. And we’re going to use the link with the lowest bid. But in this case, both links go to switch A. It’s the same price on the far end of each link.
So we have another tie. What’s the tiebreaker now? And the reason I crossed the connections between the switches A and C and the topology is that I’ve had many students who have heard the rule that we select the lowest port ID as the tie, which is true, but many students misinterpret that and think it’s the lowest port ID on our local switch. Instead, we take a look at the other end of each link and say, “Which far-end port ID is the lowest?” If we look at the other end of the link coming out of gig 1010 on switchc, that goes into gig 10 four. Looking at the other end of the link coming out of switch CS, we can see gig one slash zero one port. That’s gig 10slash three, whose far end port ID is less 10 three, which comes out of gig 10 one on switch C.
So it’s a double tie breaker. On switch C, however, gig 10 becomes our root port. The next question is, who are the designated ports? And we have a designated port on every single segment, and it’s the end of that segment that’s closest to the route in terms of cost, and we don’t get closer to the route than being on the route. So we know that every segment connecting to the route has its own designated port on the route bridge itself. So all of the ports on the route bridge are designated ports for their segments. And we’ve only got a couple of segments left. We’ve got the segment between switches C and B and the segment between switches D and B. Now let’s take a look, first of all, at that segment between switches C and B. If I’m on the switch C side of that segment, my cost is $4 to go up to switch A over that gig link. If I’m on the switch B side of that, my cost is only two.
So which end of this link is closest to the root? That’s the switchboard side. So we’re going to say for that segment, the designated port is SwitchBig 10 seven. Let’s take a look at the link between switches B and D. If I’m on the switch B side of that link, it only costs two dollars to hop over that ten-gig link to switch A. However, if I’m on the switch D side, it costs four dollars to take that gig link up to switch A. So which end of this link has the lowest cost? The switch is located at the B end. So we’re going to say that for that link between switches B and D, the gigabit five-port on switch B is our designated port. And assuming the ports are all administratively up, all of the remaining ports are going to be blocked, or sometimes we might refer to those as “nondesignated ports.” This is going to give us the loop freetopology that we see highlighted here in blue. That is the job of the spanning tree protocol, which creates this loop-free topology while still allowing for some backup links. So if any of these links that we see in blue were to fail, we’ve got a backup path.
12. 7.11 STP Convergence Times
We know that the spanning tree protocol allows us to have physically redundant links while logically having a loop-free topology. And that’s what we have here. That is evident on Switchboard 3. Interface Gig 2 is currently blocked. We do not have user traffic over that bottom link. And switch three and switch two are both pointing up to switch one. Because switch one is our route bridge, switch three connects to it via interface gig zero one. However, let’s imagine that there was a failure on that link between switch one and switch three.
After all, that’s what the spanning tree is built for—to help us recover from something like that. The question we’re going to answer in this video is: how long does it take to recover? Do we immediately start forwarding traffic out of gigabit-zero-two? Actually, no. We’re going to remain in that blocking state for 20 seconds. Switchboard three is waiting for bridge protocol data units (Bpdus) to be received from switch one. If it does not see any Bpdus within 20 seconds, it realises something must have happened to the topology, and it’s going to start transitioning its blocking port into a forwarding state. But it does not do it immediately. It next transitions from blocking to listening in the listening state. It is listening for Bpdus on any other interfaces it may have. Right now it only has one other interface, but it’s taking a look at the Bpdus being advertised around the network and trying to determine which of its ports will become its new route port. And who knows, maybe it’s now the root bridge, if the root bridge has been destroyed.
So it examines those BPDs for 15 seconds, but we’re still not forwarding user traffic. We then transition from the listening state into the learning state, where we stay for 15 seconds. We are learning Mac addresses. We’re starting to populate our Mac address table with traffic seen on our different interfaces. However, we are still unable to forward traffic on gigabit zero two. We haven’t transitioned to the forwarding state yet. But after a total of 50 seconds—20 seconds spent blocking, 15 seconds spent listening, and an additional 15 seconds spent learning—we finally transition to the forwarding state, and the previously blocked port becomes our new route port. And the previous route port is going to be blocked again; that was a total of 50 seconds of delay. And this is using the standard Spanish tree protocol. But be aware that there is another version called the “rapid spanning tree protocol.” And depending on the situation, it may be possible to recover from a network failure in as little as one or two seconds. It’s very, very rapid. But traditionally, I want you to know the block listening states that add up to 50 seconds of delay.
13. 7.12 STP Variants
In this video, we want to consider some of the different flavours of spanning tree protocol, going all the way back to the beginning. At deck with Radio Perlman, she developed what was known as the deck standard, and that was called a CSC, a common spanning tree. In fact, when the IAAA came out outside the standards-based version of 802.1D, that was also considered to be a common spanning tree. And what that means is that even if we have multiple VLANs configured on these switches and trunks between the switches, every VLAN is still going to be using the same instance of a spanning tree. In other words, for VLAN 100, 200, and 300, all of the different VLANs are going to be looking at the same switch as the root bridge.
They’re going to have the same forwarding and blocking ports on the various switches. And later, Cisco came along and said, “Okay, let’s see if we can improve on that just a bit because we realize different VLANs by their nature have different traffic patterns.” Maybe for one VLAN, it would be better for switches to be the route. Maybe for another VLAN, SW-2 could be the route. Maybe another VLAN would benefit from having SW3 as the route. It is not always possible to have a one-size-fits-all spanning tree. So Cisco developed the PVST approach, the pervlin-spanning tree. And as the name suggests, each VLAN is going to have its own instance of a spanning tree. So for example, let’s say VLAN 100 uses switch SW1 as its root; another VLAN could use a different switch. And in the literature, you may see this written a couple of different ways.
You may see it written as PVST or PVST plus. It’s typically fine to use those terms interchangeably, but I wanted to make sure you knew the difference. If you see a plus sign after PVST, that means the trunks interconnecting these switches are using the Q trunk encapsulation type, as opposed to Cisco’s ISL, which is a link-proprietary trunk encapsulation type. But typically, we just use those terms interchangeably. Now back to another standard approach to a spanning tree protocol: MSTP, or multiple spanning tree protocol. You may also see that written as just mist for multiple spanning tree protocol, and it is a standard; it’s II 802 1S.
And what the IE did was notice what Cisco noticed, which is that not every VLAN would benefit from having the same switch as the root bridge. Perhaps, but they came up with a different approach. Instead of giving every single VLAN its own instance of a spanning tree, this “triple standard” observes that while we may have several VLANs that would benefit from the same switch being the route, for example, let’s say that in this basic topology, switch A would be the optimal route bridge for VLANs one, two, three, and four. However, switch B would be the best route bridge for VLANs 5, 6, 7, and 8. So what we can do is, instead of having eight different instances of the spanning tree running as we would with PVST and PBST, we can have just two instances.
We can define an instance where switch A is the root, and we can assign VLANs to that instance that would benefit from having switch A as the root. We set up a second instance with switch B as the route, and then we assigned those VLANs that would benefit from switch B being the root. However, so far, all of these different options we’ve talked about still have the same default convergence time. If a link fails in this redundant physical loop, it takes about 50 seconds to start using an alternate path, and that can be a long time. Fortunately, there is another standards-based approach that can speed things up, and that’s called RST, or rapid spanning tree protocol. And when I say it’s going to speed things up quicker than 50 seconds, I’m talking on the order of a few milliseconds to typically a maximum of about 6 seconds, which, by the way, is three times the default hello time of 2 seconds. That’s where we get the six and a few milliseconds. That can happen if a direct link fails because, with the rapid spanning tree protocol, all of the switches on the topology can participate in the conversions, educate one another, and detect when something goes wrong and send out a message called a TCN, or a typology change notification.
And that can really speed things up when it comes to conversions. And I mentioned that this is a standard. Specifically, it’s III 802, one W. Oh, by the way, we talked about Cisco’s approach to PVST and PBST. Plus, they also came out with a variant of the Rapid Spanish Three Protocol, which they call Rapid PVST or Rapid PBST. Plus, they’re simply using their per-VLAN spanning tree approach, but they’re overlaying that with this RSTP standard. And to better understand a rapid spanning tree protocol, we need to define some terms. First, let’s consider some different roles that our ports may have. And the good news is, once we understand the port roles of a PVST topology as an example, then we’ll understand this pretty well. We still have the concept of a root bridge, which is the switch with the lowest bridge ID, which is made up of the bridge priority followed by the Mac address. We still have the concept of root ports. Remember, we have a route port and only one route port on every non-route bridge. So the root bridge has zero route ports, and that route port on the non-route bridge is the port on that bridge or switch that’s closest to the route.
In terms of cost, we still have the concept of “designated ports.” Remember that every segment has a designated port. And you might wonder, hold on, what about that other link from switch three up to the hub, that shared media hub? That entire path from the switch up to the hub back down the other path—that’s one segment. So we do not have a designated port on that other link because it’s part of the same segment. We also have ports that are administratively shut down or disabled. Where it gets a little bit different, though, is when we have blocked ports. There are two types of blocking ports in the rapid spanning tree protocol.
One is an alternate port that we would have on a switch. We’ve got a full duplex point-to-point connection between switch two and switch three. And if we’re blocking thanks to Spanish-3 protocol decisions, then that’s going to be known as an alternate port. Another type of blocking port we might see, hopefully not, but on an Ethernet hub. Hopefully we do not have these in our network, but if we did and we had this loop up to the hub and back, obviously that’s a layer to loop. So we’re going to have one port blocked. But if we’re going through a shared media hub, instead of calling it an alternate port, we call it a backup port. Also, the different port states differ a bit with RSTP. We still have a situation where we’re throwing packets away.
And by that, I do not mean BPDUS bridge protocol data units. We’re still processing those. But we’re discarding user traffic if we’re in the discarding state, similar to the blocking state with previous versions of the spanning tree protocol. We do not have a listening state, but we do have a learning state where we’re starting to populate our Mac address table. We’re still not forwarding user data yet, but then we’ll transition to forwarding. And, once again, unless we change the default hello timer to something else, this will usually be a maximum of about 6 seconds. And I also want you to know some of the different names for link types. We have point-to-point links that go between switches because each link only has one device on each end of that link. So we call it “point to point.” And you might wonder why I did not label the laptop connection into the switch as a point-to-point connection because it is a point-to-point connection, but it’s a special type. It’s called an edge port. Now an edge port is where we’re connecting an end-user device. In other words, no switch, no device capable of sending BPDUS or bridge protocol data units in the network. It could not cause a loop, in other words. And as a result, we don’t want to have to wait through the spanning tree timers in order for that port to go active.
So we’re going to enable it as an edge port. Cisco uses the term “port-forwarding.” That’s their term for a port that’s going to go active almost immediately because we have given our solemn vow to that switchboard that we will not connect it to a switch or anything that’s going to send it a Bpdu. So, despite the fact that it is a point-to-point link, we refer to it as the link between the laptop and the switch. It’s an edge port on a point-to-point link. Now, what about that shared media hub? Again, I hope we don’t have these in our modern network, but if we did, that would be called a shared link. And you might be wondering how the switch knows we’re connected to another switch rather than a hub through this port. Well, it makes the assumption that if it’s a full-duplex port, in other words, we can transmit and receive simultaneously. It makes the assumption that it’s a point-to-point link going to another switch. Because we cannot run in full duplex with an Ethernet hub, we have to run in half duplex, where we can send or receive at any one time, but not both at the same time. And that means if the switch sees a port set to half-duplex, it’s going to make the assumption that that is a shared link connecting out to an Ethernet hub. Now, I mentioned earlier that the switches using the rapid spanning tree protocol participate in the convergence. To speed things up, let’s zoom in on just a couple of switches that might be in a much larger topology. And let me give you an idea of how that works. Let’s say that we’ve got a large topology with lots of switches, and we’re focused on just a couple of switches here. And in this topology, it’s so big that we don’t even see the root bridge.
But it was at the very bottom of your screen. But something happened, and it went down. And now the elected route bridge is at the top of the screen. Well, switching to one’s top port, which was a designated port, one realises that I just received a Bpdu, proving to me that the route bridge lives somewhere up this way. So I should not be a designated port; I should be a root port. And we make that transition in step one, from a designated port to a root port. The problem is, the bottom port on SW 1 is currently a root port, and we don’t want to cause any sort of temporary loop during this time of convergence. So we’re going to temporarily set that bottom port to blocking. Now, while we’re blocking, and remember, when I say blocking, we’re not blocking BPD, we’re blocking user data. We’re going to send a proposal down to our neighbors, which are on the south side, and that proposal says, “Hey, I propose that you come to me to get to the root bridge.”
And as proof that you should come to me, check this out. Here’s a Bpdu showing that the elected rootbridge lives at the top of the screen. You need to come through me to get there, and switch two looks at that proposal and says, “You convince me I should not be a designated port on top, I should be a root port.” So it’s going to change from a designated port into a root port. That’s step four. And then in step five, we let switchboard know that, okay, we’re convinced we have transitioned our end of the link to a root port. In other words, we send an agreement. And once the switch is flipped, one can tell that the other end of this link has changed. It knows that the coast is clear for it to transition in step six from the blocking state to being a designated port on that segment, remembering that every segment has a designated port, and it’s the port on that segment that’s closest to the route in terms of cost. And this is the way that Switch One educated Switch Two about this change. And that would happen very, very rapidly, by the way.
And then the process would continue. Switch two would take its bottom route port, go into a blocking state, send a proposal to its neighbour, and the process would happen over again. So that’s a look at some of the different variants of the spanning tree protocol. We had the original deck version pioneered by Radio Perlman back in the 1980s, and then in 1990, I tripled the standard to 802 1D. We took a look at the subtle difference between a couple of Cisco variants, PBST and PBST, plus the plus signifying that the switches are interconnected with trunk links and that the trunk encapsulation type is IEEE 802 One.We took a look at the MST, or MSTP, as some literature calls it, for multiple spanning tree protocol, where we define different instances of spanning trees, where we have a route selected for each instance, and then we just populate each instance with appropriate VLANs. And then finally, we took a look at the asynchronous spanning tree protocol that can speed up that 52nd conversion time to just a few milliseconds, with a maximum of about 6 seconds.
14. 7.13 Link Aggregation
If I want to interconnect these two Ethernet switches, like we see on screen, I could do so with a single link. However, that might be a bottleneck, so I might want to add a second link. However, with a spanning tree protocol, Spanning Tree will say, “I’m only going to allow user traffic over one of those links to prevent the formation of a loop and the resulting broadcast storm.” But wouldn’t it be great if instead of just standing by and waiting for the other link to fail, we could actively be using both links? That’s what we can do with a technology called link aggregation, or LAG. We can logically bundle together multiple physical links into a single logical link. And you often see that on a topology drawing with an oval around those links, indicating that they are logically bundled together. Sometimes that bundle is referred to as an “ether channel.”
And with this Ether Channel connection, we can simultaneously send traffic between both switches using both links. That’s going to give us more bandwidth between the switches. If I have a couple of gigabit links logically connecting my switches, I suddenly have a two gigabit link between those switches, despite the fact that I’m only showing you two links here. and this could vary based on your switch hardware. But we could have an Ether Channel bundle containing two links, four links, or a maximum of eight links. And this will provide us with load balancing across all of those links. And it also gives us redundancy. With a spanning tree, we would have one of these two links standing by, waiting for the other to fail. But this gives us some redundancy as well. If we were to have one of these two links go down, the other link would still continue to function. If I had a bundle with eight links and one or two failed, the remaining links would continue to function. So it does give us redundancy. And when we’re setting this up, we could just say, “I want to turn on link aggregation,” or we could configure just one switch to negotiate the formation of an Ethernet channel using a protocol such as PAGP.
That’s the port aggregation protocol. That’s a Cisco proprietary protocol that negotiates an Ether Channel formation. There is an industry standard called LACP, the Link Aggregation Control Protocol, and they work similarly. Let’s take a look at the logic of how they can negotiate the formation of a channel. First, with Pag P, I mentioned that we could just set both sides to “on,” where we’re not sending PAGP frames but just “on.” But if we want to use a Pag-P, we could set our mode to “auto” or “desirable with auto,” and we’re willing to become an Ether Channel port, but only if the other side invites us into an Ether Channel. I need to receive a Pag-P frame; I’m not going to initiate it myself. So think about the fact that if both sides of this Ether Channel have ports set to auto, it’s not going to work. Because even though both sides are willing to form an Ethereum channel, nobody is suggesting that they do it. So we don’t bring one up. If both sides are desirable, a channel will be brought up because a desirable side is going to initiate the formation of a channel by sending Pag-P frames. And if the other side is set to desirable, it’ll say, “Sure, let’s do it.” Let’s bring up an Ether channel.
Or if the other side is auto, or receives those PAGP frames from the desirable side, and the auto side says, “Yes, I was waiting for somebody to ask, let’s definitely form an Ether Channel.” So these are the different combinations and permutations of how PAGP can negotiate the formation of a channel. But I mentioned that there was an industry standard as well, and that was LACP, the Link Aggregation Control Protocol. Logically, it works the same, except the names are different. Instead of auto, we have passive, and instead of desirable, we have active. But as you’ll see with the checks and Xs onscreen, the logic remains the same. So is there an advantage to one lag protocol versus another? Well, there might be. And again, this can vary based on your switch model. But LACP supports having more than eight ports in a channel. You can identify eight additional ports that can fill in the gaps if we start to have individual links fail within that eight-port channel. So we can have eight backup links if we want. Suddenly, we’re dedicating a lot of switch ports to an Ether Channel. I’m not sure we always want to do that, but we do have that option with LACP.
We do. not with PAGP. And we said we’re going to load balance across this Ether Channel, and a lot of people have a false sense of security about how that works. Let me explain. And again, this is going to vary based on your switch model. But let’s say that we’ve got some servers hanging off of Switch B, and we’ve got this PC that’s going to be sending traffic over to those different servers. In fact, let’s imagine we have lots of PCs off Switch A sending traffic to the servers. And I’ve got four links in this Ether Channel bundle. And many people would guess that we’re going to load balance based on this. We’ll send the first packet over the top link, we’ll send the next packet over the next link, the next packet over the third link, and the next packet over the fourth link, and then we’ll just start over again. That’s not the way it works. There are load-balancing algorithms that you need to be aware of. Load balancing is based on the destination’s Mac address in many switches I’ve worked on.
Now, here’s what I mean. If I have four links, how many binary bits does it take to represent four different possibilities? Two bits—I could have two bits—will give us a unique identifier for one of these four links. So if I’m paying attention to destination Mac addresses, which again are the default on minis, which is what I’ve worked on, we’re going to be looking at the last two bits of the destination’s Mac address, which is 48 bits long. And we normally write a Mac address as a hexadecimal number, and I didn’t write off the full Mac address for these three servers. But I’m showing you here the last hexadecimal digit for these three different servers, and they all look different. But if we break those hex digits down into binary, remember, four binary bits go into one hex digit. That gives us 16 different combinations. If I look at 1 and 5 and D in binary, the two numbers to the right, most digits are the same. They all lead to the same place. That means instead of load balancing by taking turns on the links I use to get over to the servers, that link identified with 0/1 is going to be used for all of the traffic going to all three of those servers. As a result, we’re not performing the load balancing that we might have expected. What can you do? We’ll see what other load balancing algorithms are available on your switch platform.
Now, you can typically look at source or destination, IP addressing, or Mac addressing. What I like to do is combine both source and destination. I think that adds an element of randomness to this. So I might say something like “source,” “destination,” and “Mac.” I want that to be my load-balancing algorithm. And that’s what I’m considering in this case, not just the last two binary bits of the server’s Mac address but the last two binary bits of the PC’s Mac address. And what’s going to happen with those last two bits from the PC and the last two bits from the server? We’re going to perform an exclusive OR operation on each bit. That’s a boolean operation. An exclusiveor says if we’re comparing two digits and they’re the same, the result is zero. If we compare two digits and they differ, it means they are exclusively something else, exclusive. Or if they’re different, we get a binary one. So if the last digit in my PC’s Mac address happened to be a one and the last digit in the server’s Mac address happened to be a zero, that would give me an exclusive result of one. A one and a zero are different. If both were one, that’s a zero. If both were zero, that would also result in a zero. We get a zero when the two bits match. We only get one when they’re different. But by doing that, we’re adding some randomness to the selection of the link that we’re using. And that’s a look at link aggregation, also commonly referred to as an “ether channel.”
15. 7.14 Port Mirroring
If we want to troubleshoot a network connectivity problem, it is possible that the server is not successfully communicating with the client. A great way to approach that troubleshooting issue is to capture packets going between the client and the server so we can analyse those packets. Take a look at what’s going on in the layer two header and in the layer three header. But the challenge is that a switch has learned the ports off of which these devices live. It’s learned their Mac addresses, in other words. So when the server is sending a frame to the client, it’s only going to go through the port connected to the client. How can I connect a laptop running some sort of sniffing software and get a copy so that I can analyse that? And by the way, when I say sniffing software, there’s a great free piece of software you might want to download. It’s called Wireshark, and you can get it from Wireshark.org. It is a fantastic troubleshooting utility, and I might be running that on this sniffer laptop. But how do I get a copy? Well, some of your Ethernet switches will support a port mirroring feature.
With port mirroring, we can identify a port and say we want copies of frames coming into that port, going out of that port, or both. Or we could say all frames in a particular VLAN. We can make copies of all those and send them out the port connected to our sniffer laptop by configuring that port as the destination for our port mirroring configuration. Then, when the server is sending a frame to the client, the switch is going to make a copy and also send it down to the sniffer. But a word of caution: you don’t want to leave a port in this state because somebody may come along later and plug a PC into that empty port. They think, “Well, we need to install a new PC.” Here’s an empty port; let’s use that. And suddenly, that PC, whether the user realises it or not, is receiving copies of frames that it should not be receiving. So you want to be very diligent about unconfiguring that port for port mirroring when you’re done. Or another option is to just designate a port on your switch, such as the very first port or the very last port, and just know that that’s reserved. You’re not going to connect any end-user station, server, or other network appliance to that port. That port is going to be reserved just for sniffing applications. So that’s another approach. But it would be better for security if you simply unconfigured your port mirror configuration when you were done.
16. 7.15 Distributed Switching
If we have a network that contains multiple Ethernet switches, there are some design options that we’re going to consider in this video for how we distribute the switching across our network. What we see on screen is the very common three-tier architecture where we have our network divided into the access layer, the distribution layer, sometimes called the building distribution layer, and the core layer. Let’s talk about each one. The access layer is something that I often refer to as the “wiring closet layer,” because this is where maybe you have your Ethernet switches that connect out to your end devices. You might have switches on each floor of a building. Each of those wiring closets, those switches there, would be at the access layer. And then maybe, within a building, there’s an aggregation point for all of those access layer switches. They can all be traced back to a set of distribution layer switches. And notice it’s not just a single switch. At the building distribution layer, we’ve got a couple of switches for redundancy, and those distribution layer switches connect to each of the axis layer switches. So we could actually lose a single link or even a distribution layer switch, and everybody would still be able to get to any destination that they wanted.
And if we have a larger campus environment with multiple buildings, we need to connect those multiple buildings. and that can be done at the core layer. The core layer is concerned with speed. It wants to get traffic as quickly as possible from perhaps one building distribution layer switch to another building distribution layer switch. Maybe this is the layer that we use to get out to the Internet. Although some designs use the building distribution layer to get out to the Internet, there’s no hard and fast rule about that. But this is a very common three-tier architecture you often see for enterprise networks. And as a side note, let’s take a look at the different types of network topologies that we have here. We can see PCs staring out from access layer switches, indicating that they are in stations. This is a starred topology. But if you take a look at all of those redundant connections between the core and the distribution layers, that looks like a partial mesh topology. And what do we call a network that contains multiple topologies? We call that a hybrid topology. And here we have a hybrid of some partial meshes and some star topologies. And while this design might be great for a large campus environment with multiple buildings, sometimes we just don’t need three layers. We might only have one or two buildings.
Do we really need a core layer? Perhaps not. Maybe we could combine the core layer and the distribution layer and have those multilayer switches, as we see here, connected up to our axis layer switches. That kind of design is called a collapsed core. We’re literally collapsing our core and distribution layers into a single layer. And this might be more appropriate for a smaller installation where we don’t have lots and lots of buildings. And while the collapsed core and the three-tier design are common in enterprise networks, there’s another design that we might see in data centers. Let’s check out the spine-leaf design or some literature. We’ll call it the leaf spine design. But in a data center, we’ve got these big racks of servers, typically. And the servers—we’re going to call them nodes—are going to connect redundantly into these leaf switches, which live in a rack full of equipment. In fact, they might be located at the top of the rack. They’re also known as Tor switches or top-of-rack switches. So imagine we’ve got one rack, and we’ve got these top-of-rack switches interconnecting all the servers in that rack. We’ve got another rack with other switches interconnecting the servers in that rack. How do we get from one rack of servers to another rack of servers? Well, that connection is going to be made not through leaf switches but through spine switches. Spine switches have a full mesh of connectivity out to our leaf switches.
This is going to give us a very predictable delay. Because if I want to go from any leaf switch to any other leaf switch, I know that I have to hop through one and only one spine switch. And this two-layered design is common in data centers. And if you think about the design as a whole, where we’ve got leaf switches connecting out to our servers, and then the spine switches are interconnecting our leaf switches, this is logically sort of like a switch. I mean, think about the ports on all of those leaf switches. Those would be analogous to the ports on the really big Ethernet switch connecting it to our servers. And in an Ethernet switch, what gets us from one port to another port is the back plane of the switch. So we could think of the spine switches as acting like the backplane, and the leaf switches as acting like the ports connecting out to the servers. That’s a look at a few different approaches to switch distribution in our network. We looked at three designs in the enterprise where there could be multiple buildings. We could have a three-tiered design: the access layer in the wiring closet, the distribution layer in a building aggregating those access layer switches, and the core layer that aggregates the building distribution layer switches. And again, in a data center, we might have leaf switches connecting out to our servers, and they can be interconnected with spine switches.