A fundamental aspect of designing systems with dedicated servers is identifying and improving the system bottlenecks. We extend the concept of a bottleneck to networks with heterogeneous, flexible servers. In contrast with a network with dedicated servers, the bottlenecks are not a priori obvious, but can be determined by solving a number of linear programming problems. Unlike the dedicated server case, we find that a bottleneck may span several nodes in the network. We then identify some characteristics of desirable flexibility structures. In particular, the chosen flexibility structure should not only achieve the maximal possible capacity (corresponding to full server flexibility), but should also have the feature that the entire network is the (unique) system bottleneck. The reason is that it is then possible to shift capacity between arbitrary nodes in the network, allowing the network to cope with demand fluctuations. Finally, we specify when certain flexibility structures (in particular chaining, targeted flexibility, and the “N” and “W” structures from the call center literature) possess these desirable characteristics.