Packet switches are at the heart of modern communication networks. Initially deployed for local- and wide-area computer networking, they are now being used in different contexts, such as interconnection networks for High-Performance Computing (HPC), Storage Area Networks (SANs) and Systems-on-Chip (SoC) communication. Each application domain, however, has peculiar requirements in terms of bandwidth, latency, scalability and delivery guarantee.In this thesis we present two novel switching architectures, aimed at shared-memory supercomputing and storage networking respectively. We describe the general architecture of the two systems and discuss how specific requirements and current technology trends have impacted the design. More important, we present architectural innovations that address important issues concerning performance and scalability of input-queued switches.We propose techniques that enable the construction of distributed (multi-chip) schedulers for large crossbars, develop a scheme for integrated scheduling of unicast and multicast traffic and and study flow-control mechanisms that enable lossless behavior while providing fine-grained control of active flows.