Identifying the upstream and downstream capabilities of a PCIe device is key to determining the effectiveness of the product. This is especially true for a PCIe NVMe AIC (add-in-card). NVMe SSDs are designed to interface directly with the system CPU via the PCIe bus. Determining whether or not an NVMe AIC can fully utilize the available PCIe bandwidth, and allocate this bandwidth to where it is needed most, allows one to separate the wheat from the chaff. This article explains the functionality and architecture behind the concepts of upstream and downstream, and how they relate to an NVMe-based storage solution.
Deciphering PCIe Terminology: How to identify an NVMe AIC’s Upstream & Downstream Bandwidth
First, lets’ start with the basics; a brief overview of what Upstream and Downstream bandwidth refers to, and how this is related to a PCIe add-in-card (port types and bandwidth allocation):
Upstream Port (USP): The PCIe Switch USP is used to interface with the host computing platform’s PCIe root complex (which serves as a sort of bridge between the CPU, memory and PCIe bus). The bandwidth allocated to this port is referred to the as the Upstream Bandwidth, and is generally denoted by an “x#” value, such as x8 or x16.
Downstream Port (DSP): DSPs interface with the PCIe endpoint devices. The bandwidth allocated to the DSPs is referred to as Downstream Bandwidth. In the context of an NVMe AIC, this refers to the NVMe SSDs.
Bandwidth Allocation: How the PCIe switch allocates bandwidth (X# of electrical lanes) to the NVMe devices (SSDs in the case of an NVMe AIC). In order to avoid a performance bottleneck when all devices are accessed, the total bandwidth allocated to the DSPs should not exceed what is allocated to the USP.
The diagram shown above illustrates HighPoint’s SSD7104/7104F PCIe Gen3 x16 RAID AIC. It allocates x16 lanes of dedicated Upstream Bandwidth, and x4 lanes of Downstream Bandwidth to each of the four M.2 ports. This distribution is ideal; the upstream and downstream bandwidth is perfectly in sync, and x4 lanes are allocated to each SSD (which ensures the product can deliver maximum throughput).
How does an AIC Distribute Bandwidth?
Ok, so we now understand that the product’s Electrical Lane bandwidth should correspond with its Upstream bandwidth. How do we determine how this bandwidth is distributed Upstream to the system, and Downstream to each NVMe SSD.
Identifying Upstream: In terms of an PCIe NVMe AIC, Upstream refers to the maximum electrical lanes the card can output to the system. Most AICs denote this by the “x#” value assigned to the product description, such as the forementioned SSD7105’s “Gen3 x16”. HighPoint makes this easy for customers – our products deliver exactly what is stated by the product name. For a non-HighPoint solution, this is not always the case, as they may be simply referring to the card’s mechanical requirement (type of slot it will fit into). If in doubt, check the published specifications.
Identifying Downstream: As mentioned previously, in regards to an NVMe AIC, “Downstream” refers to how bandwidth is distributed to each of the AIC’s NVMe ports. Ideally, the NVMe AIC solution would be capable of allocating x4 lanes per device port. This applies to both PCIe Gen3 and Gen4 NVMe media, and enables the SSD to reach the theoretical maximum throughout.
The Upstream and Downstream capabilities of a give NVMe AIC is determined by two things; the AIC’s PCIe Switch Chipset, and AIC’s hardware architecture (how it makes use of the Switch Chipset, if present).
What is a switch chipset? A PCIe Switch chipset is chip or set of chips designed to allocate bandwidth (total number of PCIe lanes) to each “port”. Any true high-performance PCIe NVMe AIC will be equipped with dedicated PCIe Switch.
When discussing PCIe Switch Chipsets, “port” can refer to an individual device port or the device itself (AIC in this case), as switch chipsets are employed by any number of computing devices (such as a motherboard, AIC or backplane). For the purposes of this article, “port” refers to the AIC’s Upstream Port (connection to the computer) and Downstream ports (NVMe device ports).
You can determine much about the capabilities of the AIC if you can identify it’s PCIe Switch chipset.
The two major players in the PCIe Switch chipset market are ASmedia and Broadcom. The following describes four of their leading PCIe Gen3 switches.
ASmedia ASM2812 – this chipset can deliver 12 total lanes; a maximum of x4 lanes of upstream bandwidth, and x8 lanes of downstream bandwidth that can be distributed to as many as 12 ports in increments of x1, x2, x4 or x8. This chipset is favored by applications that prioritize maximum device support over raw throughput. For example, the x4 lane upstream bandwidth only allows a single NVMe SSD to perform optimally at any one time. It is most commonly employed by entry-level 1-2 port NVMe HBA’s and various motherboard applications.
ASmedia ASM2824 – this chipset delivers total 24 lanes; a maximum of x8 lanes of upstream bandwidth, and 16 lanes of downstream bandwidth that can be distributed to as many as 12 ports in increments of x1, x2, x4 or x8.
Like the ASM2812, this chipset favors maximum port count over transfer throughput, and is often used for general use NVMe HBAs (1-4 ports) and various motherboard-related PCIe solutions. The x8 upstream bandwidth only allows up to two NVMe devices to operate concurrently at optimal speeds (x4 lanes per SSD).
Broadcom PEX8724 – this chipset can deliver a total of 24 lanes, which can be distributed to as many as 6 ports (1 upstream + 5 downstream), in increments of x4 or x8 .The design is flexible; x4 or x8 bandwidth can be allocated to the upstream and downstream ports.
Broadcom PEX8747 - this chipset can deliver a total of 48 lanes; it supports a maximum of 5 ports (1 upstream + 4 downstream), with as much x16 lanes assigned to each port. This flexible design allocates x16 lanes to the upstream port, and x32 lanes to as many as 4 downstream ports, in increments of x8 or x16. The dedicated x16 lanes of upstream bandwidth is ideal for high-performance applications; for example, up to four NVMe SSDs can operate concurrently at optimal speeds (x4 lanes).
Broadcom PEX8749 – Broadcom refers to the 48-lane, 18-port PEX8749 as a PCIe switch device designed for fan-out aggregation. It is employed by PCIe devices that require additional, more flexible ways to distribute PCIe bandwidth, and features an integrated DMA engine. The DMA engine is ideal for storage devices, as it enables the switch to offload this task from the system CPU, and optimize data transfer between any storage device hosted by its ports. The switch device boasts a huge number of ports, 18 total, which can be allocated x1 to x16 lanes of bandwidth and be assigned to serve as the upstream port, dynamically. HighPoint’s 8-Channel PCIe Gen3 NVMe AICs, such as the SSD7140A and SSD7180, utilize this switch to dynamically allocate up to x4 lanes to each SSD, as needed.
Industry’s Fastest & Most Flexible PCIe/NVMe Architecture: HighPoint PCIe Gen3 NVMe solutions employ this chipset (often in combination with the PLX8749) to allocate x4 lanes of bandwidth to each device port. They can also assign lanes on the fly, to ensure nothing is wasted. For example, our SSD7140A 8-port M.2 NVMe RAID AICs, via Broadcom’s PCIe Switch and our unique board design, can allocate up to x4 lanes to each port. Unlike ordinary 8-port controllers, which do not utilize PCIe switches and statically assign bandwidth to each M.2 channel, the SSD7140A will automatically adjust lane assignment on the fly depending on the number of hosted M.2 SSDs and how they are being utilized. Say for example, only half the ports are occupied (4), the card will assign the 4 hosted SSDs x4 lanes of dedicated bandwidth.
Conclusion
By now, it should be clear that to truly determine the Upstream/Downstream capability of an NVMe AIC or AIC drive, you must consider the number of NVMe SSDs the target device can support, how it is able to distribute bandwidth to each of these SSDs (Downstream), and whether or not the device is able to saturate the PCIe lanes it has been allocated (Upstream).
Unlike the majority of NVMe AICs, adapters and HBAs in today’s marketplace, HighPoint PCIe Gen3 NVMe solutions are engineered to take full advantage of x16 lanes of host bandwidth, and actively work to ensure none of it is wasted. Due to our unique hardware architecture, which integrates Broadcom’s leading PEX8747 and PEX8749 Switches, SSD7000 series NVMe AICs and RocketAIC 7000 series NVMe drives allocate the maximum possible host bandwidth to each NVMe device port, and deliver up to 14GB/s (14,000MB/s) of real-world transfer throughput; the maximum possible via a single PCIe 3.0 slot!
Our four port models (SSD7105/SSD7104/SSD7104F/SSD7120) allocate a dedicated x4 lanes to each port at all times to fully maximize x16 lanes of bandwidth. Our high-density 8-channel models (SSD7140A, SSD7180, SDSD7184) can dynamically allocate bandwidth to each SSD as needed, to maximize storage capacity without sacrificing transfer throughput.
Learn More
留言