Over the past few weeks I've been posting screenshots from CircuitMaker to Twitter for my latest project. In response I've had a bit more interest than I was expecting in the project, so to answer some of the common questions I've been asked, and some that I haven't been asked yet here we are. For those who aren't interested in a little more detailed there is a TL;DR version here.
This project does not yet have a proper name, ARM server is very utilitarian and how all of my projects start out before I reveal them. With the interest around this project I have revealed earlier than I normally would have so enjoy the placeholder. I would say I am taking no suggestions, but I am terrible with these things, so suggest away.
What is the project?
The project is to build an ARM powered server board in a mini-ITX form factor, but certainly not limited to server only applications. I started this project to explore just how powerful an ARM server could be without the limitations of most single board computers like low RAM and limited/slow storage options.
The SoC that I settled on for the task is an ideal candidate to see just how capable an ARM server node can be. With a DDR4 RAM controller, native PCIe, XFI, and hardware accelerated networking functions alongside a capable quad core 1.8GHz Cortex-A72 many of the limitations faced with ARM devices can be evaluated. This CPU isn't designed for mobile applications, designed instead for a high performance/watt ratio at a max designed power draw of 18 watts. With the configuration I have designed to, storage is only limited to what can fit on 4 SATA ports + 1 NVMe drive, and a maximum of 64GB of RAM.
The current hardware design includes an NXP LS1046A SoC as the main CPU on board, with a Marvell SATA controller, Broadcom PCIe packet switch and using the on-chip IP for networking and USB functionality. Implemented are 2 direct connect native USB 3.0 super speed ports, 4x1Gbe ethernet ports and 1xSFP+ 10Gbps capable port connected to the on chip XFI controller. This allows networking to be handled with the DPAA block in the SoC for full network acceleration independent of CPU load.
The PCIe switch is an unfortunate side effect of one of the SoC's limitations with PCIe lanes. The CPU has 4 PCIe 3.0 lanes available but to connect to multiple devices like the SATA chipset, the PCIe slot and M2 connector has required this compromise to be made. Bandwidth available over the PCIe bus will still allow for exceptional performance for storage operations, but it does limit bandwidth available for external GPU options. Given that there aren't AAA titles compiled for Aarch64 this limitation however isn't going to be a significant impediment.
The final major piece of confirmed infrastructure on the board is the BMC or baseboard management controller. Those in the enterprise segment will be fully aware of how these systems work, and agree that no server application will be truly relevant without one. Existing off the shelf BMC solutions don't really meet my needs, and the BMC chipsets which include GPU's for better access to iKVM are extremely hard as an independent designer to obtain, even harder again if you want documentation. To overcome this I designed a system around a Cortex-A5 MPU to offer most of the usual services you would expect minus iKVM. The BMC provides 100Mbps ethernet on a separate management port, USB connection to the host CPU for in-band communications and access to the serial console.
A 2D accelerator may be included in the final design depending on the work required and BOM cost to do so. The accelerator for headless designs is not strictly required with the serial console giving full access to firmware environment and linux systems are capable of running without requiring the overhead of a framebuffer. The PCIe slot will also allow a 3D capable GPU if needed. My preference is to include the 2D accelerator and initial provisions in the board design do leave room for its implementation.
As the hardware is not yet complete the software environment including firmware is only planned at this stage. The plan is to make as much of the firmware environment open source as possible. Some components are not open source at the time of writing this post due to source materials not existing. For example the Marvell SATA controller has a closed firmware implementation, and the NXP networking block (known as fman or frame manager) has a closed source microcode.
The intended boot firmware environment is a standard EL3 hypervisor implementing PSCI (Power State Coordination Interface) as required by the ARM spec for ARMv8/Aarch64 devices, along with a Tianocore/Linaro derived EDK2 implementation for UEFI with ACPI support. A source has informed me that a valid ACPI implementation will allow the Windows 10 Enterprise/Server Aarch64 editions to run on this board. Broad OS compatibility is something worth working towards, after all why build a server board if it required embedded style work to get to work. The likely path towards this would be dual mode DTB/ACPI as the preferred method for linux is device tree. ACPI support, no matter your feelings towards it, does however provide a better experience by abstracting away much of the device specific logic and allowing the high level OS to work with the logical interfaces.
While I mentioned the BMC above with the hardware overview, the BMC itself is really its own sub-project within the overall board. The system is made up of a Microchip Cortex-A5 MPU (Originally an Atmel MPU) with 256MB of DDR3 RAM and 256MB of parallel NOR. This allows for a light weight linux based OS with significant capabilities for logging and customisation. Using the 100Mbps ethernet port combined with the USB 2.0 High Speed device connection to the host CPU, remote media can be used much the same as existing commercial BMC implementations. The BMC also contains a connection to the firmware QSPI NOR for the host CPU allowing out-of-band firmware updates including the capability to restore from a bad firmware flash remotely.
The BMC also doubles as the system manager, implementing the system management bus to thermal, fan, PCIe, and DDR subsystems, allowing the BMC to log all FRU changes, plus respond to over temperature and critical events. The BMC runs off the 5V standby voltage supplied by a standard ATX power supply and manages the power on process for both the power supply and the host CPU including the front panel button connections for reset and power. The BMC is the full package for management of the system.
Where to next
There is still a lot of work to do before the board can be prototyped and tested. Latest progress estimation has approximately 20% of high speed busses and 15% overall of routing completed. This board is one of the most complex projects I have designed and there is a lot of work ahead as just one person. After the board layout is completed I spend a reasonable amount of time going over every square mm of the design to make sure it complies with design rules and requirements before I finally accept that it is ready for a prototype. At that point I will follow up this post with more technical information for those who are interested as well as the schematic, gerbers, and the link to the CircuitMaker project.
Building an ARM server in a mini-ITX form factor with the following specs:
- LS1046A 1.8GHz quad core Cortex-A72.
- 1 4-lane PCIe slot (16 lane length for GPUs).
- DDR4 RDIMM supporting up to 64GB quad rank RAM with ECC.
- 4 SATA 3.0 ports.
- 4 1Gbe Ethernet ports.
- 2 USB 3.0 Super Speed ports.
- 1 SFP+ (supports 10Gbe modules only).
- BMC with dedicated 100Mbps management port.
- USB console with access to BMC debug, Host CPU console and JTAG.
- 1 M2 port supporting 2280 (22mm wide * 80mm long) NVMe drive.
- Board level power usage monitoring.
- (unconfirmed) VGA 2D GPU.
- Open Source UEFI firmware.
Yes, I am plugging myself on my own blog. Projects like these are expensive, and while I will continue to do them for myself and love of PCB design, if you enjoy my work and would like to help support my projects I have a PayPal Tip Jar.