OpenOnload 201109-u2 Release Notes ================================== Major new features of this release are identified below, along with limitations that we are aware of. ---------------------------------------- Epoll enhancements ------------------ openonload-201109-u2 includes some enhancements to the default epoll acceleration. The enhancements apply when EF_UL_EPOLL is unset or set to 1. The default epoll acceleration is now expected to be the best choice for all applications except those that manage very large numbers of accelerated file descriptors in each epoll set, for which you should set EF_UL_EPOLL=2. In previous releases, threads blocking inside epoll_wait() would, while spinning, exclude invocations of epoll_ctl(). This could cause latency spikes or even deadlock. For this reason, we advised that for such applications you set EF_UL_EPOLL=2 and EF_EPOLL_CTL_FAST=0. From openonload-201109-u2 those workarounds are not necessary. The default settings should give much better performance. This is of particular benefit in applications where an epoll_ctl() which causes epoll_wait() to wake up on the latency critical path. (boost::asio for example is known to do this). CAVEAT: Some kernels with realtime extensions have a bug that causes error messages to be emitted to the system log when the default epoll acceleration is used. This issue affects MRG 2 for example. The work-around is to use EF_UL_EPOLL=2 on such systems. ---------------------------------------- Per-thread control of spinning ------------------------------ openonload-201109-u1 added a new extension API: onload_thread_set_spin(api, enable); This can be used to control spinning on a per-thread as well as per-API basis. The existing spin-related configuration options set the default behaviour for threads, and this API overrides the default. To enable spinning only for certain threads: 1) Set the spin timeout by setting EF_SPIN_USEC, and disable spinning by default by setting EF_POLL_USEC=0. 2) In each thread that should spin, invoke onload_thread_set_spin(). To disable spinning only in certain threads: 1) Enable spinning by setting EF_POLL_USEC=. 2) In each thread that should not spin, invoke onload_thread_set_spin(). ---------------------------------------- Scalable packet buffer mode --------------------------- Versions of Onload prior to openonload-201109 were only able to support a limited number of buffers for storing packet data, due to these buffers consuming a limited hardware resource. The limit was typically around 120,000 packet buffers in aggregate, shared amongst all Onload stacks. The openonload-201109 release introduced a new feature that removes that limit. The openonload-201109-u1 includes some significant improvements to this feature. Please see the user guide for the details, but briefly the prerequisites for using scalable packet buffer mode are: - The system hardware must have an IOMMU and it must be enabled in the BIOS [1]. - The kernel must be compiled with support for IOMMUs [2], and kernel command line options are needed to select the IOMMU mode used [3]. - SR-IOV must be enabled on the network adapter via the "sfboot" utility. - The kernel must be compiled with support for SR-IOV APIs (CONFIG_PCI_IOV). NB. you can often find the config for the running kernel in one of: /boot/config-$(uname -r) /lib/modules/$(uname -r)/build/.config /proc/config.gz - In order to use more than 6 VFs (virtual functions) the system hardware and kernel must support PCIe Alternative Requester ID (ARI) which is a PCIe genII feature. Current Solarflare adapters support up-to 127 VFs. - Onload option EF_PACKET_BUFFER_MODE=1 must be set in the environment. When "scalable packet buffer mode" is enabled there is no hardware-imposed limit on the number of packet buffers that can be allocated, but the per-stack limit enforced by EF_MAX_PACKETS still applies. There were several bugs in the Linux kernel's early support for SR-IOV. We recommend you use a kernel at least as recent as 2.6.35, or an RHEL kernel at least as recent as 2.6.32-131.0.15. For other distributions we suggest (a) to use the latest stable kernel available for that distribution and (b) to perform extensive testing with scalable packet buffer mode before deploying in production environments. A bug in the Intel IOMMU driver in RHEL6 kernels means scalable packet buffer mode does not work if the Onload driver is reloaded. Rebooting fixes this problem. [1] To detect if the BIOS has IOMMU support enabled please see the messages printed as the kernel boots. The BIOS presents ACPI tables to the kernel. For Intel machines the ACPI table "DMAR" represents the IOMMU so expect a line in the syslog starting: kernel: ACPI: DMAR For AMD machines the ACPI table to expect is IVRS: kernel: ACPI: IVRS The BIOS may call this option "IO virtualization" or for Intel systems "VT-d" or it may enable the IOMMU with a more generically named virtualization option. Please consult your system documentation. [2] Most recent kernels are compiled with support for IOMMUs by default, but unfortunately the realtime (-rt) kernel patches are not currently compatible with IOMMUs. It is possible to use scalable packet buffer mode on some systems without IOMMU support, but in an insecure mode. In this configuration the IOMMU is bypassed, and there is no checking of DMA addresses provided by Onload in user-space. Bugs or misbehaviour of user-space code can cause system compromise. To enable this insecure mode set unsafe_sriov_without_iommu=1 for the sfc_resource kernel module. RedHat MRG kernels are compiled with CONFIG_PCI_IOV disabled. It should be possible to recompile the kernel with CONFIG_PCI_IOV enabled, but note that Solarflare does not currently test this configuration. [3] We recommend you configure the kernel to use the IOMMU in pass-through mode. Set the following kernel command-line options in your bootloader config: For an Intel system: intel_iommu=on iommu=on,pt or for an AMD system: amd_iommu=on iommu=on,pt In pass-through mode the IOMMU is bypassed for regular devices. This gives the best possible performance, and also avoids kernel bugs that are exhibited on a variety of kernel versions when the IOMMU is fully enabled for all devices. ---------------------------------------- Scalable packet buffer mode on RHEL6 ------------------------------------ On RHEL6 scalable packet buffer mode stops working if the sfc_resource driver is reloaded. At this stage we believe this is a kernel bug, but we continue to investigate. As a work-around, you will need to reboot the system if you need to reload the drivers and use scalable packet buffer mode. David Riddoch 2012/02/03