Hardware Locality Changelog

What's new in Hardware Locality 2.10.0

Dec 5, 2023
  • Heterogeneous Memory core improvements:
  • Better heuristics to identify the subtype of memory such as HBM, DRAM, NVM, CXL-DRAM, etc.
  • Build memory tiers, i.e. sets of NUMA nodes with the same subtype and similar performance.
  • NUMA node tier ranks are exposed in the new MemoryTier info attribute (starts from 0 for highest bandwidth tier)..
  • See the new Heterogeneous Memory section in the documentation.
  • API:
  • Add hwloc_topology_free_group_object() to discard a Group created by hwloc_topology_alloc_group_object().
  • Linux backend:
  • Fix cpukinds on NVIDIA Grace to report identical cores even if they actually have very small frequency differences. Thanks to John C. Linford for the report.
  • Add CXLDevice attributes to CXL DAX objects and NUMA nodes to show which PCI device implements which window.
  • Ignore buggy memory-side caches and memory attributes when fake NUMA emulation is enabled on the Linux kernel command-line.
  • Add more info attributes in MemoryModule Misc objects, thanks to Zubiao Xiong for the patch.
  • Get CPUModel and CPUFamily info attributes on LoongArch platforms.
  • x86 backend:
  • Add support for new AMD CPUID leaf 0x80000026 for better detection of Core Complex and Die on Zen4 processors.
  • Improve Zhaoxin CPU topology detection.
  • Tools:
  • Input locations and many command-line options (e.g. hwloc-calc -I -N -H, lstopo --only) now accept filters such as "NUMA[HBM]" so that only objects are that type and subtype are considered.
  • NUMA[tier=1] is also accepted for selecting NUMA nodes depending on their MemoryTier info attribute.
  • Add --object-output to hwloc-calc to report the type as a prefix to object indexes, e.g. Core:2 instead of 2 in the output of -I.
  • hwloc-info --ancestor and --descendants now accepts kinds of objects instead of single types.
  • The new --first option only shows the first matching object.
  • Add --children-of-pid to hwloc-ps to show a hierarchy of processes. Thanks to Antoine Morvan for the suggestion.
  • Add --misc-from to lstopo to add Misc objects described in a file.
  • To be combined with the new hwloc-ps --lstopo-misc for a customizable lstopo --top replacement.
  • Misc:
  • lstopo may now configure the layout of memory object placed above, for instance with --children-order memory:above:vert.
  • Fix XML import from memory or stdin when using libxml2 2.12.
  • Fix installation failures when configuring with --target, thanks to Clement Foyer for the patch.
  • Fix support for 128bit pointer architectures.
  • Remove Netloc.

New in Hardware Locality 2.9.3 (Sep 12, 2023)

  • Handle Linux glibc allocation errors in binding routines (CVE-2022-47022).
  • Fix hwloc-calc when searching objects on heterogeneous memory platforms, thanks to Antoine Morvan for the report.
  • Fix hwloc_get_next_child() when there are some memory-side caches.
  • Don't crash if the topology is empty because Linux cgroups are wrong.
  • Improve some hwloc-bind warnings in case of command-line parsing errors.
  • Many documentation improvements all over the place, including:
  • hwloc_topology_restrict() and hwloc_topology_insert_group() may reorder children, causing the logical indexes of objects to change.

New in Hardware Locality 2.9.3 RC 1 (Sep 7, 2023)

  • Handle Linux glibc allocation errors in binding routines.
  • Fix hwloc-calc when searching objects on heterogeneous memory platforms
  • Fix hwloc_get_next_child() when there are some memory-side caches.
  • Don't crash if the topology is empty because Linux cgroups are wrong.
  • Improve some hwloc-bind warnings in case of command-line parsing errors.
  • Many documentation improvements all over the place, including:
  • Hwloc_topology_restrict() and hwloc_topology_insert_group() may reorder children, causing the logical indexes of objects to change.

New in Hardware Locality 2.9.2 (Jun 28, 2023)

  • Don't forget L3i when defining filters for multiple levels of caches with hwloc_topology_set_cache/icache_types_filter().
  • Fix object total_memory after hwloc_topology_insert_group_object().
  • Fix the (non-yet) exporting in synthetic description for complex memory hierarchies with memory-side caches, etc.
  • Fix some default size attributes when building synthetic topologies.
  • Fix size units in hwloc-annotate.
  • Improve bitmap reallocation error management in many functions.
  • Documentation improvements:
  • Better document return values of functions.
  • Add "Error reporting" section (in hwloc.h and in the doxygen doc).
  • Add FAQ entry "What may I disable to make hwloc faster?"
  • Improve FAQ entries "Why is lstopo slow?" and "I only need ..., why should I use hwloc?"
  • Clarify how to deal with cpukinds in hwloc-calc and hwloc-bind manpages.

New in Hardware Locality 2.9.1 RC 1 (Mar 28, 2023)

  • Don't forget to apply object type filters to "perflevel" caches detected on recent Mac OS X releases, thanks to Michel Lesoinne for the report.
  • Fix a failed assertion in hwloc_topology_restrict() when some NUMA nodes are removed because of HWLOC_RESTRICT_FLAG_REMOVE_CPULESS but no PUs are.
  • Thanks to Mark Grondona for reporting the issue.
  • Mark HPE Cray Slingshot NICs with subtype "Slingshot".

New in Hardware Locality 2.9.0 (Dec 14, 2022)

  • Backends:
  • Expose the memory size of CXL memory devices (Type 3) on Linux.
  • The LevelZero backend now reports the "XeLinkBandwidth" distance matrix between L0 devices (and subdevices) when available.
  • Add support for CUDA compute capability up to 9.0.
  • Tools:
  • lstopo now switches to console mode when its output is redirected. Graphical window mode may be forced back with --of window.
  • hwloc-calc now accepts "numa" in -H, and I/O subtypes such as "gpu" in -I and -N.

New in Hardware Locality 2.9.0 RC 1 (Dec 8, 2022)

  • Backends:
  • Expose the memory size of CXL memory device (Type 3) on Linux.
  • The LevelZero backend now reports the "XeLinkBandwidth" distance matrix between L0 devices (and subdevices) when available.
  • Add support for CUDA compute capability up to 9.0.
  • * Tools:
  • lstopo now switches to console mode when its output is redirected. Graphical window mode may be forced back with --of window.
  • hwloc-calc now accepts "numa" in -H, and I/O subtypes such as "gpu" in -I and -N.

New in Hardware Locality 2.8.0 (Jul 5, 2022)

  • API:
  • Add HWLOC_TOPOLOGY_FLAG_NO_DISTANCES, _NO_MEMATTRS and _NO_CPUKINDS
  • to reduce the overhead when unneeded.
  • Add separate Read/Write Bandwidth/Latency memory attributes and
  • implement them on Linux.
  • Backends:
  • NUMA nodes may now have a subtype such as DRAM, HBM, SPM, or NVM
  • on heterogeneous memory platforms on Linux.
  • Add DAXType and DAXParent attributes on Linux to tell where a
  • DAX device or its corresponding NUMA node come from (SPM for
  • Specific-Purpose or NVM for Non-Volatile Memory).
  • Detect heterogeneous caches in hybrid CPUs on MacOS X,
  • thanks to Paul Bone for the help.
  • Max frequencies are not ignored in Linux cpukinds anymore (they were
  • ignored in hwloc 2.7.0), but they may be slightly adjusted to avoid
  • reporting hybrid CPUs because Intel Turbo Boost Max 3.0.
  • See the documentation of environment variable HWLOC_CPUKINDS_MAXFREQ.
  • Hardwire the PCI locality of HPE Cray EX235a nodes.
  • Tools:
  • lstopo and other tools may now load Linux and x86 cpuid topology files
  • from a tarball.
  • lstopo may now replace the P# and L# index prefixes with custom strings
  • thanks to --os-index-prefix and --logical-index-prefix options.
  • Misc:
  • Add --disable-readme to avoid regenerating the top-level hwloc README
  • file from the documentation.

New in Hardware Locality 2.7.1 (Mar 24, 2022)

  • Workaround crashes when virtual machines report incoherent x86 CPUID information about numbers of cores and threads.
  • Use setenv() instead of putenv() when trying to force enable oneAPI L0 support, to avoid issues with applications that touch the environment.
  • Add some warnings at the end of configure when GPU libraries are missing on the system or their path is missing in the environment.

New in Hardware Locality 2.6.0 (Oct 28, 2021)

  • Backends:
  • Expose two cpukinds for energy-efficient cores (icestorm) and high-performance cores (firestorm) on Apple M1 on Mac OS X.
  • Use sysfs CPU "capacity" to rank hybrid cores by efficiency on Linux when available (mostly on recent ARM platforms for now).
  • Improve HWLOC_MEMBIND_BIND (without the STRICT flag) on Linux kernel >= 5.15: If more than one node is given, the kernel may now use all of them instead of only the first one before falling back to others.
  • Expose cache os_index when available on Linux, it may be needed when using resctrl to configure cache partitioning, memory bandwidth monitoring, etc.
  • Add a "XGMIHops" distances matrix in the RSMI backend for AMD GPU interconnected through XGMI links.
  • Expose AMD GPU memory information (VRAM and GTT) in the RSMI backend.
  • Add OS devices such as "bxi0" for Atos/Bull BXI HCAs on Linux.
  • Tools:
  • Lstopo has a better placement algorithm with respect to I/O objects, see --children-order in the manpage for details.
  • Hwloc-annotate may now change object subtypes and cache or memory sizes.
  • Build:
  • Allow to specify the ROCm installation for building the RSMI backend:
  • Use a custom installation path if specified with --with-rocm=<dir>.
  • Use /opt/rocm-<version> if specified with --with-rocm-version=<version> or the ROCM_VERSION environment variable.
  • Try /opt/rocm if it exists.
  • See "How do I enable ROCm SMI and select which version to use?" in the FAQ for details.
  • Documentation:
  • Add FAQ entry "How do I create a custom heterogeneous and
  • asymmetric topology?"

New in Hardware Locality 2.5.0 (Oct 13, 2021)

  • API:
  • Add hwloc/windows.h to query Windows processor groups.
  • Add hwloc_get_obj_with_same_locality() to convert between objects with same locality, for instance NUMA nodes and Packages, or OS devices within a PCI device.
  • Add hwloc_distances_transform() to modify distances structures.
  • Hwloc-annotate and lstopo have new distances-transform options.
  • Hwloc_distances_add() is replaced with _add_create() followed by
  • _add_values() and _add_commit(). See hwloc/distances.h for details.
  • Add topology flags to mitigate binding modifications during hwloc discovery, especially on Windows:
  • HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING and _MEMBINDING restrict discovery to PUs and NUMA nodes inside the binding.
  • HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING prevents from ever changing the binding during discovery.
  • Backends:
  • Add a levelzero backend for oneAPI L0 devices, exposed as OS devices of subtype "LevelZero" and name such as "ze0".
  • Add hwloc/levelzero.h for interoperability between converting between L0 API devices and hwloc cpusets or OS devices.
  • Expose NEC Vector Engine cards on Linux as OS devices of subtype "VectorEngine" and name "ve0", etc.
  • Thanks to Anara Kozhokanova, Tim Cramer and Erich Focht for the help.
  • Add a NVLinkBandwidth distances structure between NVIDIA GPUs (and POWER processor or NVSwitches) in the NVML backend, and a XGMIBandwidth distances structure between AMD GPUs in the RSMI backends.
  • See "Topology Attributes: Distances, Memory Attributes and CPU Kinds" in the documentation for details about these new distances.
  • Add support for NUMA node 0 being offline in Linux, thanks to Jirka Hladky.
  • Build:
  • Add --with-cuda-version=<version> or look at the CUDA_VERSION environment variable to find the appropriate CUDA pkg-config files.
  • Thanks to Stephen Herbein for the suggestion.
  • Also add --with-cuda=<dir> to specify the CUDA installation path manually (and its NVML and OpenCL components).
  • Thanks to Andrea Bocci for the suggestion.
  • See "How do I enable CUDA and select which CUDA version to use?" in the FAQ for details.
  • Tools:
  • Lstopo now has a --windows-processor-groups option on Windows.
  • Hwloc-ps now has a --short-name option to avoid long/truncated command path.
  • Hwloc-ps now has a --single-ancestor option to return a single (possibly too large) object where a process is bound.
  • Hwloc-ps --pid-cmd may now query environment variables,
  • Including MPI-specific variables to find out process ranks.

New in Hardware Locality 2.5.0 RC1 (Jun 8, 2021)

  • API:
  • Add hwloc/windows.h to query Windows processor groups.
  • Add hwloc_get_obj_with_same_locality() to convert between objects
  • with same locality, for instance NUMA nodes and Packages,
  • or OS devices within a PCI device.
  • Add hwloc_distances_transform() to modify distances structures.
  • hwloc-annotate and lstopo have new distances-transform options.
  • hwloc_distances_add() is replaced with _add_create() followed by
  • _add_values() and _add_commit(). See hwloc/distances.h for details.
  • Add topology flags to mitigate binding modifications during
  • hwloc discovery, especially on Windows:
  • HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING and _MEMBINDING
  • restrict discovery to PUs and NUMA nodes inside the binding.
  • HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING prevents from ever
  • changing the binding during discovery.
  • Backends:
  • Add a levelzero backend for oneAPI L0 devices, exposed as OS devices
  • of subtype "LevelZero" and name such as "ze0".
  • Add hwloc/levelzero.h for interoperability between converting
  • between L0 API devices and hwloc cpusets or OS devices.
  • Expose NEC Vector Engine cards on Linux as OS devices of subtype
  • "VectorEngine" and name "ve0", etc.
  • Thanks to Anara Kozhokanova, Tim Cramer and Erich Focht for the help.
  • Add a NVLinkBandwidth distances structure between NVIDIA GPUs
  • (and POWER processor or NVSwitches) in the NVML backend,
  • and a XGMIBandwidth distances structure between AMD GPUs
  • in the RSMI backends.
  • See "Topology Attributes: Distances, Memory Attributes and CPU Kinds"
  • in the documentation for details about these new distances.
  • Add support for NUMA node 0 being offline in Linux, thanks to Jirka Hladky.
  • Build:
  • Add --with-cuda-version=<version> or look at the CUDA_VERSION
  • environment variable to find the appropriate CUDA pkg-config files.
  • Thanks to Stephen Herbein for the suggestion.
  • Also add --with-cuda=<dir> to specify the CUDA installation path
  • manually (and its NVML and OpenCL components).
  • Thanks to Andrea Bocci for the suggestion.
  • See "How do I enable CUDA and select which CUDA version to use?"
  • in the FAQ for details.
  • Tools:
  • lstopo now has a --windows-processor-groups option on Windows.
  • hwloc-ps now has a --short-name option to avoid long/truncated
  • command path.
  • hwloc-ps now has a --single-ancestor option to return a single
  • (possibly too large) object where a process is bound.
  • hwloc-ps --pid-cmd may now query environment variables,
  • including MPI-specific variables to find out process ranks.

New in Hardware Locality 2.4.1 (Feb 23, 2021)

  • Fix AMD OpenCL device locality when PCI bus or device number >= 128.
  • Applications using any of the following inline functions must be recompiled to get the fix: hwloc_opencl_get_device_pci_busid() hwloc_opencl_get_device_cpuset(), hwloc_opencl_get_device_osdev().
  • Fix the ranking of cpukinds on non-Windows systems, thanks to Ivan Kochin for the report.
  • Fix the insertion of custom Groups after loading the topology, thanks to Scott Hicks.
  • Add support for CPU0 being offline in Linux, thanks to Garrett Clay.
  • Fix missing x86 Package and Core objects FreeBSD/NetBSD.
  • Fix the import of very large distances with heterogeneous object types.
  • Fix a memory leak in the Linux backend, thanks to Perceval Anichini.

New in Hardware Locality 2.4.1 RC1 (Jan 26, 2021)

  • Fix AMD OpenCL device locality when PCI bus or device number >= 128.
  • Thanks to Edgar Leon for reporting the issue.
  • Applications using any of the following inline functions must be recompiled to get the fix: hwloc_opencl_get_device_pci_busid()
  • Hwloc_opencl_get_device_cpuset(), hwloc_opencl_get_device_osdev().
  • Fix missing x86 Package and Core objects FreeBSD/NetBSD.
  • Thanks to Thibault Payet and Yuri Victorovich for the report.
  • Fix the import of very large distances with heterogeneous object types.
  • Fix a memory leak in the Linux backend,
  • Thanks to Perceval Anichini.

New in Hardware Locality 2.4.0 (Dec 1, 2020)

  • API:
  • Add hwloc/cpukinds.h for reporting information about hybrid CPUs.
  • Use Linux cpufreq frequencies to rank cores by efficiency.
  • Use x86 CPUID hybrid leaf and future Linux kernels sysfs CPU type
  • files to identify Intel Atom and Core cores.
  • Use the Windows native EfficiencyClass to separate kinds.
  • Backends:
  • Properly handle Linux kernel 5.10+ exposing ACPI HMAT information
  • with knowledge of Generic Initiators.
  • Tools:
  • lstopo has new --cpukinds and --no-cpukinds options for showing
  • CPU kinds or not in textual and graphical modes respectively.
  • hwloc-calc has a new --cpukind option for filtering PUs by kind.
  • hwloc-annotate has a new cpukind command for modifying CPU kinds.
  • Misc:
  • Fix hwloc_bitmap_nr_ulongs(), thanks to Norbert Eicker.
  • Add a documentation section about
  • "Topology Attributes: Distances, Memory Attributes and CPU Kinds".
  • Silence some spurious warnings in the OpenCL backend and when showing
  • process binding with lstopo --ps.

New in Hardware Locality 2.4.0 RC 1 (Nov 24, 2020)

  • API:
  • Add hwloc/cpukinds.h for reporting information about hybrid CPUs.
  • Use Linux cpufreq frequencies to rank cores by efficiency.
  • Use x86 CPUID hybrid leaf and future Linux kernels sysfs CPU type files to identify Intel Atom and Core cores.
  • Use the Windows native EfficiencyClass to separate kinds.
  • Backends:
  • Properly handle Linux kernel 5.10+ exposing ACPI HMAT information with knowledge of Generic Initiators.
  • Tools:
  • lstopo has new --cpukinds and --no-cpukinds options for showing CPU kinds or not in textual and graphical modes respectively.
  • hwloc-calc has a new --cpukind option for filtering PUs by kind.
  • hwloc-annotate has a new cpukind command for modifying CPU kinds.
  • Misc:
  • Fix hwloc_bitmap_nr_ulongs(), thanks to Norbert Eicker.
  • Add a documentation section about "Topology Attributes: Distances, Memory Attributes and CPU Kinds".
  • Silence some spurious warnings in the OpenCL backend and when showing process binding with lstopo --ps.

New in Hardware Locality 2.3.0 (Sep 28, 2020)

  • API:
  • Add hwloc/memattrs.h for exposing latency/bandwidth information between initiators (CPU sets for now) and target NUMA nodes, typically on heterogeneous platforms.
  • When available, bandwidths and latencies are read from the ACPI HMAT table exposed by Linux kernel 5.2+.
  • Attributes may also be customized to expose user-defined performance information.
  • Add hwloc_get_local_numanode_objs() for listing NUMA nodes that are local to some locality.
  • The new topology flag HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT causes support arrays to be loaded from XML exported with hwloc 2.3+.
  • - hwloc_topology_get_support() now returns an additional "misc" array with feature "imported_support" set when support was imported.
  • Add hwloc_topology_refresh() to refresh internal caches after modifying the topology and before consulting the topology in a multithread context.
  • BACKENDS:
  • Add a ROCm SMI backend and a hwloc/rsmi.h helper file for getting the locality of AMD GPUs, now exposed as "rsmi" OS devices.
  • Thanks to Mike Li.
  • Remove POWER device-tree-based topology on Linux, (it was disabled by default since 2.1).
  • TOOLS:
  • Command-line options for specifying flags now understand comma-separated lists of flag names (substrings).
  • hwloc-info and hwloc-calc have new --local-memory --local-memory-flags and --best-memattr options for reporting local memory nodes and filtering by memory attributes.
  • hwloc-bind has a new --best-memattr option for filtering by memory attributes among the memory binding set.
  • Tools that have a --restrict option may now receive a nodeset or some custom lags for restricting the topology.
  • lstopo now has a --thickness option for changing line thickness in the graphicaL output.
  • Fix lstopo drawing when autoresizing on Windows 10.
  • Pressing the F5 key in lstopo X11 and Windows graphical/interactive outputs now refreshes the display according to the current topology and binding.
  • Add a tikz lstopo graphical backend to generate picture easily included into
  • LaTeX documents. Thanks to Clement Foyer.
  • MISC:
  • The default installation path of the Bash completion file has changed to
  • ${datadir}/bash-completion/completions/hwloc.

New in Hardware Locality 2.2.0 (Sep 21, 2020)

  • API:
  • Add hwloc_bitmap_singlify_by_core() to remove SMT from a given cpuset, thanks to Florian Reynier for the suggestion.
  • Add --enable-32bits-pci-domain to stop ignoring PCI devices with domain >16bits (e.g. 10000:02:03.4). Enabling this option breaks the library ABI.
  • Thanks to Dylan Simon for the help.
  • Backends:
  • Add support for Linux cgroups v2.
  • Add NUMA support for FreeBSD.
  • Add get_last_cpu_location support for FreeBSD.
  • Remove support for Intel Xeon Phi (MIC, Knights Corner) co-processors.
  • Tools:
  • Add --uid to filter the hwloc-ps output by uid on Linux.
  • Add a GRAPHICAL OUTPUT section in the manpage of lstopo.
  • Misc:
  • Use the native dlopen instead of libltdl, unless --disable-plugin-dlopen is passed at configure time.

New in Hardware Locality 1.11.0 (Jun 18, 2015)

  • API:
  • Socket objects are renamed into Package to align with the terminology used by processor vendors. The old HWLOC_OBJ_SOCKET type and "Socket" name are still supported for backward compatibility.
  • HWLOC_OBJ_NODE is replaced with HWLOC_OBJ_NUMANODE for clarification. HWLOC_OBJ_NODE is still supported for backward compatibility.
  • "Node" and "NUMANode" strings are supported as in earlier releases.
  • Detection improvements:
  • Add support for Intel Knights Landing Xeon Phi.
  • Add Vendor, Model, Revision, SerialNumber, Type and LinuxDeviceID info attributes to Block OS devices on Linux.
  • Add --disable-libudev to avoid dependency on the libudev library.
  • Add "MemoryDevice" Misc objects with information about DIMMs, on Linux when privileged and when I/O is enabled.
  • Add a PCISlot attribute to PCI devices on Linux when supported to identify the physical PCI slot where the board is plugged.
  • Add CPUStepping info attribute on x86 processors.
  • Ignore the device-tree on non-Power architectures to avoid buggy detection on ARM.
  • Work-around buggy Xeon E5v3 BIOS reporting invalid PCI-NUMA affinity for the PCI links on the second processor.
  • Add support for CUDA compute capability 5.x.
  • Many fixes to the x86 backend:
  • Add L1i and fix L2/L3 type on old AMD processors without topoext support.
  • Fix Intel CPU family and model numbers when basic family isn't 6 or 15.
  • Fix package IDs on recent AMD processors.
  • Fix misc issues due to incomplete APIC IDs on x2APIC processors.
  • Avoid buggy discovery on old SGI Altix UVs with non-unique APIC IDs.
  • Gather total machine memory on NetBSD.
  • Tools:
  • lstopo:
  • Collapse identical PCI devices unless --no-collapse is given.
  • The ASCII art output is now called "ascii", for instance in "lstopo -.ascii".
  • Automatically scales graphical box width to the inner text in Cairo, ASCII and Windows outputs.
  • Add --rect to lstopo to force rectangular layout even for NUMA nodes.
  • Add --restrict-flags to configure the behavior of --restrict.
  • Objects may have a "Type" info attribute to specify a better type name and display it in lstopo.
  • Really export all verbose information to the given output file.
  • hwloc-annotate:
  • May now operate on all types of objects, including I/O.
  • May now insert Misc objects in the topology.
  • Do not drop instruction caches and I/O devices from the output anymore.
  • Fix lstopo path in hwloc-gather-topology after install.
  • Misc:
  • Fix hwloc/cudart.h for machines with multiple PCI domains.
  • Fix PCI Bridge-specific depth attribute.
  • Fix hwloc_bitmap_intersect() for two infinite bitmaps.
  • Fix some corner cases in the building of levels on large NUMA machines with non-uniform NUMA groups and I/Os.
  • Improve the performance of object insertion by cpuset for large topologies.
  • Prefix verbose XML import errors with the source name.
  • Improve pkg-config checks and error messages.
  • Fix excluding after a component with an argument in the HWLOC_COMPONENTS
  • environment variable.
  • Documentation:
  • Fix the recommended way in documentation and examples to allocate memory on some node, it should use HWLOC_MEMBIND_BIND.
  • Add a "Miscellaneous objects" section in the documentation.
  • Add a FAQ entry "What happens to my topology if I disable symmetric multithreading, hyper-threading, etc. ?" to the documentation.

New in Hardware Locality 1.10.1 (Jun 18, 2015)

  • Actually remove disallowed NUMA nodes from nodesets when the whole-system
  • flag isn't enabled.
  • Fix the gathering of PCI domains. Thanks to James Custer for reporting the issue and providing a patch.
  • Fix the merging of identical parent and child in presence of Misc objects.
  • Fix some misordering of children when merging with ignore_keep_structure() in partially allowed topologies.
  • Fix an overzealous assertion in the debug code when running on a single-PU host with I/O. Thanks to Thomas Van Doren for reporting the issue.
  • Don't forget to setup NUMA node object nodesets in x86 backend (for BSDs) and OSF/Tru64 backend.
  • Fix cpuid-x86 build error with gcc -O3 on x86-32.
  • Fix support for future very large caches in the x86 backend.
  • Fix vendor/device names for SR-IOV PCI devices on Linux.
  • Fix an unlikely crash in case of buggy hierarchical distance matrix.
  • Fix PU os_index on some AIX releases.
  • Fix hwloc_bitmap_isincluded() in case of infinite sets.
  • Change hwloc-ls.desktop into a lstopo.desktop and only install it if lstopo is built with Cairo/X11 support. It cannot work with a non-graphical lstopo or hwloc-ls.
  • Add support for the renaming of Socket into Package in future releases.
  • Add support for the replacement of HWLOC_OBJ_NODE with HWLOC_OBJ_NUMANODE in future releases.
  • Clarify the documentation of distance matrices in hwloc.h and in the manpage of the hwloc-distances.
  • Improve some error messages by displaying more information about the hwloc library in use.
  • Document how to deal with the ABI break when upgrading to the upcoming 2.0

New in Hardware Locality 1.10.1 (Jun 18, 2015)

  • Actually remove disallowed NUMA nodes from nodesets when the whole-system
  • flag isn't enabled.
  • Fix the gathering of PCI domains. Thanks to James Custer for reporting the issue and providing a patch.
  • Fix the merging of identical parent and child in presence of Misc objects.
  • Fix some misordering of children when merging with ignore_keep_structure() in partially allowed topologies.
  • Fix an overzealous assertion in the debug code when running on a single-PU host with I/O. Thanks to Thomas Van Doren for reporting the issue.
  • Don't forget to setup NUMA node object nodesets in x86 backend (for BSDs) and OSF/Tru64 backend.
  • Fix cpuid-x86 build error with gcc -O3 on x86-32.
  • Fix support for future very large caches in the x86 backend.
  • Fix vendor/device names for SR-IOV PCI devices on Linux.
  • Fix an unlikely crash in case of buggy hierarchical distance matrix.
  • Fix PU os_index on some AIX releases.
  • Fix hwloc_bitmap_isincluded() in case of infinite sets.
  • Change hwloc-ls.desktop into a lstopo.desktop and only install it if lstopo is built with Cairo/X11 support. It cannot work with a non-graphical lstopo or hwloc-ls.
  • Add support for the renaming of Socket into Package in future releases.
  • Add support for the replacement of HWLOC_OBJ_NODE with HWLOC_OBJ_NUMANODE in future releases.
  • Clarify the documentation of distance matrices in hwloc.h and in the manpage of the hwloc-distances.
  • Improve some error messages by displaying more information about the hwloc library in use.
  • Document how to deal with the ABI break when upgrading to the upcoming 2.0

New in Hardware Locality 1.10.0 (Jun 18, 2015)

  • API:
  • Add hwloc_topology_export_synthetic() to export a topology to a synthetic string without using lstopo. See the Synthetic topologies section in the documentation.
  • Add hwloc_topology_set/get_userdata() to let the application save a private pointer in the topology whenever it needs a way to find its own object corresponding to a topology.
  • Add hwloc_get_numanode_obj_by_os_index() and document that this function as well as hwloc_get_pu_obj_by_os_index() are good at converting node sets and cpusets into objects.
  • hwloc_distrib() does not ignore any objects anymore when there are too many of them. They get merged with others instead.
  • Tools:
  • hwloc-bind --get now executes the command after displaying the binding instead of ignoring the command entirely.
  • Clarify that memory sizes shown in lstopo are local by default unless specified (total memory added in the root object).
  • Synthetic topologies:
  • Synthetic topology descriptions may now specify attributes such as memory sizes and OS indexes. See the Synthetic topologies section in the documentation.
  • lstopo now exports in this fully-detailed format by default.
  • The new option --export-synthetic-flags may be used to revert back the old format.
  • Documentation:
  • Add the doc/examples/ subdirectory with several real-life examples, including the already existing hwloc-hello.C for basics.
  • Improve the documentation of CPU and memory binding in the API.
  • Add a FAQ entry about operating system errors, especially on AMD platforms with buggy cache information.
  • Add a FAQ entry about loading many topologies in a single program.
  • Misc:
  • Work around buggy Linux kernels reporting 2 sockets instead 1 socket with 2 NUMA nodes for each Xeon E5 v3 (Haswell) processor.
  • pciutils/libpci support is now removed since libpciaccess works well and there's also a Linux-specific PCI backend. For the record, pciutils was GPL and therefore disabled by default since v1.6.2.
  • Add --disable-cpuid configure flag to work around buggy processor simulators reporting invalid CPUID information.
  • Fix a racy use of libltdl when manipulating multiple topologies in different threads.
  • Fix some build failures in private/misc.h.
  • Fix failures to detect X11/Xutil.h on some Solaris platforms.
  • The plugin ABI has changed, this release will not load plugins built against previous hwloc releases.

New in Hardware Locality 1.9.1 (Sep 29, 2014)

  • Fix a crash when the PCI locality is invalid. Attach to the root object instead. Thanks to Nicolas Denoyelle for reporting the issue.
  • Fix -f in lstopo manpage. Thanks to Jirka Hladky for reporting the issue.
  • Fix hwloc_obj_type_sscanf() and others when strncasecmp() is not properly available. Thanks to Nick Papior Andersen for reporting the problem.
  • Mark Linux file descriptors as close-on-exec to avoid leaks on exec.
  • Fix some minor memory leaks.

New in Hardware Locality 1.9.0 (Mar 28, 2014)

  • API:
  • Add hwloc_obj_type_sscanf() to extend hwloc_obj_type_of_string() with type-specific attributes such as Cache/Group depth and Cache type. hwloc_obj_type_of_string() is moved to hwloc/deprecated.h.
  • Add hwloc_linux_get_tid_last_cpu_location() for retrieving the last CPU where a Linux thread given by TID ran.
  • Add hwloc_distrib() to extend the old hwloc_distribute[v]() functions. hwloc_distribute[v]() is moved to hwloc/deprecated.h.
  • Don't mix total and local memory when displaying verbose object attributes with hwloc_obj_attr_snprintf() or in lstopo.
  • Backends:
  • Add CPUVendor, CPUModelNumber and CPUFamilyNumber info attributes for x86, ia64 and Xeon Phi sockets on Linux, to extend the x86-specific support added in v1.8.1.
  • Add many CPU- and Platform-related info attributes on ARM and POWER platforms, in the Machine and Socket objects.
  • Add CUDA info attributes describing the number of multiprocessors and cores and the size of the global, shared and L2 cache memories in CUDA OS devices.
  • Add OpenCL info attributes describing the number of compute units and the global memory size in OpenCL OS devices.
  • The synthetic backend now accepts extended types such as L2Cache, L1i or Group3. lstopo also exports synthetic strings using these extended types.
  • Tools:
  • lstopo:
  • Do not overwrite output files by default anymore. Pass -f or --force to enforce it.
  • Display OpenCL, CUDA and Xeon Phi numbers of cores and memory sizes in the graphical output.
  • Fix export to stdout when specifying a Cairo-based output type with --of.
  • hwloc-ps:
  • Add -e or --get-last-cpu-location to report where processes/threads run instead of where they are bound.
  • Report locations as likely-more-useful objects such as Cores or Sockets instead of Caches when possible.
  • hwloc-bind:
  • Fix failure on Windows when not using --pid.
  • Add -e as a synonym to --get-last-cpu-location.
  • hwloc-distrib:
  • Add --reverse to distribute using last objects first and singlify into last bits first.
  • hwloc-info:
  • Report unified caches when looking for data or instruction cache ancestor objects.
  • Misc:
  • Add experimental Visual Studio support under contrib/windows.
  • Fix some overzealous assertions and warnings about the ordering of objects on a level with respect to cpusets. The ordering is only guaranteed for complete cpusets (based on the first bit in sets).
  • Fix some memory leaks when importing xml diffs and when exporting a "too complex" entry.

New in Hardware Locality 1.8.1 (Mar 28, 2014)

  • Fix the cpuid code on Windows 64bits so that the x86 backend gets enabled as expected and can populate CPU information.
  • Add CPUVendor/CPUModelNumber/CPUFamilyNumber attributes when running on x86 architecture.
  • Work around buggy BIOS reporting duplicate NUMA nodes on Linux.
  • Add a name to the lstopo graphical window.

New in Hardware Locality 1.8.0 (Mar 28, 2014)

  • New components:
  • Add the "linuxpci" component that always works on Linux even when libpciaccess and libpci aren't available (and even with a modified
  • file-system root). By default the old "pci" component runs first
  • because "linuxpci" lacks device names (obj->name is always NULL).
  • API:
  • Add the topology difference API in hwloc/diff.h for manipulating many similar topologies.
  • Add hwloc_topology_dup() for duplicating an entire topology.
  • hwloc.h and hwloc/helper.h have been reorganized to clarify the documentation sections. The actual inline code has moved out of hwloc.h into the new hwloc/inlines.h.
  • Deprecated functions are now in hwloc/deprecated.h, and not in the official documentation anymore.
  • Tools:
  • Add hwloc-diff and hwloc-patch tools together with the new diff API.
  • Add hwloc-compress-dir to (de)compress an entire directory of XML files using hwloc-diff and hwloc-patch.
  • Object colors in the graphical output of lstopo may be changed by adding a "lstopoStyle" info attribute. See CUSTOM COLORS in the lstopo(1) manpage
  • for details.
  • hwloc-gather-topology may now gather I/O-related files on Linux when --io is given. Only the linuxpci component supports discovering I/O objects from these extended tarballs.
  • hwloc-annotate now supports --ri to remove/replace info attributes with a given name.
  • hwloc-info supports "root" and "all" special locations for dumping information about the root object.
  • lstopo now supports --append-legend to append custom lines of text to the legend in the graphical output. Thanks to Jirka Hladky for discussing the idea.
  • hwloc-calc and friends have a more robust parsing of locations given on the command-line and they report useful error messages about it.
  • Add --whole-system to hwloc-bind, hwloc-calc, hwloc-distances and hwloc-distrib, and add --restrict to hwloc-bind for uniformity among tools.
  • Misc:
  • Calling hwloc_topology_load() or hwloc_topology_set_*() on an already loaded topology now returns an error (deprecated since release 1.6.1).
  • Fix the initialisation of cpusets and nodesets in Group objects added when inserting PCI hostbridges.
  • Never merge Group objects that were added explicitly by the user with hwloc_custom_insert_group_object_by_parent().
  • Add a sanity check during dynamic plugin loading to prevent some crashes when hwloc is dynamically loaded by another plugin mechanisms.
  • Add --with-hwloc-plugins-path to specify the install/load directories of plugins.
  • Add the MICSerialNumber info attribute to the root object when running hwloc inside a Xeon Phi to match the same attribute in the MIC OS device when running in the host.

New in Hardware Locality 1.7.2 (Mar 28, 2014)

  • Do not create invalid block OS devices on very old Linux kernel such as RHEL4 2.6.9.
  • Fix PCI subvendor/device IDs.
  • Fix the management of Misc objects inserted by parent.
  • Add a PortState into attribute to OpenFabrics OS devices.
  • Add a MICSerialNumber info attribute to Xeon PHI/MIC OS devices.
  • Improve verbose error messages when failing to load from XML.

New in Hardware Locality 1.7.1 (Mar 28, 2014)

  • Fix a failed assertion in the distance grouping code when loading a XML
  • file that already contains some groups.
  • Remove unexpected Group objects when loading XML topologies with I/O objects and NUMA distances.
  • Fix PCI link speed discovery when using libpciaccess.
  • Fix invalid libpciaccess virtual function device/vendor IDs when using SR-IOV PCI devices on Linux.
  • Fix GL component build with old NVCtrl releases.
  • Fix embedding breakage caused by libltdl.
  • Always use the system-wide libltdl instead of shipping one inside hwloc.
  • Document issues when enabling plugins while embedding hwloc in another project, in the documentation section Embedding hwloc in Other Software.
  • Add a FAQ entry "How to get useful topology information on NetBSD?" in the documentation.
  • Somes fixes in the renaming code for embedding.
  • Miscellaneous minor build fixes.

New in Hardware Locality 1.7.0 (Mar 28, 2014)

  • New operating system backends:
  • Add BlueGene/Q compute node kernel (CNK) support.
  • Add NetBSD support, thanks to Aleksej Saushev.
  • New I/O device discovery:
  • Add co-processor OS devices such as "mic0" for Intel Xeon Phi (MIC) on Linux.
  • Add co-processor OS devices such as "cuda0" for NVIDIA CUDA-capable GPUs.
  • Add co-processor OS devices such as "opencl0d0" for OpenCL GPU devices on the AMD OpenCL implementation.
  • Add GPU OS devices such as ":0.0" for NVIDIA X11 displays.
  • Add GPU OS devices such as "nvml0" for NVIDIA GPUs.
  • These new OS devices have some string info attributes such as CoProcType,
  • GPUModel, etc. to better identify them.
  • See the I/O Devices and Attributes documentation sections for details.
  • New components:
  • Add the "opencl", "cuda", "nvml" and "gl" components for I/O device discovery.
  • "nvml" also improves the discovery of NVIDIA GPU PCIe link speed.
  • All of these new components may be built as plugins. They may also be disabled entirely by passing --disable-opencl/cuda/nvml/gl to configure.
  • API:
  • Add hwloc_topology_get_flags().
  • Add hwloc/plugins.h for building external plugins.
  • Interoperability:
  • Add hwloc/opencl.h, hwloc/nvml.h, hwloc/gl.h and hwloc/intel-mic.h to retrieve the locality of OS devices that correspond to AMD OpenCL GPU devices or indexes, to NVML devices or indexes, to NVIDIA X11 displays, or to Intel Xeon Phi (MIC) device indexes.
  • Add new helpers in hwloc/cuda.h and hwloc/cudart.h to convert between CUDA devices or indexes and hwloc OS devices.
  • Add hwloc_ibv_get_device_osdev() and clarify the requirements of the OpenFabrics Verbs helpers in hwloc/openfabrics-verbs.h.
  • Tools:
  • hwloc-info is not only a synonym of lstopo -s anymore, it also dumps information about objects given on the command-line.
  • Documentation:
  • Add a section "Existing components and plugins".
  • Add a list of common OS devices in section "Software devices".
  • Add a new FAQ entry "Why is lstopo slow?" about lstopo slowness issues because of GPUs.
  • Clarify the documentation of inline helpers in hwloc/myriexpress.h and hwloc/openfabrics-verbs.h.
  • Misc:
  • Improve cache detection on AIX.
  • The HWLOC_COMPONENTS variable now excludes the components whose names are prefixed with '-'.
  • lstopo --ignore PU now works when displaying the topology in graphical and textual mode (not when exporting to XML).
  • Make sure I/O options always appear in lstopo usage, not only when using pciutils/libpci.
  • Remove some unneeded Linux specific includes from some interoperability headers.
  • Fix some inconsistencies in hwloc-distrib and hwloc-assembler-remote manpages.
  • Fix a memory leak on AIX when getting memory binding.
  • Fix many small memory leaks on Linux.
  • The `libpci' component is now called `pci' but the old name is still
  • accepted in the HWLOC_COMPONENTS variable for backward compatibility.

New in Hardware Locality 1.6.2 (Mar 28, 2014)

  • Use libpciaccess instead of pciutils/libpci by default for I/O discovery. pciutils/libpci is only used if --enable-libpci is given to configure because its GPL license may taint hwloc. See the Installation section in the documentation for details.
  • Fix get_cpubind on Solaris when bound to a single PU with processor_bind(). Thanks to Eugene Loh for reporting the problem and providing a patch.

New in Hardware Locality 1.6.1 (Mar 28, 2014)

  • Fix some crash or buggy detection in the x86 backend when Linux cgroups/cpusets restrict the available CPUs.
  • Fix the pkg-config output with --libs --static.
  • Fix the output of hwloc-calc -H --hierarchical when using logical ndexes in the output.
  • Calling hwloc_topology_load() multiple times on the same topology is officially deprecated. hwloc will warn in such cases.
  • Add some documentation about existing plugins/components, package dependencies, and I/O devices specification on the command-line.

New in Hardware Locality 1.6.0 (Mar 28, 2014)

  • Major changes:
  • Reorganize the backend infrastructure to support dynamic selection of components and dynamic loading of plugins. For details, see the new documentation section Components and plugins.
  • The HWLOC_COMPONENTS variable lets one replace the default discovery components.
  • Dynamic loading of plugins may be enabled with --enable-plugins (except on AIX and Windows). It will build libxml2 and libpci support as separated modules. This helps reducing the dependencies of the core hwloc library when distributed as a binary package.
  • Backends:
  • Add CPUModel detection on Darwin and x86/FreeBSD.
  • The x86 backend now adds CPUModel info attributes to socket objects created by other backends that do not natively support this attribute.
  • Fix detection on FreeBSD in case of cpuset restriction.
  • XML:
  • Add hwloc_topology_set_userdata_import/export_callback(), hwloc_export_obj_userdata() and _userdata_base64() to let applications specify how to save/restore the custom data they placed in the userdata private pointer field of hwloc objects.
  • Tools:
  • Add hwloc-annotate program to add string info attributes to XML topologies.
  • Add --pid-cmd to hwloc-ps to append the output of a command to each PID line. May be used for showing Open MPI process ranks, see the hwloc-ps(1) manpage for details.
  • hwloc-bind now exits with an error if binding fails; the executable is not launched unless binding suceeeded or --force was given.
  • Add --quiet to hwloc-calc and hwloc-bind to hide non-fatal error messages.
  • Fix command-line pid support in windows tools.
  • All programs accept --verbose as a synonym to -v.
  • Misc:
  • Fix some DIR descriptor leaks on Linux.
  • Fix I/O device lists when some were filtered out after a XML import.
  • Fix the removal of I/O objects when importing a I/O-enabled XML topology without any I/O topology flag.
  • When merging objects with HWLOC_IGNORE_TYPE_KEEP_STRUCTURE or lstopo --merge, compare object types before deciding which one of two identical object to remove (e.g. keep sockets in favor of caches).
  • Add some GUID- and LID-related info attributes to OpenFabrics OS devices.
  • Only add CPUType socket attributes on Solaris/Sparc. Other cases don't report reliable information (Solaris/x86), and a replacement is available as the Architecture string info in the Machine object.
  • Add missing Backend string info on Solaris in most cases.
  • Document object attributes and string infos in a new Attributes section in the documentation.
  • Add a section about Synthetic topologies in the documentation.

New in Hardware Locality 1.5.2 (Mar 28, 2014)

  • Use libpciaccess instead of pciutils/libpci by default for I/O discovery. pciutils/libpci is only used if --enable-libpci is given to configure because its GPL license may taint hwloc. See the Installation section in the documentation for details.
  • Fix get_cpubind on Solaris when bound to a single PU with processor_bind(). Thanks to Eugene Loh for reporting the problem and providing a patch.
  • Fix some DIR descriptor leaks on Linux.
  • Fix I/O device lists when some were filtered out after a XML import.
  • Add missing Backend string info on Solaris in most cases.
  • Fix the removal of I/O objects when importing a I/O-enabled XML topology without any I/O topology flag.
  • Fix the output of hwloc-calc -H --hierarchical when using logical indexes in the output.
  • Fix the pkg-config output with --libs --static.

New in Hardware Locality 1.4.1 (Mar 7, 2012)

  • This release contains all changes from v1.3.2.
  • Fix hwloc_alloc_membind, thanks Karl Napf for reporting the issue.
  • Fix memory leaks in some get_membind() functions.
  • Fix helpers converting from Linux libnuma to hwloc (hwloc/linux-libnuma.h)
  • in case of out-of-order NUMA node ids.
  • Fix some overzealous assertions in the distance grouping code.
  • Workaround BIOS reporting empty I/O locality in cuda and openfabrics
  • helpers on Linux. Thanks to Albert Solernou for reporting the problem.
  • Install a valgrind suppressions file hwloc-valgrind.supp (see the FAQ).
  • Fix memory binding documentation. Thanks to Karl Napf for reporting the
  • issues.

New in Hardware Locality 1.4.0 (Jan 31, 2012)

  • Major features
  • Add "custom" interface and "assembler" tools to build multi-node
  • topology. See the Multi-node Topologies section in the documentation
  • for details.
  • Interface improvements
  • Add symmetric_subtree object attribute to ease assumptions when consulting
  • regular symmetric topologies.
  • Add a CPUModel and CPUType info attribute to Socket objects on Linux
  • and Solaris.
  • Add hwloc_get_obj_index_inside_cpuset() to retrieve the "logical" index
  • of an object within a subtree of the topology.
  • Add more NVIDIA CUDA helpers in cuda.h and cudart.h to find hwloc objects
  • corresponding to CUDA devices.
  • Discovery improvements
  • Add a group object above partial distance matrices to make sure
  • the matrices are available in the final topology, except when this
  • new object would contradict the existing hierarchy.
  • Grouping by distances now also works when loading from XML.
  • Fix some corner cases in object insertion, for instance when dealing
  • with NUMA nodes without any CPU.
  • Backends
  • Implement hwloc_get_area_membind() on Linux.
  • Honor I/O topology flags when importing from XML.
  • Further improve XML-related error checking and reporting.
  • Hide synthetic topology error messages unless HWLOC_SYNTHETIC_VERBOSE=1.
  • Tools
  • Add synthetic exporting of symmetric topologies to lstopo.
  • lstopo --horiz and --vert can now be applied to some specific object types.
  • lstopo -v -p now displays distance matrices with physical indexes.
  • Add hwloc-distances utility to list distances.
  • Documentation
  • Fix and/or document the behavior of most inline functions in hwloc/helper.h
  • when the topology contains some I/O or Misc objects.
  • Backend documentation enhancements.
  • Bug fixes
  • Fix missing last bit in hwloc_linux_get_thread_cpubind().
  • Thanks to Carolina Gómez-Tostón Gutiérrez for reporting the issue.
  • Fix FreeBSD build without cpuid support.
  • Fix several Windows build issues.
  • Fix inline keyword definition in public headers.
  • Fix dependencies in the embedded library.
  • Improve visibility support detection. Thanks to Dave Love for providing
  • the patch.
  • Remove references to internal symbols in the tools.

New in Hardware Locality 1.3.1 (Jan 31, 2012)

  • Fix pciutils detection with pkg-config when not installed in standard
  • directories.
  • Fix visibility options detection with the Solaris Studio compiler.
  • Thanks to Igor Galic and Terry Dontje for reporting the problems.
  • Fix support for old Linux sched.h headers such as those found
  • on Red Hat 8. Thanks to Paul H. Hargrove for reporting the problems.
  • Fix inline and attribute support for Solaris compilers. Thanks to
  • Dave Love for reporting the problems.
  • Print a short summary at the end of the configure output. Thanks to
  • Stefan Eilemann for the suggestion.
  • Add --disable-libnuma configure option to disable libnuma-based
  • memory binding support on Linux. Thanks to Rayson Ho for the
  • suggestion.
  • Make hwloc's configure script properly obey $PKG_CONFIG. Thanks to
  • Nathan Phillip Brink for raising the issue.
  • Silence some harmless pciutils warnings, thanks to Paul H. Hargrove
  • for reporting the problem.
  • Fix the documentation with respect to hwloc_pid_t and hwloc_thread_t
  • being either pid_t and pthread_t on Unix, or HANDLE on Windows.

New in Hardware Locality 1.3.0 (Jan 31, 2012)

  • Major features
  • Add I/O devices and bridges to the topology using the pciutils
  • library. Only enabled after setting the relevant flag with
  • hwloc_topology_set_flags() before hwloc_topology_load(). See the
  • I/O Devices section in the documentation for details.
  • Discovery improvements
  • Add associativity to the cache attributes.
  • Add support for s390/z11 "books" on Linux.
  • Add the HWLOC_GROUPING_ACCURACY environment variable to relax
  • distance-based grouping constraints. See the Environment Variables
  • section in the documentation for details about grouping behavior
  • and configuration.
  • Allow user-given distance matrices to remove or replace those
  • discovered by the OS backend.
  • XML improvements
  • XML is now always supported: a minimalistic custom import/export
  • code is used when libxml2 is not available. It is only guaranteed
  • to read XML files generated by hwloc.
  • hwloc_topology_export_xml() and export_xmlbuffer() now return an
  • integer.
  • Add hwloc_free_xmlbuffer() to free the buffer allocated by
  • hwloc_topology_export_xmlbuffer().
  • Hide XML topology error messages unless HWLOC_XML_VERBOSE=1.
  • Minor API updates
  • Add hwloc_obj_add_info to customize object info attributes.
  • Tools
  • lstopo now displays I/O devices by default. Several options are
  • added to configure the I/O discovery.
  • hwloc-calc and hwloc-bind now accept I/O devices as input.
  • Add --restrict option to hwloc-calc and hwloc-distribute.
  • Add --sep option to change the output field separator in hwloc-calc.
  • Add --whole-system option to hwloc-ps.

New in Hardware Locality 1.2.2 (Jan 31, 2012)

  • Fix build on AIX 5.2, thanks Utpal Kumar Ray for the report.
  • Fix XML import of very large page sizes or counts on 32bits platform,
  • thanks to Karsten Hopp for the RedHat ticket.
  • Fix crash when administrator limitations such as Linux cgroup require
  • to restrict distance matrices. Thanks to Ake Sandgren for reporting the
  • problem.
  • Fix the removal of objects such as AMD Magny-Cours dual-node sockets
  • in case of administrator restrictions.
  • Improve error reporting and messages in case of wrong synthetic topology
  • description.
  • Several other minor internal fixes and documentation improvements.

New in Hardware Locality 1.2.1 (Jan 31, 2012)

  • Improve support of AMD Bulldozer "Compute-Unit" modules by detecting
  • logical processors with different core IDs on Linux
  • Fix hwloc-ps crash when listing processes from another Linux cpuset
  • Thanks to Carl Smith for reporting the problem
  • Fix build on AIX and Solaris. Thanks to Carl Smith and Andreas Kupries
  • for reporting the problems
  • Fix cache size detection on Darwin. Thanks to Erkcan Özcan for reporting
  • the problem
  • Make configure fail if --enable-xml or --enable-cairo is given and
  • proper support cannot be found. Thanks to Andreas Kupries for reporting
  • the XML problem
  • Fix spurious L1 cache detection on AIX. Thanks to Hendryk Bockelmann
  • for reporting the problem
  • Fix hwloc_get_last_cpu_location(THREAD) on Linux. Thanks to Gabriele
  • Fatigati for reporting the problem
  • Fix object distance detection on Solaris
  • Add pthread_self weak symbol to ease static linking
  • Minor documentation fixes

New in Hardware Locality 1.2.0 (Jan 31, 2012)

  • Major features
  • Expose latency matrices in the API as an array of distance structures
  • within objects. Add several helpers to find distances
  • Add hwloc_topology_set_distance_matrix() and environment variables
  • to provide a matrix of distances between a given set of objects
  • Add hwloc_get_last_cpu_location() and hwloc_get_proc_last_cpu_location()
  • to retrieve the processors where a process or thread recently ran
  • Add the corresponding --get-last-cpu-location option to hwloc-bind
  • Add hwloc_topology_restrict() to restrict an existing topology to a
  • given cpuset
  • Add the corresponding --restrict option to lstopo
  • Minor API updates
  • Add hwloc_bitmap_list_sscanf/snprintf/asprintf to convert between bitmaps
  • and strings such as 4-5,7-9,12,15-
  • hwloc_bitmap_set/clr_range() now support infinite ranges
  • Clarify the difference between inserting Misc objects by cpuset or by
  • parent
  • hwloc_insert_misc_object_by_cpuset() now returns NULL in case of error
  • Discovery improvements
  • x86 backend (for freebsd): add x2APIC support
  • Support standard device-tree phandle, to get better support on e.g. ARM
  • systems providing it
  • Detect cache size on AIX. Thanks Christopher and IBM
  • Improve grouping to support asymmetric topologies
  • Tools
  • Command-line tools now support "all" and "root" special locations
  • consisting in the entire topology, as well as type names with depth
  • attributes such as L2 or Group4
  • hwloc-calc improvements
  • Add --number-of/-N option to report the number of objects of a given
  • type or depth
  • I is now equivalent to --intersect for listing the indexes of
  • objects of a given type or depth that intersects the input
  • Add -H to report the output as a hierarchical combination of types
  • and depths
  • Add --thissystem to lstopo
  • Add lstopo-win, a console-less lstopo variant on Windows
  • Miscellaneous
  • Remove C99 usage from code base
  • Rename hwloc-gather-topology.sh into hwloc-gather-topology
  • Fix AMD cache discovery on freebsd when there is no L3 cache, thanks
  • Andriy Gapon for the fix

New in Hardware Locality 1.1.2 (Jan 31, 2012)

  • Fix a segfault in the distance-based grouping code when some objects
  • are not placed in any group. Thanks to Bernd Kallies for reporting
  • the problem and providing a patch.
  • Fix the command-line parsing of hwloc-bind --mempolicy interleave.
  • Thanks to Guy Streeter for reporting the problem.
  • Stop truncating the output in hwloc_obj_attr_snprintf() and in the
  • corresponding lstopo output. Thanks to Guy Streeter for reporting the
  • problem.
  • Fix object levels ordering in synthetic topologies.
  • Fix potential incoherency between device tree and kernel information,
  • when SMT is disabled on Power machines.
  • Fix and document the behavior of hwloc_topology_set_synthetic() in case
  • of invalid argument. Thanks to Guy Streeter for reporting the problem.
  • Add some verbose error message reporting when it looks like the OS
  • gives erroneous information.
  • Do not include unistd.h and stdint.h in public headers on Windows.
  • Move config.h files into their own subdirectories to avoid name
  • conflicts when AC_CONFIG_HEADERS adds -I's for them.
  • Remove the use of declaring variables inside "for" loops.
  • Some other minor fixes.
  • Many minor documentation fixes.

New in Hardware Locality 1.1.1 (Jan 31, 2012)

  • Add hwloc_get_api_version() which returns the version of hwloc used
  • at runtime. Thanks to Guy Streeter for the suggestion.
  • Fix the number of hugepages reported for NUMA nodes on Linux.
  • Fix hwloc_bitmap_to_ulong() right after allocating the bitmap.
  • Thanks to Bernd Kallies for reporting the problem.
  • Fix hwloc_bitmap_from_ith_ulong() to properly zero the first ulong.
  • Thanks to Guy Streeter for reporting the problem.
  • Fix hwloc_get_membind_nodeset() on Linux.
  • Thanks to Bernd Kallies for reporting the problem and providing a patch.
  • Fix some file descriptor leaks in the Linux discovery.
  • Fix the minimum width of NUMA nodes, caches and the legend in the graphical
  • lstopo output. Thanks to Jirka Hladky for reporting the problem.
  • Various fixes to bitmap conversion from/to taskset-strings.
  • Fix and document snprintf functions behavior when the buffer size is too
  • small or zero. Thanks to Guy Streeter for reporting the problem.
  • Fix configure to avoid spurious enabling of the cpuid backend.
  • Thanks to Tim Anderson for reporting the problem.
  • Cleanup error management in hwloc-gather-topology.sh.
  • Thanks to Jirka Hladky for reporting the problem and providing a patch.
  • Add a manpage and usage for hwloc-gather-topology.sh on Linux.
  • Thanks to Jirka Hladky for providing a patch.
  • Memory binding documentation enhancements.

New in Hardware Locality 1.1.0 (Dec 17, 2010)

  • API:
  • Increase HWLOC_API_VERSION to 0x00010100 so that API changes may be
  • detected at build-time.
  • Add a memory binding interface.
  • The cpuset API (hwloc/cpuset.h) is now deprecated. It is replaced by
  • the bitmap API (hwloc/bitmap.h) which offers the same features with more
  • generic names since it applies to CPU sets, node sets and more.
  • Backward compatibility with the cpuset API and ABI is still provided but
  • it will be removed in a future release.
  • Old types (hwloc_cpuset_t, ...) are still available as a way to clarify
  • what kind of hwloc_bitmap_t each API function manipulates.
  • Upgrading to the new API only requires to replace hwloc_cpuset_ function
  • calls with the corresponding hwloc_bitmap_ calls, with the following
  • renaming exceptions:
  • hwloc_cpuset_cpu -> hwloc_bitmap_only
  • hwloc_cpuset_all_but_cpu -> hwloc_bitmap_allbut
  • hwloc_cpuset_from_string -> hwloc_bitmap_sscanf
  • Add an `infos' array in each object to store couples of info names and
  • values. It enables generic storage of things like the old dmi board infos
  • that were previously stored in machine specific attributes.
  • Add linesize cache attribute.
  • Features:
  • Bitmaps (and thus CPU sets and node sets) are dynamically (re-)allocated,
  • the maximal number of CPUs (HWLOC_NBMAXCPUS) has been removed.
  • Improve the distance-based grouping code to better support irregular
  • distance matrices.
  • Add support for device-tree to get cache information (useful on Power
  • architectures).
  • Helpers:
  • Add NVIDIA CUDA helpers in cuda.h and cudart.h to ease interoperability
  • with CUDA Runtime and Driver APIs.
  • Add Myrinet Express helper in myriexpress.h to ease interoperability.
  • Tools
  • lstopo now displays physical/OS indexes by default in graphical mode
  • use -l to switch back to logical indexes). The textual output still uses
  • logical by default (use -p to switch to physical indexes).
  • lstopo prefixes logical indexes with `L#' and physical indexes with `P#'.
  • Physical indexes are also printed as `P#N' instead of `phys=N' within
  • object attributes (in parentheses).
  • Add a legend at the bottom of the lstopo graphical output, use --no-legend
  • to remove it.
  • Add hwloc-ps to list process' bindings.
  • Add --membind and --mempolicy options to hwloc-bind.
  • Improve tools command-line options by adding a generic --input option
  • and more) which replaces the old --xml, --synthetic and --fsys-root.
  • Cleanup lstopo output configuration by adding --output-format.
  • Add --intersect in hwloc-calc, and replace --objects with --largest.
  • Add the ability to work on standard input in hwloc-calc.
  • Add --from, --to and --at in hwloc-distrib.
  • Add taskset-specific functions and command-line tools options to
  • manipulate CPU set strings in the format of the taskset program.

New in Hardware Locality 1.0.3 (Dec 17, 2010)

  • Remove unneeded runtime dependency on libibverbs.so in the library and
  • all utils programs
  • Fix hwloc_cpuset_to_linux_libnuma_ulongs in case of non-linear OS-indexes
  • for NUMA nodes
  • lstopo now displays physical/OS indexes by default in graphical mode
  • use -l to switch back to logical indexes). The textual output still uses
  • logical by default (use -p to switch to physical indexes)

New in Hardware Locality 1.0.2 (Dec 17, 2010)

  • Public headers can now be included directly from C++ programs.
  • Solaris fix for non-contiguous cpu numbers.
  • Darwin 10.4 fix. Thanks to Olivier Cessenat for reporting the issue.
  • Revert 1.0.1 patch that ignored sockets with unknown ID values since it only slightly helped POWER7 machines with old Linux kernels while it prevents recent kernels from getting the complete POWER7 topology.
  • Fix hwloc_get_common_ancestor_obj().
  • Remove arch-specific bits in public headers.
  • Some fixes in the lstopo graphical output.
  • Various man page clarifications and minor updates.

New in Hardware Locality 1.0.1 (Dec 17, 2010)

  • Various Solaris fixes.
  • Fix "non-native" builds on x86 platforms (e.g., when building 32
  • bit executables with compilers that natively build 64 bit).
  • Ignore sockets with unknown ID values (which fixes issues on POWER7
  • machines).
  • Various man page clarifications and minor updates.
  • Fixed memory leaks in hwloc_setup_group_from_min_distance_clique().
  • Fix cache type filtering on MS Windows 7.
  • Fixed warnings when compiling with -DNDEBUG.

New in Hardware Locality 1.0.0 (Dec 17, 2010)

  • The ABI of the library has changed.
  • Backend updates:
  • Add FreeBSD support.
  • Add x86 cpuid based backend.
  • Add Linux cgroup support to the Linux cpuset code.
  • Support binding of entire multithreaded process on Linux.
  • Fix and enable Group support in Windows.
  • Cleanup XML export/import.
  • Objects:
  • HWLOC_OBJ_PROC is renamed into HWLOC_OBJ_PU for "Processing Unit",
  • its stringified type name is now "PU".
  • Use new HWLOC_OBJ_GROUP objects instead of MISC when grouping
  • objects according to NUMA distances or arbitrary OS aggregation.
  • Rework memory attributes.
  • Add different cpusets in each object to specify processors that
  • are offline, unavailable, ...
  • Cleanup the storage of object names and DMI infos.
  • Features:
  • Add support for looking up specific PID topology information.
  • Add hwloc_topology_export_xml() to export the topology in a XML file.
  • Add hwloc_topology_get_support() to retrieve the supported features
  • for the current topology context.
  • Support non-SYSTEM object as the root of the tree, use MACHINE in
  • most common cases.
  • Add hwloc_get_*cpubind() routines to retrieve the current binding
  • of processes and threads.
  • API:
  • Add HWLOC_API_VERSION to help detect the currently used API version.
  • Add missing ending "e" to *compare* functions.
  • Add several routines to emulate PLPA functions.
  • Rename and rework the cpuset and/or/xor/not/clear operators to output
  • their result in a dedicated argument instead of modifying one input.
  • Deprecate hwloc_obj_snprintf() in favor of hwloc_obj_type/attr_snprintf().
  • Clarify the use of parent and ancestor in the API, do not use father.
  • Replace hwloc_get_system_obj() with hwloc_get_root_obj().
  • Return -1 instead of HWLOC_OBJ_TYPE_MAX in the API since the latter
  • isn't public.
  • Relax constraints in hwloc_obj_type_of_string().
  • Improve displaying of memory sizes.
  • Add 0x prefix to cpuset strings.
  • Tools:
  • lstopo now displays logical indexes by default, use --physical to
  • revert back to OS/physical indexes.
  • Add colors in the lstopo graphical outputs to distinguish between online,
  • offline, reserved, ... objects.
  • Extend lstopo to show cpusets, filter objects by type, ...
  • Renamed hwloc-mask into hwloc-calc which supports many new options.
  • Documentation:
  • Add a hwloc(7) manpage containing general information.
  • Add documentation about how to switch from PLPA to hwloc.
  • Cleanup the distributed documentation files.
  • Miscellaneous:
  • Many compilers warning fixes.
  • Cleanup the ABI by using the visibility attribute.
  • Add project embedding support.

New in Hardware Locality 0.9.4 (Dec 17, 2010)

  • Fix reseting colors to normal in lstopo -.txt output.

New in Hardware Locality 0.9.3 (Dec 17, 2010)

  • Fix autogen.sh to work with Autoconf 2.63.
  • Fix various crashes in particular conditions:
  • xml files with root attributes
  • offline CPUs
  • partial sysfs support
  • unparseable /proc/cpuinfo
  • ignoring NUMA level while Misc level have been generated
  • Tweak documentation a bit
  • Do not require the pthread library for binding the current thread on Linux
  • Do not erroneously consider the sched_setaffinity prototype is the old version
  • when there is actually none.
  • Fix _syscall3 compilation on archs for which we do not have the sched_setaffinity system call number.
  • Fix AIX binding.
  • Fix libraries dependencies: now only lstopo depends on libtermcap, fix
  • binutils-gold link
  • Have make check always build and run hwloc-hello.c
  • Do not limit size of a cpuset.

New in Hardware Locality 0.9.2 (Dec 17, 2010)

  • Trivial documentation changes.

New in Hardware Locality 0.9.1 (Dec 17, 2010)

  • Re-branded to "hwloc" and moved to the Open MPI project, relicensed under the BSD license.
  • The prefix of all functions and tools is now hwloc, and some public
  • functions were also renamed for real.
  • Group NUMA nodes into Misc objects according to their physical distance
  • that may be reported by the OS/BIOS.
  • May be ignored by setting HWLOC_IGNORE_DISTANCES=1 in the environment.
  • Ignore offline CPUs on Solaris.
  • Improved binding support on AIX.
  • Add HP-UX support.
  • CPU sets are now allocated/freed dynamically.
  • Add command line options to tune the lstopo graphical output, add
  • semi-graphical textual output
  • Extend topobind to support multiple cpusets or objects on the command
  • line as topomask does.
  • Add an Infiniband-specific helper hwloc/openfabrics-verbs.h to retrieve
  • the physical location of IB devices.