.TH "hwlocality_memattrs" 3 "Wed Dec 14 2022" "Version 2.9.0" "Hardware Locality (hwloc)" \" -*- nroff -*-
.ad l
.nh
.SH NAME
hwlocality_memattrs \- Comparing memory node attributes for finding where to allocate on
.SH SYNOPSIS
.br
.PP
.SS "Data Structures"

.in +1c
.ti -1c
.RI "struct \fBhwloc_location\fP"
.br
.in -1c
.SS "Typedefs"

.in +1c
.ti -1c
.RI "typedef unsigned \fBhwloc_memattr_id_t\fP"
.br
.in -1c
.SS "Enumerations"

.in +1c
.ti -1c
.RI "enum \fBhwloc_memattr_id_e\fP { \fBHWLOC_MEMATTR_ID_CAPACITY\fP, \fBHWLOC_MEMATTR_ID_LOCALITY\fP, \fBHWLOC_MEMATTR_ID_BANDWIDTH\fP, \fBHWLOC_MEMATTR_ID_READ_BANDWIDTH\fP, \fBHWLOC_MEMATTR_ID_WRITE_BANDWIDTH\fP, \fBHWLOC_MEMATTR_ID_LATENCY\fP, \fBHWLOC_MEMATTR_ID_READ_LATENCY\fP, \fBHWLOC_MEMATTR_ID_WRITE_LATENCY\fP, \fBHWLOC_MEMATTR_ID_MAX\fP }"
.br
.ti -1c
.RI "enum \fBhwloc_location_type_e\fP { \fBHWLOC_LOCATION_TYPE_CPUSET\fP, \fBHWLOC_LOCATION_TYPE_OBJECT\fP }"
.br
.ti -1c
.RI "enum \fBhwloc_local_numanode_flag_e\fP { \fBHWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY\fP, \fBHWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY\fP, \fBHWLOC_LOCAL_NUMANODE_FLAG_ALL\fP }"
.br
.in -1c
.SS "Functions"

.in +1c
.ti -1c
.RI "int \fBhwloc_memattr_get_by_name\fP (\fBhwloc_topology_t\fP topology, const char *name, \fBhwloc_memattr_id_t\fP *id)"
.br
.ti -1c
.RI "int \fBhwloc_get_local_numanode_objs\fP (\fBhwloc_topology_t\fP topology, struct \fBhwloc_location\fP *location, unsigned *nr, \fBhwloc_obj_t\fP *nodes, unsigned long flags)"
.br
.ti -1c
.RI "int \fBhwloc_memattr_get_value\fP (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target_node, struct \fBhwloc_location\fP *initiator, unsigned long flags, hwloc_uint64_t *value)"
.br
.ti -1c
.RI "int \fBhwloc_memattr_get_best_target\fP (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, struct \fBhwloc_location\fP *initiator, unsigned long flags, \fBhwloc_obj_t\fP *best_target, hwloc_uint64_t *value)"
.br
.ti -1c
.RI "int \fBhwloc_memattr_get_best_initiator\fP (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target, unsigned long flags, struct \fBhwloc_location\fP *best_initiator, hwloc_uint64_t *value)"
.br
.in -1c
.SH "Detailed Description"
.PP 
Platforms with heterogeneous memory require ways to decide whether a buffer should be allocated on 'fast' memory (such as HBM), 'normal' memory (DDR) or even 'slow' but large-capacity memory (non-volatile memory)\&. These memory nodes are called 'Targets' while the CPU accessing them is called the 'Initiator'\&. Access performance depends on their locality (NUMA platforms) as well as the intrinsic performance of the targets (heterogeneous platforms)\&.
.PP
The following attributes describe the performance of memory accesses from an Initiator to a memory Target, for instance their latency or bandwidth\&. Initiators performing these memory accesses are usually some PUs or Cores (described as a CPU set)\&. Hence a Core may choose where to allocate a memory buffer by comparing the attributes of different target memory nodes nearby\&.
.PP
There are also some attributes that are system-wide\&. Their value does not depend on a specific initiator performing an access\&. The memory node Capacity is an example of such attribute without initiator\&.
.PP
One way to use this API is to start with a cpuset describing the Cores where a program is bound\&. The best target NUMA node for allocating memory in this program on these Cores may be obtained by passing this cpuset as an initiator to \fBhwloc_memattr_get_best_target()\fP with the relevant memory attribute\&. For instance, if the code is latency limited, use the Latency attribute\&.
.PP
A more flexible approach consists in getting the list of local NUMA nodes by passing this cpuset to \fBhwloc_get_local_numanode_objs()\fP\&. Attribute values for these nodes, if any, may then be obtained with \fBhwloc_memattr_get_value()\fP and manually compared with the desired criteria\&.
.PP
\fBSee also\fP
.RS 4
An example is available in doc/examples/memory-attributes\&.c in the source tree\&.
.RE
.PP
\fBNote\fP
.RS 4
The API also supports specific objects as initiator, but it is currently not used internally by hwloc\&. Users may for instance use it to provide custom performance values for host memory accesses performed by GPUs\&.
.PP
The interface actually also accepts targets that are not NUMA nodes\&. 
.RE
.PP

.SH "Typedef Documentation"
.PP 
.SS "typedef unsigned \fBhwloc_memattr_id_t\fP"

.PP
A memory attribute identifier\&. May be either one of \fBhwloc_memattr_id_e\fP or a new id returned by \fBhwloc_memattr_register()\fP\&. 
.SH "Enumeration Type Documentation"
.PP 
.SS "enum \fBhwloc_local_numanode_flag_e\fP"

.PP
Flags for selecting target NUMA nodes\&. 
.PP
\fBEnumerator\fP
.in +1c
.TP
\fB\fIHWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY \fP\fP
Select NUMA nodes whose locality is larger than the given cpuset\&. For instance, if a single PU (or its cpuset) is given in \fCinitiator\fP, select all nodes close to the package that contains this PU\&. 
.TP
\fB\fIHWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY \fP\fP
Select NUMA nodes whose locality is smaller than the given cpuset\&. For instance, if a package (or its cpuset) is given in \fCinitiator\fP, also select nodes that are attached to only a half of that package\&. 
.TP
\fB\fIHWLOC_LOCAL_NUMANODE_FLAG_ALL \fP\fP
Select all NUMA nodes in the topology\&. The initiator \fCinitiator\fP is ignored\&. 
.SS "enum \fBhwloc_location_type_e\fP"

.PP
Type of location\&. 
.PP
\fBEnumerator\fP
.in +1c
.TP
\fB\fIHWLOC_LOCATION_TYPE_CPUSET \fP\fP
Location is given as a cpuset, in the location cpuset union field\&. 
.TP
\fB\fIHWLOC_LOCATION_TYPE_OBJECT \fP\fP
Location is given as an object, in the location object union field\&. 
.SS "enum \fBhwloc_memattr_id_e\fP"

.PP
Memory node attributes\&. 
.PP
\fBEnumerator\fP
.in +1c
.TP
\fB\fIHWLOC_MEMATTR_ID_CAPACITY \fP\fP
The "Capacity" is returned in bytes (local_memory attribute in objects)\&. Best capacity nodes are nodes with \fBhigher capacity\fP\&.
.PP
No initiator is involved when looking at this attribute\&. The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_LOCALITY \fP\fP
The "Locality" is returned as the number of PUs in that locality (e\&.g\&. the weight of its cpuset)\&. Best locality nodes are nodes with \fBsmaller locality\fP (nodes that are local to very few PUs)\&. Poor locality nodes are nodes with larger locality (nodes that are local to the entire machine)\&.
.PP
No initiator is involved when looking at this attribute\&. The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_BANDWIDTH \fP\fP
The "Bandwidth" is returned in MiB/s, as seen from the given initiator location\&. Best bandwidth nodes are nodes with \fBhigher bandwidth\fP\&.
.PP
The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&.
.PP
This is the average bandwidth for read and write accesses\&. If the platform provides individual read and write bandwidths but no explicit average value, hwloc computes and returns the average\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_READ_BANDWIDTH \fP\fP
The "ReadBandwidth" is returned in MiB/s, as seen from the given initiator location\&. Best bandwidth nodes are nodes with \fBhigher bandwidth\fP\&.
.PP
The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_WRITE_BANDWIDTH \fP\fP
The "WriteBandwidth" is returned in MiB/s, as seen from the given initiator location\&. Best bandwidth nodes are nodes with \fBhigher bandwidth\fP\&.
.PP
The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_LATENCY \fP\fP
The "Latency" is returned as nanoseconds, as seen from the given initiator location\&. Best latency nodes are nodes with \fBsmaller latency\fP\&.
.PP
The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_LOWER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&.
.PP
This is the average latency for read and write accesses\&. If the platform provides individual read and write latencies but no explicit average value, hwloc computes and returns the average\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_READ_LATENCY \fP\fP
The "ReadLatency" is returned as nanoseconds, as seen from the given initiator location\&. Best latency nodes are nodes with \fBsmaller latency\fP\&.
.PP
The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_LOWER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_WRITE_LATENCY \fP\fP
The "WriteLatency" is returned as nanoseconds, as seen from the given initiator location\&. Best latency nodes are nodes with \fBsmaller latency\fP\&.
.PP
The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_LOWER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&. 
.SH "Function Documentation"
.PP 
.SS "int hwloc_get_local_numanode_objs (\fBhwloc_topology_t\fP topology, struct \fBhwloc_location\fP * location, unsigned * nr, \fBhwloc_obj_t\fP * nodes, unsigned long flags)"

.PP
Return an array of local NUMA nodes\&. By default only select the NUMA nodes whose locality is exactly the given \fClocation\fP\&. More nodes may be selected if additional flags are given as a OR'ed set of \fBhwloc_local_numanode_flag_e\fP\&.
.PP
If \fClocation\fP is given as an explicit object, its CPU set is used to find NUMA nodes with the corresponding locality\&. If the object does not have a CPU set (e\&.g\&. I/O object), the CPU parent (where the I/O object is attached) is used\&.
.PP
On input, \fCnr\fP points to the number of nodes that may be stored in the \fCnodes\fP array\&. On output, \fCnr\fP will be changed to the number of stored nodes, or the number of nodes that would have been stored if there were enough room\&.
.PP
\fBNote\fP
.RS 4
Some of these NUMA nodes may not have any memory attribute values and hence not be reported as actual targets in other functions\&.
.PP
The number of NUMA nodes in the topology (obtained by \fBhwloc_bitmap_weight()\fP on the root object nodeset) may be used to allocate the \fCnodes\fP array\&.
.PP
When an object CPU set is given as locality, for instance a Package, and when flags contain both \fBHWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY\fP and \fBHWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY\fP, the returned array corresponds to the nodeset of that object\&. 
.RE
.PP

.SS "int hwloc_memattr_get_best_initiator (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target, unsigned long flags, struct \fBhwloc_location\fP * best_initiator, hwloc_uint64_t * value)"

.PP
Return the best initiator for the given attribute and target NUMA node\&. If the attribute does not relate to a specific initiator (it does not have the flag \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP), \fC-1\fP is returned and \fCerrno\fP is set to \fCEINVAL\fP\&.
.PP
If \fCvalue\fP is non \fCNULL\fP, the corresponding value is returned there\&.
.PP
If multiple initiators have the same attribute values, only one is returned (and there is no way to clarify how that one is chosen)\&. Applications that want to detect initiators with identical/similar values, or that want to look at values for multiple attributes, should rather get all values using \fBhwloc_memattr_get_value()\fP and manually select the initiator they consider the best\&.
.PP
The returned initiator should not be modified or freed, it belongs to the topology\&.
.PP
\fCflags\fP must be \fC0\fP for now\&.
.PP
If there are no matching initiators, \fC-1\fP is returned with \fCerrno\fP set to \fCENOENT\fP; 
.SS "int hwloc_memattr_get_best_target (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, struct \fBhwloc_location\fP * initiator, unsigned long flags, \fBhwloc_obj_t\fP * best_target, hwloc_uint64_t * value)"

.PP
Return the best target NUMA node for the given attribute and initiator\&. If the attribute does not relate to a specific initiator (it does not have the flag \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP), location \fCinitiator\fP is ignored and may be \fCNULL\fP\&.
.PP
If \fCvalue\fP is non \fCNULL\fP, the corresponding value is returned there\&.
.PP
If multiple targets have the same attribute values, only one is returned (and there is no way to clarify how that one is chosen)\&. Applications that want to detect targets with identical/similar values, or that want to look at values for multiple attributes, should rather get all values using \fBhwloc_memattr_get_value()\fP and manually select the target they consider the best\&.
.PP
\fCflags\fP must be \fC0\fP for now\&.
.PP
If there are no matching targets, \fC-1\fP is returned with \fCerrno\fP set to \fCENOENT\fP;
.PP
\fBNote\fP
.RS 4
The initiator \fCinitiator\fP should be of type \fBHWLOC_LOCATION_TYPE_CPUSET\fP when refering to accesses performed by CPU cores\&. \fBHWLOC_LOCATION_TYPE_OBJECT\fP is currently unused internally by hwloc, but users may for instance use it to provide custom information about host memory accesses performed by GPUs\&. 
.RE
.PP

.SS "int hwloc_memattr_get_by_name (\fBhwloc_topology_t\fP topology, const char * name, \fBhwloc_memattr_id_t\fP * id)"

.PP
Return the identifier of the memory attribute with the given name\&. 
.SS "int hwloc_memattr_get_value (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target_node, struct \fBhwloc_location\fP * initiator, unsigned long flags, hwloc_uint64_t * value)"

.PP
Return an attribute value for a specific target NUMA node\&. If the attribute does not relate to a specific initiator (it does not have the flag \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP), location \fCinitiator\fP is ignored and may be \fCNULL\fP\&.
.PP
\fCflags\fP must be \fC0\fP for now\&.
.PP
\fBNote\fP
.RS 4
The initiator \fCinitiator\fP should be of type \fBHWLOC_LOCATION_TYPE_CPUSET\fP when refering to accesses performed by CPU cores\&. \fBHWLOC_LOCATION_TYPE_OBJECT\fP is currently unused internally by hwloc, but users may for instance use it to provide custom information about host memory accesses performed by GPUs\&. 
.RE
.PP

.SH "Author"
.PP 
Generated automatically by Doxygen for Hardware Locality (hwloc) from the source code\&.