7.2. Per-App-Context Mapping
By default, every application context (prte_app_context_t) within a job
is placed using the same mapping, ranking, and binding policy — the one
specified at the job level via --map-by, --rank-by, and --bind-to.
Per-app-context mapping allows each application context in a multi-program
multiple-data (MPMD) job to carry its own independent set of placement
directives.
7.2.1. When to Use It
Per-app-context mapping is useful when different components of a coupled application have meaningfully different hardware affinity requirements. For example:
A compute kernel that should be mapped by core and bound tightly to those cores.
A communication or I/O helper that should be mapped by node and left unbound.
A utility process that must not run on the head node (
NOLOCAL), while the rest of the job can use all nodes.
Without per-app mapping, satisfying these requirements would require launching
multiple separate jobs with separate prun invocations, losing the ability
to use shared memory and direct PMIx communication between the components.
7.2.2. Command-Line Syntax
Per-app directives are specified using the standard MPMD separator (:) on
the prun command line. Each --map-by, --rank-by, and --bind-to
option that appears after a : separator applies only to the application
context that follows it. Options that appear before the first : separator
continue to apply at the job level.
# app1 mapped by core, app2 mapped by node and ranked by fill
prun -n 4 app1 --map-by core : -n 2 app2 --map-by node --rank-by fill
# app1 avoids the head node; app2 can use all nodes
prun -n 8 app1 --map-by slot:nolocal : -n 2 app2 --map-by slot
# app1 uses a rankfile for precise placement; app2 uses default slot mapping
prun -n 3 app1 --map-by rankfile:file=/path/to/rfile : -n 2 app2
Any --map-by qualifier that is valid at the job level is also valid per
app, with the following exceptions (described in the next section).
7.2.3. Job-Level-Only Directives
Some directives are properties of the job as a whole and cannot be applied per app context:
OVERSUBSCRIBE/NOOVERSUBSCRIBEOversubscription governs whether the job as a whole may exceed node slot counts. Because multiple app contexts share the same nodes, this decision must be consistent across all apps. Specifying
OVERSUBSCRIBEorNOOVERSUBSCRIBEin a per-app--map-bystring is an error and will cause the job to abort withPRTE_JOB_STATE_MAP_FAILED.INHERIT/NOINHERITThese modifiers control whether a spawned child job copies its parent’s placement policies. This is a job-level property. If the PMIx spawn path supplies
INHERITorNOINHERITin per-appinfo[]arrays, PRRTE will attempt to promote the directive to the job level. If different app contexts carry conflicting directives (oneINHERITand anotherNOINHERIT), the job will abort withPRTE_JOB_STATE_MAP_FAILED.--display-map/--display-devel-mapThe job map is displayed once after all app contexts have been placed. A display-map directive found on any individual app context is promoted to the job level automatically; displaying a partial mid-loop map is not supported.
7.2.4. Per-App NOLOCAL
The NOLOCAL modifier (PRTE_MAPPING_NO_USE_LOCAL) prevents an app’s
processes from being placed on the head node (HNP). Unlike the job-level-only
directives above, NOLOCAL is permitted per app context and takes effect
only for the app that carries it.
This means one app in a job can avoid the head node while other apps in the same job can use it:
# app1 will not run on the head node; app2 may
prun -n 8 app1 --map-by slot:nolocal : -n 1 app2 --map-by slot
Internally, NOLOCAL is stored as a directive bit within the
PRTE_APP_MAPBY attribute on the prte_app_context_t. The node-list
construction performed by prte_rmaps_base_get_target_nodes() reads this
bit for each app independently, so the exclusion of the head node does not
affect subsequent app contexts that do not carry the bit.
7.2.5. PMIx Spawn Path
Per-app placement directives can also be supplied via the PMIx_Spawn API
using the per-app info[] array on each pmix_app_t. The relevant PMIx
keys are:
PMIX_MAPBY— equivalent to--map-byPMIX_RANKBY— equivalent to--rank-byPMIX_BINDTO— equivalent to--bind-to
When these keys appear in a per-app info[] array (rather than in the
job-level info[] array), PRRTE stores them as per-app attributes on the
corresponding prte_app_context_t and routes them through the same per-app
dispatch path as the command-line case. When the same keys appear in the
job-level info[] array, they continue to set the job-level policy as
before.
7.2.6. Inheritance and Fallback
An app context that carries no per-app directives inherits the job-level
policy without modification. Partial overrides are supported: if an app
specifies only --map-by, it inherits the job-level --rank-by and
--bind-to.
The inheritance chain for each field is:
Per-app attribute on
prte_app_context_t(highest priority)Job-level value from
jdata->map/jdata->attributesPRRTE system default
This resolution is performed by prte_rmaps_base_resolve_app_options()
immediately before each app context is dispatched to a mapping component.
7.2.7. How the Dispatch Works
The standard single-dispatch path (in which one mapping component processes
all app contexts in a single map_job() call) is preserved unchanged for
jobs that carry no per-app directives.
When at least one app context carries a per-app PRTE_APP_MAPBY,
PRTE_APP_RANKBY, or PRTE_APP_BINDTO attribute, prte_rmaps_base_map_job()
switches to a per-app loop:
Resolve options —
prte_rmaps_base_resolve_app_options()builds a per-app copy of theprte_rmaps_options_tstruct, starting from the job-level defaults and overriding with any per-app attributes. The fieldapp_options.app_idxis set to the index of the current app context.Select component — the same component selection loop is used as in the single-dispatch path. Each component’s
map_job()is called withapp_options. Becauseapp_options.app_idx >= 0, each component skips any app context whose index does not match, returningPRTE_ERR_TAKE_NEXT_OPTIONfor those it cannot handle.Rank assignment —
prte_rmaps_base_compute_vpids()is called once per app context after placement, with the app index and a running vpid counter so that global rank values remain contiguous and non-overlapping across the whole job. Per-app ranking controls only the order in which processes within that app are assigned ranks relative to each other; the starting rank for each app is always the first rank not yet assigned by any previous app.Binding — no structural changes are required. Because
prte_rmaps_base_setup_proc()is called from within each component’s inner loop with the currentoptsin scope, per-app binding is automatically derived from theopts->bindvalue set byprte_rmaps_base_resolve_app_options().
The complete job map is the union of nodes used by all app contexts.
prte_rmaps_base_display_map() is called once at the end, after all app
contexts have been placed, and displays this complete map.
7.2.8. Attribute Storage
Per-app directives are stored as attributes on prte_app_context_t using
the following keys (defined in src/util/attr.h):
Attribute key |
PMIx type |
Meaning |
|---|---|---|
|
|
Parsed mapping policy enum value; directive bits (e.g., |
|
|
Parsed ranking policy enum value |
|
|
Parsed binding policy enum value |
|
|
Path to the sequential or rankfile for this app; takes precedence over
the job-level |
|
|
Device name for distance-based mapping (e.g., |
|
|
Use hardware threads as CPUs for this app |
|
|
Use cores as CPUs for this app |
|
|
Comma-delimited CPU ranges for |
|
|
Maximum number of processes to bind to a single target object before moving to the next |
The existing PRTE_APP_PPR (25) and PRTE_APP_PES_PER_PROC (24)
attributes are unchanged. When a per-app --map-by string contains a
ppr:N:obj specification, the parsed N value is written to PRTE_APP_PPR
in addition to setting PRTE_APP_MAPBY = PRTE_MAPPING_PPR, so that the
ppr mapping component can read it through the standard path.
7.2.9. Framework Version
The addition of app_idx to prte_rmaps_options_t is a breaking interface
change for any mapping component. All components must now honour the
options->app_idx field: when it is >= 0, the component must process
only the app context at that index. The rmaps framework version was therefore
incremented from 4.0.0 to 5.0.0
(PRTE_RMAPS_BASE_VERSION_5_0_0). Out-of-tree components built against
the older headers will produce a version mismatch at load time rather than
silently exhibiting incorrect behavior.