7.2. Per-App-Context Mapping

By default, every application context (prte_app_context_t) within a job is placed using the same mapping, ranking, and binding policy — the one specified at the job level via --map-by, --rank-by, and --bind-to. Per-app-context mapping allows each application context in a multi-program multiple-data (MPMD) job to carry its own independent set of placement directives.

7.2.1. When to Use It

Per-app-context mapping is useful when different components of a coupled application have meaningfully different hardware affinity requirements. For example:

  • A compute kernel that should be mapped by core and bound tightly to those cores.

  • A communication or I/O helper that should be mapped by node and left unbound.

  • A utility process that must not run on the head node (NOLOCAL), while the rest of the job can use all nodes.

Without per-app mapping, satisfying these requirements would require launching multiple separate jobs with separate prun invocations, losing the ability to use shared memory and direct PMIx communication between the components.

7.2.2. Command-Line Syntax

Per-app directives are specified using the standard MPMD separator (:) on the prun command line. Each --map-by, --rank-by, and --bind-to option that appears after a : separator applies only to the application context that follows it. Options that appear before the first : separator continue to apply at the job level.

# app1 mapped by core, app2 mapped by node and ranked by fill
prun -n 4 app1 --map-by core : -n 2 app2 --map-by node --rank-by fill

# app1 avoids the head node; app2 can use all nodes
prun -n 8 app1 --map-by slot:nolocal : -n 2 app2 --map-by slot

# app1 uses a rankfile for precise placement; app2 uses default slot mapping
prun -n 3 app1 --map-by rankfile:file=/path/to/rfile : -n 2 app2

Any --map-by qualifier that is valid at the job level is also valid per app, with the following exceptions (described in the next section).

7.2.3. Job-Level-Only Directives

Some directives are properties of the job as a whole and cannot be applied per app context:

OVERSUBSCRIBE / NOOVERSUBSCRIBE

Oversubscription governs whether the job as a whole may exceed node slot counts. Because multiple app contexts share the same nodes, this decision must be consistent across all apps. Specifying OVERSUBSCRIBE or NOOVERSUBSCRIBE in a per-app --map-by string is an error and will cause the job to abort with PRTE_JOB_STATE_MAP_FAILED.

INHERIT / NOINHERIT

These modifiers control whether a spawned child job copies its parent’s placement policies. This is a job-level property. If the PMIx spawn path supplies INHERIT or NOINHERIT in per-app info[] arrays, PRRTE will attempt to promote the directive to the job level. If different app contexts carry conflicting directives (one INHERIT and another NOINHERIT), the job will abort with PRTE_JOB_STATE_MAP_FAILED.

--display-map / --display-devel-map

The job map is displayed once after all app contexts have been placed. A display-map directive found on any individual app context is promoted to the job level automatically; displaying a partial mid-loop map is not supported.

7.2.4. Per-App NOLOCAL

The NOLOCAL modifier (PRTE_MAPPING_NO_USE_LOCAL) prevents an app’s processes from being placed on the head node (HNP). Unlike the job-level-only directives above, NOLOCAL is permitted per app context and takes effect only for the app that carries it.

This means one app in a job can avoid the head node while other apps in the same job can use it:

# app1 will not run on the head node; app2 may
prun -n 8 app1 --map-by slot:nolocal : -n 1 app2 --map-by slot

Internally, NOLOCAL is stored as a directive bit within the PRTE_APP_MAPBY attribute on the prte_app_context_t. The node-list construction performed by prte_rmaps_base_get_target_nodes() reads this bit for each app independently, so the exclusion of the head node does not affect subsequent app contexts that do not carry the bit.

7.2.5. PMIx Spawn Path

Per-app placement directives can also be supplied via the PMIx_Spawn API using the per-app info[] array on each pmix_app_t. The relevant PMIx keys are:

  • PMIX_MAPBY — equivalent to --map-by

  • PMIX_RANKBY — equivalent to --rank-by

  • PMIX_BINDTO — equivalent to --bind-to

When these keys appear in a per-app info[] array (rather than in the job-level info[] array), PRRTE stores them as per-app attributes on the corresponding prte_app_context_t and routes them through the same per-app dispatch path as the command-line case. When the same keys appear in the job-level info[] array, they continue to set the job-level policy as before.

7.2.6. Inheritance and Fallback

An app context that carries no per-app directives inherits the job-level policy without modification. Partial overrides are supported: if an app specifies only --map-by, it inherits the job-level --rank-by and --bind-to.

The inheritance chain for each field is:

  1. Per-app attribute on prte_app_context_t (highest priority)

  2. Job-level value from jdata->map / jdata->attributes

  3. PRRTE system default

This resolution is performed by prte_rmaps_base_resolve_app_options() immediately before each app context is dispatched to a mapping component.

7.2.7. How the Dispatch Works

The standard single-dispatch path (in which one mapping component processes all app contexts in a single map_job() call) is preserved unchanged for jobs that carry no per-app directives.

When at least one app context carries a per-app PRTE_APP_MAPBY, PRTE_APP_RANKBY, or PRTE_APP_BINDTO attribute, prte_rmaps_base_map_job() switches to a per-app loop:

  1. Resolve optionsprte_rmaps_base_resolve_app_options() builds a per-app copy of the prte_rmaps_options_t struct, starting from the job-level defaults and overriding with any per-app attributes. The field app_options.app_idx is set to the index of the current app context.

  2. Select component — the same component selection loop is used as in the single-dispatch path. Each component’s map_job() is called with app_options. Because app_options.app_idx >= 0, each component skips any app context whose index does not match, returning PRTE_ERR_TAKE_NEXT_OPTION for those it cannot handle.

  3. Rank assignmentprte_rmaps_base_compute_vpids() is called once per app context after placement, with the app index and a running vpid counter so that global rank values remain contiguous and non-overlapping across the whole job. Per-app ranking controls only the order in which processes within that app are assigned ranks relative to each other; the starting rank for each app is always the first rank not yet assigned by any previous app.

  4. Binding — no structural changes are required. Because prte_rmaps_base_setup_proc() is called from within each component’s inner loop with the current opts in scope, per-app binding is automatically derived from the opts->bind value set by prte_rmaps_base_resolve_app_options().

The complete job map is the union of nodes used by all app contexts. prte_rmaps_base_display_map() is called once at the end, after all app contexts have been placed, and displays this complete map.

7.2.8. Attribute Storage

Per-app directives are stored as attributes on prte_app_context_t using the following keys (defined in src/util/attr.h):

Attribute key

PMIx type

Meaning

PRTE_APP_MAPBY (26)

PMIX_UINT16

Parsed mapping policy enum value; directive bits (e.g., NOLOCAL) are encoded in the upper bits using PRTE_SET_MAPPING_DIRECTIVE

PRTE_APP_RANKBY (27)

PMIX_UINT16

Parsed ranking policy enum value

PRTE_APP_BINDTO (28)

PMIX_UINT16

Parsed binding policy enum value

PRTE_APP_MAP_FILE (29)

PMIX_STRING

Path to the sequential or rankfile for this app; takes precedence over the job-level PRTE_JOB_FILE in the seq and rank_file components

PRTE_APP_DIST_DEVICE (30)

PMIX_STRING

Device name for distance-based mapping (e.g., mlx5_0)

PRTE_APP_HWT_CPUS (31)

PMIX_BOOL

Use hardware threads as CPUs for this app

PRTE_APP_CORE_CPUS (32)

PMIX_BOOL

Use cores as CPUs for this app

PRTE_APP_CPUSET (33)

PMIX_STRING

Comma-delimited CPU ranges for PE-LIST mapping

PRTE_APP_BINDING_LIMIT (34)

PMIX_UINT16

Maximum number of processes to bind to a single target object before moving to the next

The existing PRTE_APP_PPR (25) and PRTE_APP_PES_PER_PROC (24) attributes are unchanged. When a per-app --map-by string contains a ppr:N:obj specification, the parsed N value is written to PRTE_APP_PPR in addition to setting PRTE_APP_MAPBY = PRTE_MAPPING_PPR, so that the ppr mapping component can read it through the standard path.

7.2.9. Framework Version

The addition of app_idx to prte_rmaps_options_t is a breaking interface change for any mapping component. All components must now honour the options->app_idx field: when it is >= 0, the component must process only the app context at that index. The rmaps framework version was therefore incremented from 4.0.0 to 5.0.0 (PRTE_RMAPS_BASE_VERSION_5_0_0). Out-of-tree components built against the older headers will produce a version mismatch at load time rather than silently exhibiting incorrect behavior.