m5c contract mappings
Contract mappings convert raw source records into contract-shaped records. They are the bridge between vendor fields and stable app semantics.
Use a mapping when a module has provider-specific data, such as Okta System Log JSON, GitHub repository event JSON, or a custom log record, and an app expects a stable DataContract.
Where mapping files live
m5c discovers any *.yaml file under a directory named mappings.
A typical module layout is:
packages/modules/okta-system-log/
module.yaml
mappings/
system-log-auth.yaml
tests/
fixtures/
okta-successful-login.raw.json
The mapping file can have any filename ending in .yaml. The surrounding DataModule should reference it from spec.mappings so humans and reports can understand which module provides which contract.
Build a mapping file
Build a mapping in this order:
- Pick the target contract.
- Add a representative raw fixture.
- Name the raw source shape with
spec.source.kind. - Map required contract fields first.
- Normalize enums and IDs.
- Add tests that prove important examples produce contract-shaped output.
- Run
m5c validateandm5c test. - Run
m5c lowerand inspect generated SQL.
Mapping document
kind: ContractMapping
apiVersion: semantic-catalog.mach5.io/v1alpha1
metadata:
name: okta-system-log-auth
spec:
source_language: mach5_mapping_ir
source:
kind: okta.system_log
format: json
target:
contract: identity.authentication_event.v1
fields:
event_uid:
path: /uuid
required: true
time:
path: /published
transform: rfc3339_timestamp
required: true
actor.user.email_addr:
path: /actor/alternateId
required: true
actor.user.uid:
path: /actor/id
src_endpoint.ip:
path: /client/ipAddress
transform: ip_string
status_id:
enum:
path: /outcome/result
values:
SUCCESS: 1
FAILURE: 2
default: 0
required: true
raw_event_ref:
template: raw-okta-system-log/{/uuid}
tests:
- name: okta-successful-login
input: ../tests/fixtures/okta-successful-login.raw.json
expect:
event_uid: okta-evt-001
status_id: 1
actor.user.email_addr: alice@example.com
Top-level fields
| Field | Required | Meaning |
|---|---|---|
kind | Yes | Must be ContractMapping. |
apiVersion | Recommended | Current examples use semantic-catalog.mach5.io/v1alpha1. |
metadata.name | Yes | Stable mapping name. Use a source-to-contract name such as okta-system-log-auth. |
spec.source_language | Yes | Use mach5_mapping_ir for native mappings that m5c test can execute. |
spec.source | Recommended | Describes the raw source shape. Used during lowering. |
spec.target.contract | Yes | Must match a local DataContract.metadata.name. |
spec.fields | Yes | Target field mappings. Keys are target contract field names. |
spec.tests | Recommended | Fixture-backed examples run by m5c test. |
Source metadata
spec.source identifies the raw shape this mapping expects.
source:
kind: okta.system_log
format: json
| Field | Meaning |
|---|---|
source.kind | Source relation hint used by m5c lower when generating SQL. Dots are converted to underscores, so okta.system_log becomes FROM okta_system_log. If omitted, generated SQL uses FROM raw_payloads. |
source.format | Describes the input format. V1 mappings expect JSON payloads; the field is descriptive today and reserved for future validation/import behavior. |
source.kind is not a connector selector. Connector/provider coverage is described by DataModule manifests; mappings describe how one raw source shape is normalized into a target contract.
Good source kind names describe the raw record shape, not the product category:
| Better | Too vague |
|---|---|
okta.system_log | okta |
github.repository_event | github |
gcp.audit_log | gcp |
custom.security_event | json |
JSON Pointer paths
path values use JSON Pointer, not JSONPath.
Use:
path: /actor/alternateId
Do not use:
path: $.actor.alternateId
For this raw fixture:
{
"uuid": "okta-evt-001",
"published": "2026-05-26T00:00:00Z",
"actor": {
"id": "00u123",
"alternateId": "alice@example.com"
},
"client": {
"ipAddress": "203.0.113.10"
},
"outcome": {
"result": "SUCCESS"
}
}
These pointers resolve as follows:
| Pointer | Value |
|---|---|
/uuid | okta-evt-001 |
/actor/id | 00u123 |
/actor/alternateId | alice@example.com |
/client/ipAddress | 203.0.113.10 |
/outcome/result | SUCCESS |
Field mapping forms
The keys under spec.fields are target contract field names. The value describes how to create each output field.
Path
Read a value from the raw JSON record.
actor.user.email_addr:
path: /actor/alternateId
required: true
Const
Emit a fixed value.
metadata.vendor:
const: okta
Template
Compose a string from literal text and JSON Pointer placeholders.
raw_event_ref:
template: raw-okta-system-log/{/uuid}
Enum
Map source values to contract values.
status_id:
enum:
path: /outcome/result
values:
SUCCESS: 1
FAILURE: 2
default: 0
The source value is converted to a string key before lookup. If no key matches, default is used when provided.
Coalesce
Try multiple mappings in order and use the first non-null value.
actor.user.email_addr:
coalesce:
- path: /actor/alternateId
- path: /actor/displayName
- const: unknown@example.invalid
Raw passthrough
Preserve the entire raw payload.
raw_payload:
raw: true
Missing values
Use required: true when the target contract or app cannot safely operate without the value.
event_uid:
path: /uuid
required: true
Optional fields can be omitted. To emit null when missing, use:
src_endpoint.ip:
path: /client/ipAddress
on_missing: null
To fail when missing without using required, use:
src_endpoint.ip:
path: /client/ipAddress
on_missing: error
Transforms
Transforms apply after a value is extracted.
| Transform | Current behavior |
|---|---|
string | Converts the value to a string. |
rfc3339_timestamp | Converts the value to a string for timestamp-shaped fields. |
ip_string | Converts the value to a string for IP-shaped fields. |
int | Keeps numeric values or parses integer strings. |
lower | Converts string form to lowercase. |
upper | Converts string form to uppercase. |
Example:
time:
path: /published
transform: rfc3339_timestamp
Tests
Mapping tests are fixture-backed. Each test points to a raw JSON file relative to the mapping file.
tests:
- name: okta-successful-login
input: ../tests/fixtures/okta-successful-login.raw.json
expect:
event_uid: okta-evt-001
status_id: 1
actor.user.email_addr: alice@example.com
m5c test does three things for mapping tests:
- Reads the fixture JSON.
- Applies the mapping.
- Validates the output against the target contract when that contract is present.
- Checks every field listed under
expect.
Run tests with:
m5c test apps/security-analytics --workspace --format json
The report includes mapping test counts and failures. Use mapping tests to catch source drift before package lowering or deployment.
Lowering
During m5c lower, mappings are rendered into inspectable SQL projections and generated bundle resources. For the example above, source.kind: okta.system_log produces generated SQL over the relation okta_system_log.
SELECT
json_value(raw_payload, '$.uuid') AS "event_uid",
json_value(raw_payload, '$.published') AS "time",
json_value(raw_payload, '$.actor.alternateId') AS "actor.user.email_addr"
FROM okta_system_log
If source.kind is omitted, generated SQL uses raw_payloads:
FROM raw_payloads
Generated SQL assumes the raw relation has a JSON column named raw_payload.
Run lowering with:
m5c lower apps/security-analytics --dev --out lowered
Then inspect:
lowered/generated/sql/
lowered/generated/bundles/
Validate and test workflow
Use this loop while authoring:
m5c validate apps/security-analytics --workspace --offline
m5c test apps/security-analytics --workspace
m5c lower apps/security-analytics --dev --out /tmp/m5c-lowered
If a mapping test fails, inspect:
- the fixture path;
- JSON Pointer spelling;
- required fields;
- enum source values and defaults;
- target contract field names and types.
Common mistakes
| Mistake | Fix |
|---|---|
Using $.field JSONPath syntax | Use JSON Pointer: /field. |
Setting source.kind: okta | Use a record-shape name such as okta.system_log. |
Assuming source.kind selects a connector | Put connector/provider information in DataModule; mappings normalize raw records. |
| Mapping to fields not present in the target contract | Update the target contract or correct the field name. |
Marking too many fields as required | Require only fields the app needs to function correctly. |
| Forgetting fixture tests | Add tests for success, failure, missing-field, and source-drift examples. |
| Committing generated SQL as source of truth | Commit mapping source; treat lowered SQL as generated output unless deliberately promoted. |
Best practices
- Start from the contract and map only fields the app actually uses.
- Keep one mapping focused on one raw record shape.
- Use stable target field names from the contract.
- Prefer contract enum values over raw vendor strings.
- Use
source.kindvalues that will make generated SQL readable. - Add at least one fixture per important source variant.
- Run validate, test, and lower before publishing a package.