m5c contract mappings

Contract mappings convert raw source records into contract-shaped records. They are the bridge between vendor fields and stable app semantics.

Use a mapping when a module has provider-specific data, such as Okta System Log JSON, GitHub repository event JSON, or a custom log record, and an app expects a stable DataContract.

Where mapping files live

m5c discovers any *.yaml file under a directory named mappings.

A typical module layout is:

packages/modules/okta-system-log/
  module.yaml
  mappings/
    system-log-auth.yaml
  tests/
    fixtures/
      okta-successful-login.raw.json

The mapping file can have any filename ending in .yaml. The surrounding DataModule should reference it from spec.mappings so humans and reports can understand which module provides which contract.

Build a mapping file

Build a mapping in this order:

  1. Pick the target contract.
  2. Add a representative raw fixture.
  3. Name the raw source shape with spec.source.kind.
  4. Map required contract fields first.
  5. Normalize enums and IDs.
  6. Add tests that prove important examples produce contract-shaped output.
  7. Run m5c validate and m5c test.
  8. Run m5c lower and inspect generated SQL.

Mapping document

kind: ContractMapping
apiVersion: semantic-catalog.mach5.io/v1alpha1
metadata:
  name: okta-system-log-auth
spec:
  source_language: mach5_mapping_ir
  source:
    kind: okta.system_log
    format: json
  target:
    contract: identity.authentication_event.v1
  fields:
    event_uid:
      path: /uuid
      required: true
    time:
      path: /published
      transform: rfc3339_timestamp
      required: true
    actor.user.email_addr:
      path: /actor/alternateId
      required: true
    actor.user.uid:
      path: /actor/id
    src_endpoint.ip:
      path: /client/ipAddress
      transform: ip_string
    status_id:
      enum:
        path: /outcome/result
        values:
          SUCCESS: 1
          FAILURE: 2
        default: 0
      required: true
    raw_event_ref:
      template: raw-okta-system-log/{/uuid}
  tests:
    - name: okta-successful-login
      input: ../tests/fixtures/okta-successful-login.raw.json
      expect:
        event_uid: okta-evt-001
        status_id: 1
        actor.user.email_addr: alice@example.com

Top-level fields

FieldRequiredMeaning
kindYesMust be ContractMapping.
apiVersionRecommendedCurrent examples use semantic-catalog.mach5.io/v1alpha1.
metadata.nameYesStable mapping name. Use a source-to-contract name such as okta-system-log-auth.
spec.source_languageYesUse mach5_mapping_ir for native mappings that m5c test can execute.
spec.sourceRecommendedDescribes the raw source shape. Used during lowering.
spec.target.contractYesMust match a local DataContract.metadata.name.
spec.fieldsYesTarget field mappings. Keys are target contract field names.
spec.testsRecommendedFixture-backed examples run by m5c test.

Source metadata

spec.source identifies the raw shape this mapping expects.

source:
  kind: okta.system_log
  format: json
FieldMeaning
source.kindSource relation hint used by m5c lower when generating SQL. Dots are converted to underscores, so okta.system_log becomes FROM okta_system_log. If omitted, generated SQL uses FROM raw_payloads.
source.formatDescribes the input format. V1 mappings expect JSON payloads; the field is descriptive today and reserved for future validation/import behavior.

source.kind is not a connector selector. Connector/provider coverage is described by DataModule manifests; mappings describe how one raw source shape is normalized into a target contract.

Good source kind names describe the raw record shape, not the product category:

BetterToo vague
okta.system_logokta
github.repository_eventgithub
gcp.audit_loggcp
custom.security_eventjson

JSON Pointer paths

path values use JSON Pointer, not JSONPath.

Use:

path: /actor/alternateId

Do not use:

path: $.actor.alternateId

For this raw fixture:

{
  "uuid": "okta-evt-001",
  "published": "2026-05-26T00:00:00Z",
  "actor": {
    "id": "00u123",
    "alternateId": "alice@example.com"
  },
  "client": {
    "ipAddress": "203.0.113.10"
  },
  "outcome": {
    "result": "SUCCESS"
  }
}

These pointers resolve as follows:

PointerValue
/uuidokta-evt-001
/actor/id00u123
/actor/alternateIdalice@example.com
/client/ipAddress203.0.113.10
/outcome/resultSUCCESS

Field mapping forms

The keys under spec.fields are target contract field names. The value describes how to create each output field.

Path

Read a value from the raw JSON record.

actor.user.email_addr:
  path: /actor/alternateId
  required: true

Const

Emit a fixed value.

metadata.vendor:
  const: okta

Template

Compose a string from literal text and JSON Pointer placeholders.

raw_event_ref:
  template: raw-okta-system-log/{/uuid}

Enum

Map source values to contract values.

status_id:
  enum:
    path: /outcome/result
    values:
      SUCCESS: 1
      FAILURE: 2
    default: 0

The source value is converted to a string key before lookup. If no key matches, default is used when provided.

Coalesce

Try multiple mappings in order and use the first non-null value.

actor.user.email_addr:
  coalesce:
    - path: /actor/alternateId
    - path: /actor/displayName
    - const: unknown@example.invalid

Raw passthrough

Preserve the entire raw payload.

raw_payload:
  raw: true

Missing values

Use required: true when the target contract or app cannot safely operate without the value.

event_uid:
  path: /uuid
  required: true

Optional fields can be omitted. To emit null when missing, use:

src_endpoint.ip:
  path: /client/ipAddress
  on_missing: null

To fail when missing without using required, use:

src_endpoint.ip:
  path: /client/ipAddress
  on_missing: error

Transforms

Transforms apply after a value is extracted.

TransformCurrent behavior
stringConverts the value to a string.
rfc3339_timestampConverts the value to a string for timestamp-shaped fields.
ip_stringConverts the value to a string for IP-shaped fields.
intKeeps numeric values or parses integer strings.
lowerConverts string form to lowercase.
upperConverts string form to uppercase.

Example:

time:
  path: /published
  transform: rfc3339_timestamp

Tests

Mapping tests are fixture-backed. Each test points to a raw JSON file relative to the mapping file.

tests:
  - name: okta-successful-login
    input: ../tests/fixtures/okta-successful-login.raw.json
    expect:
      event_uid: okta-evt-001
      status_id: 1
      actor.user.email_addr: alice@example.com

m5c test does three things for mapping tests:

  1. Reads the fixture JSON.
  2. Applies the mapping.
  3. Validates the output against the target contract when that contract is present.
  4. Checks every field listed under expect.

Run tests with:

m5c test apps/security-analytics --workspace --format json

The report includes mapping test counts and failures. Use mapping tests to catch source drift before package lowering or deployment.

Lowering

During m5c lower, mappings are rendered into inspectable SQL projections and generated bundle resources. For the example above, source.kind: okta.system_log produces generated SQL over the relation okta_system_log.

SELECT
  json_value(raw_payload, '$.uuid') AS "event_uid",
  json_value(raw_payload, '$.published') AS "time",
  json_value(raw_payload, '$.actor.alternateId') AS "actor.user.email_addr"
FROM okta_system_log

If source.kind is omitted, generated SQL uses raw_payloads:

FROM raw_payloads

Generated SQL assumes the raw relation has a JSON column named raw_payload.

Run lowering with:

m5c lower apps/security-analytics --dev --out lowered

Then inspect:

lowered/generated/sql/
lowered/generated/bundles/

Validate and test workflow

Use this loop while authoring:

m5c validate apps/security-analytics --workspace --offline
m5c test apps/security-analytics --workspace
m5c lower apps/security-analytics --dev --out /tmp/m5c-lowered

If a mapping test fails, inspect:

  • the fixture path;
  • JSON Pointer spelling;
  • required fields;
  • enum source values and defaults;
  • target contract field names and types.

Common mistakes

MistakeFix
Using $.field JSONPath syntaxUse JSON Pointer: /field.
Setting source.kind: oktaUse a record-shape name such as okta.system_log.
Assuming source.kind selects a connectorPut connector/provider information in DataModule; mappings normalize raw records.
Mapping to fields not present in the target contractUpdate the target contract or correct the field name.
Marking too many fields as requiredRequire only fields the app needs to function correctly.
Forgetting fixture testsAdd tests for success, failure, missing-field, and source-drift examples.
Committing generated SQL as source of truthCommit mapping source; treat lowered SQL as generated output unless deliberately promoted.

Best practices

  • Start from the contract and map only fields the app actually uses.
  • Keep one mapping focused on one raw record shape.
  • Use stable target field names from the contract.
  • Prefer contract enum values over raw vendor strings.
  • Use source.kind values that will make generated SQL readable.
  • Add at least one fixture per important source variant.
  • Run validate, test, and lower before publishing a package.

Analytics Cookies

Help us understand website usage.

Necessary storage remembers your choice. With your consent, Mach5 also uses PostHog analytics to measure website traffic and interactions.

Change this anytime from Cookie Settings in the footer. Privacy Notice.