Overview
Position: Data Architect - Mainframe Migration & Modernization
Location: London, UK (Hybrid-2/3 days a week)
6 months contract position
Responsibilities
- Design and implement Change Data Capture (CDC) pipelines using IBM CDC tools or equivalents, including subscription management, bookmarks, and replay strategies.
- Handle complex data encoding transformations, such as EBCDIC to UTF-8 and packed decimal conversions, with validation test suites.
- Utilize migration tooling for schema conversion and downstream analytics (Glue, Athena, Redshift), with infrastructure-as-code (Terraform) and CI/CD (GitLab).
- Plan and execute cutovers with dual-run validation, reconciliation, rollback strategies, and data governance controls (masking, encryption, IAM).
- Develop observability dashboards for lag, throughput, error rates, and cost using CloudWatch/Grafana, with operational runbooks and alerting.
- Ensure data quality through pre-migration validation tests and reconciliation against golden sources.
- Apply domain-driven design principles to model bounded contexts and aggregate roots.
- Architect event-driven systems using CDC as event streams, with replay and orchestration patterns.
- Translate Db2 schemas to target pipelines (Aurora Postgres preferred), including logical and physical data modelling, referential integrity, and denormalization decisions.
- Build integration pipelines from Db2 to target pipelines (Aurora Postgres preferred) via Kafka/S3, ensuring idempotency, ordering, and reliable delivery semantics.
Your Profile
Essential skills/knowledge/experience:
- Change Data Capture: CDC design and operations (IBM, Precisely, or equivalent); subscription management, bookmarks, replay, backfill.
- Db2 & z/OS knowledge: Db2 catalog, z/OS fundamentals, batch windows, performance considerations.
- Relational modelling: Data modelling; normalization, indexing, partitioning; OLTP vs. analytics trade-offs.
- Integration patterns: Kafka/ hands-on, CDC-to-target pipelines, UPSERT/MERGE logic; Python/SQL; strong troubleshooting.
- Data quality mindset: Write validation tests before migration; golden-source reconciliation.
- Logical data modelling: Entity-relationship diagrams, normalization (1NF through Boyce-Codd/BCNF), denormalization trade-offs; identify functional dependencies and anomalies.
- Physical data modelling: Table design, partitioning strategies, indexes; SCD types; dimensional vs. transactional schemas; storage patterns for OLTP vs. analytics.
- Normalization & design: Normalize to 3NF/BCNF for transactional systems; understand when to denormalize for queries; trade-offs between 3NF, Data Vault, and star schemas.
- Domain-Driven Design: Bounded contexts and subdomains; aggregates and aggregate roots; entities vs. value objects; repository patterns; ubiquitous language.
- Event-driven architecture: Domain events and contracts; CDC as event streams; idempotency and replay patterns; mapping Db2 transactions to event-driven architectures; saga orchestration.
- CQRS patterns: Command/query separation; event sourcing and state reconstruction; eventual consistency, when CQRS is justified for mainframe migration vs. overkill.
- Database internals: Index structures (B-tree, bitmap, etc.), query planning, partitioning strategies; how Db2 vs. PostgreSQL differ in storage and execution.
- Data quality & validation: Designing test suites for schema conformance; referential integrity checks; sampling and reconciliation strategies.
Desirable skills/knowledge/experience
- IBM zDIH patterns, zIIP tuning.
- COBOL copybook/VSAM ingestion experience.
- PostgreSQL/Aurora data modelling