Data types, data structures, schemas (both JSON and Spark), schema management etc.
Strong understanding of complex JSON manipulation
Experience working with Data Pipelines using custom Python/PySpark frameworks
Strong understanding of the 4 core Data categories (Reference, Master, Transactional, Freeform) and the implications of each, particularly managing/handling Reference Data.
Strong understanding of Data Security principles
Data owners, access controls
Row and column level, GDPR etc including experience of handling sensitive datasets
Strong problem solving and analytical skills, particularly able to demonstrate these intuitively
Requirements
Required Skills:
Languages / Frameworks:
JSON
YAML
Python (as a programming language, not just able to write basic scripts; Pydantic experience would be a bonus)