Aktiviere Job-Benachrichtigungen per E-Mail!

Data Engineer (m/f/d)

Cyber Insight

Leipzig

Hybrid

EUR 50.000 - 70.000

Vollzeit

Heute

Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A forward-thinking tech startup in Leipzig is seeking a hands-on Data Engineer to design and build reliable data systems for their AI-driven security platforms. In this role, you'll develop data pipelines, process relevant cybersecurity data, and collaborate with teams on risk assessment models. Candidates should have over 3 years of experience in data engineering, particularly with strong Python skills and familiarity with tools like BigQuery and Docker. This position offers flexible hours and a modern work environment.

Leistungen

Flexible working hours

Modern, secure data platform design freedom

Collaborative startup environment

Qualifikationen

3+ years of experience in data engineering or cybersecurity data processing.
Proven experience with data orchestration frameworks like Airflow.
Understanding of data testing, validation, and observability practices.

Aufgaben

Design and build data pipelines and ETL workflows across various environments.
Ingest and process cybersecurity-related data sources.
Implement monitoring and observability for data workflows.

Kenntnisse

Strong Python skills

Data orchestration frameworks

Data modeling and warehousing

Familiarity with CVE data

GCP data tools knowledge

Data testing and validation practices

Tools

BigQuery

PostgreSQL

Docker

Kubernetes

Terraform

At Cyber Insight, we are building the next generation of AI-driven platforms for IT security and risk management. Our mission is to empower companies to gain deep insights into their IT landscapes and proactively mitigate risks in an increasingly complex digital world.

As a fast-growing startup, we combine expertise in cybersecurity, data engineering, and artificial intelligence to deliver solutions that automate risk assessments, predict potential threats, and help organizations stay ahead of evolving cyber risks. Our team thrives on innovation, collaboration, and a shared passion for making a real impact in the cybersecurity space.

We are looking for a hands-on Data Engineer who is passionate about building reliable, scalable, and secure data systems. You’ll help shape our data architecture and pipelines that feed our AI models and risk assessment engines — including the crucial task of mapping vulnerabilities (CVEs) to specific software and system components.

Tasks

Design, build, and maintain data pipelines and ETL/ELT workflows across GCP and on-prem environments.
Ingest and process cybersecurity-relevant data sources such as CVE feeds, software inventories, vulnerability databases, and event logs.
Develop and maintain transformation logic and data models linking vulnerabilities (CVEs) to affected software and assets.
Implement and automate data validation, consistency checks, and quality assurance using tools like Great Expectations or Deequ.
Collaborate with AI and graph modeling teams to structure and prepare data for threat intelligence and risk quantification models.
Manage and optimize data storage using BigQuery, PostgreSQL, and Cloud Storage, ensuring scalability and performance.
Automate data workflows and testing through CI/CD pipelines (GitHub Actions, GCP Cloud Build, Jenkins).
Implement monitoring and observability for pipelines using Prometheus, Grafana, and OpenTelemetry.
Apply a security-focused mindset in data handling, ensuring safe ingestion, processing, and access control of sensitive datasets.

Requirements

3+ years of experience in data engineering, backend data systems, or cybersecurity data processing.

Strong Python skills and experience with pandas, PySpark, or Dask for large-scale data manipulation.
Proven experience with data orchestration and transformation frameworks (Airflow, dbt, or Dagster).
Solid understanding of data modeling, data warehousing, and SQL optimization and ETL pipelines (Kafka).
Familiarity with CVE data structures, vulnerability databases (e.g. NVD, CPE, CWE), or security telemetry.
Experience integrating heterogeneous data sources (APIs, CSV, JSON, XML, or event streams).
Knowledge of GCP data tools (BigQuery, Pub/Sub, Dataflow, Cloud Functions) or equivalent in Azure/AWS.
Experience with containerized environments (Docker, Kubernetes) and infrastructure automation (Terraform or Pulumi).
Understanding of data testing, validation, and observability practices in production pipelines.
A structured and security-aware approach to building data products that support AI-driven risk analysis.

Nice to Have

Experience working with graph databases (Neo4j, ArangoDB) or ontology-based data modeling.
Familiarity with ML pipelines (Vertex AI Pipelines, MLflow, or Kubeflow).
Understanding of software composition analysis (SCA) or vulnerability scanning outputs (e.g. Trivy, Syft).
Background in threat intelligence, risk scoring, or cyber risk quantification.
Experience in multi-cloud or hybrid setups (GCP, Azure, on-prem).

Benefits

Freedom to design and shape a modern, secure data platform from the ground up.
A collaborative startup environment where your work directly supports AI and cybersecurity products.
Flexible working hours and remote-friendly setup.
Exposure to cutting-edge technologies in AI, data engineering, and cyber risk analytics.
Competitive salary and benefits tailored to your experience.

We are looking forward to meet you!

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.