Kraken
Site Reliability Engineer - Data Platform
Kraken
$110k - $176k
Remote, Europe, United States, Canada
Terraform
Kubernetes
Docker

Site Reliability Engineer - Data Platform

Overview

Kraken is a mission-focused company rooted in crypto values. As a Krakenite, you’ll join us on our mission to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion.

Job Description

Kraken is a world-class team with crypto conviction, united by the desire to discover and unlock the potential of crypto and blockchain technology. As a fully remote company, Krakenites are industry pioneers who develop premium crypto products for experienced traders, institutions, and newcomers to the space.

Responsibilities

  • - Design the data governance mechanisms that ensure our lakehouse is easy to interact with, secure and in compliance with all applicable regulations
  • - Implement the infrastructure we use to ingest our data, store it, catalog it with the right metadata and capture its lineage
  • - Provide a state-of-the-art suite of BI tools for multiple teams within the company
  • - Guarantee the availability, high performance, scalability and cost efficiency of our data platform
  • - Implement data infrastructure solutions (self service) that support the needs of 10+ business units and over 100 engineering and data analysts
  • - Utilize Infrastructure as Code (IaC) principles to design, provision, and manage both on-premises and cloud (AWS) infrastructure components using tools such as Terraform
  • - Develop and maintain automation scripts using bash/shell scripting and to automate operational tasks and deployments
  • - Enhance and manage CI/CD pipelines to facilitate consistent software deployments across the data infrastructure
  • - Implement robust data monitoring and alerting solutions to proactively detect anomalies and performance issues
  • - Manage and implement role-based access control (RBAC) and permissions for a multitude of user groups and machine workflows across different environments
  • - Manage and maintain real-time streaming data architecture using technologies like Kafka and Debezium Change Data Capture (CDC)
  • - Utilize Kubernetes to manage containerized applications within the data infrastructure, ensuring efficient deployment, scaling, and orchestration
  • - Implement effective incident response procedures and participate in on-call rotations
  • - Collaborate with data analysts, engineers, and cross-functional teams to understand requirements and implement appropriate solutions
  • - Document architecture, processes, and best practices to enable knowledge sharing and support continuous improvement
  • - Support AI/ML teams with their infra requests

Required Skills

  • - Bachelor''s degree in Computer Science, Engineering, or a related field (or equivalent experience)
  • - Proven experience (5+ years) working as a Site Reliability Engineer, Infrastructure Engineer, or similar roles, with a focus on data infrastructure and security
  • - Experience with real-time data processing technologies, such as Kafka and Debezium
  • - Working experience in managing hybrid systems particularly AWS and (HashiCorp nice to have)
  • - Infrastructure as Code tools such as Terraform, Terragrunt and Atlantis
  • - Experience with containerization and orchestration tools, particularly Kubernetes and Docker
  • - Solid understanding of bash/shell scripting and proficiency in at least one programming language (preferably Python or Rust)
  • - Familiarity with CI/CD deployment pipelines and related tools
  • - Strong problem-solving skills and the ability to troubleshoot complex systems
  • - Experience with data-related technologies (databases, data lakes, airflow, spark) is a plus

Benefits

  • - Bonus program
  • - Equity program
  • - Wellness allowance
  • - Medical, dental, vision and 401(k) [US Only]

About the company

Buy, sell, trade and learn about crypto on Kraken — the simple, powerful crypto platform that grows with you.


All Job Openings at Kraken