Our project is to develop the data platform, where all the analytical data of the company will be stacked. This is a great opportunity to participate in the launch and operation of a large Kubernetes/Spark/S3 cluster and one of the most interesting BI practices in Eastern Europe.Β
Β
Responsibilities:Β
- Support and active development of the Data Platform and Hybrid Data Platform (on-prem + Azure Fabric, in progress)
- Support for a team of data engineers and analystsΒ
Β
Skills:Β
- Understanding the advantage of GitOps/IaC over manual workΒ
- Kubernetes, Helm, ArgoCD, Prometheus, Grafana, Loki, HashiCorp VaultΒ
- Apache Spark in Kubernetes, Apache Kafka, Minio/S3, Apache AirflowΒ
- Docker (BuildKit), Gitlab, Gitlab CIΒ
- Experience with at least one of the most popular programming languages, such as Python, Golang, Java, Scala, etc. Be able writing a code
Β
Will be a plus:Β
- Kerberos, Active directoryΒ
- ClickhouseΒ
- DatahubΒ
- Elasticsearch
- Experience with any OLAP DB, support, optimizeΒ
- Security in K8s, HashiCorp Vault, Oauth, OpenID, KeycloakΒ
Β
Will be a significant plus:Β
Β
Technologies that we use:Β
- Kubernetes RKE2 1.31. Cilium 1.17. Gitops, Argocd, Helm, Kustomize, Kyverno.Β
- Gitlab, Gitlab CI, Gitlab Kubernetes Runner, Docker, Buildkit.Β
- Apache Airflow.Β
- Apache Spark, Apache Kyuubi, Hive MetastoreΒ
- Minio, Redis, PostgreSQL (Cloudnative-PG), Elasticsearch, Apache Kafka, Clickhouse.Β
- Datahub.Β
- Prometheus Stack, Grafana, Grafana Loki.Β
- Python, Golang metrics exporters, Datadog Vector, Fluent-bit.Β
- PowerBI, Azure Fabric.Β
- Ansible.Β
Β
We offer:Β
- The opportunity to work on a large-scale project from scratch
- We are not tied to the office, willing to work remotelyΒ
- Health insuranceΒ
- Compensation of sports clubs and foreign language schoolsΒ
- Internal training (IT and not only).Β