gridscale GmbH

Site Reliability Engineer (m/f/d) | JobSetuu

gridscale GmbH

Köln
Full-Time

Posted 1 hour ago • Via www.arbeitnow.com

Description

Job Overview

  • Source: Arbeitnow
  • Location: Köln
  • Job Type: Full-Time

Job Description

At our company, it’s all about #OneTeam! Join gridscale and help shape the future of the cloud together with OVH.

As a leading tech company, we’ve been working for over two decades to reduce our environmental footprint - with innovative solutions and an open cloud designed to be sustainable from the ground up: #SustainableByDesign.

Our Tech Stack 🚀

OpenStack · Kubernetes · KVM · Linux · Bare-metal

· Ansible · Terraform · FluxCD/ ArgoCD · Git · Go · Python

· Claude Code/ Cursor/ agentic coding tooling

Your Role💻

You'll help build, operate, and industrialize OVHcloud's on-premise cloud platform (OPCP). You'll join a small, senior team that owns the OpenStack-based infrastructure and the Kubernetes / GitOps stack our customer-facing cloud runs on and that treats AI-assisted engineering as a first-class part of how we work.

The platform is actively in build mode, so joining now means real influence on the architecture, the automation strategy, and how we adopt AI in platform engineering. As a Senior, you shape the focus of your role around your strengths and interests: there's a clear backbone of automation, compute-lifecycle, and platform work, plus an explicit AI-substrate workstream. You're at home in a security-oriented, highly automated (GitOps) environment, keep an overview in ambiguous situations, and make well-founded decisions on that basis.

Your Tasks

  • Design and build OpenStack-based on-prem infrastructure that deploys itself autonomously - discovering available hardware and bringing up a functional datacenter in minutes.

  • Develop Infrastructure as Code with Ansible and Terraform - typically spec-first with LLM assistance, then human-validated; push this further via custom agent / sub-agent setups, agentic test generation, and prompt-engineered review loops.

  • Drive the ongoing development of our Kubernetes stack and GitOps workflows (FluxCD / ArgoCD).

  • Own the full lifecycle of our compute infrastructure - from bare-metal (firmware, provisioning, hardware health) through hypervisors to virtual compute nodes - and build the automation that keeps capacity healthy and rolls out updates without disturbing tenant workloads.

  • Build and extend the AI substrate that compounds our output: Markdown knowledge bases as retrieval substrate, agentic prototypes for incident triage and capacity planning, and deeper integration of agentic coding tools into daily work.

  • Contribute to the self-healing direction, turning today's manual runbooks into tomorrow's reasoning agents. Auto-remediation isn't a separate team here - it's how platform work is meant to land.

  • Design and implement test suites aligned with functional and technical specs (non-regression, performance, security).

  • Document and package the solution so users can deploy and operate it without friction, and keep improving the platform based on telemetry and user feedback.

  • Act as a technical reference and mentor across automation, platform engineering, and AI-tooling topics.

What we offer you💼

  • A platform that is genuinely in build mode - your architectural decisions stick.

  • A senior team where seniority means autonomy, not just a title.

  • AI-augmented engineering as a first-class workflow -Claude Code and comparable agentic tooling, Markdown-KB-as-substrate, and room to push the practice further. Modern tooling that compounds your work instead of just sitting next to it.

  • Exceptional team spirit across all departments and national borders - we live #OneTeam

  • Exciting work in a highly innovative, international environment with cutting-edge technologies

  • 32 vacation days, increasing with length of service

  • Flexible working hours, home-office options, and a secure permanent position with market- and performance-based compensation

  • Employer-funded pension plan and an attractive insurance package

  • OVHcloud covers 50% of public transportation costs

  • Up to €400 per year toward sports activities (gym membership, classes, etc.)

  • Attractive discounts at numerous shops and companies through Corporate Benefits

  • A contribution toward leasing your cargo bike

  • Regular company events and free cold and hot beverages

  • Several years of hands-on experience running production infrastructure (SRE, Platform, or DevOps).

  • Solid OpenStack experience - deployed, operated, and debugged it in production.

  • End-to-end compute infrastructure management, from bare-metal lifecycle through hypervisor and virtual compute node operations (migration, host evacuation, graceful drains, capacity rebalancing). The skill matters more than the specific tooling - what counts is having done it at scale and automated it.

  • Strong with Infrastructure as Code (Ansible, Terraform) and GitOps (FluxCD or ArgoCD), plus solid Linux administration including on bare-metal.

  • Active, daily practice of AI-assisted engineering, with opinions formed from real use. You can describe a workflow where an LLM saved you half a day, and one where you should have skipped it. Theoretical interest doesn't count.

  • Fluent English, written and spoken - our team is distributed, and this is the working language.

  • Nice to Have

    • Production experience with Kubernetes and the cloud-native ecosystem.

    • Production-quality Go and/or Python.

    • Deeper agentic tooling craft (Claude Code, Cursor, Aider): custom agent / sub-agent setups, hooks, prompt engineering, your own workflows or skills and managing a Markdown-first knowledge base as substrate for AI workflows.

    • Advanced compute-node tuning (CPU pinning, NUMA, hugepages, SR-IOV / PCI passthrough) and basic network debugging (VLANs, BGP).

    • Observability tooling (Prometheus, Loki, Grafana, etc.) and auto-remediation / self-healing systems (StackStorm, Event-Driven Ansible, or similar).

    • Experience in security-critical environments and with edge or multi-site deployments.

  • Soft Skills

    • A continuous-improvement mindset and ownership for what you build.

    • You see AI tooling as a structural shift in how engineering gets done - not a trend, not a threat and want to shape how the team adopts it.

    • You enjoy sharing knowledge, learning from peers, and can synthesize ideas clearly.

Find more English Speaking Jobs in Germany on Arbeitnow

Expert Career Tips for Site Reliability Engineer (m/f/d) Roles

To succeed in a competitive market as a Site Reliability Engineer (m/f/d), you need more than just technical skills. Here are some expert strategies to elevate your profile:

  • Build a Strong Portfolio: For technical roles, a clean GitHub or a personal project site is essential. For non-technical roles, a case study portfolio demonstrating problem-solving and impact is equally valuable. Show, don't just tell, what you have achieved in your previous positions.
  • Master the Narrative: When interviewing, use the STAR method (Situation, Task, Action, Result) to structure your answers. Quantify your results wherever possible—mentioning "increased efficiency by 20%" is much more impactful than saying "improved efficiency."
  • Continuous Learning: The industry moves fast. Whether it's staying updated with the latest AI tools or mastering a new management methodology, continuous professional development is key. Consider obtaining industry-recognized certifications that align with Site Reliability Engineer (m/f/d) requirements.
  • Networking: Connect with other professionals in similar roles. Join online communities, attend webinars, and engage in meaningful discussions on professional social networks. Often, the best opportunities come through referrals and community engagement.
  • Soft Skills Matter: Communication, empathy, and leadership are often the deciding factors between two equally qualified technical candidates. Cultivate these skills as they are universally valued across all industries and seniority levels.

Additionally, research the specific company's culture and values. Tailoring your application to show how you align with their mission can significantly increase your chances of moving forward in the process.

Salary & Compensation

Salary not disclosed; typically competitive for the role.

Work Arrangement

Type: On-Site

Standard business hours at the office.

Comprehensive Application Strategy & Hiring Process

Applying for a new role is a marathon, not a sprint. Follow this strategic approach to maximize your success rate:

1. Initial Research & Tailoring

Don't send the same resume to every employer. Spend at least 30 minutes researching the company. Look for recent news, their product roadmap, and their team structure. Modify your summary and core competencies to reflect the specific keywords found in the job description.

2. The Perfect Cover Letter

If the application allows for a cover letter, use it to tell a story that your resume cannot. Explain why you are passionate about this specific company and how your unique background makes you the perfect fit for the challenges they are currently facing.

3. Navigating the Multi-Stage Interview

Most modern hiring processes involve 3-5 stages. This typically includes a recruiter screen, a technical or skill-based assessment, a peer interview, and a final leadership round. Prepare for each stage differently: focus on enthusiasm and fit for the recruiter, technical depth for the assessment, and strategic vision for the leadership round.

4. Post-Interview Follow-Up

Always send a personalized thank-you note within 24 hours of each interview. Reference a specific topic discussed during the call to demonstrate your active listening and genuine interest in the role.

By following these steps, you demonstrate a high level of professionalism and attention to detail that sets you apart from the average applicant.

Typical Interview Process

  1. Resume screening
  2. HR call
  3. Skill interview
  4. Final manager interview
  5. Offer

Tip: Research the company's products and culture.

Global Market Intelligence & Relocation Insights

At JobSetuu, we specialize in helping talent navigate the global job market. Here is what you need to know about the current landscape in Köln and beyond:

The demand for skilled professionals is increasingly borderless. For roles based in Köln, understanding the local cost of living, visa requirements (if applicable), and cultural nuances is vital. If this is a remote role, consider the time zone alignment and the asynchronous communication culture of the hiring organization.

Relocation Support: Many forward-thinking companies offer relocation packages that include moving stipends, temporary housing, and legal assistance with work permits. When evaluating an offer, look beyond the base salary—consider the total compensation package, including equity, bonuses, and healthcare benefits.

Work-Life Balance Trends: Hybrid and remote work have become standard in many regions. Research the local labor laws and common practices regarding work hours and vacation time to ensure the role aligns with your lifestyle goals.

Leveraging JobSetuu's tools can help you compare salaries across different cities and understand the "purchasing power" of your potential offer, ensuring you make an informed decision for your long-term career path.

Skills & Competency Roadmap for Professional Development

To remain competitive in Professional Development, we recommend focusing on the following core competencies over the next 12-18 months:

  • Technical Mastery: Deepen your expertise in the core tools and languages relevant to your field. For developers, this might be cloud architecture; for marketers, it might be data-driven attribution modeling.
  • AI Augmentation: Learn how to leverage generative AI and automation tools to increase your productivity. Understanding how to integrate these technologies into your workflow is becoming a non-negotiable skill.
  • Leadership & Strategy: Even in individual contributor roles, the ability to think strategically and lead projects from inception to completion is highly valued. Focus on stakeholder management and high-level project planning.
  • Data Literacy: The ability to interpret data and use it to drive decisions is essential across all business functions. Familiarize yourself with data visualization and basic analytical concepts.

By investing in these areas, you not only prepare yourself for the role you are applying for today but also build a resilient foundation for the opportunities of tomorrow.

Apply via JobSetuu

Discover your next career milestone on JobSetuu. This Site Reliability Engineer (m/f/d) position is part of our commitment to bringing you the most relevant and high-impact job openings globally. At JobSetuu, we simplify your job search by aggregating premier listings and providing the tools you need to stand out. Don't miss the chance to elevate your professional journey—explore more opportunities and career insights on our platform today.

shopping_cart

Recommended Career Gear

Zebronics Debonair, Computer Chassis, Micro ATX/Mini ITX, USB x 2, Front Audio, Perforated Side Panel, Top-Mounted PSU Amazon Choice
coding accessories

Zebronics Debonair, Computer Chassis, Micro ATX/Mini ITX, USB x 2, Front Audio, Perforated Side Panel, Top-Mounted PSU

₹799
Buy on Amazon
Primebook 2 Max (2026) | 8GB RAM, 256GB UFS Storage | 15.6-Inch Full HD IPS Display | 12hrs Battery | MediaTek Helio G99 | Android 15 (PrimeOS 3.0) | Backlit Keyboard | in-Built AI (Gray) Amazon Choice
interview books

Primebook 2 Max (2026) | 8GB RAM, 256GB UFS Storage | 15.6-Inch Full HD IPS Display | 12hrs Battery | MediaTek Helio G99 | Android 15 (PrimeOS 3.0) | Backlit Keyboard | in-Built AI (Gray)

₹27,990
Buy on Amazon
Zebronics Suave, Computer Chassis, Micro ATX/Mini ITX, USB x 2, Front Audio, Perforated Side Panel, Textured Front Panel, Top-Mounted PSU with SMPS (with SMPS) Amazon Choice
coding accessories

Zebronics Suave, Computer Chassis, Micro ATX/Mini ITX, USB x 2, Front Audio, Perforated Side Panel, Textured Front Panel, Top-Mounted PSU with SMPS (with SMPS)

₹1,499
Buy on Amazon
check_circle

Discovery Success