Site Reliability Engineer
LucidLink
Location
Sofia Office
Employment Type
Full time
Location Type
Hybrid
Department
Engineering
Company overview
LucidLink is a fast-growing startup on a mission to make data instantly and securely accessible from everywhere. As remote and hybrid work has become the new normal, our cloud-based technology enables teams to instantly access files and collaborate from anywhere in a familiar format that works like a local hard drive.
LucidLink’s solution is designed for workflows involving huge files, massive data sets and real-time collaboration. Our customers include the world’s most creative companies like Paramount, Warner Brothers, Epic Games, Spotify, A+E and Netflix. We were founded in 2016 by storage industry experts and support over one billion customer files across more than 40+ countries. LucidLink is headquartered in San Francisco, California, has an engineering office in Sofia, Bulgaria, and remote employees across North America, Europe, and Australia.
Reasons to join LucidLink:
Tackle big challenges: You’ll have the chance to solve complex, high-stakes problems that redefine how teams collaborate globally. By starting with the Media & Entertainment industry and expanding into data-intensive sectors, you’ll gain deep insight into cutting-edge technologies and play a role in shaping the future of global workflows.
Values-led culture: Our values don’t just exist on paper—they guide every decision and interaction. You’ll thrive in an environment where integrity, innovation, and empathy are at the core of how we operate, empowering you to grow personally and professionally.
Hypergrowth journey: Joining a company with triple-digit growth rates means unparalleled opportunities for advancement, learning, and being part of an exciting journey toward unicorn status. You’ll experience the adrenaline of startup speed combined with the satisfaction of building something truly impactful.
Immediate impact: At LucidLink, your work will matter—immediately. You’ll be part of a tight-knit team of 170+ builders working at startup speed, where your ideas and actions will create tangible, exponential results that contribute to our collective success.
Comprehensive benefits: We believe in investing in our people. With unlimited PTO, a competitive salary, stock options, and full health coverage, you’ll feel supported both professionally and personally while enjoying a strong work-life balance.
Job Description
As a Site Reliability Engineer at LucidLink, you will solve complex and broad business problems with simple solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to iteratively improve on existing solutions. A LucidLink SRE's work includes automation, optimization, incident response, and consultative support for other engineering teams.
Your skills and qualifications:
3 years of experience in the field.
Experience with the major devops tools - Ansible, Terraform.
Experience with container infrastructures, e.g. Docker/Kubernetes.
Coding skills - Python, Bash.
Ability to navigate complex network topologies and troubleshoot systems with multiple interconnected components.
Experience with various cloud providers - AWS, DigitalOcean is a plus.
Experience administering Linux servers. Windows and macOS experience is a plus.
Software security skills - security and vulnerability management, compliance monitoring will be considered a plus.
Good English, both spoken and written.
Your responsibilities:
Deliver operational readiness and provide post-release production support.
Deliver stability, scalability, and support of product infrastructure and availability to the LucidLink customer-facing production environment.
Perform routine environment maintenance to ensure everything runs smoothly and efficiently.
Develop maintenance requirements and procedures.
Handle outages, provide resolution and root cause analysis, including post-mortems, and further mitigation/prevention actions.
Analyze the technology currently being used to improve and enhance LucidLink’s infrastructure.
Enhance LucidLink’s monitoring system to “know about each problem before the customer does”.
Create and improve automation related to LucidLink production systems.
Participate in our on-call and incident response teams to solve critical problems in production.