Lead Site Reliability Engineer – Remote

April 19, 2019

Lead Site Reliability EngineerCompany Description

At Tutuka, we think everyone should have access to user-friendly payment services. We make connecting easy, by making simple, safe payments happen for people around the globe. We enable payments via virtual and physical cards for partners like banks, telcos, retailers, developers and fintechs across the world.

Job Description

As the Lead Site Reliability Engineer (SRE) at Tutuka you’ll be working closely with the entire technical team ensuring the reliability of enterprise-level, highly scalable, highly secure financial processing systems that power tens of millions of transactions and tying them to web, mobile and API interfaces that make it easy for people to issue, redeem and reconcile prepaid cards all over the world.

We already have a team of amazing developers that work out of our local offices in Johannesburg, South Africa as well as remotely across Europe and Southeast Asia, and now we need you to drive improvements in our reliability, scalability and efficiency.

What you will be doing

You’ll find every day an exciting challenge, helping our technical team transform a monolithic enterprise processing environment with bank-level security and 99.95% uptime, into a sleek, nimble, micro-service serverless processing environment with better than bank-level security and 99.99% uptime.

If it was easy, we would already have done it! This role may or may not involve the following:

Work closely with software engineering teams to improve availability, latency, performance, efficiency, monitoring, emergency response, and capacity planning
Across hybrid cloud environment of hosted data centre and AWS
Handle upgrades of infrastructure and services through automation
Identify, gathering, documenting and automating responses to key performance metrics, logs, and alerts
Find optimizations and other efficiencies to scale the application
Develop playbooks and tools to streamline processes and shorten problem resolution time
Perform periodic on call duties
Maintain infrastructure as a code management process

Qualifications

We love taking on team members with a variety of skill levels, from intern to PhD. But there’s no getting around the fact that we need this person to know what they’re doing, and hit the ground running.

You should already be an SRE guru with:

Solid understanding of operational principles, such as capacity planning, monitoring and incident handling
Experience automating manual processes, leveraging cloud (preferably AWS) platforms
Telemetry, tracing, logging, and alerting best-practices
Experience implementing monitored and seamless deployment pipelines
Internet fundamentals. HTTP/s, DNS, TCP/IP, security-by-design, caching

Extra kudos are awarded for:

JVM performance tuning
Experience in monitoring of cloud based systems
Knowledge of automated testing frameworks and methodologies
Experience with some scripted and compiled/virtual languages (for example JavaScript and Go/JAVA)

Additional Information

Lots of space to challenge yourself:

Learning about how the payments industry works
Working with global clients and partners
Working across multiple teams
Helping to grow our technology by understanding your customer’s needs, and conveying that into tangible applications

What’s in it for you:

Working at the cutting edge of payment innovation
International exposure and experience

If you can see yourself in this role and feel you can add to the ongoing success of Tutuka, then please get in touch and apply. (If you do not have Site Reliability Engineering experience, your application cannot be considered!)

Apply Here!

Comments