logo

View all jobs

Site Reliability Engineer

Toronto, Ontario
Site Reliability Engineer - Kingston or Toronto

We have partnered with a recognized leader in Canadian fintech to search for a Site Reliability Engineer. This role will focus on working directly with the engineering, platform, production and dev-ops teams to own reliability, monitoring, production environments, capacity and performant.. The right person for this role is passionate about challenging the status quo and is interested in creating innovative solutions to next-gen data related obstacles.

This is a great role for a developer, dev-ops engineer or platform architect to make a huge impact and work directly alongside senior leadership within a national company. 

**COVID Update: This role will be 100% remote until our client deems it is safe for their workforce to return to the office. All interviews will be done via video. Work will resume in either the Kingston or Toronto office post-COVID.
 
Responsibilities Include:
  • Ownership of reliability, monitoring, operations, capacity and performance analysis
  • Improving service observability/monitoring for augmented MTTD/MTTR
  • Creation and maintenance of monitoring, alerting and dashboard solutions related to performance, metrics and operational workload
  • Automation to ensure repeatability and reduced time to action
  • Partner with development, security, operations, QA and business teams to improve availability, performance, security and maintainability
  • Assist in capacity planning for production environments and participate in setting SLAs/SLOs/SLIs
  • Proactive testing relating to flexibility and resilience of systems and production environment
  • Work with development, security and audit teams to ensure adherence to compliance standards
  • Troubleshoot reliability and performance issues
  • On-call in the event of critical incidents
  • Documentation of "Tribal/Collective" knowledge

Requirements:
  • Bachelors degree related to computer science, engineering, or systems development.
  • 3 years or more experience working in software development (Java, C# or Python - experience with JS frameworks/libraries is a plus)
  • 3 years or more experience working professionally in a role related to application development, dev-ops, or some mix of the two (leadership experience is not required, but natural leadership skills are)
  • Knowledge of microservices or microservice architecture is a must-have (candidates with only monolithic architecture experience will not be considered for the role)
  • Experience with cloud architectures.  GCP is preferred but not required (other cloud flavours will be considered) 
  • Experience troubleshooting n-tier architectures with diverse sets of technologies (e.g. load balancers, web/app/caching/database servers, queues, threading, memory, CPU, heap, storage, network, os)
  • Experience using application and infrastructure monitoring systems (e.g. Splunk, Cloudwatch, Datadog, New Relic, Sumologic, ELK
  • Experience with continuous deployment based software development lifecycles (e.g. CI/CD)
  • Experience with SQL databases (e.g., MSSQL, PostgreSQL, MySQL, DB2, Sybase)
  • Strong leadership skills are a must:
    • Active listener / Excellent communication skills (all media / all audiences)
    • Courage / Initiative / Action bias
    • Creative problem solver / Data-based root cause solution bias
    • Emotional intelligence / Self-aware / Open mind
    • Cognitive Adaptability / Quick Learner / Systemic Thinker
 
How to apply?
You can apply directly to gord.marriage@talentlab.com  or on our website at www.talentlab.com. We want to thank all applicants for their interest, but only those in consideration will be contacted.
www.talentlab.com
www.talentlab.com
 

Share This Job

Powered by