SRE

Keyway

Keyway

Operations
Buenos Aires, Argentina
Posted on Thursday, August 29, 2024

About Keyway: Keyway is a Series A stage PropTech company revolutionizing the real estate industry through digital solutions. By leveraging cutting-edge artificial intelligence and machine learning, we've developed Keypilot, the first AI-powered real estate copilot designed to optimize the entirety of a real estate transaction. Our leadership includes a serial tech entrepreneur with a history of successful ventures and substantial venture capital backing from industry-leading investors.

Role Overview: As a Site Reliability Engineer (SRE), you will play a crucial role in ensuring the availability, reliability, and performance of Keyway's services. This position requires understanding of system dynamics, a passion for automation, and a drive for continuous improvement in a high-stakes environment.

Responsibilities:

  • System Reliability & Availability: Monitor, maintain, and enhance the reliability and performance of Keyway's critical services.
  • Automation & Efficiency: Design and implement automation tools to minimize repetitive tasks and manual errors, thereby improving operational efficiency.
  • Incident Management & Problem-Solving: response to critical incidents, ensuring swift resolution and conducting thorough postmortem analyses to prevent future occurrences.
  • Monitoring & Alerts: Develop and maintain effective monitoring systems, setting alert thresholds to minimize false alarms and improve incident response.
  • Resource Optimization: Analyze resource usage and optimize infrastructure to improve efficiency and reduce costs, ensuring proper utilization of cloud and on-premise tools.
  • Security & Compliance: Work closely within engineering teams to ensure infrastructure compliance with security policies and regulations (e.g., SOC2, GDPR).
  • Infrastructure as Code (IaC): Manage infrastructure using IaC tools like Terraform, promoting consistent practices.
  • Interdisciplinary Collaboration: Collaborate closely with development, operations, and security teams to integrate SRE best practices into the software development lifecycle (SDLC).
  • Mentorship & Team Development: Mentor junior team members, fostering a culture of continuous learning and knowledge sharing.
  • Capacity Planning & Scalability Strategies: Anticipate demand growth and plan necessary infrastructure capacity to ensure efficient and uninterrupted scaling.

About You:

  • Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • Experience: Minimum of 5 years in a site reliability engineering role or similar.
  • Technical Proficiency: Strong command of infrastructure as code (IaC) tools (e.g., Terraform), automation tools, cloud services (AWS, Azure, Google Cloud), and robust monitoring solutions.
  • Skills: Proficient in programming languages relevant to automation and infrastructure management (e.g., Python, Ruby, Bash).
  • Certifications: Certifications in cloud architecture or security (AWS Certified, Microsoft Azure Certified) are highly desirable.
  • Communication: Strong communication skills and proficiency in English.

Keyway’s Commitment to Diversity: At Keyway, we celebrate diversity and recognize the value it brings to our customers and employees.