Site Reliability Engineer (SRE)

Boston, Massachusetts, United States Full-time

Quantopian is looking for Site Reliability Engineers to join our team! 

Quantopian empowers amateur and professional quants to find alpha in trading markets. We provide a hosted platform for free-form research of market data, an open-source backtesting engine, and a large base of educational material for quants of all experience levels.

We're looking for Site Reliability Engineers to support our rapidly expanding user base and build towards our ambitious product roadmap. The SRE team at Quantopian manages the full cloud infrastructure platform that all of our products and services run on. We oversee code deployments, monitoring and alarm systems, databases, servers, containers, test infrastructure, and more.

We work on interesting problems, such as:

  • Running arbitrary user code on our own servers, with all the associated security implications
  • Building intuitive, powerful research and development tools for our users
  • Designing data stores for real-world financial data and optimizing them for high throughput when running trading simulations
  • Measuring and autoscaling our cloud infrastructure to respond to varying load
  • Creating infrastructure to power a productive user experience for both internal and external users

We've built Quantopian with Python and Ruby on Rails on AWS and Heroku. We depend heavily on Docker, Kubernetes, Ansible, and Postgres. Most new services are targeted for Kubernetes and we are steadily migrating our existing services. We love Apache Airflow and have integrated it into our core business processes.

We also try to give back to the technology community on which we rely. Our core backtest simulation engine, Zipline, is open-source. We incorporate engineering time for contributions to other open-source projects like Apache Airflow. We encourage attendance (and speakerships) at conferences.

We're well-financed by highly reputable venture capital investors, including Spark Capital, Khosla Ventures, Bessemer Venture Partners, and Andreessen Horowitz.

We've assembled a top-notch product and engineering team here in Boston, and we're still growing. Our engineering team comes from a wide variety of backgrounds, so domain expertise in finance is not required.

Our small team size and ambitious goals dictate our approach to talent acquisition and retention: we believe in hiring friendly, motivated engineers, putting them in a positive, collaborative environment, and giving them hard problems to solve and the autonomy to solve them.

Our SREs are embedded in an Application Engineering team where they help implement deployment mechanisms, ensure services are designed for scale, and provide guidance on cloud technologies. SREs write and review code using the same GitHub Pull Request-based workflow as Application Engineers. SREs are also members of a weekly on-call rotation. We have weekly incident review meetings to ensure we are continuously reducing interruptions.

Ideally, you:

  • have some experience managing and operating systems, whether in an SRE capacity, “DevOps” or Operations role, in IT, or some similar field;
  • have some prior programming experience, either as coursework or in industry;
  • have good written communication skills and an interest in collaborating with our entire organization;
  • thrive on designing, building, and shipping incredible features that help our users and developers find market insights;

You do not need to have:

  • A background in finance, quant finance, wall street, or any other financial markets
  • A Computer Science degree (though prior programming experience of some kind is expected)

Current SRE Initiatives

SREs at Quantopian work on core infrastructure projects and on application engineering projects - both of which are essential to maintaining our business. The major initiatives we’re currently working on are focused on increasing developer velocity and assisting our investment team with the operation of our asset management business.

Transitioning from Ansible to Kubernetes

We’re eagerly adopting Kubernetes across our infrastructure. We are already running containers as part of our critical-path nightly data pipelines. New engineering projects are starting from a container-first development strategy. Our next step is to seamlessly migrate our existing internal services, and eventually even our user-facing services.  We’re looking for SRE’s who are interested in not just repackaging and redeploying but redesigning applications to best take advantage of the flexibility of the Kubernetes system. There is a lot of opportunity to get hands-on experience with building and managing both containerized applications as well as a production Kubernetes cluster.

Supporting our Investment Research team

We work closely with the team responsible for managing Quantopian’s Asset Management business. This team engages with our Community to discover and combine novel insights about the market and use those ideas to systematically invest institutional capital. Community authors are paid royalties based on their contributions.  In support of the investment team we are currently collaborating on the delivery of the Quantopian Alpha Model (QAM): an in-house developed alpha model built on top of the best ideas from the Quantopian Community. The Quantopian Alpha Model will allow our investment team to incorporate ideas from many more authors in our community. For the SRE team, QAM will improve our ability to manage our fund-related infrastructure, drastically reducing operational risk. The SRE team has been heavily involved in using tools like Airflow and Kubernetes to create an automated daily workflow for the QAM project while meeting our required service levels. We’re preparing to launch the first version of QAM to our production trading environment by the end of 2018.