Oh Snap!
This job is no longer active - but you can still view the details below.

Site Reliability Engineer (Big Data / Hadoop)

| Greater NYC Area
About Foursquare:
Since our inception in 2009, Foursquare has been a leading force in changing how location information enriches our real-world and digital lives. As a location intelligence company, Foursquare is comprised of two well-known consumer apps, Foursquare and Swarm, as well as thriving media and enterprise products. Our B2B offerings include Places (for developers), Pinpoint and Attribution (for marketers), and Place Insights (for analysts, based on the world's largest foot traffic panel). With more than 200 people across our offices in New York, San Francisco, and in sales offices around the globe, we’re dedicated to our trailblazing mission—enriching consumer experiences and informing business decisions with location intelligence.
About our Engineering Team:
As a member of Foursquare’s engineering team, we want you to bring experience building real products from the ground up. We're passionate about tackling tough challenges in the location space and look for others who like to dive deep into code and help solve hard problems. You should be comfortable running with your own ideas and eager to learn new skills on a bleeding edge platform. We use a variety of tools, technologies, and languages to build software (Scala, Thrift, MongoDB, Memcached, JS/jQuery, Kafka, Pants, Hadoop, MR, Spark) but experience with equivalent ones will do just fine.
Join us and help bring our feature ideas (and your own!) off the whiteboard and into reality. Here are some high level areas you could help within our NY or SF office:
- Improve the accuracy and efficiency of one of the premier Place search APIs in existence. Our API powers Foursquare City Guide and Swarm as well as apps from Microsoft, Uber, Samsung, Twitter and more
- Leverage machine learning techniques to build systems which process and derive insights from billions of location data points every day
- Launch features that make cities easier to explore and continue to push the Foursquare City Guide and Swarm apps forward
- Build resilient services and tooling which drive all of our offline processing of petabytes of data

About the Role:
At Foursquare, our production systems run on an innovative hybrid cloud-and-coloc installation. We embrace open source and home-grown tools in the belief that what works best, is best. We're looking for a seasoned site reliability engineer to help us grow, automate, and monitor our footprint, in the datacenter and in the cloud.

The Big Data SRE will focus on operation and optimization of our large (7000+ cores, 4 petabytes storage and growing!) Hadoop cluster. You will work closely with the rest of the engineering org, to ensure a stable and scalable platform is available to support our extensive data analytics and machine learning efforts. You will cross train with the rest of the SRE team to share your Hadoop expertise, and to acquire skills relevant to maintaining and scaling the rest of our infrastructure.

You should have a proven track record of writing automation tools, a solid understanding of operating system fundamentals, and familiarity with common production environment services. You should be comfortable running with your own ideas and eager to learn new skills on a bleeding edge platform. We use a variety of tools, technologies, and languages to build software (e.g., Scala, Hadoop, Python, Thrift, MongoDB, Memcached, Redis, Kafka, Chef, Aurora, Mesos, RocksDB, Luigi, Pants, Nginx, Haproxy, Logstash, Grafana), but experience with equivalent ones will do just fine.

    • 5+ years of proven industry experience.
    • Strong written and verbal communication skills.
    • Solid background using Linux and *nix operating systems.
    • Experience with deployment automation tools like Ambari, Chef, Puppet or similar systems.
    • Familiarity with a breadth of projects in the Hadoop ecosystem, and expert with at least a few of them. We primarily use HDFS, YARN, Hive, MapReduce, Cascading, Scalding, Presto, Spark, PySpark, Jupyter, Zeppelin.
    • Familiarity with using and supporting analytics systems like Hive, Redshift, Presto, Athena, Tableau and similar tools.
    • Familiarity with performance debugging and tuning at the OS, JVM and cluster (MapReduce, Hive, Spark jobs) levels.
    • Bonus points for deploying/operating large-ish Hadoop clusters in AWS/GCP and use of EMR, Terraform, DC/OS, Dataproc.
    • Bachelors Degree or higher in Computer Science, Electrical Engineering or related field
Foursquare is proud to foster an inclusive environment that is free from discrimination. We strongly believe in order to build the best products, we need a diversity of perspectives and backgrounds. This leads to a more delightful experience for our users and team members. We value listening to every voice and we encourage everyone to come be a part of building a company and products we love.
Foursquare is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected Veteran status, or any other characteristic protected by law.
Read Full Job Description