DataRobot offers a machine learning platform for data scientists of all skill levels to build and deploy accurate predictive models in a fraction of the time it used to take. The technology addresses the critical shortage of data scientists by changing the speed and economics of predictive analytics. The DataRobot platform uses massively parallel processing to train and evaluate 1000's of models in R, Python, Spark MLlib, H2O and other open source libraries. It searches through millions of possible combinations of algorithms, pre-processing steps, features, transformations and tuning parameters to deliver the best models for your dataset and prediction target.
As a senior backend developer, you turn data science techniques into scalable new features, with a focus on robust distributed architecture. You will engage across disciplines to design and implement cutting-edge machine learning enhancements and infrastructure in DataRobot. Whether optimizing database query patterns, working on predictions tech, improving model tuning, designing storage structures necessary for new features, improving coding practices, or parallelizing code bottlenecks, you get it done while helping those you work with to be better developers.
The ideal candidate should bring new ideas from concept to implementation, write quality code, participate in design/development discussions, then translate architectural specs into working application design.Main Requirements
- 5+ years of Python experience working in a large software system (not just web-dev)
- ~3 years of experience developing distributed systems experience, ideally with some kind of architecture responsibilities, or designing component interfaces
- In the interview process you will be evaluated on your performance in a number of coding test and design test scenarios - be prepared to think!
- Some experience with data processing
- MLlib (Spark.ml)
- Experience in some/all of these:
- Messaging like ZMQ or RabbitMQ
- API interface design and construction
- Microservice/distributed systems design and construction
- Persistent storage like Redis and MongoDB
- Parallel Computing
- Experience/understanding resource management services workflow (Hadoop/Yarn, Mesos, Kubernetes, AWS, OpenStack, Docker or any other).
- Experience working on the JVM (Java, Scala) a plus
- System/performance engineering (profiling process memory/cpu/io/network usage, system calls, flame graphs, jvm/python specific debugging instruments (pdb, visualvm, etc.))