As a Python Backend Engineer on our team, you will accelerate the process of engineering features to get the most out of multiple data sources and data types. The people you will work with own the entire feature learning experience for DataRobot, and is responsible for making sure our feature discovery automation and resulting models are the best in the world.
You will build out Data Management and ETL system inside DataRobot utilizing distributed frameworks such as Spark, Hadoop and Kubernetes. You will build scalable solutions to process high data volumes, with the main focus on crafting robust components for data intake, cleanup and full variety of data transformation. You will also be responsible for designing and implementing features from start to finish, including clean and easy to use APIs, automated tests, and deployment infrastructure.
The ideal candidate can bring new ideas from concept to implementation, write quality, testable code, and participate in design/development discussions.
• 5+ years of experience in Python
• 3+ years of experience in architecting and developing distributed systems
• Experience with data processing using at least one of the following tools:
• Experience with Hadoop ecosystem
In the interview process, you will be evaluated on your performance in a number of coding and design scenarios - be prepared to think!
• Ability to communicate about technical topics
• Willingness to learn about new technologies
• Experience in some or all of the following:
◦ System/performance evaluation such as profiling process memory/cpu/io/network usage and language-specific debugging tools
◦ Document-oriented databases, ideally MongoDB
◦ Distribute search engines such as ElasticSearch or Solr
◦ Messaging services such as RabbitMQ
◦ Hadoop services such as Yarn, Spark and HDFS