Data Engineer Codecademy
Codecademy is one of the biggest online educators in the world. We have one of the largest data sets in learning - billions of lines of code, hundreds of millions of submissions, and, we think, the ability to find out how all of that can make the best learning experience in the world. We're not simply using data to solve the usual problems - recommendation engines, search rankings, etc. - but instead are using it to improve the ability of our students to learn skills that can change their lives. Our team started with a data scientist who, after working on the aforementioned problems at Google, Square, and LinkedIn, knew he could use his skills to create a world where learning is once again fun, accessible, and relevant. Will you be the next to make the same decision?
At Codecademy, we believe in the value of fast iteration and being data driven. We?re looking for a data engineer who is passionate about building systems to tackle large data sets ETL and facilitate analytics-driven experimentation across the Codecademy team.
Problems we work on:
Data infrastructure: Consolidate the ETL processes across various internal and external data sources and proactively make the quality of our data better and easier to access.
Learner session analysis: Analyze learner session data to provide actionable insight to improve our user experience. For example, you?ll be diving deep into where learners succeed and drop off, picking up on the insights from that data to help people learn at a faster clip and to stay motivated in order to achieve their goals.
Business intelligence visualization: Visualize and provide insights of our key growth, retention, and user engagement metrics. With your help, define the metrics that matter for Codecademy, help to present them to the team, and guide product development by showcasing the path to growth.
Be lean, be data driven: Work with internal team members to form testable hypothesis, be creative to design the cheapest experiment, and draw valuable insights from experiment results.
Mastery of large scale ETL systems.
Mastery of at least one of the following languages: Ruby, Python or Java.
Experience working with distributed NoSQL database systems in a production environment.
Analytical problem solving skills
Ability to make pragmatic engineering decisions in a short amount of time
||New York, NY |