Infinity Consulting Solutions
Job Description
– Lead Machine Learning Engineer
– Data Platform and Pipelines Our client is a well-known Entertainment company that is seeking a Lead Machine Learning Engineer for their The Direct-to-Consumer Group.
We are building a global streaming video platform (OTT) which covers search, recommendation, personalization, catalogue, video transcoding, global subscriptions and really much more.
We build user experiences ranging from classic lean-back viewing to interactive learning applications.
We build for connected TVs, web, mobile phones, tablets and consoles for a large footprint of all of their owned networks.
We are hiring Senior Software Engineers to join the Personalization, Recommendation and Search team.
As part of a rapidly growing team, you will own complex systems that will provide a personalized and unique experience for millions of users across over 200 countries for all their brands.
You will be responsible for building scalable and distributed data pipelines and will contribute to the design of our data platform and infrastructure.
You will handle big data, both structured and unstructured, at the scale of millions of users.
You will lead by example and define the best practices, will set high standards for the entire team and for the rest of the organization.
You have a successful track record for ambitious projects across cross-functional teams.
You are passionate and results-oriented.
You strive for technical excellence and are very hands-on.
Your co-workers love working with you.
You have built respect in your career through concrete accomplishments.
What the ideal candidate looks like: 5+ years of experience designing, building, deploying, testing, maintaining, monitoring and owning scalable, resilient and distributed data pipelines.
High Proficiency in at least two of Scala, Python, Spark or Flink applied to large scale data sets.
Strong understanding of workflow management platforms (Airflow or similar).
Familiarity with advanced SQL.
Expertise with big data technologies (Spark, Flink, Data Lake, Presto, Hive, Apache Beam, NoSQL, …).
Knowledge of batch and streaming data processing techniques.
Obsession for service observability, instrumentation, monitoring and alerting.
Understanding of the Data Lifecycle Management process to collect, access, use, store, transfer, delete data.
Strong knowledge of AWS or similar cloud platforms.
Expertise with CI/CD tools (CircleCI, Jenkins or similar) to automate building, testing and deployment of data pipelines and to manage the infrastructure (Pulumi, Terraform or CloudFormation).
Understanding of relational databases (e.g., MySQL, PostgreSQL), NoSQL databases (e.g., key-value stores like Redis, DynamoDB, RocksDB), and Search Engines (e.g., Elasticsearch).
Ability to decide, based on the use case, when to use one over the other.
Familiarity with recommendation and search to personalize the experience for millions of users across million items.
Masters in Computer Science or related discipline.
– provided by Dice