Responsible for rapid design, architect, prototype, and implementation of solutions to tackle the Big Data and Data Science needs.
Architect and develop highly scalable distributed data pipelines using Hadoop and/or Spark ecosystem.
Be hands on developer coding out pipelines in Python/Java.
Design and build scalable data models for large-scale analytics datasets in both SMP and MPP Architecture realms.
Performance tuning of SQL queries with a thorough understanding of database internals.
Work with Business Intelligence and Data Scientists to understand data needs and ingest rich data sources such as external claims data feeds, Electronic Health Record data, financial data, operational data, Public Data sets and social media feeds and real time streaming data.
Research, experiment, and utilize bleeding edge technologies to make data engineering efficient and resilient.
Perform code reviews and mentor other developers on the best practices.
Translate advanced business analytics problems into technical approaches that yield actionable recommendations, communicate results and educate others through design and build of insightful visualizations, reports, and presentations.
Minimum 10 years of software development and Architecture experience with multiple programming languages, technologies, and Frameworks.
Heavy development experience on the data processing side of the software development. Adept at choosing the right Data Structures and right Algorithms for data processing.
4 years of experience designing and delivering solutions utilizing Hadoop-Map reduce, Spark, Hive, and Sqoop.
Experience working NoSQL databases like HBase, Cassandra.
Experience working with Real time data streams using Kafka, Storm.
Working knowledge of Linux/Unix operating systems.
Strong background in Data warehousing and ETL principles, architecture and its implementation in large environments. Experience in handling at least Terabytes of data processing using databases like SQL Server, Azure SQL Data warehouse or other MPP databases like Vertica/RedShift.
Strong industry experience in programming languages such as Python, C# or Java, with the ability to pick up new languages and technologies quickly; understanding of cloud and distributed systems principles and experience with large-scale, big data methods.
At least 2 years of experience working on leading cloud platforms like Azure/AWS/GCP.Microsoft Azure experience is preferred.
Experience working with Microsoft Azure tools and technologies like Azure Data Factory, Azure SQL Data Warehouse, IoT Hub, and Stream Analytics is highly preferred.
Ability to evaluate multiple technologies and platforms and propose the right solution for the problem is highly desired.
Good understanding of Object oriented and Functional programming paradigms.
Familiarity with DevOps methodologies; CI/CD, Dockers, Containers, Kubernetes, Jenkins.
Work with team members and clients to assess needs, provide assistance, and resolve problems, using excellent problem-solving skills, verbal/written communication, and the ability to explain technical concepts to Leadership and business audiences;
Bachelor’s degree in Computer Science, Computer Engineering, or related field from an accredited college or university; Master’s Degree preferred.