Senior Data Engineer

Durham, NC

Company Name :IBA Infotech LLC

Type : Contract

Primary Skills : Hadoop, Spark, Hive, Presto, Flink, Samza

Location : Durham

CTC : DOE

Job Description:

Summary:

  • Develop and deploy highly-available, fault-tolerant software that will help drive improvements towards the features, reliability, performance, and efficiency of the Cloud Analytics platform.
  • Actively review code, mentor, and provide peer feedback.
  • Collaborate with engineering teams to identify and resolve pain points as well as evangelize best practices.
  • Partner with various teams to transform concepts into requirements and requirements into services and tools.
  • Engineer efficient, adaptable and scalable architecture for all stages of data lifecycle (ingest, streaming, structured and unstructured storage, search, aggregation) in support of a variety of data applications.
  • Build abstractions and re-usable developer tooling to allow other engineers to quickly build streaming/batch self-service pipelines.
  • Build, deploy, maintain, and automate large global deployments in AWS.
  • Troubleshoot production issues and come up with solutions as required.

 

This may be the perfect job for you if:

  • You have a strong engineering background with ability to design software systems from the ground up.
  • You have expertise in Java, Python or similar programming languages.
  • You have experience in web-scale data and large-scale distributed systems, ideally on cloud infrastructure.
  • You have a product mindset. You are energized by building things that will be heavily used.
  • You have engineered scalable software using big data technologies (e.g. Hadoop, Spark, Hive, Presto, Flink, Samza, Storm, Elasticsearch, Druid, Cassandra, etc).
  • You have experience building data pipelines (real-time or batch) on large complex datasets.
  • You have worked on and understand messaging/queueing/stream processing systems.
  • You design not just with a mind for solving a problem, but also with maintainability, testability, monitorability, and automation as top concerns.