Big Data

Troy, MI

Company Name :IBA Infotech LLC

Type : Contract

Primary Skills : Spark, Scala, Python, Java, NoSQL, SparkSQL, and ANSI SQL query

Location : Troy

CTC : DOE

Job Description:

Responsibilities

  • Create data integration pipelines to extract, cleanse, and integrate data from a variety of sources and formats for analysis and use across use cases.
  • Perform data profiling, discovery, and analysis to identify/determine location, suitability and coverage of data, and identify the various data types, formats, and data quality which exist within a given data source.
  • Work with source system and business SME’s to develop an understanding of the data requirements and options available within customer sources to meet the data and business requirements.
  • Create re-usable data extraction/ingestion pipelines and templates to demonstrate the logical flow and manipulation of data required to move data from customer source systems into the target data lake, warehouse, and/or sandbox.
  • Perform hands on data development to build the data extraction, movement and integration, leveraging state of the art tools and practices, including both streaming and batched data ingestion techniques.
  • Provide elbow-to-elbow style mentoring of customer resources and other consultants.
  • Assist in creation of data requirements and data model design as necessary and appropriate.

 

Qualifications

  • Minimum of 3 years of experience working with the Apache Hadoop Ecosystem of tools and technologies to extract, integrate, cleanse and organize data, including experience with either the Hortonworks or Cloudera distributions.
  • Key Tools and Technologies

o Spark
o Scala
o Python
o Java

  • Experience working with the following types of workloads and data pipelines:

o Enterprise-scale ETL and ELT batched workloads
o Near real-time micro-batches
o Streaming data

  • Experience working with Data Governance frameworks
  • Some experience performing conceptual and logical data model design
  • Experience in the Financial Services, Retail industry, or Healthcare Payor or Provider industries is a plus.
  • Strong NoSQL, SparkSQL, and ANSI SQL query language skills
  • Strong verbal and written communication and English language skills
  • Strong consulting skills, consulting experience strongly desired