· 8+ years Data analysis/architecture experience in Waterfall and Agile Methodology in various domains (prefer Healthcare) in a data warehouse environment.
· Good knowledge of relational database, Hadoop big data platform and tools, data vault and dimensional model design.
· Strong SQL experience (prefer Oracle, Hive, and Impala) in creating DDL’s and DML’s in Oracle, Hive, and Impala.
· Experience in analysis, design, development, support, and enhancements in data warehouse environment with Cloudera Bigdata Technologies (Hadoop, MapReduce, Sqoop, PySpark, Spark, HDFS, Hive, Impala, Stream Sets, Kudu, Oozie, Hue, Kafka, Yarn, Python, Flume, Zookeeper, Sentry, Cloudera Navigator) along with Informatica.
· Experience in working with Sqoop scripts, PySpark programs, HDFS commands, HDFS file formats (Parquet, Avro, ORC etc.), Stream Sets pipelines, jobs scheduling, hive/impala queries, Unix commands, scripting, and shell scripting etc.
· Experience in migrating data from relational database (prefer Oracle) to big data – Hadoop platform is a plus.
· Experience eliciting, analyzing, and documenting functional and non-functional requirements.
· Ability to document business, functional and non-functional requirements, meeting minutes, and key decisions/actions.
· Experience in identifying data anomalies.
· Experience building data sets and familiarity with PHI and PII data.
· Ability to establish priorities & follow through on projects, paying close attention to detail with minimal supervision.
· Effective communication, presentation, & organizational skills.
· Good experience in working with Visio, Excel, PowerPoint, Word, etc.
· Effective team player in a fast paced and quick delivery environment.
· Participate in Team activities, Design discussions, Stand up meetings and planning Review with team.
· Perform data analysis, data profiling, data cleansing and data quality analysis in various layers using Database queries both in Oracle and Big Data platforms.
· Eliciting, analyzing, and documenting functional and non-functional requirements.
· Document business requirements, meeting minutes, and key decisions/actions.
· Lead client meetings and sessions with data-driven analysis to clarify requirements and design decisions.
· Perform data gap and impact analysis due to new data addition and existing data changes for any new business requirements and enhancements.
· Follow the organization design standards document, create data mapping specification document, pseudo codes for the development team(s) and design documents.
· Create logical & physical data models.
· Review and understand existing business logic used in Oracle and Hadoop ETL platforms to verify against the business user needs.
· Review PySpark programs that are used to ingest historical and incremental data.
· Review SQOOP scripts to ingest historical data from Oracle database to Hadoop IOP, created HIVE tables and Impala view creation scripts for Dimension tables.
· Assist Business Analyst to create Test Plan, Design Test scenarios, SQL scripts (prefer Oracle and Hadoop), test or mockup data, executes the test scripts.
· Validate test results and records as well as log and research defects.
· Analyze production data issues, report problems, and find solutions to fix the issues, if any.
· Create incidents and tickets to fix production issues, create Support Requests to deploy code for development team to UAT environment.
· Participate in meetings to continuously upgrade the Functional and technical expertise.
· Establish priorities & follow through on projects, paying close attention to detail with minimal supervision.
· Create and present project plan, project status and other dashboards, as necessary.
· Perform other duties as assigned.