Job Description Summary
Primary responsibility for this role is to architect, design, development for Hadoop based Enterprise Data Platform (Data lake). The responsibilities will entail into onboarding/implementing new data sources into data lake and provision data for downstream systems from Data Lake.
Reporting to the Head of Enterprise Data Delivery, the Enterprise Big Data Specialist will have a number of responsibilities including the architecture design, and development of a Hadoop based Enterprise Data Platform, Data warehousing, and Advance Analytics platform. The candidate must have deep skills in the domain of big data, warehousing, advanced analytics and at home with high scale complexity and deep & wide data stores. The candidate must also be highly capable of identifying capability gaps and defining strategic solutions to support a variety of business needs and be able to deliver quick successes through strong relationships with senior members in both business and technology. This role also works closely with emerging new technologies (e.g. AWS Cloud, zeppelin notebook) for the analytics platform. Architecture of Big data, data warehouse and analytical platform is backed by a collection of modern technologies and the successful applicant will demonstrate a keen understanding, knowledge and experience in similar technology sets. The candidate should also have a history of hands-on technical know-how, a critical thinking approach to problem-solving, a sense of urgency and commercial acumen regarding delivery, and exhibit the initiative to identify challenges and solutions independently. Experience in building support and rapport with business and technology communities to drive substantial outcomes will be highly regarded. This role will report into Head of Enterprise Data Delivery.
- Developing and managing a broad range of data systems, architecture, and application components & processes across the entire data value chain from data sourcing, ingestion to consumption for structured and unstructured data sets.
- Evaluate new technologies and open source or third party products
- Work closely with Data Governance, Architecture, Cloud Infrastructure & Applications teams to develop, implement and manage solutions for Enterprise Data Platform and Advance Analytical Platform.
- Collaborate with business unit stakeholders in ensuring data quality metrics while ensuring compliance with data policies, standards, roles & responsibilities, and adoption requirements.
- Collaborate with business unit stakeholders in selecting, introducing and integrating data consumption solution, tools & methodologies into the organization.
- Coordinate and manage vendor work assignments
- Collaborate external client’s technical team to setup new clients for Advance Analytical platform.
- Provide oversight to operations of the data intake and Analytical platform functions from a technical development and data services activity standpoint.
- Practice established development disciplines such as good code management, branching and merging of code in a GIT repository
- Mentor/ train junior big data developers in the team.
- Perform code review and proactively avoid exposure to platform vulnerabilities.
- Lead moderate to complex assignments/projects and/or manage projects and the jobs/team(s) across different departments.
- Advise on department/division or business level strategy but typically contributes to execution of strategy
- Has relevant and highly developed professional and technical skills; experienced level of knowledge in field of expertise and strong knowledge of business/context
- Ability to multi-task
- Strong analysis / design experience
- Meticulous attention to detail
- Must be a self-starter with ability to follow through on projects assigned
- Technical thought leader with hands-on experience in data technologies
- Overall IT experience in an enterprise environment – 10+ years
- Minimum 3+ years in Big Data development/design including the following Hadoop ecosystem components – HDFS, Hive, Sqoop, Flume, Pig, Kafka, Spark / Spark SQL, Oozie, Hue and Java programming
- Strong knowledge of various DMBS systems including NoSQL architectures and design principles.
- Good understanding of AWS cloud technology and Hadoop implementation on AWS including S3, EC2 and EMR
- Proven experience in performance tuning Hive tables
- Experience with development using Agile methodologies
- Undergraduate degree in MIS, Computer Science
- Excellent interpersonal and communication skills
- Be a strategic thinker with excellent analytical and problem-solving skills
Nice to have:
- Previous experience in the financial and/or securities industry
- Experience in Python, Scala, Zeppelin Notebook, Ranger
- Experience designing a technology stack for machine learning
- Experience in the configuration of YARN, MapReduce for performance, security
- Experience working with Big data development tools like Zaloni, Podium data, Informatica