Role is a big data software engineer, supporting several different applications and projects in the Data Science depar...
Role is a big data software engineer, supporting several different applications and projects in the Data Science department. One of the applications builds video session records from one of the Data Science feeds - Creates a daily report based off the video session records by service providers. Another application receives and scans data and publishes it to one of the datalakes. Maintaining statistics on ingested and published files.
This person will need to be able to:
• be plugged into any project in a developer capacity
• be able to translate business requirements into software architecture and design
• and execute - develop code, build and execute unit testing, work with product owners, other developers and leads on the team, QA and support personnel to architect, design, build, test, deploy, fix and support her code.
Mandatory Skills Description:
• Python programming experience
• Pyspark - allow python to access a Hadoop environment through SPARK
• SQL (querying and working with major SQL DBs - e.g. Oracle, Postgres, MySQL, ..)
• Extensive experience working in a Linux and Hadoop environment
• Shell scripting
• HDFS dfs CLI - command line interface
• Experience with processing large amounts of data
o data ingestion, data movement, data cleansing, data quality
• Excellent communication skills, team player
• Knowledge of Hive, Postgres
• computer science, algorithms and optimization