View all jobs

Software Engineer (Dataiku/Hadoop)

New York, NY
Software Engineer (Dataiku/Hadoop)

Our client is seeking a Software Engineer candidate with expertise in design, development and implementation of statistical modeling databases and the implementation of statistical models in Python/R on a DataIku/Hadoop platform. The position requires working closely with the WM Strategists and Modeling Group, who are responsible for the development and implementation of statistical based models covering a wide range of financial products such as bank deposits, mortgage lending and retail lending. We have an opening for a qualified individual to join our fast-paced work environment.

The right candidate would have a background in data engineering and requisite familiarity and experience working with statistical modeling. The candidate should be well versed in the Hadoop ecosystem and all the intricate details of Hadoop application design. Additionally, experience with Hadoop/Spark performance tunings which includes but not limited to data partitioning and indexing is requisite.
  • Work closely with members of WM Strats and Modeling team in the design, development and implementation of large statistical databases in DataIku/Hadoop environment
  • Work closely with members of WM Strats and Modeling team in the implementation of statistical and econometric models in Python/PySpark/R on the DataIku platform
  • Work closely with members of WM Strats and Modeling team to facilitate processing large data in Hadoop environment using Spark/PySpark/RSpark
  • Ensure data integrity through – data quality, validation, governance and transparency
  • Production deployment and model monitoring to ensure stable performance and adherence to standards
Skills required:
  • Experienced professional with 10-12 years of experience developing and implementing statistical models in Big Data ecosystem, i.e., Hadoop, Spark, HBase, Hive / Impala or any other similar distributed computing technology
  • Proficiency with Python/R and basic libraries for statistical/econometric modeling such as scikit- learn, pandas
  • Experienced in Hadoop, Spark, HDFS, Python, R, PySpark and other leading technologies
  • Proficiency with DataIku or similar tools
  • Proficiency in data analysis using complex and optimized SQL and / or above-mentioned
  • technologies
  • Understanding of data architecture, structures, data modeling and database design and
  • performance management
  • Good written and verbal communication skills
Proficiency / Experience with the following a plus:
  • In-depth understanding of Statistics
  • Finance, Mortgages, Bank Deposit Products

More Openings

Project Manager
Embedded Engineer

Share This Job

Powered by