Senior Data Engineer

Development Team | Buffalo, NY United States

We are seeking a highly motivated team member with solid expertise in utilizing “Big Data” techniques to develop and optimize internal processes for collecting, warehousing and transforming ever-growing data pipelines. Our continually-growing data repositories include large (100 TB+) amounts of structured and unstructured data from external sources (customers, public datasets) and internally generated structured data.

By taking ownership of our company-wide ability to ingest, QA, and access data, you will directly enable development of new products and increase the productivity of our mechanical engineers, business analysts, software engineers, and delivery team.

Responsibilities

  • Take ownership of improving our company-wide ability to ingest, QA, and access data through efficient deployment of variety of “Big Data” and conventional techniques/technologies.
  • Manage the architecture and workflow of various data repositories/ databases (Time Series DB, Alarms DB, Met Tower data DB, and Work Order DB).
  • Develop, cloud-deploy and maintain automated pipelines to clean and load large amounts of time series data and alarm data received from customers into our databases.
  • Maintain current legacy pipelines to clean and load large amounts of time series data received from customers into our databases and lead efforts to assimilate these legacy pipelines into new processes.
  • Support individual teams on tight deadlines with custom data loading and extraction needs, while keeping an eye out for how to handle these new use cases more swiftly by improving existing automated processes.
  • Work closely with mechanical engineers, business analysts, developers, UI teams and our delivery teams to integrate or replace manual data QA systems with automated, UI-enabled platforms.
  • Communicate data workflows through wiki documentation and team training.
  • Provide regular and consistent mentoring to other software engineers on the development, deployment, and maintenance of big data systems.

Requirements

  • Clear communication skills with different backgrounds (software, other engineering, business).
  • Strong problem solving and troubleshooting skills with ability to provide examples.
  • Experience maintaining data and metadata architecture (bonus points for designing it).
  • Experience maintaining data pipelines (bonus points for designing them).
  • Experience cleaning structured and unstructured data.
  • Experience deploying production applications or workflows to a public cloud (bonus points for AWS).
  • Recent experience with one or more “big data” storage solutions (Hadoop, Kafka, MongoDB, Cassandra, bonus points for Redshift).
  • Experience with data security management.
  • Experience working with short-term data management assignments (for example data QA, business intelligence).
  • Experience using ETL processes.
  • Experience with SQL programming and data warehousing systems in an enterprise setting.
  • Experience with shell scripts (bash, csh, etc.) in a Linux cloud environment.

Preferred

  • Experience with the OPC SCADA communications protocol and SCADA/HMI historians (e.g. PI).
  • Proficiency in analytics-driven languages/platforms e.g. Python, Perl, R, etc.
  • Experience working with wind farm SCADA systems.
  • Experience managing AWS infrastructure for Redshift, RDS, EC2.
  • Familiarity writing AWS Lambda functions.
  • Familiarity with time-series and NoSQL databases.
  • Proficiency working with as many as possible: Clojure, Elixir, C++, and Python.
  • Git proficiency: branching, merging, pull requests, etc.

About the Company

Sentient Science Corporation is a leader in predicting the durability and reliability of complex rotating equipment. Sentient Science’s cloud-based software, DigitalClone Live, manages the health and life extension of rotating mechanical equipment using a materials science-based approach. The technology models and simulates components and full systems within the aerospace, rail and wind energy industries. DigitalClone applies materials science and physics-based modeling techniques to simulate rotating equipment under representative operational loads and conditions. Operators optimize their maintenance cycles and lower the pre-and post-installation costs of “rotating” systems through life extension actions. Equipment manufacturers use the software for design tradeoff and sensitivity analysis and to prove their life claims to the market. Sentient Science was recognized by the White House in 2014 with the Tibbetts Award and Bloomberg New Energy Finance Pioneers Award in 2016.

Please submit your Résumés and Cover Letter to [email protected].

(We DO NOT accept résumés from third party recruitment agencies)