Information Technology

Remote Senior Data Engineer

Permanent

Friedman Williams

Remote Senior Data Engineer NY NJ

Job ID: 20400

We’re seeking a Senior Data Engineer to design, build, and operate mission-critical data pipelines and platforms. You will lead development of Apache Airflow (Astronomer) orchestration, including DAG creation using custom Python code (operators, hooks, plugins) and triggering Boomi integrations to coordinate and monitor data workflows across systems. You’ll design dbt models and tests aligned to a Medallion Architecture (Bronze/Silver/Gold), and ensure high levels of quality, reliability, observability, and performance in a SnowfLake environment. This role partners closely with analytics, product, and business stakeholders and includes ownership of L1/production support, proactive monitoring, and rapid remediation.

Citizen or Green Card Only

Key Responsibilities
Data Pipeline Architecture & Design
· Design scalable, testable ELT pipelines structured around the Medallion Architecture with clear layer contracts.
· Model normalized schemas in dbt (staging/3NF → marts) and standardized transformation patterns (macros, packages).
· Define data contracts, SLAs/SLOs, and versioned interfaces between layers.
WorkfLow Orchestration (AirfLow/Astronomer)
· Author, schedule, and operate complex Airflow DAGs; implement custom operators, sensors, and hooks for REST/SaaS and databases.
· Integrate with Boomi (iPaaS) to trigger pipelines and exchange run states (e.g., webhooks, APIs); build robust retries, idempotency, and backfills.
· Deploy and manage Airflow via Astronomer, including environment promotion, connections, secrets, and runtime upgrades.
· Instrument alerting, logging, SLAs, and on-failure callbacks; drive incident response and root cause analysis.
Data Modeling, Quality & Governance
· Build and orchestrate dbt models; enforce standards (naming, sources, snapshots, incremental strategies, tests).
· Implement data quality at multiple layers: dbt tests (schema, singular/custom), Great
Expectations/dbt-utils/dbt-audit-helper suites, anomaly checks, and audit trails.
· Partner with data governance on ownership, lineage, RBAC, PII handling, and compliance.

Performance & Cost Optimization
· Tune Snowflake (warehouses, micro-partitioning, clustering, caching), query optimization, and cost controls.
· Benchmark pipelines, identify bottlenecks, and implement parallelism, partitioning, and efficient file/volume strategies.
CI/CD & DevEx
· Build GitHub-based CI/CD for Airflow DAGs and dbt projects (linting, unit/integration tests, environment promotion, artifact versioning).
· Champion engineering hygiene: code reviews, testing pyramids, IaC for data where applicable, and high-quality documentation.
Observability & Support
· Establish end-to-end observability (task metrics, data freshness, row-level validations, lineage).
· Own L1 & production support with clear escalation paths and on-call participation; track incidents and preventive actions.
Collaboration & Leadership
· Work in Agile ceremonies (backlog grooming, sprint planning, demos, retros).
· Mentor peers, drive design reviews, and communicate complex topics to non-technical audiences.

Required Skills & Experience
· 8+ years in data engineering / data platform roles, with 5+ years of building and operating production pipelines.
· Expert in Apache Airflow (DAG design, custom operators/hooks, backfills, SLAs, retries, failure handling) and Astronomer deployment/ops.
· Proven Python expertise in Airflow (plugins, packaging, unit tests, typing, logging) and integration work.
· Hands-on Boomi experience integrating with orchestration (triggering runs, callbacks, API/webhook patterns).
· Deep dbt knowledge (modeling, macros, packages, dbt tests & audits, sources/snapshots, exposures).
· Strong Snowflake skills (warehouse sizing, performance tuning, security/RBAC, tasks/streams, cost governance).
· Solid Git/GitHub workflows and CI/CD for data (GitHub Actions or similar) including automated testing and promotions.
· Track record implementing observability and data quality at scale (freshness SLAs, test coverage, lineage, run health).
· Excellent communication, documentation, and stakeholder partnership; self-starter and systematic problem solver.

· Industry experience in legal, professional services, healthcare, or finance (compliance-heavy environments).

Preferred Qualifications
· Experience with data mesh/domain-driven data ownership and product thinking.
· Familiarity with data cataloging/metadata tools (e.g., OpenLineage, DataHub, Collibra, Alation, Monte Carlo).
· Cloud experience with Snowflake and at least one major cloud (Azure, AWS, or GCP).
· Knowledge of regulatory frameworks and data security best practices (e.g., HIPAA, SOC 2, SOX) relevant to legal/healthcare/finance.
· Experience on Linux server setup and configuration is a plus

What Success Looks Like (90–180 Days)
· 90 days: Stable Astronomer environments, standardized DAG patterns, dbt conventions in place, and baseline DQ tests covering critical tables.
· 180 days: Boomi-to-Airflow orchestration fully operational, Gold-layer marts powering BI with documented SLAs, measurable reductions in failed runs and compute costs.

For immediate consideration, please submit your resume and cover letter to Todd Grossman @ tgrossman@friedmanwilliams.com 

Tagged as: Remote Senior Data Engineer