Senior Data Engineer (Python)
Proxify
Job Description
<h2><strong>About us:</strong></h2>
<p> </p>
<p>Talent has no borders. Proxify's mission is to connect top developers around the world with the opportunities they deserve. So, it doesn't matter where you are; we are here to help you fast-track your independent career in the right direction. 🙂</p>
<p>Since our launch, Proxify's developers have successfully worked with 1200+ happy clients to build their products and growth features. 5000+ talented developers trust Proxify and its network to fulfill their dreams and objectives.</p>
<p>Proxify is shaped by a global network of supportive, talented developers interested in remote full-time jobs. Our Glassdoor (4.5/5) and Trustpilot (4.8/5) ratings reflect the trust developers place in us and our commitment to our members' success.</p>
<p> </p>
<h3><strong>The Role:</strong></h3>
<p> </p>
<p>We are looking for a Senior Data Engineer to architect and scale the data foundations for one of our high-growth client products. The ideal candidate is a Python expert who treats data infrastructure as software, implementing CI/CD, unit testing, and observability into every layer of the modern data stack. You are a perfect candidate if you are growth-oriented, you love what you do, and you enjoy working on new ideas to develop exciting products.</p>
<p> </p>
<h3><strong>What we’re looking for:</strong></h3>
<p> </p>
<ul>
<li>
<p>5+ years of experience building complex data processing applications using Python (Pandas, PySpark, or Dask).</p>
</li>
<li>
<p>Advanced SQL skills for complex transformations, window functions, and query optimization in cloud warehouses.</p>
</li>
<li>
<p>Deep experience with dbt (data build tool) for managing the T in ELT, including documentation and testing.</p>
</li>
<li>
<p>Proven experience with Apache Airflow, Prefect, or Dagster for managing complex dependency graphs.</p>
</li>
<li>
<p>Hands-on experience with Snowflake, BigQuery, or AWS Redshift.</p>
</li>
<li>
<p>Strong understanding of Dimensional Modeling (Star/Snowflake schema) and Data Vault 2.0.</p>
</li>
<li>
<p>Experience with Git, Docker, and implementing CI/CD for data pipelines.</p>
</li>
</ul>
<p> </p>
<h3><strong>Nice-to-Have:</strong></h3>
<p> </p>
<ul>
<li>
<p>Experience building Real-time Pipelines using Kafka or Flink.</p>
</li>
<li>
<p>Familiarity with Data Contracts and Data Quality frameworks (Great Expectations, Monte Carlo).</p>
</li>
<li>
<p>Knowledge of Vector Databases (Pinecone, Milvus) for AI/LLM applications.</p>
</li>
<li>
<p>Infrastructure as Code (Terraform) experience.</p>
</li>
</ul>
<p> </p>
<h3><strong>Responsibilities:</strong></h3>
<p> </p>
<ul>
<li>
<p>Build and maintain scalable, automated ELT/ETL pipelines that provide a "single source of truth" for the organization.</p>
</li>
<li>
<p>Implement rigorous automated testing and monitoring to ensure data integrity and reliability.</p>
</li>
<li>
<p> Optimize warehouse storage and compute costs while reducing pipeline latency.</p>
</li>
<li>
<p>Pa