Data Acquisition Engineer
hace 8 horas
Descripción del trabajoContractor role; US-based company. We operate remotely - most of the Engineering team is CET.Por favor, presente su candidatura sin demora si su perfil encaja bien con este puesto, debido al alto nivel de interés.About WalkwayWalkway builds AI-driven revenue intelligence for tours and activities. Operators use our platform for real-time analytics, competitive benchmarks, and dynamic pricing. Our data team collects large-scale web and API data to power these insights.The RoleWe have a small, focused group that owns source coverage and freshness. The Data Acquisition Lead sets priorities and reviews complex fixes; the Data Engineer maintains schemas, pipelines, and SLAs. You’ll own day-to-day spider health and QA.Your focus is 80 percent web data collection and spider reliability; 20 percent light transformations when formats change so downstream tables stay consistent. You will keep pipelines healthy, support internal users, and run QA checks so data stays accurate at all times. This is an early-career role with significant growth.What you will do80 percent - Spiders and data collectionBuild and maintain spiders and API collectors in Python/JavaScript; adapt quickly when sites change.Handle HTTP basics: headers, cookies, sessions, pagination, rate limits, retries with backoff.Use browser automation when needed: Playwright or Puppeteer for dynamic pages.Triage and fix breakages: selectors, auth flows, captcha or antibot responses, proxy rotation.Monitor runs and freshness; create alerts and simple dashboards; escalate when SLAs are at risk.Write validation checks and source-level QA to prevent bad data from entering the warehouse.Document playbooks so fixes are repeatable.20 percent - Transformations, QA, and supportAdjust small Python or SQL transformations when a source output changes.Reconcile row counts and key fields against benchmarks; raise and resolve data quality issues.Collaborate with Data Engineers on schemas and idempotent loads into the warehouse.Update DAGs or jobs when source formats change so downstream tasks run idempotently and on schedule.Provide lightweight technical support to internal consumers.AlwaysFollow legal and ethical guidelines for data collection; respect terms, privacy, and access controls.Communicate clearly in English with engineers and non-technical stakeholders.Our stack (you do not need all of it)Node.js in JavaScript or TypeScript ; async and await fundamentals.Crawlee framework : PlaywrightCrawler, PuppeteerCrawler, HttpCrawler.Browser automation : Playwright or Puppeteer.HTTP-based crawling and DOM parsing : Cheerio.Large-scale crawling : request queues, autoscaled concurrency, session pools.Proxy providers : integration and rotation, residential or datacenter, country targeting, session stickiness.GCP basics : Cloud Run or Cloud Functions, Pub/Sub, Cloud Storage, Cloud Scheduler.Data : BigQuery or Postgres fundamentals, CSV or Parquet handling.What you bringSome hands-on scraping experience ; personal projects or internships are fine.Core web fundamentals : HTTP, headers and cookies, session handling, JSON APIs, simple auth flows.Comfortable in Node.js and TypeScript or JavaScript; willing to learn browser automation and concurrency patterns.Curiosity and high energy; you like chasing down failures and making things work again.Adaptable in a fast-changing environment; comfortable prioritizing under guidance.Experience with other web crawling frameworks, for example Scrapy, is valued and a plus.Schedule and orchestrate runs reliably using Cloud Scheduler and Airflow or Mage where appropriate, with clear SLAs and alerting.Nice to haveFamiliarity with antibot tactics and safe bypass strategies; rotating proxies; headless browsers.Basic SQL; comfort reading or writing simple queries for QA.Experience with GitHub Actions, Docker, and simple cost-aware choices on GCP.Exposure to data quality checks or anomaly detection.Your first 90 days30 days: ship your first spider, add monitoring and a QA checklist, fix a real breakage end to end.60 days: own a set of sources; reduce failure rate and mean time to repair; document playbooks.90 days: propose a reliability or cost improvement; automate a repeat QA step.Why WalkwayReal impact on a data product used by operators.Ship quickly with a pragmatic, low-ego team; see your work move from concept to production fast.Fully remote with EU and US overlap; a few team gatherings per year; travel covered.Learn from senior engineers and grow toward data engineering or platform paths.How to applyApply to this job offer and add in your resume links to a repo or code sample; if possible one example of a scraper you built and what it collected.If you are based in Europe, we would love to hear from you.#J-18808-Ljbffr
-
Data Acquisition Engineer: Remote Web Scraping Lead
hace 8 horas
Lobelle, España Walkway A tiempo completoUna empresa basada en EE. UU. busca un Data Acquisition Lead para liderar la recolección de datos web. Este rol implica construir spiders y manejar transformaciones ligeras. Se requiere experiencia básica en scraping, así como conocimientos en Node.js y TypeScript. El puesto ofrece la oportunidad de trabajar de forma remota con un equipo ágil y recibir...
-
Lobelle, España Walkway A tiempo completoUna empresa basada en EE.La siguiente información tiene como objetivo proporcionar a los posibles candidatos una mejor comprensión de los requisitos para este puesto.UU.busca un Data Acquisition Lead para liderar la recolección de datos web.Este rol implica construir spiders y manejar transformaciones ligeras.Se requiere experiencia básica en scraping,...
-
Data Engineer AWS
hace 4 días
Lobelle, España Keepler Data Tech A tiempo completoUna empresa de tecnología en datos busca un Data Engineer AWS para desarrollar plataformas de datos en entornos cloud utilizando AWS. Se requieren al menos 2 años de experiencia en ingeniería de datos, conocimientos en Python, Spark y habilidades en SQL. La empresa ofrece un ambiente de trabajo colaborativo, flexibilidad horaria, y un presupuesto anual de...
-
Data Engineer AWS
hace 3 días
Lobelle, España Keepler Data Tech A tiempo completoUna empresa de tecnología en datos busca un Data Engineer AWS para desarrollar plataformas de datos en entornos cloud utilizando AWS.Por favor, asegúrese de leer completamente el resumen y los requisitos de esta oportunidad de empleo que se detallan a continuación.Se requieren al menos 2 años de experiencia en ingeniería de datos, conocimientos en...
-
Data Engineer AWS
hace 4 días
Lobelle, España Keepler Data Tech A tiempo completoEn Keepler queremos hacer crecer nuestro equipo con personas que tengan ganas de desarrollar software basado en datos con dos objetivos: ayudar en la transformación a nuestros clientes y disfrutar del proceso de crear valor a través de la tecnología.Inscríbase rápido, consulte la descripción completa desplazándose hacia abajo para conocer todos los...
-
Data Engineer
hace 8 horas
Lobelle, España Itequia A tiempo completoEn Itequia, somos una empresa tecnológica especializada en soluciones digitales a medidas, y colaboramos con grandes compañías líderes en sus sectores.Compruebe a continuación si tiene lo necesario para esta oportunidad y, si es así, envíe su solicitud lo antes posible.Buscamos incorporar un/a Data Engineer para integrarse en el equipo de uno de...
-
Data Engineer
hace 8 horas
Lobelle, España Xebia A tiempo completoFor more than 20 years, our global network of passionate technologists and pioneering craftspeople has delivered cutting‑edge technology and game‑changing consulting to companies on the brink of AI driven digital transformation. Since 2001, we have grown into a full service digital consulting company with 5500+ professionals working on a worldwide...
-
Data Engineer: Azure Data Factory
hace 8 horas
Lobelle, España Itequia A tiempo completoUna empresa tecnológica busca un/a Data Engineer para unirse al equipo de un cliente multinacional en Madrid. La persona seleccionada se encargará de diseñar pipelines de datos en Azure, modelar el Data Warehouse para KPIs y optimizar datasets en Power BI. Se requiere sólida experiencia en ingeniería de datos, conocimientos en SQL y PySpark, así como...
-
Data Engineer
hace 1 día
Lobelle, España Xebia A tiempo completoFor more than 20 years, our global network of passionate technologists and pioneering craftspeople has delivered cutting‑edge technology and game‑changing consulting to companies on the brink of AI driven digital transformation. Since 2001, we have grown into a full service digital consulting company with 5500+ professionals working on a worldwide...
-
Senior Data Engineer
hace 8 horas
Lobelle, España WeHunt España A tiempo completoDesplácese hacia abajo para encontrar los detalles completos de la oferta de trabajo, incluyendo la experiencia requerida y las funciones y tareas asociadas.Sobre el perfil:Buscamos un Senior Data Engineer con especialización en Starburst (Trino) y Dell Data Lakehouse para unirse a nuestro equipo de IA y Datos .Serás responsable de desplegar, mantener y...