Data acquisition engineer
hace 4 semanas
Contractor role; US-based company. We operate remotely - most of the Engineering team is CET. About Walkway Walkway builds AI-driven revenue intelligence for tours and activities. Operators use our platform for real-time analytics, competitive benchmarks, and dynamic pricing. Our data team collects large-scale web and API data to power these insights. The Role We have a small, focused group that owns source coverage and freshness. The Data Acquisition Lead sets priorities and reviews complex fixes; the Data Engineer maintains schemas, pipelines, and SLAs. You'll own day-to-day spider health and QA. Your focus is 80 percent web data collection and spider reliability; 20 percent light transformations when formats change so downstream tables stay consistent. You will keep pipelines healthy, support internal users, and run QA checks so data stays accurate at all times. This is an early-career role with significant growth. What you will do 80 percent - Spiders and data collectionBuild and maintain spiders and API collectors in Python/Java Script; adapt quickly when sites change. Handle HTTP basics: headers, cookies, sessions, pagination, rate limits, retries with backoff. Use browser automation when needed: Playwright or Puppeteer for dynamic pages. Triage and fix breakages: selectors, auth flows, captcha or antibot responses, proxy rotation. Monitor runs and freshness; create alerts and simple dashboards; escalate when SLAs are at risk. Write validation checks and source-level QA to prevent bad data from entering the warehouse. Document playbooks so fixes are repeatable.20 percent - Transformations, QA, and supportAdjust small Python or SQL transformations when a source output changes. Reconcile row counts and key fields against benchmarks; raise and resolve data quality issues. Collaborate with Data Engineers on schemas and idempotent loads into the warehouse. Update DAGs or jobs when source formats change so downstream tasks run idempotently and on schedule. Provide lightweight technical support to internal consumers.AlwaysFollow legal and ethical guidelines for data collection; respect terms, privacy, and access controls. Communicate clearly in English with engineers and non-technical stakeholders.Our stack (you do not need all of it)Node.js in Java Script or Type Script; async and await fundamentals. Crawlee framework: Playwright Crawler, Puppeteer Crawler, Http Crawler. Browser automation: Playwright or Puppeteer. HTTP-based crawling and DOM parsing: Cheerio. Large-scale crawling: request queues, autoscaled concurrency, session pools. Proxy providers: integration and rotation, residential or datacenter, country targeting, session stickiness. GCP basics: Cloud Run or Cloud Functions, Pub/Sub, Cloud Storage, Cloud Scheduler. Data: Big Query or Postgres fundamentals, CSV or Parquet handling.What you bringSome hands-on scraping experience; personal projects or internships are fine. Core web fundamentals: HTTP, headers and cookies, session handling, JSON APIs, simple auth flows. Comfortable in Node.js and Type Script or Java Script; willing to learn browser automation and concurrency patterns. Curiosity and high energy; you like chasing down failures and making things work again. Adaptable in a fast-changing environment; comfortable prioritizing under guidance. Experience with other web crawling frameworks, for example Scrapy, is valued and a plus. Schedule and orchestrate runs reliably using Cloud Scheduler and Airflow or Mage where appropriate, with clear SLAs and alerting.Nice to haveFamiliarity with antibot tactics and safe bypass strategies; rotating proxies; headless browsers. Basic SQL; comfort reading or writing simple queries for QA. Experience with Git Hub Actions, Docker, and simple cost-aware choices on GCP. Exposure to data quality checks or anomaly detection.Your first 90 days30 days: ship your first spider, add monitoring and a QA checklist, fix a real breakage end to end. 60 days: own a set of sources; reduce failure rate and mean time to repair; document playbooks. 90 days: propose a reliability or cost improvement; automate a repeat QA step.Why WalkwayReal impact on a data product used by operators. Ship quickly with a pragmatic, low-ego team; see your work move from concept to production fast. Fully remote with EU and US overlap; a few team gatherings per year; travel covered. Learn from senior engineers and grow toward data engineering or platform paths.How to apply Apply to this job offer and add in your resume links to a repo or code sample; if possible one example of a scraper you built and what it collected. If you are based in Europe, we would love to hear from you.
-
Data Acquisition Engineer
hace 4 semanas
Madrid, España Walkway A tiempo completoContractor role; US-based company. We operate remotely - most of the Engineering team is CET.About WalkwayWalkway builds AI-driven revenue intelligence for tours and activities. Operators use our platform for real-time analytics, competitive benchmarks, and dynamic pricing. Our data team collects large-scale web and API data to power these insights.The...
-
Data Acquisition Engineer
hace 3 semanas
Madrid, España Walkway A tiempo completoContractor role; US-based company.We operate remotely - most of the Engineering team is CET.About Walkway Walkway builds AI-driven revenue intelligence for tours and activities. Operators use our platform for real-time analytics, competitive benchmarks, and dynamic pricing. Our data team collects large-scale web and API data to power these insights.The Role...
-
Senior Software Engineer
hace 1 semana
Madrid, España Clarity AI A tiempo completoSenior Software Engineer - Data Platform Join to apply for the Senior Software Engineer - Data Platform role at Clarity AI Head of Talent Acquisition and Development at Clarity AI About Clarity AI
-
Data Engineer
hace 2 días
Madrid, España CIVIR A tiempo completo**Outcome**: implementation of an end-to-end data solution Tareas **Main responsibilities**: - Data acquisition and transformation. - Develop, construct, test, and maintain data architectures. - Prepare data for modelling. - Discover tasks that can be automated. - Identify ways to improve data reliability, efficiency, and quality. **Requisitos**: -...
-
Software Engineer for data acquisition
hace 2 semanas
Madrid, España DNV GL A tiempo completoData Acquisition of production data from different wind farms and other renewable assets.Todos los posibles candidatos deben leer con atención los siguientes detalles de este trabajo antes de presentar una candidatura.Execute IT and SCADA projects for all Wind OEMs.Managing, improving and organizing enterprise data.Combine raw information from different...
-
Data Engineer
hace 2 semanas
Madrid, España Page Personnel España A tiempo completo3 años de experiência como Data Engineer. Valorable CI/CD y certificación en GCP. Full remote|Proyección profesional 3 años de experiência como Data Engineer. Valorable CI/CD y certificación en GCP. Consultora con sede en Madrid dedicada al desarrollo de productos tecnológicos Oportunidades de carrera y desarrollo profesional. 3 años de...
-
Software Engineer for data acquisition
hace 2 días
madrid, España DNV GL A tiempo completoData Acquisition of production data from different wind farms and other renewable assets. Execute IT and SCADA projects for all Wind OEMs. Managing, improving and organizing enterprise data. Combine raw information from different sources to create consistent and machine-readable datasets. Translate requirements and designs into functional data pipelines...
-
Talent Acquisition
hace 4 días
Madrid, España Elevate| Agence Data & Technologies Marketing A tiempo completo# Talent Acquisition - Madrid - VIE## Elevate recrute !* Mesure d’audience : digital analytics, tracking, ...* Personnalisation de l’expérience client : AB Testing, Personnalisation, E-Merchandising,...* Optimisation des campagnes publicitaires : MMM, Attribution/Contribution Marketing...* Segmentation client et scoring : RFM, clustering,machine...
-
Talent Acquisition
hace 1 semana
Madrid, España Elevate| Agence Data & Technologies Marketing A tiempo completo# Talent Acquisition - Madrid - VIE## Elevate recrute !* Mesure d’audience : digital analytics, tracking, ...* Personnalisation de l’expérience client : AB Testing, Personnalisation, E-Merchandising,...* Optimisation des campagnes publicitaires : MMM, Attribution/Contribution Marketing...* Segmentation client et scoring : RFM, clustering,machine...
-
GCP Data Engineer
hace 2 semanas
Madrid, España NTT DATA, Inc. A tiempo completoNTT DATA somos todas las personas que la formamos. Un equipo de más de 190.000 profesionales, tan diverso cómo diversos son los más de 50 países en los que estamos presentes y los diferentes sectores en los que desarrollamos nuestra actividad; telecomunicaciones, entidades financieras, industria, utilities, energía, administración pública y sanidad....