Data Collection Engineer

hace 3 días


Remote Spain Centric Software A tiempo completo

LOCATION
Candidates must be legally based in Spain due to employment and compliance regulations.

ABOUT US:
In today's complex retail landscape, characterized by economic fluctuations and supply chain challenges, consumers are more discerning, often comparing prices and seeking compelling products. Centric Pricing addresses this by enabling retailers and brands to deeply understand the competitive landscape post-product launch. By leveraging AI-driven insights, businesses can make informed decisions quickly, aligning product development, sourcing, costing, and pricing strategies with real-time market demands. 

The integration of Centric Pricing into Centric Software's platform provides an end-to-end solution that combines intelligence and execution capabilities. This empowers brands and retailers to optimize product availability, reduce time to market, and enhance product quality, ultimately improving the consumer experience and driving profitability. 

We are a key innovation partner for iconic and emerging brands across the world. Our Platform can analyze the info of more than 1.000 retailers, processing data from more than brands, tracking millions of products. 
 

INTRODUCTION:

As a Data Collection Engineer, you will be instrumental in building scalable and high-quality data collection systems, collaborating across teams to drive innovation and maintain the robustness of our data pipeline. 


WHAT YOU'LL DO:

Design and Build Robust Web Crawlers:

Develop and maintain spiders for high-scale data extraction using Scrapy. Ensure spiders are modular, reusable, and easy to maintain with components such as loaders, middlewares, and pipelines. Apply advanced techniques to bypass anti-bot mechanisms, including rotating proxies, captcha-solving strategies and fingerprinting 

Enhance and Maintain Infrastructure:

Build scalable CI/CD pipelines for automated testing, deployment, and monitoring of spiders. Leverage tools like Scrapyd for centralized spider scheduling and lifecycle management. Ensure efficient parallelization and cloud deployment for high-throughput crawling. 

Code Quality and Consistency:

Uphold coding standards and implement consistent practices across teams. Conduct thorough code reviews and mentor junior engineers on clean code principles. Maintain version control and detailed change logs for spider development. 

Monitoring, Maintenance & Reliability:

Integrate performance monitoring systems to ensure real-time alerts and health checks. Schedule periodic spider audits to handle structure changes and improve reliability. Troubleshoot failures and optimize resource usage (CPU/network) for crawling efficiency. 

Data Integrity and Accuracy:

Build robust data validation mechanisms to guarantee quality outputs. Collaborate with internal consumers to ensure data collected aligns with business requirements. Continuously track data anomalies and automate recovery strategies. 

Collaboration and Knowledge Sharing:

Work cross-functionally with product, engineering, and other data teams. Promote a culture of documentation, onboarding tools, and internal knowledge bases Contribute to training initiatives, helping the team stay current on scraping techniques and technologies. 



DESIRED TECHNICAL SKILLS AND EXPERIENCE: 

Core requirements:

Comfort with Git workflows, code reviews, and CI/CD pipelinesExperience with cloud infrastructure like AWS Experience with monitoring/observability systems like Grafana and Sentry. Knowledge of the Web environment (model, standards, DOM, Request-Response, Cookies, JavaScript, Browsers, Headers, XHR, etc.). Familiarity with TLS/SSL, TCP/IP stack, and low-level web networking is a strong plus. Bonus / Senior-Level expectations: Proficient in designing fault-tolerant systems and deploying them at scale.Familiarity with containerized deployments.Proficient in developing scalable web crawlers and data pipelines using Python and Scrapy. Experience building resilient scraping systems across diverse web architectures.Prior experience mentoring or leading junior developers.

Soft Skills and Work Ethic: 

Excellent communication skills in English, both written and spoken. A collaborative mindset with a proactive approach to knowledge sharing. Strong analytical thinking and problem-solving abilities. Commitment to continuous improvement, mentoring, and agile team dynamics. Remain up-to-date with technology trends to keep our software as innovative as possible. 

Centric Software provides equal employment opportunities to all qualified applicants without regard to race, sex, sexual orientation, gender identity, national origin, color, age, religion, protected veteran or disability status or genetic information.


  • Data Engineer

    hace 3 días


    Madrid, remote, Spain ConsenSys A tiempo completo

    About ConsenSys DataConsenSys Data sits within Consensys Software Inc. to help address all our variants of data, break down silos, enable best practices, provide first rate resources, and accelerate our mission of becoming a cutting edge data driven organization. We are using a mix of providing some centralized data engineering functions as a shared service,...

  • Data Collection Engineer

    hace 2 semanas


    spain Incode A tiempo completo

    POWER A WORLD OF TRUST Incode is the leading provider of world-class identity solutions that is reinventing the way humans authenticate and verify their identities online to power a world of digital trust. Through our revolutionary identity solutions, we are unleashing the business potential of universal industries including finance, government, retail,...

  • Senior Data Engineer

    hace 2 semanas


    Spain - Remote Rithum A tiempo completo

    Rithum is the world's most trusted commerce network, accelerating how brands, suppliers, and retailers work together to deliver seamless e-commerce experiences. We provide an unmatched platform for brands and retailers, enabling them to accelerate growth, optimise operations across channels, scale product offerings and enhance margins.Today, more than 40,000...

  • Data Engineer

    hace 1 semana


    Spain Remote, Spain, Spain Axmed A tiempo completo

    Our VisionWe believe in a world where everyone, regardless of their country's wealth or frontiers, enjoys access to medicines and healthcare when they need it.Our MissionWe work tirelessly to remove access barriers faced by patients and caregivers across Low and Middle-Income Countries (LMICs) when seeking quality medicines and quality healthcare.Who We...

  • Data Engineer

    hace 3 días


    Spain based Remote Wizeline A tiempo completo

    We are:Wizeline, a global AI-native technology solutions provider, develops cutting-edge, AI-powered digital products and platforms. We partner with clients to leverage data and AI, accelerating market entry and driving business transformation. As a global community of innovators, we foster a culture of growth, collaboration, and impact.With the right people...

  • Senior Data Engineer

    hace 1 semana


    Remote, N/A, Spain Parser Limited A tiempo completo

    This position offers you the opportunity to join a fast-growing technology organization that is redefining productivity paradigms in the software engineering industry. Thanks to our flexible, distributed model of global operation and the high caliber of our experts, we have enjoyed triple digit growth over the past five years, creating amazing career...

  • Presales Engineer

    hace 1 semana


    spain, spain Catenon A tiempo completo

    FUNCTIONS ¡Desde Catenon continuamos ayudando a nuestros clientes a crecer! En esta ocasión nos encontramos en la búsqueda de un/a Presales Engineer (Data Solutions) para incorporar a uno de nuestros clientes, una empresa líder en Europa en el desarrollo de soluciones de gestión empresarial, y uno de los principales partners de Microsoft. Como Presales...

  • QA Engineer

    hace 2 días


    Spain (remote) ClickHouse A tiempo completo

    About ClickHouseRecognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies. With over 2,000 customers and ARR that has more than quadrupled over the past year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads. ClickHouse's incredible...

  • Senior Python Engineer

    hace 1 semana


    Remote (Spain), Spain (Remote, Madrid), Spain (Remote, Barcelona) Archlet A tiempo completo

    What we are looking forA collaborative, empathetic, technology passionate engineer who enjoys achieving results together with others.Ability to design and architect web solutions, breaking down complex business problems into simple, robust, and effective implementations.Strong coding skills to execute and deliver maintainable, well-tested, and scalable...

  • Senior Data Engineer

    hace 1 semana


    Port of Spain thehivecareers A tiempo completo

    Location: Trinidad Tobago (Remote) Job Type: Full-time  Job Summary: A Senior Data Engineer is responsible for designing, building, and optimizing data infrastructure to support large-scale analytics and business intelligence.  Key Responsibilities: Data Architecture & Pipeline Development Design and maintain scalable data pipelines for efficient data...