























We are looking for a practical Data Automation Engineer. You will be responsible for the entire process from large-scale acquisition of external data to automated workflows, ensuring that cleaned and structured data is delivered to our internal systems and Agents. Your work will directly impact the company’s decision-making by ensuring reliable access to critical business data from various internal and third-party sources.
Data Acquisition (~30%):
• Design, develop, and maintain robust, scalable data pipelines. Data sources include both internal and external, covering real-time Feeds and batch loads, as well as structured and unstructured data. • Parse and extract structured data from HTML, JS-rendered pages (using headless browsers), APIs, PDFs, and other unstructured formats. • Ensure data quality and reliability by building comprehensive monitoring, logging, and data freshness alerting mechanisms.
Workflow Automation & Orchestration (~70%):
• Use OpenClaw or a similar framework as the core orchestration layer to define, schedule, and manage chat-based and automated agentic workflows. • Build automated pipelines that integrate databases, data warehouses, internal APIs, and reporting tools. • Implement fault-tolerant workflow logic, including error handling, retry mechanisms, conditional branching, and graceful failure notification. • Collaborate with data analysts to understand requirements and automate manual data extraction and processing tasks to improve operational efficiency. • Create data architecture diagrams (schemas), data dictionaries, pipeline designs, and operational manuals for the systems under your responsibility.
• 1-5 years of hands-on experience in data engineering or automation-related fields. • Hands-on experience in building, scheduling, and monitoring automation tasks using OpenClaw, Hermes Agent, or similar RPA/workflow orchestration tools (e.g., Apache Airflow, n8n, UI.Vision). • In-depth understanding of the Chinese internet ecosystem, including the structure and access patterns of mainstream platforms. • Experience with SQL databases (e.g., PostgreSQL, MySQL) and cloud storage (e.g., S3, OSS) for data storage and pipeline construction. • Experience with Docker in containerized/cloud environments. • Good English reading and writing skills, with the ability to communicate effectively with stakeholders.
• Knowledge of proxy service providers. • English speaking ability is a plus, but not a hard requirement. • Experience working in startups or small and medium-sized enterprises (SMEs), with the ability to operate with high autonomy and possess pragmatic problem-solving skills. • Based in Shenzhen is a plus.
• Primary Languages: Python / Go • Crawler & Browser Automation: Playwright, Puppeteer, Selenium, BeautifulSoup, Scrapy • Workflow Orchestration: OpenClaw, Apache Airflow • Data Storage: PostgreSQL, S3 or similar products / 阿里云 OSS • Infrastructure: Docker, Git, Linux • Proxy & Basic Services: Various residential/datacenter proxy networks
• Competitive compensation • Flexible work arrangements ([Remote Work Policy - Must be located in Shanghai/Shenzhen/Beijing]) • Learning budget and time for personal skill development
Favorite Medium is a digital product design and engineering consulting firm with years of experience. Headquartered overseas, its team is distributed globally, with primary operations in South Korea, Japan, Hong Kong, and other regions. We specialize in helping businesses design, develop, and launch digital products from 0 to 1, covering product strategy, UI/UX design, software development, data engineering, AI, and web3.
About submitting resumes, inquiries, and interview details The entire interview process is conducted online. Once your resume is approved, we will email you to schedule an online interview.
If you are interested in this position, please attach your resume and send an email to [email protected] For any other questions, feel free to add WeChat: atomkwk for inquiries.
This content is automatically aggregated by InertiaRSS (RSS Reader) for reading reference only. Original from — Copyright belongs to the original author.