Data Platform Engineer | Fully Remote Work

Tokyo
Fulltime
Full Remote
Famous Start-up
Own Products/Services
Company Image
Industry
AI/NLP Company
IT Skills
Terraform, Node.js, Typescript, Python
Working hours
Salary
5 Million yen〜9 Million yen
Job Description
【Company Overview】 The company supports corporate transformation by leveraging cutting-edge Generative AI and Large Language Model (LLM) technologies. Its core strength lies in its technical capability to develop domestic LLMs from scratch. In May 2024, it released a Japanese-specialized LLM with 100 billion parameters, one of the largest in Japan. While many companies limit themselves to fine-tuning overseas models or utilizing Open Source Software (OSS), the company designs and builds models with its own hands from the ground up. Its performance, optimized for the Japanese language and business domains, stands at the highest level in the country. Furthermore, it has significantly reduced hallucinations and can be operated safely under the company's own management. Rather than being a mere "user of Generative AI," the company aims to be an "architect of AI itself," supporting Japan's industrial competitiveness through technology. The company is expanding its business by leveraging this technical prowess; it currently operates an AI agent for the manufacturing industry as a SaaS business and a business AI implementation support platform as a new PaaS venture. Since its founding, it has targeted enterprise companies and has already been adopted by over 300 organizations, including 30% of the Nikkei 225. Moving forward, the company plans to expand from manufacturing R&D to domestic enterprises, the Japanese business community as a whole, and eventually to global corporations. In October 2024, it completed a Series D funding round of 4.5 billion yen, bringing its total funding to 8.8 billion yen. With its workforce reaching approximately 150 employees, the company is now seeking new talent to drive further growth. 【Current Challenges】 ・Developing and operating applications and infrastructure capable of crawling, extracting content from, and distributing hundreds of millions of web-based documents. ・Building and maintaining data pipelines to analyze and distribute web-based literature (news, academic papers, patents) as well as clients' internal documents. ・Establishing a monitoring foundation to ensure the stable collection and continuous expansion of web-based document repositories. 【Responsibilities】 The company provides multiple products that support corporate decision-making and business transformation by collecting and structuring vast amounts of public information relevant to business activities—such as news, patents, academic papers, and web information—leveraging cutting-edge Generative AI technology. In this position, you will be responsible for the design, development, and operation of web crawlers and document distribution pipelines, which serve as the common data distribution system underlying all these products. This role requires development with a strong focus on stability, scalability, and extensibility, as it forms the data foundation supporting the core value of the products. While driving development to solve the stated challenges as an engineer, you are also expected to lead the data platform team to maximize overall productivity. ■Specifically ・Developing and operating a data platform that distributes web news, patents, and academic papers using TypeScript and Python. ・Improving scalability and designing monitoring for distributed processing within a serverless environment. *[Scope of Change] All duties related to development. 【Team Composition】 Business Head (CEO): 1 Head of Engineering: 1 Product Engineers: 7 AI Agent Engineers: 6 Structuring Engineers: 3 Data Engineers: 3 Product Manager: 1 Researcher: 1 Customer Success (CS): 2 Business Development (Biz): 8 【Development Environment】 [Programming Languages] Data Pipeline: Python Web Crawler: TypeScript (Node.js) [Containers] Docker [IaC] Terraform [Cloud] AWS [Libraries] PySpark, Puppeteer [AI Tools] Cursor, CodeRabbit, Devin 【Position Highlights】 ・By engaging in the development and operation of the document distribution system shared across all products, you can gain the following experience: ・Designing, developing, and operating large-scale distributed processing in a serverless environment, handling document data on a scale of hundreds of millions of records. ・Acquiring expertise in monitoring and operational design tailored for unstable external environments, while leveraging both web front-end and back-end knowledge through the development of web crawlers covering tens of thousands of sites. ・Committing to the business from a technical perspective by proactively making new development and improvement proposals based on product requirements and customer challenges, in collaboration with PdMs and Customer Success teams.
Required Skills
【Required】 ※Meet all of the following 3+ years of engineering experience Experience in application development using Python End-to-end experience spanning requirements definition, design, implementation, and operations At least one of the following experiences: ・Full-stack development/operations using serverless technologies and leadership experience (Lambda, ECS, Fargate, Step Functions, etc.) ・Development/operations experience building data pipelines/ETL, workflow engines, or using distributed processing frameworks (e.g., Spark) 【Preferred】 Master's or doctoral degree in a technical field related to computer science Experience leading team development Experience developing and operating in-house services for customers Experience developing and operating services for document search and recommendation Broad knowledge and operational experience with serverless computing Development/operational experience using Terraform 【Ideal Applicants】 Enjoys computer science and distributed processing Enjoys following and verifying the latest theories and case studies Enjoys not only tackling assigned tasks but also formulating questions independently and devising necessary solutions Thinking about how to contribute to the product while communicating with the business side Positive and challenging spirit 
Required Language Skills
Japanese Level
Business Level
English Level
None
Other Language Skills
Banner Background

Get Our Support Now!

Feel Free to Start Off by Casual Meeting with Us! Click Below to Register.