AI Engineer (Document Structuring / LLM New Business) | Flextime & Hybrid Work
AI Engineer (Document Structuring / LLM New Business) | Flextime & Hybrid Work
Tokyo
Fulltime
Full Remote
Famous Start-up
Own Products/Services
Industry
AI/NLP Company
IT Skills
Python
Working hours
Salary
7 Million yen〜11 Million yen
Job Description
【Company Overview】
The company supports corporate transformation by leveraging cutting-edge Generative AI and LLM technology.
Its core strength lies in the technical prowess required to develop domestic LLMs from scratch. In May 2024, the company released a Japanese-specialized LLM with 100 billion parameters, one of the largest in Japan.
While many companies limit themselves to fine-tuning overseas models or utilizing OSS, the company designs and builds its models from the ground up. Its performance, optimized for the Japanese language and business domains, boasts the highest standards among domestic models. Furthermore, it significantly reduces hallucinations and can be operated safely under internal management.
Rather than being a mere "user of Generative AI," the company aims to be an "AI creator" that supports Japan's industrial competitiveness through technology.
Leveraging this technical strength, the company currently operates an AI agent for the manufacturing industry as its SaaS business, and a business AI implementation support platform as its new PaaS venture.
Targeting enterprise clients since its founding, the company’s solutions have already been adopted by over 300 companies, including 30% of the Nikkei 225. Moving forward, starting with R&D in the manufacturing sector, the company plans to expand its reach to domestic enterprises, the broader Japanese market, and global corporations.
In October 2024, the company completed a Series D funding round of 4.5 billion yen, bringing its cumulative funding to 8.8 billion yen. With the workforce now reaching 150 employees, the company is recruiting new talent—including you—to drive further growth!
【Current Challenges】
In the new LLM business launched in 2024, there is an urgent need to utilize the vast amounts of "Excel documents" held by enterprise clients. However, converting and structuring Excel’s unique and complex specifications—such as merged cells, multi-tier headers, and visual layouts—into a format that LLMs can interpret with high precision is technically demanding. Standard parsers simply cannot handle these complexities.
Currently, the company lacks sufficient engineering resources to focus exclusively on this critical and challenging theme of "Excel Structuring." We need someone who can implement parsing logic based on a deep understanding of these specifications and take responsibility for improving accuracy. This is where you come in.
【Responsibilities】
As a Backend Engineer for the company’s new SaaS business leveraging LLMs, you will lead the development and implementation of a document structuring engine, with a particular focus on Excel files.
In this position, you will build robust, production-level data processing pipelines while utilizing your expertise in Machine Learning (ML) and Data Science.
■Specifically
・You will develop parsing logic to logically structure complex table layouts by gaining a deep understanding of Excel specifications (such as Office Open XML).
・You will build accuracy evaluation environments for structured data and drive continuous logic improvements.
・You will implement and operate backend APIs and data processing pipelines using Python.
・You will lead the data generation process for RAG (Retrieval-Augmented Generation) in collaboration with ML engineers and data scientists.
【Team Composition】
<PaaS Unit>
Business Lead (CEO) 1
Development Lead 1
Product Engineers 7
RAG Engineers 5
Machine Learning Engineers 3
Product Manager 1
Researcher 1
Customer Support 1
Business Development 3
New Business Planning 5
【Development Environment】
[Languages/Frameworks]
Python, TypeScript, Vue.js, Node.js
[Containers]
Docker
[Infrastructure as Code]
Terraform
[Cloud Platforms]
AWS, Azure
【Position Highlights】
・Intellectual Challenges in the Depths of File Specifications: You can immerse yourself in "Deep Technical Development" by diving into specifications like Office Open XML at a fundamental level, going far beyond the mere use of existing libraries.
・Integrating Academic ML Expertise with Real-World Implementation: You can elevate mathematical thinking into robust, production-level code, making full use of both your ML research background and your practical engineering skills.
・Solving the "Last Mile" of Enterprise RAG: You will drive corporate DX by providing solutions for Excel data that current LLMs cannot fully utilize, effectively bridging the final gap in enterprise-grade Retrieval-Augmented Generation.
Required Skills
【Required】
※Meet all of the following
Completion of a graduate program in science or engineering, or equivalent knowledge of mathematics and algorithms
Experience in backend development for web applications or data processing platforms using Python (3+ years preferred)
Experience implementing logic that handles complex data structures
e.g., Experience generating internal or experimental data and preparing it for use as data
【Preferred】
Experience researching and applying machine learning (ML) and data science during undergraduate or graduate studies
Advanced data processing experience using libraries such as pandas and openpyxl
Basic knowledge of Computer Vision (CV) (useful for understanding layout analysis)
Experience implementing ETL processes as a data engineer
Deep knowledge of file specifications such as Office Open XML
【Ideal Applicants】
Possesses an ML/mathematical background while being dedicated to engineering (implementation) as a role
Finds enjoyment in unraveling and hacking complex data and document structures
Able to translate research and development tasks into highly maintainable product code
Seeks to collaborate with the business side to pursue data structuring that delivers customer value
Required Language Skills
Japanese Level
Business Level
English Level
Business Level
Other Language Skills
Our Bilingual Career Consultants Will Provide Full-support for Your Job Transferring.
Wacky Japanese Resumes.
Worries related to visa sponsorship.
Ways to Handle salary and other negotiations.
Annoying Interview Scheduling and Other Paper Works Related.
Tips for Passing the Interviews.
※Results for 2022
Our Bilingual Career Consultants Will Provide Full-support for Your Job Transferring.
Wacky Japanese Resumes.
Worries related to visa sponsorship.
Ways to Handle salary and other negotiations.
Annoying Interview Scheduling and Other Paper Works Related.
Tips for Passing the Interviews.
Related Jobs
NEW
Tokyo
Senior Data Engineer (Advanced Marketing Support utilizing Cutting-edge Technology) | Hybrid Work
Fulltime
Remote
Large Scale Company
Famous Start-up
Industry
Product, Solution and Service Development for Technology Strategy
Japanese skills
Business Level
IT skills
None(なし)
Salary
7million yen 〜 15million yen
NEW
Tokyo
Fulltime
Remote
Large Scale Company
Famous Start-up
Industry
Product, Solution and Service Development for Technology Strategy
Japanese skills
Business Level
IT skills
None(なし)
Salary
10million yen 〜 14million yen
NEW
Tokyo
Fulltime
Remote
Large Scale Company
Famous Start-up
Industry
Product, Solution and Service Development for Technology Strategy