Data Infrastructure Engineer / Product Div / Remote & Flextime

Contract type
正社員/Fulltime
Industry
Web / Internet Service
Company Type
日系企業/Japanese Company
Location
東京都/Tokyo
Salary
8 Million yen〜12 Million yen
Japanese Level
ビジネスレベル/Business Level
English Level
なし/None
Other Language Skills
なし(None)

Job Description

In order to transform the sales process from one based on intuition and perseverance to one that is data-driven, you will be involved in building a data lake/data warehouse to utilize the tens of thousands of hours of video, audio, and transcribed text data that is accumulated every month, as well as building the infrastructure after the PoC (Proof of Concept) is implemented. As their online sales system has been introduced to more than 3,000 companies in Japan, they have been able to obtain "raw data from the sales field" that did not exist in the world before. Can you imagine how much value it would create for companies if they could analyze the raw data of their salespeople directly talking to customers and thereby create a data-driven organization? As a pioneer, why don't you create a variety of value that will surprise society by using big data in the sales domain, the last mile that the world has overlooked? Specific Job Description Construction of data lakes and data warehouses You will be responsible for building a data lake to accumulate unstructured data such as video, audio, sales materials, and text to improve the efficiency of hypothesis testing in the PoC phase, as well as a data warehouse to facilitate the handling and analysis of negotiation logs and unstructured data. In the multi-cloud environment of AWS and GCP, you will be able to actively work on improving the data infrastructure by optimizing from an analytical perspective, refactoring to improve operational efficiency, and introducing new technologies and services to solve problems. Development of new product version After the PoC is completed, you will be asked to build a stable data pipeline and database to commercialize the algorithms and mechanisms implemented by the data scientist. You will be involved in the development of a stable data pipeline and database for commercialization. ML Ops Promotion They are building a workflow for the data analysis team that will be assigned to you, and working to create an environment where each specialist can focus on their area of expertise. Currently, they are implementing AI Platform and using Notebook to manage resources. In order to improve the performance of the data analysis team, they would like you to work on promoting ML Ops. Business Unit Members Their team includes a data scientist who has published several books on data analysis and AI programming, a machine learning engineer who has presented at conferences and written papers in the field of deep learning, and a data engineer who was responsible for building the data analysis infrastructure for a major mobility company in Japan. Development Policy The product manager and sometimes the engineers create specifications by gathering requests from customers and the company. Assignments to tasks are made with a schedule based on preferences, and engineers make another estimate based on the requirements, set the time frame, and proceed with development. Development is done on a pull request basis using GitHub, and basically all code is reviewed. Development environment Data types: video, audio, transcribed text of conversations, business meeting logs, sales materials Language: Python Infrastructure: AWS, GCP Main services used (AWS): S3, Data Pipeline, RDS, Elasticsearch Main services used (GCP): BigQuery, Composer, GAE, GCE, GCS, AIplatform, Kubernetes (K8s) Source code management: GitHub Information sharing tool: Slack

Required skills

Required Python development experience Experience in writing SQL code in practice Experience in building data infrastructure in on-premise or cloud environment Welcome Experience in data integration processing and operation using ETL tools such as AWS Glue, GCP Cloud Dataflow, talend, etc. Experience using data flow management such as digdag, airflow, luigi, etc. Experience in building databases such as Bigquery, Redshift, etc. Experience implementing in a managed service environment Experience in development using Docker What they are looking for A person who shares our mission, "To liberate salespeople from their intuition and guts with technology and bring new business opportunities to companies.