Founding Applied AI Scientist

Posted 7 Hours Ago
Be an Early Applicant
San Francisco, CA
Mid level
Generative AI
The Role
The role involves designing and optimizing AI models for document understanding and multimodal learning, integrating features into the platform, and engaging with users for practical applications.
Summary Generated by Built In

Founding Applied AI Research Scientist

Tensorlake is building a distributed data processing platform for developers building Generative AI applications. Our product, Indexify(https://getindexify.ai), enables building continuously evolving knowledge bases and indexes for Large Language Model applications by allowing structured data or embedding extraction algorithms on any unstructured data.

We are building a server-less product on top of Indexify that allows users to build real time extraction pipelines for unstructured data. The extracted data and indexes would be directly consumed by AI Applications and LLMs to power business and consumer applications.

We are looking for a Founding Applied AI Research Scientist who thrives on tackling complex challenges at the intersection of document understanding, multimodal learning, and cutting-edge AI research. Working closely with the founding team, you'll influence Tensorlake’s technical strategy and contribute to advancing the capabilities of our AI products.

Responsibilities

As a Founding Applied AI Research Scientist, you will:

- Design, train, and evaluate document understanding models for extracting complex data, such as tables, forms, and structured text from documents.

- Develop and optimize multi-modal visual Q&A models, enabling our platform to understand and answer questions based on both textual and visual information.

- Collaborate with the team to integrate AI-driven features into Tensorlake’s platform, helping turn research insights into practical, real-world solutions.

- Work closely with users and customers to understand their needs, ensuring that AI solutions provide real, measurable value in business applications.

Qualifications

- 4+ years of experience working with AI/ML models, specifically in the fields of document understanding, computer vision, and multi-modal learning.

- Proven expertise in training and evaluating models for complex document extraction, including structured data like tables and forms.

- Deep NLP Expertise: Experience with transformer-based models such as BERT, LayoutLM, T5, or DocFormer.

- OCR Integration: Proficiency in integrating OCR technologies for extracting text from scanned documents and PDFs.

- Model Pretraining and Fine-tuning: Experience with pretraining large models and fine-tuning them for document understanding tasks.

- Layout Analysis: Understanding document layout and structure for effective table detection and hierarchy extraction.

- Benchmarking and Evaluation: Experience with document-specific datasets and evaluation techniques.

- Vision-Language Models: Familiarity with models that integrate visual and textual data for document understanding.

- Solid programming skills in Python and proficiency in at least one deep learning framework (e.g., TensorFlow, PyTorch).

- Ph.D. or Bachelor's degree in a quantitative field such as Computer Science, Mathematics, or equivalent industry experience.

Benefits

- Ability to save in 401(k) plans

- Comprehensive Healthcare and Dental Benefits

If you’re passionate about research in document understanding and multimodal learning, and enjoy tackling ambitious technical challenges, we’d love to hear from you. Even if you only fit some of the criteria but have relevant experience, we encourage you to apply and share a project that showcases your expertise. Bonus points if it’s open-source!

Top Skills

Bert
Docformer
Layoutlm
Ocr Technologies
Python
PyTorch
T5
TensorFlow
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
12 Employees
Remote Workplace

What We Do

Tensorlake is building Indexify a streaming ETL engine for building LLM applications which uses Retrieval Augmented Generation.

Similar Jobs

Liftoff Logo Liftoff

Staff Software Engineer, Bidding Intelligence

AdTech • Big Data • Machine Learning • Marketing Tech • Mobile • Software
Easy Apply
Redwood City, CA, USA
645 Employees
165K-225K

ServiceNow Logo ServiceNow

Senior Software Engineering Manager

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
San Diego, CA, USA
26000 Employees
169K-296K Annually

Block Logo Block

ASIC Physical Design Engineer - Bitcoin Mining

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
Remote
Hybrid
7 Locations
12000 Employees
168K-297K Annually

Crunchyroll Logo Crunchyroll

Director of Engineering, Android

Digital Media • eCommerce • Gaming • Mobile • News + Entertainment
Hybrid
Los Angeles, CA, USA
1200 Employees
220K-275K Annually

Similar Companies Hiring

Monte Carlo Thumbnail
Software • Generative AI • Cloud • Big Data Analytics • Big Data
San Francisco, CA
173 Employees
Accuris Thumbnail
Software • Manufacturing • Machine Learning • Information Technology • Generative AI • Conversational AI
Denver, CO
1200 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account