About the Role
We are building the next-generation AI-powered platform and web application for easy and fast creation of audio and video content. Building a revolutionary way to record, transcribe, edit, and mix audio and video on the web comes with a series of unique technical challenges and requires solving hard and complex problems.
We are looking for an experienced software engineer to build a best-in-class video editing and streaming platform to power the core editing and AI functionality of the app.
What You’ll Do
- Design, build, and optimize scalable backend infrastructure to process, store, and serve real-time media processing pipelines and AI inference workloads efficiently.
- Collaborate closely with Product, AI Research, and Engineering teams to ensure our platform meets the evolving needs of advanced audio and video workflows.
- Build cutting edge model training and evaluation infrastructure.
- Evaluate, integrate, and optimize third-party AI vendor solutions, collaborating closely with Product Engineering and AI Research teams to enhance our AI capabilities and accelerate feature delivery.
- Make high-impact architectural decisions that balance short-term goals with long-term scalability, while mentoring teammates and championing engineering best practices.
What You Bring
- 5+ years of professional backend software engineering experience.
- Experience with cloud infrastructure (GCP, AWS), modern data storage systems (S3, GCS, PostgreSQL, Dynamo/Bigtable), Linux, and data infrastructure (Airflow, Beam, Spark).
- Expertise in modern backend languages and technologies, such as Python, Typescript/Node.js, Go, or similar.
- Deep understanding of system design principles, performance optimization, and infrastructure automation (e.g., Temporal, Kubernetes, Docker).
- Solid CS fundamentals, including data structures, algorithms, databases (Postgres, BigQuery), and familiarity with monitoring and observability tools (metrics and distributed tracing).
- Strong written and verbal communication skills, along with good judgment in technical decision-making.
Nice to Have
- Experience in AI and ML technologies (Pytorch, CUDA, distributed training).
- Expertise in digital media processing, codecs, and streaming technologies (ffmpeg, WebRTC, HLS, MPEG-DASH).
- Familiarity with GPU profiling, performance tuning, and optimization techniques.
- Experience contributing to or engaging with open-source communities.
At our current size and stage, we embrace a flat organizational structure and value the expertise and contributions of every team member. As such, we have a unified job title for our engineering roles where everyone, including those with Staff-level scope, is considered a Software Engineer. While titles may not change, we are actively seeking Software Engineers with senior-or-higher-equivalent experience to join our team.
The base salary range for this role is $160,000- $240,000/year. Final offer amounts will carefully consider multiple factors, including prior experience, expertise, location, and may vary from the amount above.
Descript is building a simple, intuitive, fully-powered editing tool for video and audio — an editing tool built for the age of AI. We are a team of 150 — with a proven CEO and the backing of some of the world's greatest investors (OpenAI, Andreessen Horowitz, Redpoint Ventures, Spark Capital).
Descript is the special company that's in possession of both product market fit and the raw materials (passionate user community, great product, large market) for growth, but is still early enough that each new employee has a measurable influence on the direction of the company.
Benefits include a generous healthcare package, catered lunches, and flexible vacation time. Our headquarters are located in the Mission District of San Francisco, CA. We're looking to hire people who are local and able to join us at the office when needed. We're flexible, and you're an adult—we don't expect or mandate that you're in the office every day. But we do believe there are valuable and serendipitous moments of discovery and collaboration that come from working together in person.
Descript is an equal opportunity workplace—we are dedicated to equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or Veteran status. We believe in actively building a team rich in diverse backgrounds, experiences, and opinions to better allow our employees, products, and community to thrive.
Top Skills
What We Do
Descript builds simple and powerful collaborative tools for new media creators. We strive to eliminate the tedious work that often stands between an idea and its expression, so that creators can focus on developing their craft instead of their usage of tools.
Descript is a next generation digital media platform that offers a new "engine" that lets you edit audio by editing text (instead of waveforms).
That means it does transcription (both automated and human-powered, whichever you prefer), but more interestingly, it lets you move the audio around by simply editing the transcript.
If you're new to editing audio, you'll find it far easier - it's basically like using a word processor. If you're experienced, Descript is faster than waveform editing - and more fun.
If you work with voice audio (as opposed to music), Descript is what the future looks like...join us and change the world.