Join our small and nimble engineering team! We’re looking for an experienced engineer who enjoys working on a variety of different projects and large(ish) data pipelines (100M+ data points collected every month).
- Work in our downtown San Francisco office a few times per week
- Provide mentorship to your teammates via code and architecture review
- Help to improve, maintain and expand our end to end data pipeline
- Create and manage data collection/crawling scripts that utilize everything from basic HTTP requests to browser and mobile app automation (deep experience in this area is a huge plus)
- Build and maintain automation/scheduling tooling to ensure data is collected on a timely and consistent basis (we’re currently using self hosted Dagster for scheduling)
- Create data cleaning and normalization scripts (with some opportunity to integrate ML/LLMs for labeling/cleaning)
- Help with miscellaneous DevOps tasks to help manage the infrastructure running all the above
- Our codebase is mostly Python with Typescript and Golang used when they make sense. We almost entirely rely on self hosted open source/core services on bare VMs via docker compose.
About you
- Proficient (intermediate or higher) in:
- Python (e.g. can write a decorator, understand async/await, MRO, contextmanagers)
- SQL (e.g. familiar with complex joins, efficiently bulk update 10k+ rows, schema migrations)
- Basic understanding of Linux and Docker
- Excited to work on a variety of different projects, sometimes many at the same time
- Able to execute with minimal supervision
- Bonus skills: Web Crawling, Kubernetes, LLM Pipelines, DB Administration, HomeLab/Server Administration, Background in Statistics, Full Stack Web Development, Mobile App Development
Benefits
- Lunch provided when in office
- Unlimited PTO
- 401k
- Company paid Platinum PPO health and comparable dental & vision insurance
- $160K - $220K salary, 0.5% - 3% equity