Essential Duties & Responsibilities
The essential functions include, but are not limited to the following:
- End-to-end GenAI solutions: Scope problems, choose the right approach(prompt engineering, fine-tuning, agents), implement, evaluate, and deploy.
- Data & SQL: Write efficient SQL for analytics and data preparation; manage schemas and pipelines for model training and inference.
- Model training & fine-tuning: Run supervised fine-tuning (PEFT/LoRA/QLoRA), optimize prompts, and manage experiment tracking and evaluation.
- Agentic systems: Build agent workflows with tool use, memory, and safety/guardrails.
- Inference & deployment: Package services with Docker, optimize latency/cost (batching, caching, quantization), and deploy on AWS (ECS, EKS, SageMaker, Lambda with GPU acceleration).
- MLOps & Observability: Set up CI/CD for models/prompts, maintain offline/online evaluation pipelines, monitoring, and rollback strategies.
- Security & compliance: Implement data governance, PHI/PII protections, and guardrails against prompt injection and unsafe outputs.
- Cross-functional work: Collaborate with product managers and engineers to align GenAI capabilities with product goals; document clearly and communicate trade-offs.
- Production readiness: Lead conversations around scaling, monitoring, and maintaining GenAI systems in real-world environments.
Minimum Qualifications (Knowledge, Skills, and Abilities)
Education:
- Bachelor’s Degree or equivalent experience required; Master’s degree preferred.
Experience:
- 5+ years of Software/ML engineering experience, including 2+ years building and deploying GenAI/LLM systems.
- MS/PHD in Computer Science or equivalent experience.
- Strong SQL and Python skills with solid software engineering fundamentals.
- Experience with agent frameworks (LangGraph, AutoGen, CrewAI) and building tool-driven agents.
- Hands-on with deep learning (PyTorch or TensorFlow) and LLM fine-tuning(SFT/PEFT like LoRA/QLoRA).
- Production experience with Docker and deploying on AWS (ECS, EKS,SageMaker, Lambda, or GPU services).
- Experience creating Data and Model pipelines for model training anddeployment at scale.
- Familiarity with prompt engineering, evaluation frameworks (LLM-as-judge,metrics), and offline test harnesses.
- Understanding of security & compliance for sensitive data (e.g., PHI/PII) and
safe deployment of AI systems. - Excellent problem-solving, communication, and documentation skills.
- Inference optimization: quantization (bitsandbytes, GPTQ/AWQ), batching, caching, or vLLM.
- Healthcare experience: familiarity with HIPAA, medical data handling, or working in health tech.
- Experiment tracking (MLflow, W&B), CI/CD for ML, and monitoring (Prometheus, Grafana).
- Familiarity with major LLM APIs and OSS models (OpenAI, Anthropic, Llama, Mistral).
Other: Tech Stack
- Languages: Python, SQL
- DL/LLM: PyTorch, Tensorflow, Hugging Face, PEFT/TRL, vLLM
- Data: Snowflake, Postgres
- Cloud: AWS (ECS, EKS, SageMaker, Lambda)
- MLOps: Docker, CI/CD, MLflow or W&B
Supervisory Responsibilities
This role does not have any direct reports and is a single contributor role.
Working Environment and Travel Requirements
Work is typically in a normal office administrative environment involving minimal exposure to physical risks. Position requires little to moderate physical activity. Mostly sedentary work exerting up to 10 pounds of force occasionally or a negligible amount of force to lift, carry, push, pull, or otherwise move objects. Work involves sitting most of the time, but may involve walking or standing for brief periods of time. No significant stooping is usually required.