AI OPERATIONS PLATFORM CONSULTANT (640) – NORTH CAROLINA,NEW JERSEY- URGENT
Job Number: 640
Job Title: AI OPERATIONS PLATFORM CONSULTANT (640) - NORTH CAROLINA
Job Type: Full-time
Clearance Level: None
Work Arrangement: On-site
Job Location: Charlotte Jersey City NC NJ
Salary: 110k-135k
Background
- Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
- Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
- Setup and operation of AI inference service monitoring for performance and availability
- Deploying models in production environments, including containerization, microservices, and API design
- Triton Inference Server, including its architecture, configuration, and deployment
Requirements
- LM and Kubernetes experience
- Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
- Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server
- Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc
Preferred
- Model Optimization techniques using Triton with TRTLLM
- Model optimization techniques, including pruning, quantization, and knowledge distillation
Share Job: