CAGE CODE: 9VM30 | UEID: FJVRV3P6BTG7 / NAICS: 561311, 561312, 561320, 541612, 541519 / PSC: R431, R408

A Member of the American Staffing Association

HomeJobAI OPERATIONS PLATFORM CONSULTANT (640) – NORTH CAROLINA,NEW JERSEY- URGENT

AI OPERATIONS PLATFORM CONSULTANT (640) – NORTH CAROLINA,NEW JERSEY- URGENT

AI OPERATIONS PLATFORM CONSULTANT (640) – NORTH CAROLINA,NEW JERSEY- URGENT

AI OPERATIONS PLATFORM CONSULTANT (640) – NORTH CAROLINA,NEW JERSEY- URGENT

Job Number: 640
Job Title: AI OPERATIONS PLATFORM CONSULTANT (640) - NORTH CAROLINA
Job Type: Full-time
Clearance Level: None
Work Arrangement: On-site
Job Location: Charlotte Jersey City NC NJ
Salary: 110k-135k

Background

  • Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
  • Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production 
  • Setup and operation of AI inference service monitoring for performance and availability
  • Deploying models in production environments, including containerization, microservices, and API design
  • Triton Inference Server, including its architecture, configuration, and deployment

Requirements

  • LM and Kubernetes experience
  • Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
  • Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server
  • Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc

Preferred

  • Model Optimization techniques using Triton with TRTLLM
  • Model optimization techniques, including pruning, quantization, and knowledge distillation

Share Job:

Apply for this position

Allowed Type(s): .pdf, .doc, .docx