AI Architect

LNT/AA/1750862

Chennai

Posted On

22/05/2026

End Date

18/11/2026

Required Experience

15 - 20 years

Skills

Knowledge & Posting Location Architecture Artificial Intelligence Python Kubernetes Infrastructure Management Minimum Qualification Bachelor of Technology (BTech)

Job Description

Job Purpose Designs and architect end-to-end AI Cloud platforms with a focus on security, cost-efficiency, and performance. This position involves direct client engagement to translate requirements into technical Solution, encompassing GPU infrastructure rightsizing and optimal model selection. We are looking for a cloud expert with a demonstrated ability to transition complex AI models from concept to large-scale production. The ideal candidate brings extensive experience in AI/Cloud ecosystems and a successful track record of architecting and managing production-grade, large-scale AI platforms. Role Summary Key Responsibilities Translate business requirements into scalable, high-performance AI/GenAI architectures featuring NVIDIA GPU clusters Design end-to-end AI Cloud and next-generation platforms optimized for deep learning workloads and distributed training. Architect HPC cluster topologies utilizing high-speed InfiniBand (NDR/HDR) and RoCE v2 interconnects for low-latency communication. Right-size platform components, including GPUs, CPUs, memory and NVMe storage for comprehensive client proposals. Architect distributed training and inference environments optimized for MPI frameworks and workload scheduling via Slurm. Desing scalable container orchestration platforms using Kubernetes and Kubeflow to manage AI workloads. Propose optimized inference strategies using vLLM, Triton, and TensorRT-LLM to meet specific latency and throughput KPIs. Should have experience on RAG systems and multi-agent orchestration frameworks like LangGraph and agentic ecosystems. Develop private AI cloud environments focused on data sovereignty and regulatory compliance, such as the India DPDP Act. Define integration strategies for LLMs and open-source models within existing enterprise data systems, APIs, and knowledge graphs. Establish reference architectures for CI/CD/CT pipelines and automated model retraining workflows to ensure reproducibility. Implement automation and observability frameworks for monitoring GPU utilization, performance tuning, and failure handling. Drive technical validation through Proof of Concept (PoC) engagements, focusing on scalability and performance benchmarks for LLM training. Establish Infrastructure-as-Code (IaC) practices to ensure reproducible and reliable cluster deployments. Collaborate with C-suite stakeholders and cross-functional teams to drive technical decision-making, innovation, and roadmap alignment. Experience & Educational Requirements Qualifications and Experience EDUCATIONAL QUALIFICATIONS: (degree, training, or certification required) BE/B-Tech or equivalent with Computer Science or Electronics & Communication RELEVANT EXPERIENCE: 15 – 20 years of IT Experience with minimum 5 years in AI platform Required Technical Skills Core AI/ML Expertise Strong experience in Nvidia, Intel, Google GPU Architecture, InfiniBand Strong expertise in Kubernetes, Slurm and OpenShift Good experience in Python, PyTorch and TensorFlow Good knowledge on LangChain, LangGraph Deep understanding of Transformers, Attention mechanisms, Diffusion, MoE Knowledge of RLHF, Pinecone, FAISS, Chroma, OpenAI, VLLM Expertise in RAG and agentic AI workflows Knowledge of high-performance storage (Lustre, PFS, Object NVMe) Good Knowledge with NVIDIA architectures (Hopper, Blackwell) Soft Skills Strong problem-solving and analytical thinking Excellent communication and stakeholder management Ability to influence leadership and drive strategic decisions Innovation mindset with focus on enterprise impact Preferred Experience Currently in AI / Cloud Presales team Should be able to right size infra and choose right GPU model as per client requirement Hands-on with Python, vector DBs (Pinecone, FAISS, Chroma), and LLM APIs (OpenAI, Anthropic). Solid understanding of cloud-native architecture OpenStack, KVM, (Azure/AWS/GCP), microservices, Kubernetes, serverless, API gateways. Good knowledge on deep learning experience: CNNs, RNNs/LSTMs, Transformers, and attention mechanisms. Proficiency in Python for ML: NumPy, pandas, scikit-learn, and frameworks such as PyTorch or TensorFlow. Experience in integrating LLMs (GPT, Claude, Gemini, LLaMA, Mistral) into applications. Prompt engineering skills: zero-shot, few-shot, chain-of-thought, ReAct, and structured output patterns. Experience building RAG systems: document chunking, embedding models, vector search, and retrieval optimization. Understanding of AI agent patterns, tool use, and agentic workflows. Familiarity with Docker, CI/CD pipelines, and Git-based workflows. Strong communication, stakeholder management, and solution design skills.

The ak vertex group