Gen AI Job Description:
o Experience Level: 3 to 5 Years
o Design, implement, and manage workflows for integrating and deploying GenAI applications from Azure, Amazon, or Snowflake. Analyse systems and applications and provide recommendations for design, enhancement and development, and play an active part in their execution.
o Platform engineering: Collaborate with other teams to integrate AI solutions into existing workflows and systems to get the platform running and available. They configure and manage the underlying infrastructure that supports the platform, ensuring scalability, reliability, and high availability.
o Develop and implement best practices for managing the lifecycle of large language AI models, including version control, testing, and validation.
o Troubleshoot and resolve issues related to the performance and deployment of large language AI models.
o Stay up to date with the latest advancements in large language AI models and operations technologies to continuously improve our AI infrastructure.
o Ability to develop, suggest best practices of designing infrastructures that support fine-tuning of models to improve performance and efficiency, and troubleshoot any issues that arise during development or deployment.
o Creating and maintaining documentation: Ensure clear and comprehensive documentation of AI/ ML / LLM
o Security integration: GenAI platform engineers weave security best practices throughout the development lifecycle to safeguard the platform from vulnerabilities and data breaches.
o Monitoring and logging: Implementation of robust monitoring and logging systems, LLMOps best practices that allows for proactive identification and resolution of potential issues.
o Responsible AI Guardrails: GenAI platform engineers are responsible for ensuring all Responsible AI metrics are governed through proper system infrastructure and monitoring.
o Data privacy and governance: Ensuring user data privacy and adhering to data governance regulations are paramount considerations for GenAI platform engineers.
Requirements:
• Bachelor’s or master’s degree in statistics / economics / operation Research / data science / computer science / related field.
• 2 years of relevant experience in managing Gen AI applications, model monitoring, model validation, implementing I/O guardrails & FinOps monitoring
• Strong cross-cultural communication and negotiation skills, including the demonstrated ability to solicit opinions and accept feedback and the ability to effectively manage collaboration across time zones.
• Understanding of OpenAI, Llama, Claude, Arctic, Mistral large language models, how to deploy them on cloud/ on-premises and use APIs to build Industry solutions.
• Experience with AI/ML frameworks and tools (e.g., Langchain, Semantic Kernel, TensorFlow, PyTorch).
• Experience in using LLM models on cloud i.e. OpenAI @ Azure, Amazon Bedrock, Snowflake Cortex AI
• Familiarity with cloud platforms (e.g., AWS, Azure, Snowflake) and containerization technologies (e.g., Docker, Kubernetes).
• Advanced & secure coding experience in at least one language (Python, PySpark, TypeScript)
• Exposure to Vector/Graph/SQL Databases, non-deterministic automated testing, workflow platforms
• Excellent problem-solving skills and attention to detail.
• Strong communication and collaboration skills and experience in operating effectively as part of cross-functional teams.
Responsibilities
Gen AI Job Description:
o Experience Level: 3 to 5 Years
o Design, implement, and manage workflows for integrating and deploying GenAI applications from Azure, Amazon, or Snowflake. Analyse systems and applications and provide recommendations for design, enhancement and development, and play an active part in their execution.
o Platform engineering: Collaborate with other teams to integrate AI solutions into existing workflows and systems to get the platform running and available. They configure and manage the underlying infrastructure that supports the platform, ensuring scalability, reliability, and high availability.
o Develop and implement best practices for managing the lifecycle of large language AI models, including version control, testing, and validation.
o Troubleshoot and resolve issues related to the performance and deployment of large language AI models.
o Stay up to date with the latest advancements in large language AI models and operations technologies to continuously improve our AI infrastructure.
o Ability to develop, suggest best practices of designing infrastructures that support fine-tuning of models to improve performance and efficiency, and troubleshoot any issues that arise during development or deployment.
o Creating and maintaining documentation: Ensure clear and comprehensive documentation of AI/ ML / LLM
o Security integration: GenAI platform engineers weave security best practices throughout the development lifecycle to safeguard the platform from vulnerabilities and data breaches.
o Monitoring and logging: Implementation of robust monitoring and logging systems, LLMOps best practices that allows for proactive identification and resolution of potential issues.
o Responsible AI Guardrails: GenAI platform engineers are responsible for ensuring all Responsible AI metrics are governed through proper system infrastructure and monitoring.
o Data privacy and governance: Ensuring user data privacy and adhering to data governance regulations are paramount considerations for GenAI platform engineers.
Requirements:
• Bachelor’s or master’s degree in statistics / economics / operation Research / data science / computer science / related field.
• 2 years of relevant experience in managing Gen AI applications, model monitoring, model validation, implementing I/O guardrails & FinOps monitoring
• Strong cross-cultural communication and negotiation skills, including the demonstrated ability to solicit opinions and accept feedback and the ability to effectively manage collaboration across time zones.
• Understanding of OpenAI, Llama, Claude, Arctic, Mistral large language models, how to deploy them on cloud/ on-premises and use APIs to build Industry solutions.
• Experience with AI/ML frameworks and tools (e.g., Langchain, Semantic Kernel, TensorFlow, PyTorch).
• Experience in using LLM models on cloud i.e. OpenAI @ Azure, Amazon Bedrock, Snowflake Cortex AI
• Familiarity with cloud platforms (e.g., AWS, Azure, Snowflake) and containerization technologies (e.g., Docker, Kubernetes).
• Advanced & secure coding experience in at least one language (Python, PySpark, TypeScript)
• Exposure to Vector/Graph/SQL Databases, non-deterministic automated testing, workflow platforms
• Excellent problem-solving skills and attention to detail.
• Strong communication and collaboration skills and experience in operating effectively as part of cross-functional teams.
Salary : As per industry standard.
Industry :IT-Software / Software Services
Functional Area : IT Software - Application Programming , Maintenance
Role Overview
We are hiring Senior AWS Engineers (14+ years) to join a mature production application running on AWS. This platform has been live for several years, serves enterprise customers on a multi-tenant real-time data pipeline, and is now facing the kinds of problems that only show up at scale: database performance under load, observability gaps, disaster recovery maturity, and streaming pipeline reliability.
This is a hands-on individual contributor role. It is not an architect role, not a greenfield build, and not a DevOps pipeline rebuild. We are looking for senior engineers who still write code every day and who can take a hard problem off a busy leader's plate and own it end-to-end. Candidates should be prepared to engage with problems across the AWS stack as they arise.
Key Responsibilities
• Diagnose and resolve production performance and scalability issues on Amazon RDS, including query-level tuning, schema and index optimization, and scaling strategy
• Design and harden the observability stack across metrics, logs, traces, and alerting, bringing it to production-grade maturity
• Design and execute disaster recovery strategy, including RPO/RTO planning, failover testing, and runbook ownership
• Troubleshoot and optimize a multi-tenant real-time data pipeline running on AWS
• Work hands-on with Amazon EKS to diagnose orchestration and workload issues in production
• Work hands-on with Amazon MSK at production scale, resolving performance and scalability challenges in Kafka-based streaming
• Engage with adjacent AWS services as the work demands, including compute (EC2, Lambda, ECS), networking (VPC, ALB/NLB, Route 53), storage (S3, EBS), security and identity (IAM, KMS, Secrets Manager), and data/messaging services (DynamoDB, SQS, SNS, Kinesis)
• Drive automation of recurring operational work so engineering time is spent on real problems, not toil
• Collaborate directly with the product engineering team and communicate problem, root cause, and solution clearly and efficiently
Responsibilities
Required Skills and Expertise
• 14+ years of hands-on AWS engineering experience in production environments, with a current track record of writing code rather than purely architecture or advisory work
• Deep Amazon RDS expertise, including production database performance tuning, scaling patterns, and troubleshooting under load
• Strong observability engineering experience with tools such as Prometheus, Grafana, CloudWatch, and ELK, and a demonstrated history of taking a weak observability stack and making it production-grade
• Proven experience designing and executing disaster recovery for production systems
• Hands-on Amazon EKS experience in production, not only familiarity with Kubernetes concepts
• Hands-on Amazon MSK or Kafka experience at production scale, including performance troubleshooting
• Experience operating multi-tenant real-time data pipelines on AWS
• Strong scripting skills in Python, Bash, or equivalent, and comfort with Docker, Kubernetes and container-based workloads
• Exposure to Infrastructure as Code (Terraform or CloudFormation) and CI/CD tooling
• Strong general AWS fluency across compute (EC2, Lambda, ECS, Auto Scaling), networking (VPC design, security groups, ALB/NLB, Route 53, VPC endpoints), storage (S3 lifecycle and access patterns, EBS, EFS), security and identity (IAM, KMS, Secrets Manager, least-privilege design, encryption at rest and in transit), adjacent data and messaging services (DynamoDB, SQS, SNS, Kinesis), edge delivery (CloudFront), and cost governance (tagging, rightsizing, cloud cost optimization).
Desired Competencies
• Ability to take a problem off a leader's plate and own it end-to-end with minimal supervision
• Clear verbal communication of problem, root cause, and proposed solution in the most efficient way possible; able to articulate complex technical issues without requiring extended context-setting
• Strong leadership principles mindset and ownership orientation
• Cultural fit and attitude for a collaborative, ownership-driven team
Salary : As per industry standard.
Industry :IT-Software / Software Services
Functional Area : IT Software - Application Programming , Maintenance
Key Responsibilities
• Design, build, and maintain cloud infrastructure on AWS, implementing scalable and resilient architectures using services such as EC2, ECS, Lambda, DynamoDB, S3, CloudFront, VPC, ALB/NLB, and IAM
• Optimize cloud cost, performance, and reliability across environments
• Configure monitoring, logging, metrics, and alerting using CloudWatch and complementary observability tools
• Write and manage Terraform modules for reusable infrastructure components, enforce Infrastructure as Code best practices, and drive code reviews and compliance around IaC
• Automate provisioning, scaling, and configuration of cloud environments end-to-end
• Build and maintain CI/CD pipelines using GitLab CI, automating application build, test, security scanning, and deployment workflows
• Implement appropriate deployment strategies based on service design and risk profile
• Build and manage containerized applications using Docker
• Implement secure-by-default infrastructure and DevOps practices, including management of secrets, encryption, IAM roles, and least-privilege access
• Partner with security teams to maintain compliance frameworks and uphold governance standards
• Develop automation scripts, build internal tooling, and automate recurring operational tasks and configuration management
• Work closely with development, QA, and product teams to streamline release processes and improve overall delivery flow
Responsibilities
Required Skills and Expertise
• 5+ years of hands-on DevOps engineering experience in production AWS environments
• Strong general AWS fluency across compute (EC2, ECS, Lambda, Auto Scaling), networking (VPC design, security groups, ALB/NLB, Route 53, VPC endpoints), storage (S3 lifecycle and access patterns, EBS, EFS), security and identity (IAM, KMS, Secrets Manager, least-privilege design, encryption at rest and in transit), adjacent data and messaging services (DynamoDB, SQS, SNS), edge delivery (CloudFront), and cost governance (tagging, rightsizing, cloud cost optimization)
• Deep hands-on experience with Terraform, including writing reusable modules, applying IaC best practices, and managing state and environments at production scale
• Strong experience building and maintaining CI/CD pipelines, preferably with GitLab CI, including build, test, security scanning, and automated deployment workflows
• Solid experience with Docker and container-based workloads; familiarity with Kubernetes or ECS in production is valued
• Working knowledge of observability tooling beyond CloudWatch, such as Prometheus, Grafana, or ELK
• Strong scripting skills in Python, Bash, or equivalent for automation and tooling
• Practical experience implementing DevOps security practices, including secrets management, IAM design, and compliance-aware operations
• Familiarity with configuration management tooling and modern operational tooling patterns
Desired Competencies
• Strong ownership mindset with the ability to take a problem end-to-end, from diagnosis through automation and documentation
• Clear verbal communication of problem, root cause, and proposed solution in the most efficient way possible; able to articulate complex operational issues without requiring extended context-setting
• Collaborative working style with development, QA, product, and security teams, and the ability to drive cross-functional release improvements
• Cultural fit for a collaborative, ownership-driven team
Salary : As per industry standard.
Industry :IT-Software / Software Services
Functional Area : IT Software - Application Programming , Maintenance
We are seeking a highly skilled AWS Cloud Operations Engineer to join our dynamic Cloud Operations team. The ideal candidate will have a deep understanding of AWS EKS, microservices architecture, container technologies, and Infrastructure as Code (IaC) principles. You will play a critical role in ensuring the reliability, performance, and security of our cloud-native applications.
Responsibilities
We are seeking a highly skilled AWS Cloud Operations Engineer to join our dynamic Cloud Operations team. The ideal candidate will have a deep understanding of AWS EKS, microservices architecture, container technologies, and Infrastructure as Code (IaC) principles. You will play a critical role in ensuring the reliability, performance, and security of our cloud-native applications.
Salary : Rs. 0.0 - Rs. 1,00,000.0
Industry :IT-Software / Software Services
Functional Area : IT Software - Application Programming , Maintenance