
Senior Splunk Engineer at AvePoint
SingaporeFull-timeTechnologyPosted 17 days ago
About the Role
<p><strong>Senior Splunk Engineer for Automation and Reliability Engineering Project</strong></p>
<p><strong>Project Summary</strong></p>
<ul>
<li>Support Automation and Reliability Engineering project and operations.</li>
<li>Responsibilities:</li>
<li>Observability Engineering and Governance</li>
<li>Architect and maintain enterprise SIEM solutions aligned with operational resilience mandates (e.g., MAS TRM, DORA, APRA CPS 230).</li>
<li>Lead deployment, configuration, and optimization of Splunk for full-stack visibility across infrastructure, applications, networks, and user experience.</li>
<li>Define and enforce telemetry data governance standards—metrics, logs, and traces—ensuring consistency, retention compliance, and security.</li>
<li>Integrate Splunk with incident management, ITSM, and AIOps systems to enable predictive alerting and anomaly detection.</li>
<li>Act as the SIEM/Splunk subject matter expert (SME) for architecture reviews, platform upgrades, and performance tuning.</li>
<li>Reliability Engineering and Automation</li>
<li>Implement and champion SRE frameworks and reliability practices for mission-critical systems.</li>
<li>Design and automate runbooks, alerts, and self-healing workflows using Python, Ansible, and Terraform.</li>
<li>Collaborate with Application, Infrastructure, and Cyber teams to embed reliability principles into the delivery lifecycle.</li>
<li>Conduct resilience, chaos, and capacity testing aligned with business continuity and disaster recovery standards.</li>
<li>Define and track error budgets, reliability scorecards, and service health indicators for production workloads.</li>
<li>Cloud & Platform Integration</li>
<li>Engineer SIEM for cloud-native workloads in AWS and Azure, ensuring visibility across compute, storage, and network layers.</li>
<li>Integrate Splunk and cloud observability tools into CI/CD pipelines and landing zones to ensure continuous compliance.</li>
<li>Implement infrastructure-as-code (IaC) models using Terraform and Ansible for consistent, auditable provisioning.</li>
<li>Collaborate with Cloud, DevOps, and Security teams to ensure telemetry aligns with audit, compliance, and operational risk requirements.</li>
<li>Operational Excellence and Collaboration</li>
<li>Drive reduction in incident recurrence, MTTR, and manual intervention through observability-led automation.</li>
<li>Partner with Service Delivery, Cyber, and Application teams to enable predictive incident prevention and root cause transparency.</li>
<li>Develop and maintain executive dashboards and reports showcasing availability, reliability KPIs, and operational risk indicators.</li>
<li>Provide technical leadership during major incidents, post-incident reviews, and audits, ensuring lessons learned are codified into automation and process improvements.</li>
</ul>
<p><strong>Skillset (Must have)</strong></p>
<ul>
<li>Possess a degree in Computer Science, Engineering, or related discipline.</li>
<li>Minimum 8 years of experience in Infrastructure, Cloud, or Site Reliability Engineering related roles, with at least 5 years of experience specializing in SIEM/Splunk engineering or observability in financial or regulated environments.</li>
<li>Proven hands-on expertise in the following technical areas:</li>
</ul>
<ul>
<li>SIEM Platforms: Splunk (must), EL/Elastic</li>
<li>Automation/IaC, Terraform, Ansible, Python, CI/CD tools</li>
<li>Cloud and other platforms and integrations: AWS (CloudWatch, X-Ray, CloudTrail), Azure (Monitor, Log Analytics, App Insights), Datadog, ServiceNow
<ul>
<li>Deep understanding of SRE principles, service health modelling, error budgets, and auto-remediation design.</li>
<li>Strong analytical and troubleshooting skills, with the ability to perform deep-dive investigations and develop long-term preventive solutions.</li>
<li>Familiarity with financial sector operational resilience frameworks, regulatory compliance, and incident governance.</li>
<li>Excellent written and verbal communication skills.</li>
<li>Strong interpersonal and communication skills to interact with diverse stakeholders.</li>
<li>Agile, fast learner and able to adapt to changes</li>
</ul>
</li>
</ul>
<p> </p>
<p><strong>Skillset (Good to have)</strong></p>
<p>Preferred Certifications:</p>
<ul>
<li>Splunk Certified Power User / Splunk Certified Admin / Splunk Certified Architect</li>
<li>Terraform / Ansible / Python Certified Expert</li>
<li>AWS Certified DevOps Engineer / Azure DevOps Expert</li>
<li>SRE Foundation / Practitioner (DevOps Institute)</li>
<li>ITIL v4 Managing Professional</li>
</ul><div class="content-conclusion"><p><span data-teams="true"><span class="ui-provider a b c d e f g h i j k l m n o p q r s t u v w x y z ab ac ae af ag ah ai aj ak">Any personal data you share with us during the application process will be processed strictly in compliance with applicable data protection laws and our <a id="menur12n" class="fui-Link ___1q1shib f2hkw1w f3rmtva f1ewtqcl fyind8e f1k6fduh f1w7gpdv fk6fouc fjoy568 figsok6 f1s184ao f1mk8lai fnbmjn9 f1o700av f13mvf36 f1cmlufx f9n3di6 f1ids18y f1tx3yz7 f1deo86v f1eh06m1 f1iescvh fhgqx19 f1olyrje f1p93eir f1nev41a f1h8hb77 f1lqvz6u f10aw75t fsle3fq f17ae5zn" href="https://www.avepoint.com/company/privacy-notice" target="_blank">Privacy Notice</a>.</span></span></p></div>
Related Roles
Strategic Consultant
AvePoint
Riyadh, Riyadh, Saudi ArabiaChannel Solution Engineer, Bilingual Spanish and Portuguese (Night Shift)
AvePoint
Cebu, PhilippinesPre-Sales Solution Engineer
AvePoint
Milano, Milan, ItalyPrincipal Data Scientist
AvePoint
Jersey City, NJ, United StatesTechnical Support Analyst
AvePoint
Richmond, VA, United StatesSolution Engineer
AvePoint
Los Angeles, California, United States