
Site Reliability Engineering
Over the years the name of the title for Site Reliability Engineer has been changed.
Initially, it was called Middleware team, then Devops/Webops team, and now as Site Reliability Team.
Per the definition of Microsoft
“Middleware is software which lies between an operating system and the applications running on it. Common middleware examples include database middleware, application server middleware, message-oriented middleware, web middleware, and transaction-processing monitors.”
When it comes to a human who is working on this middleware system is essentially a middleware engineer who maintain the software systems between Hardware and Software.
The middleware team was transitioned to Devops/Webops team.
Devops stands for development and operations. Additional to Middleware activities, the devops engineer has a strong focus on continuous integration and continuous delivery.
Per wiki “CI/CD bridges the gaps between development and operation activities and teams by enforcing automation in building, testing, and deployment of applications. Modern-day DevOps practices involve continuous development, continuous testing, continuous integration, continuous deployment, and continuous monitoring of software applications throughout their development life cycle. The CI/CD practice or CI/CD pipeline forms the backbone of modern-day DevOps operations
In software engineering, CI/CD or CICD generally refers to the combined practices of continuous integration and either continuous delivery or continuous deployment. CI/CD bridges the gaps between development and operation activities and teams by enforcing automation in building, testing, and deployment of applications.”
Developing systems that can market the project to a live website (or production) at the earliest is a key emphasis on devops.
Devops is transitioned to SRE, which stands for site reliability engineer.
Maintaining the website without any downtime, recover the website if there is an issue occurs, conduct preventive measures for foreseeing problems, fix possible security issues are some of the important tasks performed by the Site Reliability team.
Per redhat,
“SRE takes the tasks that have historically been done by operations teams, often manually, and instead gives them to engineers or ops teams who use software and automation to solve problems and manage production systems.
SRE is a valuable practice when creating scalable and highly reliable software systems. It helps you manage large systems through code, which is more scalable and sustainable for sysadmins managing thousands or hundreds of thousands of machines.
The concept of site reliability engineering comes from the Google engineering team and is credited to Ben Treynor Sloss.
SRE helps teams find a balance between releasing new features and making sure that they are reliable for users.”
Expertizes
What I do
Leading the Site Reliability Engineering team in ecommerce domain.
Partner with Engineering stakeholders to design and deliver a reliable, scalable, secure, and performant platform.
Bridge between development and operations by applying a software engineering mindset to system administration topics
What You Do
Represent a real word problem using computer modelling and write the program to solve the problem.
Experiences
Walmart
2014 – present
Automation engineer, tools developer, performance analyst, service architect, system administrator, performance monitoring expert
Capacity and productivity planning to improve overall efficiency. Build software and systems to manage platform infrastructure and applications
Collaborate on RCAs and executes on the gaps identified to prevent future occurrences.
Build software and systems to manage platform infrastructure and applications
Wipro Technologies
2010 – 2014
Manage systems and projects with little direction
Assisting with creating and maintaining an automation and monitoring framework
Implement and manage infrastructure as code through automation tools
Developing continuous delivery pipeline in cloud environment
Implement and manage continuous code build and deployment
UST Global
2006 – 2010
R & D on improving ranking algorithm for proprietary search engine
Ensure designs are in compliance with specifications.
Detect the relevance from text
Support continuous improvement by investigating alternatives and technologies for improving the performance of crawling
Let’s make something together.