Skip to main content

Operational Excellence

We must return to the basics and refocus on our mission, building IT products and services for VA and the Veterans we serve, while being effective stewards of the resources we have and vocal advocates for resources we need.

Our priority number one in achieving operational excellence is implementing a discipline of engineering excellence at VA. We need to get the basics right.

This is the value proposition we bring to VA, it is the service we provide. When we lose sight of that primary responsibility, we lose the faith of our customers.

As a result, offices pursue their own information and technology services, often referred to as “shadow IT,” which creates additional complexities in the engineering landscape, exponentially increases technical debt, introduces security vulnerabilities, and ultimately creates a dysfunctional experience for VA employees, and for the Veterans who rely on a seamless, secure technological backbone at VA to access the care and services they have earned.

The foundation of true operational excellence includes three components: Engineering Excellence, Security Excellence, and Effective Resource Allocation.

Engineering Excellence

To institute an environment of Engineering Excellence, we will establish a process of continuous improvement in how we execute our engineering processes, including redundancy, failover, monitoring, modern design patterns, etc. Our goals for implementing the discipline of Engineering Excellence at VA include:

We will drive toward reducing avoidable major incidents to zero.

Over 17 percent of the major incidents we experience are the result of software and configuration changes. We can reduce this to near-zero through more careful testing, release management, and adherence to change management processes

We will target 100 percent of incidents or outages to be caught via monitoring and alerting rather than end user reports of an outage.

While OIT has established monitoring for many services and applications, the percentage of incidents that are currently caught via alerting is in the single digits. We will strive to catch 100 percent of system issues through early-warning monitoring, so that we can identify a problem before our users do.

We will establish a world-class level of systems and infrastructure uptime by:

  • Identifying our key systems and infrastructure components, identifying those that aren’t meeting resiliency needs, and driving action plans to improve.
  • Ensuring the highest priority systems and infrastructure components have tested failover capabilities.
  • Upgrading key infrastructure and systems where current configurations are negatively impacting our resiliency and reliability.
  • Improving our change management and adherence to Standard Operating Procedures (SOPs) to aggressively reduce avoidable downtime.

We will invest in critical infrastructure capabilities that enable a more agile, resilient IT environment at VA, including:

  • Modernization and consistent enforcement of a technical refresh schedule for the endpoints and devices our VA customers use to perform their work and provide care and services to Veterans
  • Modernization of our network infrastructure to improve reliability, availability, and capacity of network resources for on-premises traffic, as well as more emphasis on moving customer traffic to cloud infrastructure to reduce VA network congestion and simplify and streamline the customer’s experience.

Customers also appreciate proactive, timely communication.

So it is imperative that we communicate to our VA customers when there’s an outage that impacts them. This includes off-premises dashboards where customers can check on the availability of VA applications and services.

Security Excellence

Today’s computing environment is increasingly complex with more sophisticated bad actors attempting to take over an organization’s computing resources and compromising sensitive information.

Guarding against this necessitates developing a holistic approach that can protect the organization on an ongoing basis.

Being excellent across all the dimensions of this approach is essential to increasing VA’s security preparedness and minimizing the risk of a major security incident.

We will establish a clear security strategy founded around the Zero Trust Architecture and will identify and execute on a clear, risk-based roadmap that implements the key pillars of that strategy.

We will further strengthen our procedures that enable us to identify and respond rapidly and decisively to new security threats and incidents.

We will also demonstrate the security of our systems through compliance with federal regulations like Federal Information Security Modernization Act (FISMA) and Federal Information System Controls Audit Manual (FISCAM), as they represent a sound, broad set of improvement opportunities for our security discipline.

We must articulate and implement a clear roles and responsibilities model across the organization, so we can work in unison to secure our environment and respond to threats.

At the heart of our strategy is Zero Trust. This increasingly popular approach to security provides a powerful way to frame the broad set of investments needed to secure the organization. It starts with a simple premise: there are so many different ways that a bad actor can compromise a network that we must assume our network will be breached.

If we assume this, then the next line of defense is to assure that the bad actor cannot profit from breaching our network, since no vital information or resources will be available to them if they do.

In short, there is “zero trust” conferred to the bad actor from having breached the network boundary, and because of this, they cannot access the resources they breached the network to steal.

In a Zero Trust environment, security comes not solely from protecting the perimeter but rather from the following core capabilities and investments.

Mutifactor authentication using a secondary device

Enforcing strong identity verification

  • Multifactor authentication using the PIV card infrastructure
  • Access to VA resources from “non-human” end points (e.g., machine-to-machine communication) are strictly controlled through:
  • Restriction of server access to known machine endpoints
  • Trusted Internet Connections (also known as TIC) as the only gateway from outside the network boundary to inside the network boundary

Ensuring all connecting devices are healthy

  • Up-to-date end point protection on all devices accessing VA computing resources, enforced before access is granted
  • End-user computing devices are restricted from operating with administrator privileges, so that unsafe software cannot be installed

Using rich telemetry and advanced algorithms to detect attacks and isolate affected systems

  • Monitoring and alerting that rapidly identifies new attack methods
  • Impacted resources are isolated to restrict further damage
  • Impacted resources must be remediated to save vital information and return the resources to production.

Enforcing Least Privileged Access

  • Implementing role-based permissions, granting access permissions based on a person’s role in the organization and their associated access requirements
  • Enforcing strong workflows that remove privileges when an employee’s role changes
  • Because doing this is complex and error prone, implementing an auditing capability that validates role-based access provisioning is being enforced

Protecting sensitive VA Information as alternative line of defense

  • Encrypt all information at rest
  • Enable the ability to encrypt sensitive shared information
  • Policy enforcement that restricts the sharing of sensitive information

Assure the health of our IT supply chain by enforcing strict security requirements on our third-party software and service providers

  • Ensure that only approved IT products and services can be introduced into our estate
  • Proactively manage the set of IT solution providers including desktop, cloud services, and outsourced software developers to ensure they have strong security practices
  • Deploy a security testing process for all new updates delivered to the systems and devices in the estate

Effective Resource Allocation

We will institute a culture of relentless prioritization so we allocate resources effectively.

Assigning resources against our top priorities and transparently reporting our resource usage to stakeholders will engender the trust of our customers while directing organization focus and speed to the building and supporting of IT services most critical to VA’s mission.

This includes:

  • Analysis of where our resources are allocated, to ensure that they are applied against our top priorities
  • Analysis of efficiencies we can drive across the organization with existing resources, such as a more effective use of government employees and contractors and a tight embrace of our Federal Information Technology Acquisition Reform Act (FITARA) responsibilities
  • Aggressive prioritization and re-prioritization of how we spend those dollars, including decisions on what we are funding versus not funding, what the portfolio-level “cut” lines are, and where funding gaps exist for critical requirements — all while promoting continuous transparency and candor with our business partners.

Page last updated on April 21, 2022


An official website of the U.S. Department of Veterans Affairs

Looking for U.S. government information and services?

We’re here anytime, day or night - 24/7