Infrastructure

Why Infrastructure Security Must Include OT and AI

October 2024•10 min read•Imane E.

Critical infrastructure—power grids, water systems, transportation networks—operates on two parallel technology stacks that have historically remained separate. Operational Technology (OT) controls physical processes; Information Technology (IT) manages data and communications. As artificial intelligence increasingly integrates into both OT and IT, the boundary between them dissolves, creating security challenges that neither cybersecurity nor engineering disciplines alone can address.

The OT-IT Convergence

Traditionally, critical infrastructure operated with strict separation:

OT Systems: SCADA controllers, programmable logic controllers (PLCs), and industrial sensors managing physical processes. Designed for reliability and safety, OT systems prioritized uptime over security. Many deployed decades ago, running proprietary software with no security updates.

IT Systems: Corporate networks, databases, and administrative systems managing business operations. Security-conscious, regularly updated, designed to prevent unauthorized access.

The convergence is driven by operational benefits: remote monitoring, predictive maintenance using AI, optimization of resource usage through machine learning. But integrating OT with IT-style networks creates vulnerabilities.

AI as Force Multiplier for Both OT and IT Attacks

Artificial intelligence complicates the security picture in three ways:

Enhanced Attack Detection: AI systems monitor OT infrastructure for anomalies—unusual vibration patterns in machinery, unexpected temperature changes, abnormal power consumption. The same AI systems that detect legitimate anomalies can be poisoned to miss actual attacks or false-alarm on normal operation.

Autonomous Decision Making: AI systems increasingly make operational decisions in critical infrastructure—adjusting power grid frequency, controlling water flow rates, managing traffic signals. Unlike human operators who understand context, AI systems follow learned patterns. Adversaries can exploit these patterns to cause physical damage.

Attack Optimization: Adversaries use machine learning to optimize attacks against critical infrastructure. AI can identify the most efficient targets (which infrastructure disruption maximizes cascading failures), the best attack timing (when human operators are least alert), and the most difficult-to-detect attack methods (anomalies that machine learning models have been trained to ignore).

The Security Integration Challenge

Cybersecurity professionals understand how to prevent network intrusions and detect data theft. Control system engineers understand how to maintain physical reliability and prevent equipment damage. But OT-IT security requires integration of both disciplines.

IT Security Limitations for OT:

Frequent security patches and software updates destabilize OT systems requiring continuous operation
Encryption consumes computational resources critical for real-time control
Intrusion detection systems generate false positives on normal OT communication patterns
Zero-trust architecture (require authentication for all connections) incompatible with real-time control requirements (millisecond latency)

OT Reliability Conflicts with IT Security:

OT engineers resist network monitoring and access logging that might affect performance
Safety-critical interlocks and failsafes sometimes conflict with audit trails and accountability measures
Segmentation of networks (reducing OT exposure) limits visibility for security monitoring

AI Integration Amplifies These Challenges

Machine learning systems deployed in critical infrastructure must satisfy contradictory requirements:

Accuracy vs. Safety: AI systems must be highly accurate (to avoid false shutdowns), yet conservative enough to shut down operations when genuine anomalies indicate compromise.

Performance vs. Explainability: Fast AI inference for real-time control often means sacrificing interpretability—operators cannot understand why AI made specific decisions.

Detection vs. Adaptation: AI systems trained on historical data may miss novel attack methods. Adversaries continuously adapt attacks; static AI models become ineffective.

Implementation Framework

Step 1: Unified Security Governance

Critical infrastructure operators must establish security governance structures integrating cybersecurity and control system engineering:

Joint incident response teams with both IT and OT expertise
Security requirements specified in terms of physical outcomes, not technical controls
Decision authority shared between cybersecurity and engineering leadership

Step 2: Privacy-Preserving AI Monitoring

Deploy machine learning systems that detect anomalies while minimizing false positives and avoiding adversary manipulation:

Ensemble methods (multiple independent AI models) reduce likelihood all models are poisoned simultaneously
Anomaly detection based on physical relationships (e.g., pressure/flow rate correlation) harder to manipulate than single-parameter detection
Continuous retraining on newly observed normal operations maintains model accuracy

Step 3: Explainable AI for Operator Confidence

AI systems must explain their decisions so operators understand whether alerts indicate genuine threats:

Feature importance analysis showing which data points triggered anomaly detection
Baseline comparisons (this measurement deviates X% from normal range)
Temporal analysis (is this anomaly transient or persistent?)

Step 4: AI Security Hardening

Machine learning systems themselves become attack targets:

Test AI models against adversarial inputs designed to fool systems
Monitor for data poisoning (adversaries injecting malicious training data)
Implement model validation preventing obviously wrong decisions (preventing AI from commanding physically impossible states)

Real-World Integration Scenarios

Power Grid Optimization with Security:

AI systems optimize power grid frequency, voltage, and load balancing. Simultaneously, AI detects anomalies indicating attacks. These systems must communicate—if anomaly detection flags suspicious activity, optimization AI must adjust to safer states pending investigation.

Water System Treatment with Fraud Detection:

AI monitors treatment chemicals, adjusting dosing for optimal water quality. Simultaneously, AI detects anomalies indicating treatment system compromise. If AI detects unusual chemical concentration changes, treatment optimization AI must pause pending human authorization.

Transportation Signal Control with Attack Detection:

AI optimizes traffic signals and vehicle routing for efficiency. Simultaneously, AI detects attacks on signal systems. If anomalies are detected, AI must transition to safe state (all-red signals, default behaviors) rather than continue optimization.

Workforce Development

Integrating OT and IT security requires a new category of professional: the OT-IT security specialist. Critical infrastructure operators need staff understanding:

Control system dynamics and physical safety requirements
Cybersecurity attack methods and detection mechanisms
Machine learning capabilities and limitations
How to diagnose failures arising from OT-IT integration conflicts

This requires cross-training or hiring specialists with background in both domains.

Policy and Regulatory Framework

Regulatory standards for critical infrastructure must evolve to address OT-IT convergence:

Standards should specify safety outcomes rather than specific technical controls
Require integrated risk assessment combining cybersecurity and physical safety
Mandate joint governance structures ensuring both disciplines have authority
Support research and education in OT-IT security integration

Conclusion

Critical infrastructure security in the AI era cannot remain siloed in separate IT and OT disciplines. As AI systems integrate into both technology stacks, security must be unified. This requires structural changes in how infrastructure operators organize security functions, how they train personnel, and how regulators assess and enforce security standards.

Organizations succeeding in this integration will enjoy superior security, improved operational efficiency, and confidence that AI systems enhance rather than complicate safety. Organizations failing to integrate IT and OT security will face escalating vulnerabilities as convergence proceeds without coordinated defense.

Word Count: 820 | Read Time: 10 min | Category: Infrastructure