Step 1 – A well Organized folder structure
Before configuring networking or access zones, we need to establish a well-organized directory structure. This foundation will support our entire MLOps infrastructure, from data storage to Kubernetes integration etc, the workflow typically would be the following
- folder structure
- networking and dns
- Smart connect
- Access zones
Understanding the Structure
Base Path Convention
We will follow PowerScale best practices for Kubernetes CSI implementation:
/ifs/data/<cluster_name>/
This structure simplifies:
- Multi-cluster management
- CSI driver configuration
- Storage class implementation
- Future scalability
Our Implementation
/ifs/data/cls-01/ # Cluster root
└── ml-lab/ # ML access zone root
├── artifacts/ # Training outputs & metrics
├── datasets/ # ML training data
├── logs/ # System & application logs
├── models/ # Trained models
└── rke2-mlops/ # Kubernetes PVC storage

Directory Purposes
- artifacts/ – Training outputs, metrics, and results
- Experiment results
- Model performance data
- Visualization artifacts
- datasets/ – Training and validation data
- Raw datasets
- Processed data
- Test datasets
- logs/ – Operational logging
- Training logs
- System logs
- Application logs
- models/ – Model storage
- Trained models
- Model checkpoints
- Production models
- rke2-mlops/ – Kubernetes storage
- Persistent Volume Claims
- RKE2 cluster storage
- Container persistent storage
# Create base structure mkdir -p /ifs/data/cls-01/ml-lab
# Create ML directories cd /ifs/data/cls-01/ml-lab
mkdir -p artifacts datasets logs models rke2-mlops
# Set appropriate permissions chmod 755 /ifs/data/cls-01/ml-lab
chmod 750 /ifs/data/cls-01/ml-lab/{artifacts,datasets,logs,models,rke2-mlops}
# Verify structure tree /ifs/data/cls-01/ml-lab

ok good, now that we have our folder structure – we can configure networking, as this will be a requirement for the creation of the access zone.
Step 2 – Networking
Before diving into ML workflows, containers, and AI tools, we need a solid foundation. We’re starting with the foundation and framework that will support everything else. We need proper networking before data can flow. Our MLOps workloads need:
- Dedicated bandwidth for large datasets
- Predictable performance for training
- Isolated traffic for security
We will be utilizing two key features available to us in OneFS – Access Zones and Smart Connect. I am referencing the following design / best practice guide available to anyone – Dell PowerScale: Network Design Considerations | Dell Technologies Info Hub

we will implement and test the following

The Network we will be using for all this will be a 192.168.30.x network – we want everting to follow best practice and work via DNS (This is a requirement also of smart connect). Our DNS entry point to the ML Lab will be ml-lab.lab.local – so our smart connect zone will need to be configured accordingly.

Our network config will be as follows
- PowerScale Interface – Ext2
- Subnet(30) – 192.168.30.x – This is our IP Subnet for the ml-lab and access zone
- Smart Connect Service IP – 192.168.30.30
- Node Pool IP Range – 192.168.30.31-35
- Smart connect DNS Entry Point – ml-lab.lab.local

SmartConnect: Network Load Balancing and DNS Resolution
SmartConnect manages client connections to your PowerScale cluster through DNS-based load balancing, there is an important distinction in how network traffic actually flows.
How SmartConnect Really Works (Hint, it’s a DNS Server!)
- Client requests connection to SmartConnect service name – in our case ml-lab.lab.local
- DNS resolution occurs:
- SmartConnect service name is NOT an IP that serves traffic i.e. in our configuration 192.168.30.30 does not serve traffic – rather that is the dns endpoint the client reaches
- Instead, it returns an IP address from the pool of available node IP addresses
- The returned IP belongs to a specific node in the cluster
- Actual data traffic flows directly through node IP addresses, not through a service IP
- If a node becomes unavailable, DNS will resolve to different node IP addresses

Key Technical Points
- Smart Connect IP is defined at the Subnet level
- SmartConnect service name is purely a DNS entry point
- No traffic passes through a SmartConnect “Service IP”
- Real data transfer occurs directly with node IP addresses
- IP pool configuration is critical for proper load distribution
- DNS TTL settings affect how quickly connection changes propagat

Configuring Networking, Access Zone and SmartConnect
We need to firstly create our groupnet
Step 1: PowerScale Network Pool Configuration
#Configure the network pool with SmartConnect (command line)
isi network subnets create subnet-mlops \
–addr=192.168.30.0 \
–gateway=192.168.30.1 \
–netmask=255.255.255.0 \
–sc-service-addr=192.168.30.30 \
–sc-service-name=ml-lab.lab
#Configure the network pool with SmartConnect (GUI)

Before creating the network pool we will now have to create our Access Zone

Step 2: Configure Access Zone
lets created the Access Zone and assign it to our ml-lab network – before we configure DNS and test the set-up (I always prefer to do this step though the gui for some reason!)
In the GUI go to Access -> Access Zones
here we will create out “ml-lab” access zone – and confiure the zone base directory to be /ifs/data/cls-01/ml-lab

Next lets assign this access zone to the to the “ml-ops” Pool (Remember, Access Zones are defined at the Pool Level)

Ok, were good to finish our network config
# Create network pool
isi network pools create mlpool \
–ranges=192.168.30.31-192.168.30.35 \
–ifaces=ext-2 \
–sc-dns-zone=ml-lab.lab.local \
–sc-connect-policy=round_robin \
–sc-dns-zone-aliases=mlops.lab.local \
–description=”ML Lab Network Pool”
# Create network pool (GUI)



Ok, now were good to configure and test DNS for SmartConnect
PowerScale DNS Configuration Best Practices for ML-Ops
After setting up our network pool and access zones, proper DNS configuration is crucial for a robust ML-Ops environment. Let’s dive into DNS delegation best practices and implementation.
DNS Architecture Overview
In our ML-Ops setup, we’re implementing the following DNS structure:
- Primary Zone:
lab.local - SmartConnect Zone:
ml-lab.lab.local - Service IP (SSIP):
192.168.30.30
DNS Delegation Best Practices
Again, im referring to the following best pratcices and implementation guide – DNS delegation best practices | Dell PowerScale: Network Design Considerations | Dell Technologies Info Hub

Use Address (A) Records, Not Direct IP Delegation
Always delegate to Address (A) records rather than IP addresses directly. This approach simplifies:
- Business continuity management
- Maintenance operations
- Disaster recovery scenarios
ok so, first things first let get an A record pointing to our Smart Connect Service IP (SSIP) – we’ll then delegate to this for our DNS entry.

Next, right click on the lab.local folder object and choose “new delegation”

The FQDN i’ve chosen is ml-lab.lab.local – this will point to the smart connect service addresses A record of cls01-ssip.lab.local – 192.168.30.30. Again, what are we doing here ? we’re saying that is i lookup ml-lab.lab.local, redirect that query to cls01-ssip.lab.local (192.168.30.30) – as we’ve already discussed, smart connect is a dns server, so it will in turn take this query (forwarded from our local DNS server to it) and return and return an ip address for ml-lab.lab.local from the Address Pool available. If you think of it…. at no stage have we defined an actual storage ip address for ml-lab.lab.local anywhere in our lab setup, nor should we !

Implement One Name Server Record Per Zone
While we are only creating one for our Ml Ops lab. Create individual delegations for each SmartConnect zone or alias. This enables:
- Granular failover control
- Independent zone management
- Workflow isolation
Implementation Steps (Powershell)
1. Configure Windows DNS Server
# Create Primary Forward Lookup Zone
Add-DnsServerPrimaryZone -Name "lab.local" -ZoneFile "lab.local.dns"
# Create SmartConnect Service IP A Record
Add-DnsServerResourceRecordA -Name "ml-lab-ssip" -ZoneName "lab.local" -IPv4Address "192.168.30.30"
# Add SmartConnect Zone Delegation
Add-DnsServerZoneDelegation -Name "lab.local" -ChildZoneName "ml-lab" -NameServer "ml-lab-ssip.lab.local" -IPAddress "192.168.30.30"
2. Verify DNS Configuration
Windows DNS config

Lets Test DNS Resolution for this alias from our Linux Host that will be running the Clear-ML Agent – remember we expect an ip address to be retuned from the node pool 192.168.30.31-35

192.168.30.31 retuned, success!!, were ready to test an export and confirm zone access
mount -t nfs ml-lab.lab.local:/ifs/data/cls-01/ml-lab/datasets /mnt/datasets
and verify with the df -h command – As a first step we can see our datasets folder is mounted – we’ll obviously make this persistent by configuring the fstab (along with adding the other folders) – but as a start we know our environment is functioning as expected.

Next up: Part 3: Container Orchestration “Orchestrating ML: Rancher and PowerScale Integration” Persistent volume claims for ML workloads, PowerScale CSI driver setup, Storage class configuration and Dynamic volume provisioning for ClearML Server


Leave a comment