We'll respond shortly.
At Pivotal Labs we use DeployStudio to rapidly image machines over the network. It was an excellent solution when the DeployStudio server and the client were on the same subnet. It did not work when they were on different subnets.
We found that, with a combination of clever use of tcpdump, a carefully-crafted dhcpd configuration file, and a judicious set of firewall exceptions, we were able to extend DeployStudio so that it worked across subnets.
Unfortunately, it was an epic fail: every third install would cause our firewall (m0n0wall 1.8.0b512) to lock up. We have put the project on ice until we get a new firewall.
This blog post is intended for IT organizations with the following characteristics
See Ryan’s comments below. With a few lines of Cisco configuration (assuming you have a Cisco router), you can easily configure DeployStudio boots across subnets.
The rest of this blog post is the much more difficult path that I took, and I don’t recommend it unless you really enjoy doing things the hard way.
To make DeployStudio work across subnets, you first need to use tcpdump to capture how it works within a subnet. In this case, we used a laptop (kate-enet), and our DeployStudio server (deploystudio).
First, we started the capture. We captured to a file so that we could examine the output at our leisure. We ran the following command on our deploystudio server:
sudo tcpdump -w /tmp/kate.tcp -s 1536 host kate-enet
Next, we started a network install:
Then we examined the tcpdump file using the following command:
sudo tcpdump -r /tmp/kate.tcp -vvv | less
There were two packets we were particularly interested in:
deploystudio.sf.pivotallabs.com.bootps > kate-enet.sf.pivotallabs.com.bootpc: [bad udp cksum 2b5a!] BOOTP/DHCP, Reply, length 319, Flags [none] (0x0000)
Client-IP kate-enet.sf.pivotallabs.com
Client-Ethernet-Address 40:6c:8f:3d:e6:b4 (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: ACK
Server-ID Option 54, length 4: deploystudio.sf.pivotallabs.com
Vendor-Class Option 60, length 9: "AAPLBSDPC"
Vendor-Option Option 43, length 56: 1.1.1.4.2.127.209.7.4.130.0.4.56.8.4.130.0.4.56.9.35.130.0.4.56.30.49.48.46.56.95.109.97.99.95.109.105.110.105.95.115.101.114.118.101.114.45.50.48.49.50.45.48.56.48.54
END Option 255, length 0
And
deploystudio.sf.pivotallabs.com.bootps > kate-enet.sf.pivotallabs.com.bootpc: [bad udp cksum 254b!] BOOTP/DHCP, Reply, length 379, Flags [none] (0x0000)
Client-IP kate-enet.sf.pivotallabs.com
Server-IP deploystudio.sf.pivotallabs.com
Client-Ethernet-Address 40:6c:8f:3d:e6:b4 (oui Unknown)
sname "deploystudio.sf.pivotallabs.com"
file "/private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter"
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: ACK
Server-ID Option 54, length 4: deploystudio.sf.pivotallabs.com
Vendor-Class Option 60, length 9: "AAPLBSDPC"
RP Option 17, length 93: "nfs:10.80.28.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg"
Vendor-Option Option 43, length 21: 1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48
END Option 255, length 0
Note:
There are 4 crucial pieces of data that you must capture.
We then added the information we had culled from the tcpdump to our dhcpd.conf file (special thanks to Pepijn Oomen and Bennett Perkins; see bibliography):
class "netboot" {
match if substring (option vendor-class-identifier, 0, 9) = "AAPLBSDPC";
option dhcp-parameter-request-list 1,3,17,43,60;
if (option dhcp-message-type = 1) {
option vendor-class-identifier "AAPLBSDPC";
option vendor-encapsulated-options
08:04:81:00:00:89; # bsdp option 8 (length 04) -- selected image id;
} elsif (option dhcp-message-type = 8) {
option vendor-class-identifier "AAPLBSDPC";
if (substring(option vendor-encapsulated-options, 0, 3) = 01:01:01) {
log(debug, "bsdp_msgtype_list");
# bsdp image list message:
# one image, plus one default image (both are the same)
option vendor-encapsulated-options
01:01:01:04:02:7f:d2:07:04:82:00:04:38:09:23:82:00:04:38:1e:31:30:2e:38:5f:6d:61:63:5f:6d:69:6e:69:5f:73:65:72:76:65:72:2d:32:30:31:32:2d:30:38:30:36;
} else {
log(debug, "bspd_msgtype_select");
# details about the selected image
#
option vendor-encapsulated-options
01:01:02:08:04:82:00:04:38:82:0a:4e:65:74:42:6f:6f:74:30:35:30;
next-server deploystudio.sf.pivotallabs.com;
filename "/private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter";
option root-path = "nfs:10.0.0.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg";
}
}
}
Resist the temptation to substitute a hostname for the NFS server’s IP address; (i.e. leave it “nfs:10.0.0.64”; do not put “nfs:deploystudio.sf.pivotallabs.com”). IP addresses will work; hostnames won’t.
We used ruby (irb) to convert the dotted-decimal strings in tcpdump to colon-hexadecimal in dhcpd.conf. In the following example, we convert “1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48”:
bc$ irb
1.9.3p194 :001 > string="1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48"
=> "1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48"
1.9.3p194 :002 > string.split(".").each { |n| printf("%02x:",n) }; p
01:01:02:08:04:82:00:04:38:82:0a:4e:65:74:42:6f:6f:74:30:35:30: => nil
If you have a firewall arbitrating traffic between the subnets, you’ll need to allow all inbound traffic to your DeployStudio server. Additionally, if your firewall can’t snoop TFTP traffic, you’ll need to allow outbound UDP traffic on unreserved ports (1024 – 65535).
If you’re having problems, you need to check that your TFTP and NFS are working, preferably from a machine that’s on the subnet of the client which your trying to image.
In our example, we know that our tftp server is deploystudio.sf.pivotallabs.com, and the file we’re downloading is /private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter. Let’s try from the command line:
bc $ tftp deploystudio.sf.pivotallabs.com
tftp> get /private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter
Received 993680 bytes in 18.3 seconds
Testing NFS is a little tricky because the NFS path is slightly mangled. Specifically, a “:” is substituted for the second-to-last “/” in the pathname. For example, the dhcp root-path directive “nfs:10.80.28.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg”
is translated to a pathname of “/net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg” for testing purposes on a client machine. We take advantage of automount running on a typical OS X client. First do an ls to make sure we can see the file, then do a cp to make sure we can read the file:
ls /net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg
cp /net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg /dev/null
The time required to image a machine will more than double. A typical install will take 40 minutes or more.
Certain operations are much slower. Specifically, the time between selecting netboot server and being presented with the DeployStudio runtime screen takes approximately 7 minutes. We have studied that lag, and over 4 minutes is due to abysmal (3.8kBps) TFTP throughput. We are unclear why there is such a gross lag; running the same tftp on the command line completes 20x faster (74.7kBps).
We have a firewall that negotiates traffic between our subnets, and we are aware that TFTP provides challenges for firewalls (it re-negotiates its destination port) (Cisco firewalls have special directives to handle TFTP traffic appropriately).
Hello,
I’m not certain how your network is set up but I’ve been able to use DeployStudio across subnets for some time now. The way that I’ve set it up is to have the Deploystudio server configured as a “DHCP Helper” (AKA DHCP Relay) in addition to your actual DHCP server.
This is a setting done on the router interface for each subnet. Most routers will allow you to enter multiple DHCP servers into the list of “helpers”. I know at least Cisco, HP, and Juniper do.
Just log into your router and add a DHCP helper to each router interface for the subnets that you want to be able to use DeployStudio in. If your DeployStudio server IP is 192.168.1.200, a DHCP helper entry on a Cisco router would looks something like:
config t
interface Fast 0/0 # This is the subnet that you want to use DeployStudio in but currently can’t
ip helper-address 192.168.1.200
end
write mem
What happens is every helper in the list on the router is forwarded a copy of any DHCP request that happens in that subnet. DeployStudio is configured to respond with special information when certain “options” are requested by a system. Your DHCP server will still be used to get and IP address, but DeployStudio will be used if any network bootable drives are requested by the computer. This happens during the bootp process if you hold down the “N” key on a MAC while booting. So it is pretty seamless once you add the helper information to the router.
Hope this info helps.
Ryan.
Thanks again Ryan. I’ve updated my blog post to say, “Read Ryan’s comment: he has a much better way of doing it”.
Ryan, thanks for the tip. Once we install a router to manage our inter-subnet traffic (currently our firewall does it, and the firewall does not handle the tftp traffic gracefully), we’ll most likely take the path that you suggest—it’s much easier than crafting custom DHCP records.