Wednesday, January 31, 2007

My 3 Parallel Systems...

The way I look at things, Parallel computing falls into three basic areas:

  • Shared Memory Systems (SMP)

  • Distributed Systems (Clusters)

  • Hybrids (NUMA)

I plan to have one of each running at my house, but I got the first one up this week. I wanted something that was fairly massively multi-processor, and I just so happened to have a bunch of Sun parts in the garage and a spare SparcServer 1000E. I loaded it up with 4x 4.5GB drives, 8x 85MHz SuperSparcs, and 2GB RAM. After a failed attempt at SMP Linux, I've installed Solaris 8. So far so good. Now, to get pkgsrc running...

Monday, January 22, 2007

Setting up a VMware HA Cluster using RedHat Cluster

Once upon a time a company I worked for was migrating their systems to VMware in a consolidation effort. They had a SAN, they had a BladeCenter, and they had VMware. And we were all pretty pumped about the VMware, let me tell you. It was like the second coming.

But there was a problem, you see. We were using RedHat Cluster on Debian Sarge. And RedHat Cluster didn't support fencing machines through VMware. What's fencing, you ask? Well, in cluster-land, when a node becomes unreachable or out of sync with the others, it is 'fenced'. This is a mechanism to ensure data integrity by forcably removing the node from the cluster and the cluster storage. In RedHat Cluster, this can be achieved by either removing the node from the storage in the switch, or by rebooting the node via a LOM card or a addressable power supply. If we did either of those in VMware, we'd take everyone else down with us, and that would be very, very bad.

Well luckily for us, VMware provides a Perl API to use to connect to the host machine. I used this API to modify the fence_apc script, which is used to telnet to an APC Masterswitch and fence a node, to a version that would work for VMware Server or ESX.

First you must set up a fence.ccs file.

fence_devices {
esxblade1 {
agent = "fence_apc"
ipaddr = "10.0.0.1"
login = "fence"
passwd = "password"
}
esxblade2 {
agent = "fence_apc"
ipaddr = "10.0.0.2"
login = "fence"
passwd = "password"
}
}

This defines two fence devices, esxblade1 and esxblade2. It lists their IP addresses, a login ID and a password, and the agent to use, which in this case is fence_apc. I placed my script in /sbin/fence_apc and moved the original out of the way to make sure there wouldn't be a problem. The login ID and password should be an account on the VMware host system that has access to reboot, startup, and shutdown the VMs in question.

Next, we create a nodes.ccs file.

nodes {
node1 {
ip_interfaces {
eth0 = "10.1.0.1"
}
fence {
vmware {
esxblade2 {
port = "/home/vmware/node1.vmx"
}
}
}
}
node2 {
ip_interfaces {
eth0 = "10.1.0.2"
}
fence {
vmware {
esxblade1 {
port = "/home/vmware/node2.vmx"
}
}
}
}
}

This specifies the public interfaces for two machines, known as node1 and node2. It also defines a fencing mechanism for node1 and node2, vmware. The vmware fence script takes as its argument a 'port', which is really the path on the host system's filesystem to the .vmx file for the virtual machine.

The script for fence_vmware is available here.

You should be able to use this, in conjunction with the documentation for RedHat Cluster, to set yourself up a test VMware cluster. If not, feel free to shoot me an email and I'll try and help you out.

Sun announces partnership with Intel

Sun and Intel announced today that Sun will begin shipping Intel-based servers and workstations in Q1 2007. This is pretty exciting, Intel has some nice looking stuff coming down the pipe and I'm glad to see that Sun isn't tying itself too closely to a single CPU vendor, unlike some *cough* companies.


The 'official' announcement is available here, and Jonathan's blog posting is here.

Thursday, January 18, 2007

Mac OS X... it will remind you of Unix.

OS X may be the greatest Unix GUI system ever invented, but it still leaves me irked at times.

A friend of mine asked me a few weeks back about a way on OS X to redirect sound from one system to another. I proposed this solution, thinking it rather clever:

dd if=/dev/audio | ssh remotehost dd bs=1k of=/dev/audio

Seems like it should work, right? Wrong. For you see, there's no /dev/audio on OS X. Or /dev/mixer. Or anything, really. How annoying. So how does one use audio in a standard Unix-y way on OS X?

I had another friend ask me about making a frequency generator application on her Mac to show her Physics students. In the 'old days', this would have been no trouble, but it seems times have changed. It's now insanely difficult to output any sort of sound on any computer, much less a Mac. Since there's no /dev/audio device, none of the examples I found would work. Luckily for her, I found an application that should do the trick, but I'm still upset that I couldn't successfully make it work using Python.

If anyone's interested, the link for the program I found, called AudioTest, is located here. It looks to be pretty cool.

Tuesday, January 16, 2007

Setting up an Enterprise Home Network Part 2

I'm a System Administrator. A lot of times I wonder how it was exactly that I got to where I am. You can't really learn to do what I do from a book, or a series of books, or from a school. And yet, I have read tons of books and spent a major portion of my life in school. And all of that has helped tremendously, but it wouldn't have been enough.

When I got my first admin job, it was because I knew a thing or two about Linux. I had been running Linux since the early 90's, and had several machines set up at my house running various things. That's really what started it all. I was just playing. Years and years of playing and doing stupid things at my house that I had no reason to be doing, but I did anyhow. I made my hobby of computers relevant to the real world.

So, to help others who might be interested in doing the same, I'll put some advice I give pretty often to people who are wondering about what hardware to buy. This will be a multi-part series, with the first part dealing mainly with the Ethernet network infrastructure.

Wireless / Router

Everone has wireless already. It's usually provided by a small black and purple box made by Linksys, which does triple-duty as a switch and a NAT router. Real world relevance: 0.

If you want something a bit more serious, you could use what I've used for the past several years, a Soekris box running m0n0wall. It's a nice little system that has worked flawlessly, and can expose you to a variety of higher-level networking things that your Linksys just won't do. People to run these systems in production as well, and I can see why. But, it's not what you're likely to encounter if you walk on a job where they need a firewall worked on.

If you really want something to get you familiar, I'd recommend a Cisco PIX 501. They're relatively inexpensive, easy to find on eBay, and run the same OS that the large grade Cisco firewalls do, like the ASA. Of course, the ASA doesn't have any sort of wireless support, so you might want to pick up a Cisco Aironet 1100, as well. The PIX will give you VPN access, IDS, URL Filtering, and hardwaare Failover. The Aironet device is generally found with an 802.11b card, but is upgradable to G and other standards.

Switches

When I needed a new switch for my office, I picked up a Cisco Catalyst 5000. It was about $100 on eBay, and had 2x 24-port 10/100 blades and a single 48-port 10bT blade. You might not need something this crazy, but it has its advantages. For one, you can configure EtherChannel (or 802.3ad aggregation) between the blades and increase a single host's bandwidth to the switch. Also, you can create 802.1q VLANs to make your large switch investment into the last switch you'll ever need to buy. If you want something smaller, pick up something from the Cisco 2900 series, like the 2924 or the 2950. The 2924 won't run the latest IOS and won't do SSH, but it's adequate and will do VLANs and aggregation for you. The 2950 is just a continuation of the 2924, and is a bit more recent.

With all of this equipment, you can segment your network in a much more palatable way than usual. You know how every time you have a friend over that wants to use your wireless network, you have to give them your key to access it? No need to do that anymore. You can create a separate VLAN for wireless guests, on a separate wireless network, and leave it wide open. Restrict it to web access only, and deny it access to any of your internal machines.

Perhaps you want to run some externally facing servers, but you're concerned that they might get hacked and access your sensitive internal machines. No problem, just create a DMZ VLAN, assign your servers into it, and set up the PIX to use a 1:1 NAT to those systems on a port-by-port basis.

Hey, look at that, your entire network is Cisco now, more secure, and you're learning! Get yourself an IOS router of some description, and you can train at home for your CCNA. Stay tuned for the next part, where I will discuss Storage for your home network.

Making QEMU work on NetBSD/macppc

Because we can't be happy with running alternative OSs unless we can run other alternative OSs inside them!

I've uploaded my patch that I used to make QEMU build from pkgsrc on NetBSD/macppc. It's available in files, if someone has the same insane desire that I do.

Free Solaris 10 and Sun Studio 11 Media Kits

Sun is giving away free Solaris 10 and Sun Studio 11 media kits, this time for both x86 and SPARC, on DVD. Head on over and get yours ordered!

The link is here: Free Solaris 10 and Sun Studio Software Media Kit

Setting up an Enterprise Home Network Part 2

If you didn't realize by now, storage is all about the price you pay for what you get. Well, I suppose that's true for just about anything, but especially for storage. It sounded profound when I said it in my head earlier. Essentially, there are several things that I think about when looking at storage options in any case, and even when buying something for my home:

Capacity
How much usable storage do I need, and how much will a product give me?
Speed
This is less important for a home system, but still important. What are the drive interfaces? What is the host interface?
Redundancy
Yes, even with home systems, you need redundancy. It might not ruin my business if I have a multi-drive failure at home, but it sure will ruin my month.
Expandibility
Think about your needs on down the road, is what you buy going to last you?

NAS

At the bottom end of home storage, there are devices like the LaCie Ethernet Disk and the Buffalo TeraStation. I generally don't care for any of these devices because they're usually very Windows-centric and run proprietary OSs. What happens if I want to connect my Ethernet Disk to my LDAP system, or use it in a different way than the manufacturer intended? Too bad. Plus, for any sort of RAID5-level redundancy, you're going to be paying a bit of money.

Let's instead look at the bottom end of enterprise storage, where we have a few NAS devices that work quite nicely. The two that come to mind off-hand are the Dell PowerVault 715N and the Iomega NAS p4XX. Both of these are pretty similar, 4x PATA drives in the front, 2 10/100 NICs, standard PC guts. Are the drives very big? No, but they're easily upgradable. Is the OS very good? No, but it's just a PC, so load whatever you want.

This brings us to another point, hardware RAID or software RAID? Years ago, hardware RAID had it's place. Systems were too slow to keep up with all of the RAID calculations, so a dedicated hardware RAID controller took care of that by offloading the RAID calculations from your CPU. They also added large amounts of cache to speed up transactions. Now, for small setups like what we're talking, hardware RAID would be a waste of time and money. If your controller fails in some way, how will you get your data back? (Hint: You won't. It's gone. Bye-Bye.) Software RAID transcends hardware, so just pull those drives and mount them in another system.

But what OS? In a previous article, I discussed my decision making process for what to run on my Iomega unit. ZFS is certainly the best thing going, if you can get it to run. Barring that, FreeNAS or OpenFiler would make nice alternatives, or just using FreeBSD with vinum, which I did for some time as well. Linux and EVMS would also work nicely.

SAN

Oh, but you want more, huh? You say, "Zach, this just isn't enough, I crave real enterprise hardware!" Well, the best game in town are Fibre drives. Yes, that's right, Fibre. Think about it, who wants a Fibre drive? Who do you know that needs a Fibre drive? More importantly, who in their right mind would buy a Fibre drive and array off of eBay? We would, that's who.

It won't cost you much to get started with Fibre. First you'll need a cabinet. Being a Sun guy myself, I like the A5200 or the T3/T3+. They're available pretty cheaply on eBay, I've seen A5200s for around $50. You need an HBA, this is the part that connects your computer to your cabinet. A qlogic 2100F is about $10 on eBay. Now you need drives. A 73.4gb drive sells on eBay for about $50. Oh, and don't forget, you don't just have a big array now, you have a SAN. The A5200 will accept 2 HBAs, and you can plug them into a Fibre switch and connect even more systems. Need a little extra space on your server? No problem, carve out a LUN on the SAN and map it to the server in question. Want to learn about HA Clustering and Failover? Again, not a problem. Map a LUN to 2 hosts, install Sun Cluster, and go to town. Maybe you just want to run filesystem performance tests? You have a wad of drives at your disposal. And expandable? Hey, you've set everything up now, just buy another cabinet.

Sunday, January 14, 2007

Solaris on Iomega p405u

Some time ago, I was looking for a NAS device to increase the storage on my home network. The prosumer offerings at the time were a bit lacking, and the price was a bit too high for my tastes. Most of them used an embedded SBC running a proprietary OS of some sort, and were mainly catering to Windows/CIFS sharing. The also generally only included a single NIC, and a single or dual drives. I wanted to have a something with a minimum of RAID5, and preferably 4 ATA drives. I settled on the Iomega p405u.

The p405u is a standard PC for all intents and purposes. It contains an Asus TP20 motherboard with 2 Intel 10/100 NICs, 2 onboard IDE channels, a HighPoint 370 RAID card. In this particular configuration, it's outfitted with a half a gigabyte of RAM and a 1.0GHz PIII. There are 4 hot-plug ATA drives in caddies in the front, mine are currently 80gb. From the factory, it ran a stripped down version of FreeBSD with a Web UI tacked on. It works, and I used
it like this for some time.

However, I decided it was time to replace the built-in OS with something a bit more functional. I want to upgrade the drives and I was unsure of how the p405u software would handle that, plus there are just better things out there. I couldn't even SSH into the appliance as it was, so that Iomega software had to go. You won't find a floppy or a CD-ROM on this sucker, or even a USB port, so I set up my PXE environment to boot a couple of alternative OS distributions catered to storage. First was FreeNAS, which is based on m0n0wall. I've run m0n0wall since the first release for the Soekris 4501, and I've been very pleased with it. After reading, though, it seemed that FreeNAS is lagging a bit in development, and that it would require a lot of work just to get it PXE booted. Since I have no idea if the hardware will even work with their kernel, I decided to skip it and move on to OpenFiler. OpenFiler aspires to be great things, but it can't escape the fact that it's running Linux and LVM underneath, and I found it a pain to set up and administer volumes and shares.

I set up Solaris 6/06 to PXE boot on the system and installed it. It was able to find all of the devices in the system, except for the 10/100/1000 NIC on the PCI riser card and the HighPoint 370 RAID controller. This presents a problem, since I can't see the other 2 drives in the NAS, and it makes it somewhat limited for storage. After some searching, I found that you can override Solaris' pci-ide driver and have it bind to a device it doesn't detect. To test, simply add the following to the GRUB kernel stanza:

-B pci-ide="pci1103,4"

The 1103,4 comes from a listing of `prtconf -pv`. The relevant sections here are the vendor-id (00001103) and the device-id (00000004). If you wish to make this change permanent, you can run:

# eeprom pci-ide="pci1103,4"

Once booted with this flag, the system hung after detecting the first drive. After a post on the OpenSolaris forums, it was suggested to me that this could be the result of bugid 6414472, which is not currently fixed. I started to build a new kernel using the OpenSolaris sources, but the amazing OpenSolaris team had a better solution. Simply boot with '-kd' to drop into kmdb before booting the OS. Then, run the following:

::bp ata`ata_id_common
:c
::delete 1
ata_id_common+0x39?w a6a
:c

Yes, that's right, we've just changed the value in the debugger, and it worked beautifully. To make this change persistent across reboots:

# adb -w /platform/i86pc/kernel/drv/ata
ata_id_common+0x39?w a6a
$q

My hat goes off to the amazing OpenSolaris team, this is by far the most amazing fix I've ever done to a system. Bravo!

Welcome!

I've finally decided that I should update my site with a modern CMS and start actually posting things to it. I'm using Plone, which is based on Python. For those of you who know me, that should come as no surprise. Stay tuned for more updates, I'm working with Plone to add a Photo library and a few other goodies. I hope to update this regularly with decent stuff.