
s3420gplx saga - fakeraid for ESXi 4.1
Hi folks,
I am a newcomer to ESXi with no experience in the area
I am currently looking to rebuild my home server with ESXi to allow more flexibility in supporting guest OS environments. Hardware spec as follows:
Intel Sever board s3420gplx
Intel Xeon X3440 CPU
16gb ram - KVR1066D3Q8R7SK2/16GI
Intended ESXi OS drives: 2 x Hitachi 500GB Travelstar 7K500 drives
HBA card 1: LSI SAS 9201-16i
HBA card 2: IBM serverRAID M1015
Non OS disk: 10 x Hitachi 2tb Deskstar 5K3000 drives
I have a friend with identical hardware so the plan is do this twice over.
A bit of research was done to try and ensure "everything is on the HCL list". My first experiment installing ESXi 4.1 reveals its not that simple. Firstly, one of the onboard intel Nics (the 82578DM) is not supported

The second challenge was the intel ESRT2 fakeraid - with more research I would have realized that ESXi does not support any ""fakeraid"" drivers.
For the first challenge, following guidance from the forum here I found a modified e1000 driver which was suggested to support this chipset, identified the PCI ids and built my pci.ids and simple.map file. Wanting to understand how this all works, I used ESXi to spin up an ubuntu environment to build oem.tgz and deconstructed the ESXi 4.1 ISO. I don't plan to use a USB stick to run ESXi and this seemed to be where most of the 'guides' where aimed at. To rebuild a new ISO with the NIX driver, I did the following:
mount the iso
edit isolinux.cfg at the root of the iso to include oem.tgz after the append vmkboot text
copy oem.tgz into the root of the iso
extract imagedd (using bunzip2)
mount imagedd iso
copy oem.tgz into the root of the imagedd iso
unmount imagedd iso
re-bzip imagedd
re-create the iso using mkisofs
I then burnt a new CD with this ISO and now have a re-usable installer with the missing NIC driver - and it works
The next challenge, fakeraid. For my purposes, I am looking to only run raid 1 (mirror) with no write back cache in a home environment. I consider this to be safer than running on a single disk with no mirror. I don't have any PCI slots readily available to host any more cards, so a dedicated raid solution is a challenge. I have looked at the 'cheat' 2 SATA in 1 SATA out raid solutions but don't see these as ideal either.
So.. part 2: drivers for ""fakeraid"". First, I looked to see if anyone else had success here - from what I can see this is not the case. I then looked at two options: 1) Attempt to compile the generic linux dmraid driver OR 2) Attempt to compile the LSI / Intel drivers.
A lot of mystery seems to surround compiling drivers for ESXi, most guides are for prior versions of ESXi (a bit has changed) or assume more knowledge than I can claim to know. Credit where it is due though, the guide from kernel crash was probably the most helpful in getting some ideas on where to start.
http://www.kernelcrash.com/blog/using-a-marvell-lan-card-with-esxi-4/2009/08/22/First step was to spin up a Centos 5.5 environment in another VM. My limited understanding is that this is considered the "better" environment to attempt ESXi 4.1 driver compilation. The ISO download and install is a bit slow after Ubuntu, I only went with 'standard' options...
Next we need the ESXi OSS - for 4.1 I found this hosted in 3 parts. These were downloaded and moved onto the Centos environment and their contents extracted. Inside these packages there is a folder vmkdrivers-4.1 with a file vmkdrivers-gpl.tgz - this is where all the 'stock' ESXi driver code resides. This too was then extracted into a new folder.
Inside this package there is a kernel sourcecode we apparently need. I found it sitting at the same level as the drivers named kernel-sourcecode-410.2.6.18-164.0.0.253625.x86_64.rpm
Not really knowing what else was required, and wishing there was a "yum install every possible thing I could need" I pretty much installed every package mentioned in every guide I could find.. mostly as a result of repeated failures to compile the stock drivers (and the new drivers). Packages I installed include: qt-devel gtk2-devel gcc kernel-devel readline-devel ncurses-devel libevent-devel 'Development Tools'.
The good news was this then made it so I was able to compile the standard drivers without any errors. There is a build script in the drivers folder called build-vmkdrivers.sh which does this.
I had a quick attempt at the dmraid drivers but found a few conflicts - the "old" dmraid drivers won't compile on the new 2.6 kernel (at least from what I can understand). I grabbed the latest dmraid driver and found while it was happy with the kernel, there is a problem that it requires a library that appears to only be available in a later revision of the LVM2 package than what centos is built on. I threw in the hat at this point on the dmraid driver

Now the fun part - compiling "new" stuff. I then went on to the Intel drivers from their site (which are really LSI drivers). Its a smaller package than dmraid and I hoped would be a bit easier. These were downloaded and extracted into the drivers/vmkdrivers/src_current/drivers/scsi/<your folder> location. I then made a copy of the build script and picked a driver I hoped would be close to mine - another LSI raid driver - and cut all the lines not relating to that driver out. Once done, I then cloned the 'per c' file records to match the number of c files I had (3) and updated the driver number, name and paths and filenames. For the link command I also added in the precompiled archive file for redhat 5 64 bit (closest match I could guess) - unfortunately Intel and LSI don't provide "all" the source code.
Running the cut down script two things where apparent. First, there was a missing header ioctl32.h and second one of the c files had major issues compiling.
ioctl32.h is not included in ESXi - looking through the code it actually referenced two different versions, a linux and an asm version. Based on the if checks the asm was the one we wanted. It looked like this was an old reference, since ioctl32.h no longer exists in centos. I found that the asm library effectively pointed to a compatibility support ioctl.h so put this into the path for compilation. This resolved the calls to this header with no further issues.
With the major compile error it looks like the driver did not anticipate an environment where we could have a linux kernel > 2.6 and have a VM Module define at the same time. The code seemed to suggest that the 2.6 path would give us what was needed, so I made a small edit to the build script to remove the define for the VM Module reference.
Running build again there is now only a single warning, and we get a 4.6Mb megasr.o file - most of the size comes from the archive from my experimenting.
To test this out, I followed advice on the forum here and simply dropped the .o file into /usr/lib/vmware/vmkmod/ and ran vmkload_mod megasr.o debug=10
Now the issue is that I am getting "Unresolved symbols". I have looked through the message log and every single one relates to a subset of the custom functions from the driver.
At this point google is finally failing me, I am hoping someone can offer some guidance on how to work through this. If nothing else it has been a great learning experience, hoping to find a solution.
Thoughts / suggestions most welcome!
Cheers,
Gmk2