temp.vm-help.com
http://www.vm-help.com/forum/

ESXI 5.x Drivers Part 4: Finishing the compilation
http://www.vm-help.com/forum/viewtopic.php?f=34&t=4365
Page 1 of 1

Author:  trickstarter [ Sun Apr 14, 2013 12:49 pm ]
Post subject:  ESXI 5.x Drivers Part 4: Finishing the compilation

This post is going to walk through the latter stages of convincing Linux based drivers to compile, load, and function as part of an ESXI 5.x kernel. In part 1 of the guide we covered the creation of the initial build environment. In part 2 we prepared a build script which would be called to compile our code and produce a suitable kernel module. In part 3 we got in to calling the build script to compile our linux code. At the end of part 3 we saw how when you finally get through a compilation without any error messages, you're not necessarily left with a working module. Here we are going to address the remaining issues which should hopefully produce a functioning driver.

Login to your build host as the build user and move to the top level directory where most work will occur.

Code:
build@esx-build:~$ cd ~/vsphere/vmkdrivers-gpl/
build@esx-build:~/vsphere/vmkdrivers-gpl$


On your target ESXI host, here's how we left it last time:

Code:
~ # vmkload_mod /tmp/atl1c
vmkload_mod: Can not load module /tmp/atl1c: Unresolved symbol
~ #


What we've done here is call the VMWware equivalent of the Linux command insmod. As the kernel opens and loads your module, it assesses the binary file and parses the list of "symbols" that are referenced within. In the context we are working in right now, symbols can be best be thought of as a bunch of program functions bearing the names we will see, where the program functions are discrete blocks of - most probably - C code. When we see a message that says "Unresolved symbol" what we're being told is that the module makes a call to a function bearing the name in question, but the kernel has no such function with that name, therefore the module can't be used.

Linux provides lots of tools for poking around in binary files. Let's take a look at what our build environment makes of the module we just compiled, as far as any symbols that might be "unresolved":

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ nm -u BLD/build/vmkdriver-atl1c-CUR/release/vmkernel64/atl1c
                 U alloc_etherdev
                 U atl1c_suspend
                 U copy_from_user
                 U copy_to_user
                 U crc32_le
                 U csum_ipv6_magic
                 U del_timer_sync
                 U dev_driver_string
                 U __dev_get_by_name
                 U disable_irq
                 U dma_alloc_coherent
                 U dma_free_coherent
                 U dma_map_page
                 U dma_map_single
                 U dma_unmap_single
                 U enable_irq
                 U ethtool_op_get_link
                 U ethtool_op_get_perm_addr
                 U ethtool_op_get_sg
                 U ethtool_op_get_tso
                 U ethtool_op_set_sg
                 U flush_scheduled_work
                 U free_irq
                 U in_interrupt
                 U in_irq
                 U init_timer
                 U init_waitqueue_head
                 U __ioremap
                 U iounmap
                 U __kfree_skb
                 U __mod_timer
                 U msleep
                 U __netdev_alloc_skb
                 U netif_device_detach
                 U netif_rx
                 U __netif_schedule
                 U pci_choose_state
                 U pci_disable_device
                 U pci_disable_msi
                 U pci_enable_device
                 U pci_enable_msi
                 U pci_enable_wake
                 U pci_iomap
                 U pci_read_config_byte
                 U pci_read_config_word
                 U __pci_register_driver
                 U pci_release_regions
                 U pci_request_regions
                 U pci_set_master
                 U pci_set_power_state
                 U pci_unregister_driver
                 U pci_write_config_dword
                 U pci_write_config_word
                 U printk
                 U ___pskb_trim
                 U raw_smp_processor_id
                 U __raw_spin_failed
                 U register_netdev
                 U request_irq
                 U schedule_work
                 U skb_over_panic
                 U synchronize_irq
                 U unregister_netdev
                 U vmk_AtomicUseFence
                 U vmk_DelayUsecs
                 U vmk_HeapCreate
                 U vmk_HeapDestroy
                 U vmk_jiffies
                 U vmklnx_free_netdev
                 U vmklnx_kfree
                 U vmklnx_kmalloc
                 U vmklnx_kmem_cache_create
                 U vmklnx_kmem_cache_destroy
                 U vmklnx_mem_pool_get_parent
                 U vmklnx_net_alloc_skb
                 U vmklnx_netif_start_tx_queue
                 U vmklnx_netif_stop_tx_queue
                 U vmklnx_pci_set_consistent_dma_mask
                 U vmklnx_pci_set_dma_mask
                 U vmklnx_rcu_cleanup
                 U vmklnx_rcu_init
                 U vmklnx_register_module
                 U vmklnx_skb_real_size
                 U vmklnx_unregister_module
                 U vmk_LogBacktraceMessage
                 U vmk_LogNoLevel
                 U vmk_MachMemMaxAddr
                 U vmk_Memcpy
                 U vmk_MemPoolCreate
                 U vmk_MemPoolDestroy
                 U vmk_Memset
                 U vmk_ModuleCurrentID
                 U vmk_ModuleGetName
                 U vmk_ModuleRegister
                 U vmk_ModuleSetHeapID
                 U vmk_NameFormat
                 U vmk_NameInitialize
                 U vmk_PanicWithModuleID
                 U vmk_ScsiGetSystemLimits
                 U vmk_Snprintf
                 U vmk_StatusToString
                 U vmk_Strcpy
                 U vmk_Strncpy
                 U vmk_VA2MA




That is a lot of unresolved symbols, and the missing symbols which are stopping our module from working are certainly in this list. Interestingly, even though nm is reporting all these symbols as unresolved, when the module is loaded on our target the majority of these symbols will in fact be resolved. Why? nm has no idea what our target system looks like. It is reporting these symbols as unresolved based on it's own default view of what the kernel will look like, rather that what the target kernel will look like.

Since we're logged in to the target, we can just look and see what exactly it didn't like:

Code:
~ # egrep "atl1c|Unresolved" /scratch/log/vmkernel.log
2013-04-01T23:42:29.606Z cpu3:3794)Loading module atl1c ...
2013-04-01T23:42:29.607Z cpu3:3794)Elf: 1852: module atl1c has license GPL
2013-04-01T23:42:29.608Z cpu3:3794)WARNING: Elf: 1508: Relocation of symbol <atl1c_suspend> failed: Unresolved symbol
2013-04-01T23:42:29.608Z cpu3:3794)WARNING: Elf: 1508: Relocation of symbol <alloc_etherdev> failed: Unresolved symbol
2013-04-01T23:42:29.608Z cpu3:3794)WARNING: Elf: 1508: Relocation of symbol <alloc_etherdev> failed: Unresolved symbol
2013-04-01T23:42:29.608Z cpu3:3794)WARNING: Elf: 2767: Kernel based module load of atl1c failed: Unresolved symbol <ElfRelocateFile failed>



So we have 2 symbols that it is complaining about: atl1c_suspend and alloc_etherdev. The first thing to do is to take a look at where those symbols are referenced in our source. We'll start with atl1c_suspend:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ grep -r atl1c_suspend vmkdrivers/src_9/drivers/net/atl1c
vmkdrivers/src_9/drivers/net/atl1c/at_common_main.c:      return atl1c_suspend(pdev, state);
vmkdrivers/src_9/drivers/net/atl1c/at_common.h:extern int atl1c_suspend(struct pci_dev *pdev, pm_message_t state);
vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c:int atl1c_suspend(struct pci_dev *pdev, pm_message_t state)
vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c:   atl1c_suspend(pdev, PMSG_SUSPEND);



The first match is in vmkdrivers/src_9/drivers/net/atl1c/at_common_main.c. The reference to atl1c_suspend is at line 40 and it's part of a generic suspend function which seeks to call out the correct actual suspend code depending on the exact card you have.

The second match for atl1c_supsend is in vmkdrivers/src_9/drivers/net/atl1c/at_common.h at line 96. This line is the function prototype for the atl1c_suspend function, so it's starting to look like we might actually have this function even though it's coming up as unresolved.

The third and fourth matches are both in vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c. atl1c_suspend first appears in this file at line 2608. And, strangely, this is in fact the first line of the function (symbol) that our ESXI target reports as being unresolved. What is going? If you look around the function the problem becomes apparent. We saw in part 3 how one compile error was being caused by an undefined variable, and it was because the declaration was within a compiler directive which meant the code was only executed if CONFIG_PM was set. It's the same problem here. In fact, it's exactly the same directive and variable. CONFIG_PM is checked at line 2576 and the entire block is skipped if it isn't set. As in part 3, we can infer that CONFIG_PM isn't set since our function isn't being compiled in to the module.

Since CONFIG_PM came up as a problem once already, why don't we define in somewhere and see what happens? It could solve all our problems. Open at_common.h and add it to the start of the file. Here's the first few lines of what the file should look like:

Code:
#define CONFIG_PM

#ifndef _AT_COMMON_H_
#define _AT_COMMON_H_

#include "kcompat.h"
#include "at_osdep.h"

#define ATHEROS_ETHERNET_DEVICE(device_id) {\
        PCI_DEVICE(0x1969, device_id)}

#define DEV_ID_ATL1E            0x1026
#define DEV_ID_ATL1C            0x1067   /* TODO change */
#define DEV_ID_ATL2C            0x1066
#define DEV_ID_ATL1C_2_0        0x1063
#define DEV_ID_ATL2C_2_0        0x1062
#define DEV_ID_ATL2C_B          0x2060
#define DEV_ID_ATL2C_B_2        0x2062
#define DEV_ID_ATL1D            0x1073
#define DEV_ID_ATL1D_2_0        0x1083


Now we've defined CONFIG_PM a bunch of new code will be compiled in. Let's give the build script another try:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ ./build-atl1c.sh
vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c: In function 'atl1c_probe':
vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c:2871: warning: implicit declaration of function 'alloc_etherdev'
vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c:2871: warning: assignment makes pointer from integer without a cast
vmkdrivers/src_9/drivers/net/atl1c/at_common_main.c:95:14: error: #if with no expression
vmkdrivers/src_9/drivers/net/atl1c/at_common_main.c:258: error: 'at_common_resume' undeclared here (not in a function)
vmkdrivers/src_9/drivers/net/atl1c/atl1e_main.c: In function 'atl1e_probe':
vmkdrivers/src_9/drivers/net/atl1c/atl1e_main.c:269: warning: implicit declaration of function 'alloc_etherdev'
vmkdrivers/src_9/drivers/net/atl1c/atl1e_main.c:269: warning: assignment makes pointer from integer without a cast
GNU ld (Linux/GNU Binutils) 2.17.50.0.15.20070418
/home/build/toolchain/lin32/binutils-2.17.50.0.15-modcall/bin/x86_64-linux-ld: BLD/build/vmkdriver-atl1c-CUR/release/vmkernel64/SUBDIRS/vmkdrivers/src_9/drivers/net/atl1c/at_common_main.o: No such file: No such file or directory
build@esx-build:~/vsphere/vmkdrivers-gpl$


Looks like we've started going backwards. We're now failing to get through the compilation again. Let's weed out the errors so we can focus on the real problems:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ ./build-atl1c.sh 2>&1 | grep error
vmkdrivers/src_9/drivers/net/atl1c/at_common_main.c:95:14: error: #if with no expression
vmkdrivers/src_9/drivers/net/atl1c/at_common_main.c:258: error: 'at_common_resume' undeclared here (not in a function)
build@esx-build:~/vsphere/vmkdrivers-gpl$



Okay - the first error is apparently at line 95, and here it is in context:

Code:
#if CONFIG_PM
static int at_common_resume(struct pci_dev *pdev)
{
        switch (pdev->device) {
        case DEV_ID_ATL1E:
        case DEV_ID_ATL1C:
        case DEV_ID_ATL2C:
                return atl1e_resume(pdev);
                break;
        case DEV_ID_ATL1C_2_0:
        case DEV_ID_ATL2C_2_0:
        case DEV_ID_ATL2C_B:
        case DEV_ID_ATL1D:
        case DEV_ID_ATL1D_2_0:
        case DEV_ID_ATL2C_B_2:
                return atl1c_resume(pdev);
                break;
        default:
                return -1;
                 ;
        }
        return 0;
}
#endif



I think this is a mistake on the part of the original developer. Based on the liberal use of #ifdef CONFIG_PM elsewhere in the source, I'm willing to bet this one should also have been an #ifdef. Edit the line so it looks like this:

Code:
#ifdef CONFIG_PM
static int at_common_resume(struct pci_dev *pdev)
{
        switch (pdev->device) {
        case DEV_ID_ATL1E:
        case DEV_ID_ATL1C:
        case DEV_ID_ATL2C:


Now the second error we saw was at_common_resume undeclared, at line 258. So line 258 is referring to function at_common_resume and it is saying there is no declaration. Did you notice anything familiar with that function name? It's the same function that's within the fixed #ifdef we saw in the first error message. This makes sense. The #if statement was incorrect and caused the function declaration not to be included in the compilation. By fixing that #if and converting it to #ifdef that's almost certainly going to clean up both problems. Let's see:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ ./build-atl1c.sh 2>&1 | grep error
build@esx-build:~/vsphere/vmkdrivers-gpl$


No errors.

Going back to the original reason we got on to this path, recall we were dealing with some unresolved symbols. atl1c_suspend and alloc_etherdev. We've figured out why atl1c_suspend probably wasn't working and we've made some modifications. At this point it's tempting to jump straight in to researching what issue may be with alloc_etherdev, but I've found the best approach to working through these kind of issues is to make incremental changes and to test them before proceeding. Each change you make can introduce subtle new problems and changing multiple things at once can make it harder to pin down which pieces need attention.

With that in mind, it's time to test our freshly compiled module to see if the error we think we've dealt with is gone. First I'll copy the module to the target, and then we'll login and try to insert it again.

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ scp BLD/build/vmkdriver-atl1c-CUR/release/vmkernel64/atl1c root@YOURHOST:/tmp
Password:
atl1c                                                                                                                100% 7857KB   7.7MB/s   00:00   
build@esx-build:~/vsphere/vmkdrivers-gpl$



And on the host:

Code:
~ # vmkload_mod /tmp/atl1c
vmkload_mod: Can not load module /tmp/atl1c: Unresolved symbol
~ # egrep "atl1c|Unresolved" /scratch/log/vmkernel.log
2013-04-02T23:51:14.502Z cpu1:64992)Loading module atl1c ...
2013-04-02T23:51:14.503Z cpu1:64992)Elf: 1852: module atl1c has license GPL
2013-04-02T23:51:14.503Z cpu1:64992)WARNING: Elf: 1508: Relocation of symbol <alloc_etherdev> failed: Unresolved symbol
2013-04-02T23:51:14.503Z cpu1:64992)WARNING: Elf: 1508: Relocation of symbol <alloc_etherdev> failed: Unresolved symbol
2013-04-02T23:51:14.504Z cpu1:64992)WARNING: Elf: 2767: Kernel based module load of atl1c failed: Unresolved symbol <ElfRelocateFile failed>


Looking good. There is no mention of atl1c_suspend being unresolved. On to dealing with the alloc_etherdev.
Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ grep -r "\salloc_etherdev" vmkdrivers/src_9/drivers/net/atl1c/
vmkdrivers/src_9/drivers/net/atl1c/atl1e_main.c:    netdev = alloc_etherdev(sizeof(struct atl1e_adapter));
vmkdrivers/src_9/drivers/net/atl1c/atl1c_main.c:   netdev = alloc_etherdev(sizeof(struct atl1c_adapter));
vmkdrivers/src_9/drivers/net/atl1c/kcompat.h:#ifndef alloc_etherdev
vmkdrivers/src_9/drivers/net/atl1c/kcompat.h:#define alloc_etherdev _kc_alloc_etherdev
vmkdrivers/src_9/drivers/net/atl1c/kcompat.h:#ifndef alloc_etherdev_mq
vmkdrivers/src_9/drivers/net/atl1c/kcompat.h:#define alloc_etherdev_mq(_a, _b) alloc_etherdev(_a)



This is interesting. On first glance at the search results we can see a call to alloc_etherdev in the atl1e_main.c source file, and one in atl1c_main.c. Right after those 2 matches though is a section which appears like it ought to be defining the function for us. Let's open kcompat.h and take a look. The first match on alloc_etherdev is at line 688 in the file. It's the start of a block which looks like this:

Code:
#ifndef alloc_etherdev
#define alloc_etherdev _kc_alloc_etherdev
extern struct net_device * _kc_alloc_etherdev(int sizeof_priv);
#endif



It looks like this code is mapping any calls to alloc_etherdev to a function by the name of _kc_alloc_etherdev. If we take a look at vmkdrivers/src_9/drivers/net/atl1c/kcompat.c we can see that _kc_alloc_etherdev is implemented at line 129. So what is going on; why isn't this working? The answer lies in the context the #define alloc_etherdev statements are in, back in the kcompat.h file. Once again there is a compiler directive, this time at line 665, stating:

Code:
 #if ( LINUX_VERSION_CODE < KERNEL_VERSION(2,4,3) )


Our declaration isn't going to be compiled in. At this point it's tempting to tweak that #if directive to say 2.6.19 which would mean we'd get the provided code included. That may work (I didn't try it) but it's not the right way to solve this. alloc_etherdev has to be a function that all the network drivers call, so it simply must be provided somewhere. This brings us to a key technique one must deploy when trying to get these linux drivers to work in ESXI. Basically, the question is: what did the other drivers do?

I previously used the sky2 driver as an example and I'll use that again here. Change in to the directory where the sky2 files are located and do a quick search for alloc_etherdev:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ cd vmkdrivers/src_9/drivers/net/sky2/
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/drivers/net/sky2$ ls -l
total 208
-r--r--r-- 1 build build   2482 Nov 23 13:46 bitrev.c
-r--r--r-- 1 build build 122046 Nov 23 13:46 sky2.c
-r--r--r-- 1 build build  83285 Nov 23 13:46 sky2.h
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/drivers/net/sky2$ grep alloc_etherdev *
sky2.c:   struct net_device *dev = alloc_etherdev(sizeof(*sky2));
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/drivers/net/sky2$



We can see that sky2 uses the exact same function. How does it manage to work, then? We know that the sky2 module loads and works, therefore the alloc_etherdev code must be valid and implemented somewhere. Let's try to find it. Move over to where most of the relevant headers are, in our ESXI source tree, and search for alloc_etherdev:
Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/drivers/net/sky2$ cd ~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include/
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include$ grep -r alloc_etherdev *
linux/etherdevice.h:extern struct net_device *alloc_etherdev(int sizeof_priv);
linux/etherdevice.h:static inline struct net_device *alloc_etherdev_mq(int sizeof_priv,
linux/etherdevice.h: *  alloc_etherdev - Allocates and sets up an ethernet device
linux/etherdevice.h: *  #define alloc_etherdev(sizeof_priv)
linux/etherdevice.h:/* _VMKLNX_CODECHECK_: alloc_etherdev */
linux/etherdevice.h:#define alloc_etherdev(sizeof_priv) alloc_etherdev_mq(sizeof_priv, 1)



So the header file we need is linux/etherdevice.h. You may see unresolved symbols due to a missing header file at compile time, so a worthy avenue of investigation is to check we're including that header.

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include/linux$ grep etherdevice ../../drivers/net/atl1c/*
../../drivers/net/atl1c/kcompat.h:#include <linux/etherdevice.h>


We are. Kcompat.h is included by atl1c.h which is in turn included by atl1c_main.c. How about the sky2 module?

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include/linux$ grep etherdevice ../../drivers/net/sky2/*
../../drivers/net/sky2/sky2.c:#include <linux/etherdevice.h>


Yep that has it too. Now we're going to focus entirely on the sky2 driver's use of alloc_etherdev and build up a detailed understanding of how it works. This will hopeful shed light on why it's not working for us. Since the sky2 driver makes the alloc_etherdev call and it loads without a "unresolved symbol" error, we'd expect to see alloc_etherdev as a symbol listed in the binary. We can check this by listing all symbols in the compiled sky2 module which should be left over from the build-vmkdrivers.sh you ran in Part 1.

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include/linux$ nm ~/vsphere/vmkdrivers-gpl/BLD/build/vmkdriver-sky2-CUR/release/vmkernel64/sky2  | grep alloc
                 U alloc_pages
                 U dma_alloc_coherent
                 U __netdev_alloc_skb
0000000000001d5c t sky2_rx_alloc
                 U vmklnx_alloc_netdev_mq
                 U vmklnx_kmalloc
                 U vmklnx_kzmalloc




Wait, we call the alloc_etherdev function, it's not in the symbols list, yet the module loads without an unresolved symbol error? Huh? Time to look more closely at the function in the etherdevice.h file. Here's the relevant code:

Code:
#if !defined(__VMKLNX__)
extern struct net_device *alloc_etherdev(int sizeof_priv);
#else
static inline struct net_device *alloc_etherdev_mq(int sizeof_priv,
                                                   unsigned int queue_count)
{
        return vmklnx_alloc_netdev_mq(THIS_MODULE,
                                      sizeof_priv,
                                      "vmnic%d",
                                      ether_setup,
                                      queue_count);
}
/* snipped some comments for brevity */
#define alloc_etherdev(sizeof_priv) alloc_etherdev_mq(sizeof_priv, 1)
#endif



Ah now things are starting to make sense. If __VMKLNX__ is set, the #define at the bottom is saying that the preprocessor will replace the reference to alloc_etherdev(whatever) with alloc_etherdev_mq(whatever,1), so we were never going to see alloc_etherdev in the symbols list. More than that, alloc_etherdev_mq is a piece of static inlined code, so the contents of this function will be injected where ever alloc_etherdev_mq appears. This means we also won't see alloc_etherdev_mq in the symbols list, but what we should see is vmklnx_alloc_netdev_mq and sure enough we did just see that:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include/linux$ nm ~/vsphere/vmkdrivers-gpl/BLD/build/vmkdriver-sky2-CUR/release/vmkernel64/sky2  | grep alloc
                 U alloc_pages
                 U dma_alloc_coherent
                 U __netdev_alloc_skb
0000000000001d5c t sky2_rx_alloc
 U vmklnx_alloc_netdev_mq
                 U vmklnx_kmalloc
                 U vmklnx_kzmalloc



Moving our attention back to atl1c, we know that we are including the right header file, so this ought to be working. There must be something unique about our code that is somehow getting in the way. We'll look again for references to alloc_etherdev in our source:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/include/linux$ cd ../../drivers/net/atl1c/
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/drivers/net/atl1c$ grep alloc_etherdev *

atl1c_main.c:   netdev = alloc_etherdev(sizeof(struct atl1c_adapter));
atl1c_main.c:      goto err_alloc_etherdev;
atl1c_main.c:err_alloc_etherdev:

atl1e_main.c:    netdev = alloc_etherdev(sizeof(struct atl1e_adapter));
atl1e_main.c:        goto err_alloc_etherdev;
atl1e_main.c:err_alloc_etherdev:

kcompat.c:_kc_alloc_etherdev(int sizeof_priv)
kcompat.h:#ifndef alloc_etherdev
kcompat.h:#define alloc_etherdev _kc_alloc_etherdev
kcompat.h:extern struct net_device * _kc_alloc_etherdev(int sizeof_priv);

kcompat.h:#ifndef alloc_etherdev_mq
kcompat.h:#define alloc_etherdev_mq(_a, _b) alloc_etherdev(_a)


The answer is actually in this output. I've grouped the output to make it easier to follow. The first matches in the atl1c_main.c all pertain to the call or use of the alloc_etherdev function. The same goes for the matches in the atl1e_main.c file. The 3rd group of matches is all to do with what we looked at earlier with the definition and implementation of the alloc_etherdev function for kernels < KERNEL_VERSION(2,4,3). The fourth group of output is the important bit. This is the code:

Code:
#ifndef alloc_etherdev_mq
#define alloc_etherdev_mq(_a, _b) alloc_etherdev(_a)
#endif



It's within an #if directive section for KERNEL_VERSION < 2.6.22 which would apply to us. This function alloc_etherdev_mq should look familiar. We saw this earlier in etherdevice.h:

Code:
#define alloc_etherdev(sizeof_priv) alloc_etherdev_mq(sizeof_priv, 1)


So alloc_etherdev_mq aliases alloc_etherdev and alloc_etherdev aliases alloc_etherdev_mq. With this type of circular reference it's sort of surprising something doesn't catch fire. From the behavior we experienced - the resulting kernel module having an unresolved symbol for alloc_etherdev - the lack of a message indicating "alloc_etherdev" having a previous declaration - we can infer that the compile environment must be detecting the problem and ignoring the later statements in etherdevice.h. After all this work, the answer to this problem is pretty simple- just comment out those 3 lines in kcompat.h:

Code:
/*
#ifndef alloc_etherdev_mq
#define alloc_etherdev_mq(_a, _b) alloc_etherdev(_a)
#endif
*/

Time to try another compilation:
Code:
build@esx-build:~/vsphere/vmkdrivers-gpl/vmkdrivers/src_9/drivers/net/atl1c$ cd ~/vsphere/vmkdrivers-gpl/
build@esx-build:~/vsphere/vmkdrivers-gpl$ ./build-atl1c.sh
GNU ld (Linux/GNU Binutils) 2.17.50.0.15.20070418


That's it. No errors and interestingly, no warnings either. We've got a perfectly clean compilation. We'll copy it to our target one more time and try to load it again:

Code:
build@esx-build:~/vsphere/vmkdrivers-gpl$ scp BLD/build/vmkdriver-atl1c-CUR/release/vmkernel64/atl1c root@YOURHOST:/tmp/atl1c
Password:
atl1c                                                                                                                                                          100% 7857KB   7.7MB/s   00:00   
build@esx-build:~/vsphere/vmkdrivers-gpl$


And on the host:
Code:
/tmp # vmkload_mod /tmp/atl1c
Module /tmp/atl1c loaded successfully


And taking a closer look at what happened:
Code:
/tmp # cat /scratch//log/vmkernel.log | tail -20

2013-04-14T20:09:54.516Z cpu1:785358)Loading module atl1c ...
2013-04-14T20:09:54.517Z cpu1:785358)Elf: 1852: module atl1c has license GPL
2013-04-14T20:09:54.519Z cpu1:785358)module heap: Initial heap size: 102400, max heap size: 5562368
2013-04-14T20:09:54.519Z cpu1:785358)vmklnx_module_mempool_init: Mempool max 5562368 being used for module: 75

2013-04-14T20:09:54.519Z cpu1:785358)vmk_MemPoolCreate passed for 25 pages

2013-04-14T20:09:54.519Z cpu1:785358)skb_mem_info mempool for module atl1c created - max size 23068672
2013-04-14T20:09:54.519Z cpu1:785358)module heap: using memType 0
2013-04-14T20:09:54.519Z cpu1:785358)module heap vmklnx_atl1c: creation succeeded. id = 0x41001ad80000
2013-04-14T20:09:54.519Z cpu1:785358)<6>Atheros(R) AR8121/AR8113/AR8114/AR8131/AR8132/AR8152 PCI-E Ethernet Network Driver - version 1.0.1.14
2013-04-14T20:09:54.519Z cpu1:785358)<6>Copyright (c) 2007 - 2009 Atheros Corporation.
2013-04-14T20:09:54.519Z cpu1:785358)PCI: driver atheros_eth is looking for devices
2013-04-14T20:09:54.519Z cpu1:785358)PCI: driver atheros_eth claimed 0 device
2013-04-14T20:09:54.519Z cpu1:785358)Mod: 4485: Initialization of atl1c succeeded with module ID 75.
2013-04-14T20:09:54.519Z cpu1:785358)atl1c loaded successfully.




Success! We've compiled a module that's loading and executing - seemingly - properly. I've been writing this guide and testing against a host that doesn't actually have the necessary hardware we're targeting which is why you can see it say "atheros_eth claimed 0 devices". The output on a host with the right hardware will be slightly different.

The next step at this point, assuming you have hardware, is to start testing. Just because you managed to get the module compiled and loading without errors, does not necessarily mean it'll work properly as we'll see in the next guide.

Author:  sahilvarma987 [ Sun Aug 11, 2013 12:09 pm ]
Post subject:  Re: ESXI 5.x Drivers Part 4: Finishing the compilation

Hi trickstarter,

Great series of posts.... U must be a esxi wizard.... I tried ur steps and was successfully able to create a module and insert in esxi... ;)

Now, i have this problem, in my real module, i deal with pages and the flags inside that.
like lock_page, unlock_page etc.
How can i port that to esxi.....
Because, from the headers i can see that esxi has defined page as
struct page {}

There are no member variables inside it......

Can u help me out with ur knowledge as to how i can get access to the page flags .....

Page 1 of 1 All times are UTC - 8 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/