In the previous section, we saw what a typical drm_driver structure might look like. One of the more important fields in the structure is the hook for the load function.
      static struct drm_driver driver = {
      	...
      	.load = i915_driver_load,
        ...
      };
    The load function has many responsibilities: allocating a driver private structure, specifying supported performance counters, configuring the device (e.g. mapping registers & command buffers), initializing the memory manager, and setting up the initial output configuration.
If compatibility is a concern (e.g. with drivers converted over to the new interfaces from the old ones), care must be taken to prevent device initialization and control that is incompatible with currently active userspace drivers. For instance, if user level mode setting drivers are in use, it would be problematic to perform output discovery & configuration at load time. Likewise, if user-level drivers unaware of memory management are in use, memory management and command buffer setup may need to be omitted. These requirements are driver-specific, and care needs to be taken to keep both old and new applications and libraries working. The i915 driver supports the "modeset" module parameter to control whether advanced features are enabled at load time or in legacy fashion.
The driver private hangs off the main drm_device structure and can be used for tracking various device-specific bits of information, like register offsets, command buffer status, register state for suspend/resume, etc. At load time, a driver may simply allocate one and set drm_device.dev_priv appropriately; it should be freed and drm_device.dev_priv set to NULL when the driver is unloaded.
The DRM supports several counters which may be used for rough performance characterization. Note that the DRM stat counter system is not often used by applications, and supporting additional counters is completely optional.
These interfaces are deprecated and should not be used. If performance monitoring is desired, the developer should investigate and potentially enhance the kernel perf and tracing infrastructure to export GPU related performance information for consumption by performance monitoring tools and applications.
Obviously, device configuration is device-specific. However, there are several common operations: finding a device's PCI resources, mapping them, and potentially setting up an IRQ handler.
Finding & mapping resources is fairly straightforward. The DRM wrapper functions, drm_get_resource_start() and drm_get_resource_len(), may be used to find BARs on the given drm_device struct. Once those values have been retrieved, the driver load function can call drm_addmap() to create a new mapping for the BAR in question. Note that you probably want a drm_local_map_t in your driver private structure to track any mappings you create.
if compatibility with other operating systems isn't a concern (DRM drivers can run under various BSD variants and OpenSolaris), native Linux calls may be used for the above, e.g. pci_resource_* and iomap*/iounmap. See the Linux device driver book for more info.
Once you have a register map, you may use the DRM_READn() and DRM_WRITEn() macros to access the registers on your device, or use driver-specific versions to offset into your MMIO space relative to a driver-specific base pointer (see I915_READ for an example).
If your device supports interrupt generation, you may want to set up an interrupt handler when the driver is loaded. This is done using the drm_irq_install() function. If your device supports vertical blank interrupts, it should call drm_vblank_init() to initialize the core vblank handling code before enabling interrupts on your device. This ensures the vblank related structures are allocated and allows the core to handle vblank events.
Once your interrupt handler is registered (it uses your drm_driver.irq_handler as the actual interrupt handling function), you can safely enable interrupts on your device, assuming any other state your interrupt handler uses is also initialized.
Another task that may be necessary during configuration is mapping the video BIOS. On many devices, the VBIOS describes device configuration, LCD panel timings (if any), and contains flags indicating device state. Mapping the BIOS can be done using the pci_map_rom() call, a convenience function that takes care of mapping the actual ROM, whether it has been shadowed into memory (typically at address 0xc0000) or exists on the PCI device in the ROM BAR. Note that after the ROM has been mapped and any necessary information has been extracted, it should be unmapped; on many devices, the ROM address decoder is shared with other BARs, so leaving it mapped could cause undesired behavior like hangs or memory corruption.
In order to allocate command buffers, cursor memory, scanout buffers, etc., as well as support the latest features provided by packages like Mesa and the X.Org X server, your driver should support a memory manager.
If your driver supports memory management (it should!), you need to set that up at load time as well. How you initialize it depends on which memory manager you're using: TTM or GEM.
TTM (for Translation Table Manager) manages video memory and aperture space for graphics devices. TTM supports both UMA devices and devices with dedicated video RAM (VRAM), i.e. most discrete graphics devices. If your device has dedicated RAM, supporting TTM is desirable. TTM also integrates tightly with your driver-specific buffer execution function. See the radeon driver for examples.
The core TTM structure is the ttm_bo_driver struct. It contains several fields with function pointers for initializing the TTM, allocating and freeing memory, waiting for command completion and fence synchronization, and memory migration. See the radeon_ttm.c file for an example of usage.
The ttm_global_reference structure is made up of several fields:
	  struct ttm_global_reference {
	  	enum ttm_global_types global_type;
	  	size_t size;
	  	void *object;
	  	int (*init) (struct ttm_global_reference *);
	  	void (*release) (struct ttm_global_reference *);
	  };
	There should be one global reference structure for your memory manager as a whole, and there will be others for each object created by the memory manager at runtime. Your global TTM should have a type of TTM_GLOBAL_TTM_MEM. The size field for the global object should be sizeof(struct ttm_mem_global), and the init and release hooks should point at your driver-specific init and release routines, which probably eventually call ttm_mem_global_init and ttm_mem_global_release, respectively.
Once your global TTM accounting structure is set up and initialized by calling ttm_global_item_ref() on it, you need to create a buffer object TTM to provide a pool for buffer object allocation by clients and the kernel itself. The type of this object should be TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct ttm_bo_global). Again, driver-specific init and release functions may be provided, likely eventually calling ttm_bo_global_init() and ttm_bo_global_release(), respectively. Also, like the previous object, ttm_global_item_ref() is used to create an initial reference count for the TTM, which will call your initialization function.
GEM is an alternative to TTM, designed specifically for UMA devices. It has simpler initialization and execution requirements than TTM, but has no VRAM management capability. Core GEM is initialized by calling drm_mm_init() to create a GTT DRM MM object, which provides an address space pool for object allocation. In a KMS configuration, the driver needs to allocate and initialize a command ring buffer following core GEM initialization. A UMA device usually has what is called a "stolen" memory region, which provides space for the initial framebuffer and large, contiguous memory regions required by the device. This space is not typically managed by GEM, and it must be initialized separately into its own DRM MM object.
Initialization is driver-specific. In the case of Intel integrated graphics chips like 965GM, GEM initialization can be done by calling the internal GEM init function, i915_gem_do_init(). Since the 965GM is a UMA device (i.e. it doesn't have dedicated VRAM), GEM manages making regular RAM available for GPU operations. Memory set aside by the BIOS (called "stolen" memory by the i915 driver) is managed by the DRM memrange allocator; the rest of the aperture is managed by GEM.
/* Basic memrange allocator for stolen space (aka vram) */ drm_memrange_init(&dev_priv->vram, 0, prealloc_size); /* Let GEM Manage from end of prealloc space to end of aperture */ i915_gem_do_init(dev, prealloc_size, agp_size);
Once the memory manager has been set up, we may allocate the command buffer. In the i915 case, this is also done with a GEM function, i915_gem_init_ringbuffer().
The final initialization task is output configuration. This involves:
Several core functions exist to create CRTCs, encoders, and connectors, namely: drm_crtc_init(), drm_connector_init(), and drm_encoder_init(), along with several "helper" functions to perform common tasks.
Connectors should be registered with sysfs once they've been detected and initialized, using the drm_sysfs_connector_add() function. Likewise, when they're removed from the system, they should be destroyed with drm_sysfs_connector_remove().
void intel_crt_init(struct drm_device *dev)
{
	struct drm_connector *connector;
	struct intel_output *intel_output;
	intel_output = kzalloc(sizeof(struct intel_output), GFP_KERNEL);
	if (!intel_output)
		return;
	connector = &intel_output->base;
	drm_connector_init(dev, &intel_output->base,
			   &intel_crt_connector_funcs, DRM_MODE_CONNECTOR_VGA);
	drm_encoder_init(dev, &intel_output->enc, &intel_crt_enc_funcs,
			 DRM_MODE_ENCODER_DAC);
	drm_mode_connector_attach_encoder(&intel_output->base,
					  &intel_output->enc);
	/* Set up the DDC bus. */
	intel_output->ddc_bus = intel_i2c_create(dev, GPIOA, "CRTDDC_A");
	if (!intel_output->ddc_bus) {
		dev_printk(KERN_ERR, &dev->pdev->dev, "DDC bus registration "
			   "failed.\n");
		return;
	}
	intel_output->type = INTEL_OUTPUT_ANALOG;
	connector->interlace_allowed = 0;
	connector->doublescan_allowed = 0;
	drm_encoder_helper_add(&intel_output->enc, &intel_crt_helper_funcs);
	drm_connector_helper_add(connector, &intel_crt_connector_helper_funcs);
	drm_sysfs_connector_add(connector);
}
	In the example above (again, taken from the i915 driver), a CRT connector and encoder combination is created. A device-specific i2c bus is also created for fetching EDID data and performing monitor detection. Once the process is complete, the new connector is registered with sysfs to make its properties available to applications.
Since many PC-class graphics devices have similar display output designs, the DRM provides a set of helper functions to make output management easier. The core helper routines handle encoder re-routing and the disabling of unused functions following mode setting. Using the helpers is optional, but recommended for devices with PC-style architectures (i.e. a set of display planes for feeding pixels to encoders which are in turn routed to connectors). Devices with more complex requirements needing finer grained management may opt to use the core callbacks directly.
[Insert typical diagram here.] [Insert OMAP style config here.]
Each encoder object needs to provide:
Connector helpers need to provide functions (mode-fetch, validity, and encoder-matching) for returning an ideal encoder for a given connector. The core connector functions include a DPMS callback, save/restore routines (deprecated), detection, mode probing, property handling, and cleanup functions.