Hacking initrd.img – 添加网卡驱动,网络安装 Linux

如何在initrd.img中添加驱动


2007 年 9 月 20 日


本文通过将网卡驱动加入到 initrd.img 中,使 Linux 内核在启动的过程中能正确识别并加载网卡驱动,从而使网络安装得以进行。

前言


网络安装 Linux 并不是一个新鲜的话题,其过程也不是一个轻松的体验。为了让机器能通过网络来安装 Linux,如果还需要配合 kickstart 来自动化 Linux 的安装过程的话,用户需要做大量的配置工作。众所周知,用户需要挑选一台机器作为服务器,然后在这台机器上配置 DHCP, TFTP, NFS/Http/Ftp, pxelinux, kickstart 等一系列的东西。


但是所有的这一切能成功运作都至少有一个前提条件:我们所安装的 Linux 能正确的识别并驱动所有客户机的网卡。如果网卡驱动不了,客户机根本无法通过网络从服务器取到所需要的东西,网络安装 Linux 就无从谈起了。


本文通过将网卡驱动加入到 initrd.img 中,使 Linux 内核在启动的过程中能正确识别并加载网卡驱动,从而使网络安装得以进行。本文并不讲述网络安装 Linux 的背景知识(如为什么需要网络安装,网络安装的好处等)、具体配置和操作步骤(也就是配置 DHCP,TFTP,pxelinux 等内容)。此外,本文需要读者有熟练的 Linux 操作经验和 Shell 编程的基本知识,而且对 Linux 启动过程和驱动程序有基本的了解。


注:所有被安装的机器我们称之为客户机,提供网络安装服务的机器我们称之为服务器





















开始之前的建议


建议:如果您碰到了前言中所描述的问题的话,最好的解决方法是 – 找一个能驱动客户机网卡的 Linux Distribution,这样能省却很多麻烦。


但在现实环境下,很多原因会导致我们无法选择安装一个新的 Linux 发行版。原因有很多,比如:



  • 客户不同意我们选用其他的 Linux 版本,因为客户有大量的应用程序已经在某个Linux 版本上编译,运行良好了,更换 Linux 发行版会带来应用的移植问题
  • 客户拥有一些特殊的硬件,而这些硬件只有基于某个 Linux 发行版的驱动。更换 Linux 发行版,会导致这些硬件无法正常工作
  • 没有一个 Linux 发行版能驱动客户机的网卡。网卡厂商只给我们提供了某个 Linux 发行版上的驱动,一切都要 DIY
  • 您有着强烈的DIY情感,面对问题不是寻求别人的解决方案而是一切都要自己克服 – 毫无疑问,您就是本文最适合的读者




















解决思路


如果熟悉 Linux 的启动过程和驱动程序,那么要解决本文的问题,基本上有两条路可走。第一就是将网卡驱动编译进内核(静态链接进内核),第二种方法就是将网卡驱动做成模块,然后想办法在 Linux 启动的时候让 Linux 内核能找到并挂载该驱动。面对这两种方案,第二种方法有更好的可行性和扩展性。因为首先有些网卡驱动本身就不能被静态链接进入内核,而只能被编译成一个模块,例如下文要举的例子 – e1000 网卡驱动;其次,驱动做成模块的方式,可以适应多个内核版本,用方法 1,更换一个内核版本就要重新编译一次内核;最后,等会会看到,相比编译内核,方法 2 更简单和可操作。


方法 2 的实现手段就是定制 initrd.img,将我们的网卡驱动加进去。initrd.img 是一个小型的根文件系统,在 Linux 内核没有挂载硬盘上的根分区的时候,initrd.img 将在内存中展开。一般情况下,initrd.img 中将包含一些必需的命令和驱动,如 insmod 命令和磁盘驱动。有了 insmod,才能将磁盘驱动挂载进内核,有了磁盘驱动,内核才能挂载位于磁盘上的根文件系统。


大部分的 Linux 发行版都提供了用于网络安装 Linux 的 initrd.img,一般位于第一张安装光盘的 images/pxeboot 目录下。在一台已经装好 Linux 的机器中,在 /boot 目录下我们也能找到 initrd.img,比较一下这两个 initrd.img,会发现 pxeboot 目录下的 initrd.img 会比 /boot 下的大很多,这是因为在网络安装的情况下,Linux 不会尝试去挂载位于磁盘上的根分区(事实上,在没有安装Linux的机器上,此时磁盘中可能什么数据都没有),所以此时的 initrd.img 需要包含大量的驱动,使 Linux 能识别大量的硬件。位于 /boot 下的 initrd.img,基本上唯一需要的东西就是磁盘驱动,只要内核能访问磁盘,那么其余所需的东西都可以从磁盘取得而不需要依赖 initrd.img。





















具体操作和实例


从安装光盘中取得 initrd.img 之后,就可以开始对其进行定制。这里要感谢 Jeremy Mates,他写的 initrd-util.sh 能很好的解开和生成一个 initrd.img。脚本可以在http://sial.org/howto/linux/initrd/initrd-util下载到。


下面我们以RedHat Enterprise Linux Advance Server 4 Update 2 x86_64,Intel e1000网卡驱动为例,讲述具体的操作过程(在本例中,服务器和客户机拥有相同的Intel e1000网卡,而且我们已经手动在服务器上安装完成了正确的e1000驱动):


首先从光盘取到initrd.img,登录到服务器,然后用initrd-util.sh解开:


命令输出 1. 解开initrd.img





[root@ericvm ~]# cd `./initrd-util.sh unpack initrd.img |tail -1`
info: initrd unpack expanded into: /var/tmp/initrd-util.workdir.DA29317
[root@ericvm initrd-util.workdir.DA29317]# pwd
/var/tmp/initrd-util.workdir.DA29317
[root@ericvm initrd-util.workdir.DA29317]# ls
2.6.9-22.EL bin dev etc linuxrc lost+found modules
proc sbin selinux sys tmp var


initrd-util.sh很简单,利用gunzip, mount和cpio这些工具将initrd.img解开,其中驱动包位于modules目录下,名为modules.cgz,将这个文件解开后,生成了2.6.9-22.EL目录,进入该目录,就能找到包含在initrd.img中的驱动。本例中,RedHat已经包含了一个e1000的驱动,但是这个驱动不能驱动我们新的Intel e1000网卡。为此,我们在e1000网站下载新版的驱动,然后在服务器上编译完成,生成ko模块文件,然后拷贝到2.6.9-22.EL目录下,覆盖原文件即可。


驱动更新完毕后,现在我们需要将2.6.9-22.EL这个目录重新制作成modules.cgz,这个功能initrd-util.sh不能为我们完成,所以我们手动操作:


命令输出 2. 加入驱动并重新打包





[root@ericvm initrd-util.workdir.DA29317]# find 2.6.9-22.EL | cpio -o -H crc > newmodules
16582 blocks
[root@ericvm initrd-util.workdir.DA29317]# gzip -n -9 newmodules
[root@ericvm initrd-util.workdir.DA29317]# mv newmodules.gz modules
[root@ericvm initrd-util.workdir.DA29317]# cd modules
[root@ericvm modules]# rm -f modules.cgz
[root@ericvm modules]# mv newmodules.gz modules.cgz
[root@ericvm modules]# pwd
/var/tmp/initrd-util.workdir.DA29317/modules


驱动包重新生成了并不意味着Linux就可以识别网卡了,因为Linux必须依靠一种逻辑,将硬件设备和驱动模块文件对应起来。这个逻辑就被定义在modules目录下的除modules.cgz之外的文件中:


命令输出 3. 设备驱动识别信息文件





[root@ericvm modules]# ls
module-info modules.cgz modules.dep modules.pcimap modules.usbmap pci.ids pcitable


如上所示,pcitable, modules.pcimap中定义了PCI设备和驱动模块之间的对应关系,modules.dep中定义了模块和模块之间的依赖关系(比如,各种SCSI设备都会依赖一个基础的SCSI驱动模块),module-info中定义了驱动的静态描述信息……


要填写这些文本文件,也很简单,首先我们必须要知道这块e1000网卡的PCI设备信息,由于在服务器上e1000这块网卡已经安装完成了,所以我们可以在服务器上取到我们想要的信息:


命令输出 4. 查看网卡硬件信息





[root@ericvm ~]# lspci
………… ignore some outputs
04:00.0 Ethernet controller: Intel Corporation Enterprise Southbridge DPT LAN Copper
04:00.1 Ethernet controller: Intel Corporation Enterprise Southbridge DPT LAN Copper
………… ignore some outputs


lspci列出了服务器上两块网卡的设备信息,根据网卡设备的ID号码(04:00.0, 04:00.1),我们就可以在lspci –n的输出中找到设备的vendor code和device code(请参考lspci的manual了解lspci):


命令输出 5. 查看网卡code





[root@ericvm ~]# lspci –n
………… ignore some outputs
04:00.0 Class 0200: 8086:1096 (rev 01)
04:00.1 Class 0200: 8086:1096 (rev 01)
………… ignore some outputs


在lspci –n的输出中,我们找到了两块网卡的vendor code和device code – 8086和1096。得到了vendor code和device code之后,就可以更新initrd.img中modules目录下的pcitable, modules.pcimap等这些文件了。举例来说,在pcitable中查找e1000,能发现很多设备和e1000这个驱动关联,但是唯独没有8086:1096的组合,这就是为什么Linux无法驱动这块e1000网卡的原因了,我们需要手动将8086, 1096这两个code加入到pcitable中,并将这个设备对应到e1000驱动上。照此方法,更新其余的文件,如module-info, modules.pcimap等。


这样我们就完成了对initrd.img的完全修改,用initrd-util.sh重新将目录打包,生成一个新的initrd.img:


命令输出 6. 重新生成initrd.img





[root@ericvm ~]# ./initrd-util.sh pack /var/tmp/initrd-util.workdir.DA29317/
notice: new initrd size: 6144K
6144+0 records in
6144+0 records out
mke2fs 1.35 (28-Feb-2004)
info: initrd packed into: /var/tmp/initrd-util.initrd-new.IV29439.gz
/var/tmp/initrd-util.initrd-new.IV29439.gz
[root@ericvm ~]# ls -lh /var/tmp
total 3.7M
-rw-r–r– 1 root root 3.7M Jun 20 17:10 initrd-util.initrd-new.IV29439.gz
drwxr-xr-x 12 root root 4.0K Jun 20 17:10 initrd-util.workdir.DA29317
drwxr-xr-x 13 root root 4.0K Jun 20 15:53 initrd-util.workdir.ID29288


initrd-util.sh首先创建一个“空洞文件”,然后在这个文件中建立ext2 文件系统,然后将这个文件mount到一个目录中,最后用rsync这种方式将我们更新过的文件“拷贝”到了mount的目录下,这样“空洞”文件中就有了内容,最后对文件进行压缩,生成最终的img文件。


将/var/tmp/initrd-util.initrd-new.IV29439.gz改名成initrd.img,放到tftp配置的目录下,就可以让客户机在网络启动的时候取到新的initrd.img了,从而识别网卡开始网络安装。





















到此为止了么?


到目前为止,一切看起来都很好。客户机通过网络启动,能把网卡驱动起来并从服务器上得到所有需要的东西,并开始安装。但是,如果没有做特殊处理的话,客户机上Linux安装完成后,启动进入Linux,会发现网卡依旧驱动不了,典型的出错信息就是“无法成功挂载XXX驱动”,“ethX的MAC地址和预计的不一样”等。出现这样问题的原因很简单,这是因为正确的网卡驱动只存在于服务器上的 initrd.img 中,而没有体现到客户机的硬盘上。客户机在网络启动的时候得到了服务器上的 initrd.img,但 Linux 还没有智能到能自动解开这个 initrd.img 并将里面的驱动拷贝到客户机的硬盘上。一旦客户机完成安装重启,从硬盘启动之后,所有的驱动文件和信息就都从硬盘读取了。


还举刚才的例子,e1000 网卡驱动在 RedHat 中其实自带就有一个,但不适用于我们的 Intel e1000 网卡,用 rpm 命令可以查到安装在硬盘上的这个 e1000 驱动属于哪个RPM包:


命令输出 7. 查看驱动所在的 RPM 包





# rpm -qf /lib/modules/2.6.9-42.ELsmp/kernel/drivers/net/e1000/e1000.ko
kernel-smp-2.6.9-42.EL


所以,很明显的就是,要解决这样的问题,我们需要重新生成这个 kernel RPM 包。但是要在 RPM 包中替换一个文件,或是加入一个文件,可不像在 RAR 文件中用鼠标直接拖拽那么简单。有兴趣的可以参考 RPM 的相关资料。


除了重新生成 RPM 之外,还有一些简单的办法也是可行的,但不如重新生成 RPM 来的中规中矩。有兴趣的读者可以和我交流,这里就不赘述了。




参考资料



 

resin vs jetty


服務器:redhat as 4 2.6.9-22.ELsmp
           Intel(R) Pentium(R) D CPU 2.80GHz
           2G記憶體
           160G SATA


客戶機:WINXP SP2
           Intel(R) Pentium(r) 4 CPU 2.93GHz
           1G記憶體
            80G IDE硬碟


測試軟體:Load Runner 7.8


並發用戶數: 500


測試代碼:


<%@ page language=”java” import=”java.util.*” pageEncoding=”ISO-8859-1″%>
<%
String path = request.getContextPath();
String basePath = request.getScheme()+”://”+request.getServerName()+”:”+request.getServerPort()+path+”/”;


HashMap m = new HashMap();
for(int i=0;i<10000;i++)
 m.put(i,i);
m.clear();
m = null;
%>


<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>
<html>
  <head>
    <base href=”<%=basePath%>”>
   
    <title>My JSP ‘test.jsp’ starting page</title>
   
 <meta http-equiv=”pragma” content=”no-cache”>
 <meta http-equiv=”cache-control” content=”no-cache”>
 <meta http-equiv=”expires” content=”0″>   
 <meta http-equiv=”keywords” content=”keyword1,keyword2,keyword3″>
 <meta http-equiv=”description” content=”This is my page”>
 <!–
 <link rel=”stylesheet” type=”text/css” href=”styles.css”>
 –>


  </head>
 
  <body>
    This is my JSP page. <%=basePath%><br>
  </body>
</html>


測試結果:


jetty 6.1.5







 分析摘要 周期: 30-09-2007 10:20:45 – 30-09-2007 10:21:09












方案名: Scenario1
会话的结果文件: e:\Temp\res\res.lrr
持续时间: 24秒.





 统计信息摘要 

























  最大运行 Vuser 数: 500
  总吞吐量(字节): 439,733
  平均吞吐量(字节/秒): 17,589
  总点击次数: 500
  平均每秒点击次数: 20   查看 HTTP 响应摘要





 事务摘要 











  事务: 通过总数: 2,000 失败总数: 0 停止总数: 0          平均响应时间





















































事务名 最小值 平均值 最大值 标准偏差 90% 通过 失败 停止
index 0.634 2.209 3.847 0.856 3.283 500 0 0






 HTTP 响应摘要 













HTTP 响应 总计 每秒
HTTP_200 500 20


 

 

 

Resin pro 3.0.23







 分析摘要 周期: 30-09-2007 10:23:44 – 30-09-2007 10:24:15












方案名: Scenario1
会话的结果文件: e:\Temp\res\res.lrr
持续时间: 31秒.





 统计信息摘要 

























  最大运行 Vuser 数: 500
  总吞吐量(字节): 428,380
  平均吞吐量(字节/秒): 13,387
  总点击次数: 500
  平均每秒点击次数: 15.625   查看 HTTP 响应摘要





 事务摘要 











  事务: 通过总数: 2,000 失败总数: 0 停止总数: 0          平均响应时间





















































事务名 最小值 平均值 最大值 标准偏差 90% 通过 失败 停止
index 0.652 6.722 11.39 4.065 10.882 500 0 0






 HTTP 响应摘要 













HTTP 响应 总计 每秒
HTTP_200 500 15.625


 

總結:

      雖然這次測試比較簡單.但也應該可以體現出jetty性能比resin好一點.而且resin還購買了正版的licenses,沒有把tomcat加入測試.有時間再測試一個tomcat吧

jetty 使用手記

jetty resin tomcat 測試報告:http://www.strongd.net/blog/show/255


       一直以來使用jetty作為我的web開發配置服務器,開始的時候和所有的初學者一様使用tomcat作為開發服務器,可用着用着,感覺tomcat越來越繁瑣以及龐大。後來,用了jboss,知道jboss使用jetty作為其web應用服務器,所以就開始試着使用jetty。從那開始後,jetty就成為我的開發配置服務器了,從最初的4.0,到現在的6.0一直在使用着。

  
喜歡jetty的原因,在于其的方便,簡單的配置文件,簡單的啓動脚本,而且在elipse或者其他ide中,進行調試,運行都很方便。

  
不想多説什麽,讓事實來説話吧。在開始之前,先下載jetty。目前最新的版本為6.1:
   http://docs.codehaus.org/display/JETTY/Downloading+and+Installing#download

  
以前的版本文件是比較小的,現在的版本增加了很多東東,主要是很多例子應用,以及源碼,經過删减只有,整個應用還是比較小的,大概也就10m多了。如果只是需要運行web應用,并且只是需要jsp 2.1規範的話,只有7m多了。以前4.0的時候只有2m多,現在也已經增加了那麽多了,時代在進步,没有辦法。

   jetty
主要的jarjetty-6.1.1.jarservlet-api-2.5-6.1.1.jarjetty-util-6.1.1.jar。啓動的jar start.jar。還有jsp規範的jarjsp2.1,好像已經减了不少的jar了,只有4個文件core-3.1.1.jarant-1.6.5.jarjsp-2.1.jarjsp-api-2.1.jarcore是使用eclipsejdt,進行jsp編譯。

   jetty
的主要配置文件為etc/jetty.xml,當然你可以自己指定彆的文件。在start.jar中有個start.config文件是默認的環境配置,以及指定默認的配置文件。可以手工替换。

  
啓動jetty很簡單,在命令行下面java -jar start.jar
  
如果需要指定start.config,使用java -DSTART=start.config -jar start.jart
  
配置web 應用也非常的簡單:
  
更改jetty.xml就行了,增加web應用的方式包括,直接放置應用在webapps下面,或者配置以下的context

  



<New id=”Mywork” class=”org.mortbay.jetty.webapp.WebAppContext”>
      
<Arg><Ref id=”Contexts”/></Arg>
      
<!– 
絶對路徑,可以指定相對路徑,增加 <SystemProperty name=”jetty.home” default=”.”/> 就行–>
      
<Arg>d:/workspace/strong/web/</Arg>
      
<Arg>/mywork</Arg>
      
<Set name=”defaultsDescriptor”><SystemProperty name=”jetty.home” default=”.”/>/etc/webdefault.xml</Set>
      
<Set name=”virtualHosts”>
        
<Array type=”java.lang.String”>
          
<Item>www.strongd.net</Item>
        
</Array>
      
</Set>
      
    
</New>




要想改變原先的webapps主應用,改變下面的配置


<Call class=”org.mortbay.jetty.webapp.WebAppContext” name=”addWebApplications”>
      
<Arg><Ref id=”Contexts”/></Arg>
      
<Arg><SystemProperty name=”jetty.home” default=”.”/>/webapps</Arg>
      
<Arg><SystemProperty name=”jetty.home” default=”.”/>/etc/webdefault.xml</Arg>
      
<Arg type=”boolean”>True</Arg>  <!– extract –>
      
<Arg type=”boolean”>False</Arg> <!– parent priority class loading –>
 
</Call>




默認的web.xml配置文件為webdefault.xml
如果想配置相應的web參數,可以更改其應用。

默認的端口為8080,如果想修改,更改:jetty.port屬性



    
<Call name=”addConnector”>
      
<Arg>
          
<New class=”org.mortbay.jetty.nio.SelectChannelConnector”>
            
<Set name=”port”><SystemProperty name=”jetty.port” default=”8080″/></Set>
            
<Set name=”maxIdleTime”>30000</Set>
            
<Set name=”Acceptors”>2</Set>
            
<Set name=”confidentialPort”>8443</Set>
          
</New>
      
</Arg>
    
</Call>





簡單的配置,簡單的啓動,下一篇,我會介紹,如何在eclipse中使用jetty.




Jetty vs. Tomcat vs. Resin: A Performance Comparison

This morning, I did some comparisons between Jetty 5.1.5rc1, Tomcat 5.5.9 and Resin 3.0.14 (OS version). I ran AppFuse’s “test-canoo” target, which tests all the JSPs using Canoo WebTest. I did this as a Servlet 2.4 application, and had to tweak some stuff in my web.xml to make it work on Jetty and Resin. Nothing big, just stuff that Tomcat let pass through and these servers didn’t. One interesting thing to note that Resin requires you to use “http://java.sun.com/jstl/fmt” for JSTL’s “fmt” tag URI, while Jetty and Tomcat require “http://java.sun.com/jstl/fmt_rt”. This is with Resin’s “fast-jstl” turned off – b/c everything blows up if it’s turned on (I don’t feel like coding my JSTL to Resin’s standards, that’s why I turn it off).

Below is a list of the average time it took to run “test-canoo” after I ran it once to compile all the JSPs.


  • Jetty: 19 seconds
  • Tomcat: 19 seconds
  • Resin: 29 seconds

In addition, I tested how long it took for each server to startup – including the initialization of AppFuse.



  • Jetty: 7 seconds
  • Tomcat: 8 seconds
  • Resin: 13 seconds

So what does all this mean? A number of things:



  • I need to clean up AppFuse’s web.xml a bit for 2.4 applications.
  • Putting the database connection pool configuration in a Spring context file (vs. JNDI) makes AppFuse much more portable.
  • Jetty isn’t as fast as Jetty-lovers say it is (or maybe Tomcat just caught up).
  • The open source version of Resin is much slower than the other open source servlet containers.
  • I should restructure the build.xml to pull out Tomcat stuff and allow users to specify server deployment settings (i.e. in a ${servername}.xml file).
  • Orion still doesn’t support the Servlet 2.4 or JSP 2.0 specifications.

支援藍光燒錄機的免費燒錄軟體:AVS Disc Creator

 軟體:AVS Disc Creator(版本:2.1.5.100)
類別:燒錄程式
性質:Freeware(5.6 M)

【編輯/高啟唐】

AVS Disc Creator是一個免費的光碟燒錄軟體,不但支援CD、DVD光碟燒錄機,就連最新的藍光燒錄機也都支援了,所用者完全不必擔心相容性上的問題。

所有常用的燒錄功能,AVS Disc Creator可說是全具備了,像是資料光碟、影片光碟、MP3光碟與製作光碟映像檔等功能,AVS Disc Creator都可以燒錄,功能不比其他付費燒錄軟體遜色。

另外,AVS Disc Creator還支援抹除可複寫式光碟與專案排程等燒錄輔助功能。還在煩惱找不到合用且免費的燒錄軟體嗎?建議你來試試AVS Disc Creator!


下載:http://www.avsmedia.com/download/AVSDiscCreator.exe

Generate not repeat random numbers


for(int i=;i<100;i++){

ArrayList<Integer> tmp_l = new ArrayList<Integer>();

int pos = 100;
int serial = (int)Math.round(Math.random()*(pos-1));
int loopcontrol = 0;

while(tmp_l.contains(serial)){
    if(loopcontrol>100) break;
    serial = (int)Math.round(Math.random()*(pos-1));
    loopcontrol ++;
}
   
tmp_l.add(serial);

System.out.println(“The random number for “+i+” is:”+serial);

}

只給出代碼片段,

先有機出一個數,保存在ArrayList,再有機第二個數,看看是否存在於ArrayList中,如果存在,再重新有機第二個數.如此類推..直到完成100個有機數為此!

支援多種格式的免費映像檔燒錄軟體:ImgBurn

 

隨著燒錄機的普及,網路上越來越多大型檔案都改以映像檔格式傳輸;不過映像檔下載回來後需要燒成光碟才能讀取,是否有支援多種格式而且還是免費的映像檔燒錄軟體呢?

來試試ImgBurn吧!ImgBurn正是一套免費的映像檔燒錄軟體,它不但可以燒錄ISO、MDS、BIN、DI、DVD、GI、IMG、NRG、 PDI、CDI、CDR、GCM、IBQ、LST、VDI等市面上最常見的映像檔,還支援CD、DVD、HD DVD、Blu-ray諸多格式喔!

除了能燒錄映像檔,ImgBurn還具備製作映像檔的功能,雖然只能製作ISO格式,但比起一堆需付費的同類型軟體已經強太多了!而ImgBurn的介面簡明易懂,體積小不會佔用太多電腦資源,對於有燒錄映像檔需求的人來說,絕對是不可多得的好幫手!

 

Speeding up Linux Using hdparm

Are you running an Intel Linux system with at least one (E)IDE hard drive?

Wouldn’t it be neat if there were a magical command to instantly double the I/O performance of your disks? Or, in some cases, show 6 to 10 times your existing throughput?


Did you ever just wonder how to tell what kind of performance you’re getting on your “tricked-out” Linux box?


Don’t overlook hdparm(8). If you’ve never heard of it, don’t worry. Most people I’ve talked to haven’t either. But if you’re running an IDE/Linux system (as many folks are,) you’ll wonder how you ever got this far without it. I know I did.


What’s the big deal?


So, you’ve got your brand-new UltraATA/66 EIDE drive with a screaming brand-new controller chipset that supports multiple PIO modes and DMA and the leather seat option and extra chrome… But is your system actually taking advantage of these snazzy features? The hdparm(8) command will not only tell you how your drives are performing, but will let you tweak them out to your heart’s content.


Now before you get too excited, it is worth pointing out that under some circumstances, these commands CAN CAUSE UNEXPECTED DATA CORRUPTION! Use them at your own risk! At the very least, back up your box and bring it down to single-user mode before proceeding.


With the usual disclaimer out of the way, I’d like to point out that if you are using current hardware (i.e. your drive AND controller AND motherboard were manufactured in the last two or three years), you are at considerably lower risk. I’ve used these commands on several boxes with various hardware configurations, and the worst I’ve seen happen is the occasional hang, with no data problems on reboot. And no matter how much you might whine at me and the world in general for your personal misfortune, we all know who is ultimately responsible for the well-being of YOUR box: YOU ARE. Caveat Fair Reader.


Now, then. If I haven’t scared you away yet, try this (as root, preferably in single-user mode):

hdparm -Tt /dev/hda

You’ll see something like:

/dev/hda:
Timing buffer-cache reads: 128 MB in 1.34 seconds =95.52 MB/sec
Timing buffered disk reads: 64 MB in 17.86 seconds = 3.58 MB/sec

What does this tell us? The -T means to test the cache system (i.e., the memory, CPU, and buffer cache). The -t means to report stats on the disk in question, reading data not in the cache. The two together, run a couple of times in a row in single-user mode, will give you an idea of the performance of your disk I/O system. (These are actual numbers from a PII/350 / 128M Ram / newish EIDE HD; your numbers will vary.)


But even with varying numbers, 3.58 MB/sec is PATHETIC for the above hardware. I thought the ad for the HD said something about 66MB per second!!?!? What gives?


Well, let’s find out more about how Linux is addressing your drive:

hdparm /dev/hda

/dev/hda:
multcount = 0 (off)
I/O support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 1870/255/63, sectors = 30043440, start = 0


These are the defaults. Nice, safe, but not necessarily optimal. What’s all this about 16-bit mode? I thought that went out with the 386! And why are most of the other options turned off?


Well, it’s generally considered a good idea for any self-respecting distribution to install itself in the kewlest, slickest, but SAFEST way it possibly can. The above settings are virtually guaranteed to work on any hardware you might throw at it. But since we know we’re throwing something more than a dusty, 8-year-old, 16-bit multi-IO card at it, let’s talk about the interesting options:



  • multcount: Short for multiple sector count. This controls how many sectors are fetched from the disk in a single I/O interrupt. Almost all modern IDE drives support this. The man page claims:


    When this feature is enabled, it typically reduces operating system overhead for disk I/O by 30-50%. On many systems, it also provides increased data throughput of anywhere from 5% to 50%.

  • I/O support: This is a big one. This flag controls how data is passed from the PCI bus to the controller. Almost all modern controller chipsets support mode 3, or 32-bit mode w/sync. Some even support 32-bit async. Turning this on will almost certainly double your throughput (see below.)


  • unmaskirq: Turning this on will allow Linux to unmask other interrupts while processing a disk interrupt. What does that mean? It lets Linux attend to other interrupt-related tasks (i.e., network traffic) while waiting for your disk to return with the data it asked for. It should improve overall system response time, but be warned: Not all hardware configurations will be able to handle it. See the manpage.


  • using_dma: DMA can be a tricky business. If you can get your controller and drive using a DMA mode, do it. But I have seen more than one machine hang while playing with this option. Again, see the manpage (and the example on the next page)!

 


Turbocharged


So, since we have our system in single-user mode like a good little admin, let’s try out some turbo settings:



hdparm -c3 -m16 /dev/hda

/dev/hda:
setting 32-bit I/O support flag to 3
setting multcount to 16
multcount = 16 (on)
I/O support = 3 (32-bit w/sync)


Great! 32-bit sounds nice. And some multi-reads might work. Let’s re-run the benchmark:

hdparm -tT /dev/hda


/dev/hda:
Timing buffer-cache reads: 128 MB in 1.41 seconds =90.78 MB/sec
Timing buffered disk reads: 64 MB in 9.84 seconds = 6.50 MB/sec


WOW! Almost double the disk throughput without really trying! Incredible.


But wait, there’s more: We’re still not unmasking interrupts, using DMA, or even a using decent PIO mode! Of course, enabling these gets riskier. (Why is it always a trade-off between freedom and security?) The man page mentions trying Multiword DMA mode2, so:

hdparm -X34 -d1 -u1 /dev/hda

…Unfortunately this seems to be unsupported on this particular box (it hung like an NT box running a Java app.) So, after rebooting it (again in single-user mode), I went with this:

hdparm -X66 -d1 -u1 -m16 -c3 /dev/hda

/dev/hda:
setting 32-bit I/O support flag to 3
setting multcount to 16
setting unmaskirq to 1 (on)
setting using_dma to 1 (on)
setting xfermode to 66 (UltraDMA mode2)
multcount = 16 (on)
I/O support = 3 (32-bit w/sync)
unmaskirq = 1 (on)
using_dma = 1 (on)


And then checked:

hdparm -tT /dev/hda

/dev/hda:
Timing buffer-cache reads: 128 MB in 1.43 seconds =89.51 MB/sec
Timing buffered disk reads: 64 MB in 3.18 seconds =20.13 MB/sec


20.13 MB/sec. A far cry from the miniscule 3.58 we started with…


By the way, notice how we specified the -m16 and -c3 switch again? That’s because it doesn’t remember your hdparm settings between reboots. Be sure to add the above line (not the test line with -tT flags!) to your /etc/rc.d/* scripts once you’re sure the system is stable (and preferably after your fsck runs; having an extensive fs check run with your controller in a flaky mode may be a good way to generate vast quantities of entropy, but it’s no way to administer a system. At least not with a straight face…)


Now, after running the benchmark a few more times, reboot in multi-user mode and fire up X. Load Netscape. And try not to fall out of your chair.


In conclusion


This is one of those interesting little tidbits that escapes many “seasoned” Linux veterans, especially since one never sees any indication that the system isn’t using the most optimal settings. (Gee, all my kernel messages have looked fine….) And using hdparm isn’t completely without risk, but is well worth investigating.


And it doesn’t stop at performance: hdparm lets you adjust various power saving modes as well. See the hdparm(8) for the final word.


Many thanks to Mark Lord for putting together this nifty utility. If your particular distribution doesn’t include hdparm (usually in /sbin or /usr/sbin), get it from the source at http://metalab.unc.edu/pub/Linux/system/hardware/


Happy hacking!

Beyond Preferences API Basics

The Preferences API was first covered here shortly after it was introduced with the 1.4 version of the standard platform: the July 15, 2003 article, the Preferences API.


That article described how to get and set user specific preferences. There is more to the Preferences API than just getting and setting user specific settings. There are system preferences, import and export preferences, and event notifications associated with preferences. There is even a way to provide your own custom location for storage of preferences. The first three options mentioned will be described here. Creating a custom preferences factory will be left to a later tip.


System Preferences


The Preferences API provides for two separate sets of preferences. The first set is for the individual user, allows multiple users on the same machine to have different settings defined. These are called user preferences. Each user who shares the same machine can have his or her own unique set of values associated with a group of preferences. Something like this could be like a user password or starting directory. You don’t want every person on the same machine to have the same password and home directory. Well, I would hope you don’t want that.


The other form of preferences is the system type. All users of a machine share the same set of system preferences. For instance, the location of an installed printer would typically be a system preference. You wouldn’t necessarily have a different set of printers installed for different users. Everyone running on one machine would know about all printers known by that machine.


Another example of a system preference would be the high score of a game. There should only be one overall high score. That’s what a system preference would be used for. In the previous tip you saw how userNodeForPackge() — and subsequently userRoot() — was used to acquire the user’s preference node, the following example shows how to get the appropriate part of the system preferences tree with systemNodeForPackage() — or systemRoot() for the root. Other than the method call to get the right preference node, the API usage is identical.


The example is a simple game, using the game term loosely here. It picks a random number from 0 to 99. If the number is higher than the previously saved number, it updates the “high score.” The example also shows the current high score. The Preferences API usage is rather simple. The example just gets the saved value with getSavedHighScore() , providing a default of -1 if no high score had been saved yet, and updateHighScore(int value) to store the new high score. The HIGH_SCORE key is a constant shared by the new Preferences API accesses.


  private static int getSavedHighScore() {
    Preferences systemNode = Preferences.systemNodeForPackage(High.class);
    return systemNode.getInt(HIGH_SCORE, -1);
  }

  private static void updateHighScore(int value) {
    Preferences systemNode = Preferences.systemNodeForPackage(High.class);
    systemNode.putInt(HIGH_SCORE, value);
 }

Here’s what the whole program looks like:

import java.util.*;
import java.util.prefs.*;
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;

public class High {
  static JLabel highScore = new JLabel();
  static JLabel score = new JLabel();
  static Random random = new Random(new Date().getTime());
  private static final String HIGH_SCORE = “High.highScore”;

  public static void main (String args[]) {
    /* — Uncomment these lines to clear saved score
    Preferences systemNode = Preferences.systemNodeForPackage(High.class);
    systemNode.remove(HIGH_SCORE);
    */

    EventQueue.invokeLater(
      new Runnable() {
        public void run() {
          JFrame frame = new JFrame(“High Score”);
          frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
          updateHighScoreLabel(getSavedHighScore());
          frame.add(highScore, BorderLayout.NORTH);
          frame.add(score, BorderLayout.CENTER);
          JButton button = new JButton(“Play”);
          ActionListener listener = new ActionListener() {
            public void actionPerformed(ActionEvent e) {
              int next = random.nextInt(100);
              score.setText(Integer.toString(next));
              int old = getSavedHighScore();
              if (next > old) {
                Toolkit.getDefaultToolkit().beep();
                updateHighScore(next);
                updateHighScoreLabel(next);
              }
            }
          };
          button.addActionListener(listener);
          frame.add(button, BorderLayout.SOUTH);
          frame.setSize(200, 200);
          frame.setVisible(true);
        }
      }
    );
  }

  private static void updateHighScoreLabel(int value) {
    if (value == -1) {
      highScore.setText(“”);
    } else {
      highScore.setText(Integer.toString(value));
    }
  }

  private static int getSavedHighScore() {
    Preferences systemNode = Preferences.systemNodeForPackage(High.class);
    return systemNode.getInt(HIGH_SCORE, -1);
  }

  private static void updateHighScore(int value) {
    Preferences systemNode = Preferences.systemNodeForPackage(High.class);
    systemNode.putInt(HIGH_SCORE, value);
 }
}

And, here’s what the screen looks like after a few runs. The 61 score is not apt to be your high score, but it certainly could be.




You can try running the application as different users to see that they all share the same high score.


Import and Export


In the event that you wish to transfer preferences from one user to another or from one system to another, you can export the preferences from that one user/system, and then import them to the other side. When preferences are exported, they are exported into an XML formatted document whose DTD is specified by http://java.sun.com/dtd/preferences.dtd , though you don’t really need to know that. You can export either a whole subtree with the exportSubtree() method or just a single node with the exportNode() method. Both methods accept an OutputStream argument to specify where to store things. The XML document will be UTF-8 character encoded. Importing of the data then happens via the importPreferences() method, which takes an InputStream argument. From an API perspective, there is no difference in importing a system node/tree or a user node.


Adding a few lines of code to the previous example will export the newly updated high score to the file high.xml. Much of the added code is responsible for launching a new thread to save the file and for handling exceptions. There are only three lines to export the single node:

    Thread runner = new Thread(new Runnable() {
      public void run() {
        try {
          FileOutputStream fis = new FileOutputStream(“high.xml”);
          systemNode.exportNode(fis);
          fis.close();
        } catch (Exception e) {
          Toolkit.getDefaultToolkit().beep();
          Toolkit.getDefaultToolkit().beep();
          Toolkit.getDefaultToolkit().beep();
        }
      }
    });
    runner.start();

When exported, the file will look something like the following:

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”no”?>
<!DOCTYPE preferences SYSTEM “http://java.sun.com/dtd/preferences.dtd”>
<preferences EXTERNAL_XML_VERSION=”1.0″>
  <root type=”system”>
    <map/>
    <node name=”<unnamed>”>
      <map>
        <entry key=”High.highScore” value=”95″/>
      </map>
    </node>
  </root>
</preferences>

Notice the root element has a type attribute that says “system “. This states the type of node it is. The node also has a name attribute valued at “<unnamed> “. Since the High class was not placed in a package, you get to work in the unnamed system node area. The entry attribute provide the current high score value, 95 in the example here, though your value could differ.


While we won’t include any import code in the example here, the way to import is just a static method call on Preferences, passing in the appropriate input stream:

  FileInputStream fis = new FileInputStream(“high.xml”);
  Preferences.importPreferences(fis);
  fis.close();

Since the XML file includes information about whether the preferences are system or user type, the import call doesn’t have to explicitly include this bit of information. Besides the typical IOExceptions that can happen, the import call will throw an InvalidPreferencesFormatException if the file format is invalid. Exporting can also throw a BackingStoreException if the data to export can’t be read correctly from the backing store.


Event Notifications


The original version of the High game updated the high score preference, then explicitly made a call to update the label on the screen. A better way to perform this action would be to add a listener to the preferences node, then a value change can automatically trigger the label to update its value. That way, if the high score is ever updated from multiple places, you won’t need to remember to add code to update the label after saving the updated value.


The two lines:

  updateHighScore(next);
  updateHighScoreLabel(next);

can become one with the addition of the right listeners.

  updateHighScore(next);

There is a PreferenceChangeListener and its associated PreferenceChangeEvent for just such a task. The listener will be notified for all changes to the associated node, so you need to check for which key-value pair was modified, as shown here.

    PreferenceChangeListener changeListener =
        new PreferenceChangeListener() {

      public void preferenceChange(PreferenceChangeEvent e) {
        if (HIGH_SCORE.equals(e.getKey())) {
          String newValue = e.getNewValue();
          int value = Integer.valueOf(newValue);
          updateHighScoreLabel(value);
        }
      }
    };
    systemNode.addPreferenceChangeListener(changeListener);

The PreferenceChangeEvent has three important properties: the key, new new value, and the node itself. The new value doesn’t have all the convenience methods of Preferences though. For example, you can’t retrieve the value as an int. Instead you must manually convert the value yourself. Here’s what the modified High class looks like:

import java.awt.*;
import java.awt.event.*;
import java.io.*;
import java.util.*;
import java.util.prefs.*;
import javax.swing.*;

public class High {
  static JLabel highScore = new JLabel();
  static JLabel score = new JLabel();
  static Random random = new Random(new Date().getTime());
  private static final String HIGH_SCORE = “High.highScore”;
  static Preferences systemNode =
  Preferences.systemNodeForPackage(High.class);

  public static void main (String args[]) {
    /* — Uncomment these lines to clear saved score
    systemNode.remove(HIGH_SCORE);
    */

    PreferenceChangeListener changeListener =
        new PreferenceChangeListener() {

      public void preferenceChange(PreferenceChangeEvent e) {
        if (HIGH_SCORE.equals(e.getKey())) {
          String newValue = e.getNewValue();
          int value = Integer.valueOf(newValue);
          updateHighScoreLabel(value);
        }
      }
    };
    systemNode.addPreferenceChangeListener(changeListener);

    EventQueue.invokeLater(
      new Runnable() {
        public void run() {
          JFrame frame = new JFrame(“High Score”);
          frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
          updateHighScoreLabel(getSavedHighScore());
          frame.add(highScore, BorderLayout.NORTH);
          frame.add(score, BorderLayout.CENTER);
          JButton button = new JButton(“Play”);
          ActionListener listener = new ActionListener() {
            public void actionPerformed(ActionEvent e) {
              int next = random.nextInt(100);
              score.setText(Integer.toString(next));
              int old = getSavedHighScore();
              if (next > old) {
                Toolkit.getDefaultToolkit().beep();
                updateHighScore(next);
              }
            }
          };
          button.addActionListener(listener);
          frame.add(button, BorderLayout.SOUTH);
          frame.setSize(200, 200);
          frame.setVisible(true);
        }
      }
    );
  }

  private static void updateHighScoreLabel(int value) {
    if (value == -1) {
      highScore.setText(“”);
    } else {
      highScore.setText(Integer.toString(value));
    }
  }

  private static int getSavedHighScore() {
    return systemNode.getInt(HIGH_SCORE, -1);
  }

  private static void updateHighScore(int value) {
    systemNode.putInt(HIGH_SCORE, value);
    // Save XML in separate thread
    Thread runner = new Thread(new Runnable() {
      public void run() {
        try {
          FileOutputStream fis = new FileOutputStream(“high.xml”);
          systemNode.exportNode(fis);
          fis.close();
        } catch (Exception e) {
          Toolkit.getDefaultToolkit().beep();
          Toolkit.getDefaultToolkit().beep();
          Toolkit.getDefaultToolkit().beep();
        }
      }
    });
    runner.start();
  }
}

In addition to the PreferenceChangeListener/Event class pair, there is a NodeChangeListener and NodeChangeEvent combo for notification of preference changes. However, these are for notification nodes additions and removals, not changing values of specific nodes. Of course, if you are writing something like a Preferences viewer, clearly you’d want to know if/when nodes appear and disappear so these classes may be of interest, too.


The whole Preferences API can be quite handy to store data beyond the life of your application without having to rely on a database system. For more information on the API, see the article Sir, What is Your Preference?


Using Enhanced For-Loops with Your Classes

The enhanced for-loop is a popular feature introduced with the Java SE platform in version 5.0. Its simple structure allows one to simplify code by presenting for-loops that visit each element of an array/collection without explicitly expressing how one goes from element to element.


Because the old style of coding didn’t become invalid with the new for-loop syntax, you don’t have to use an enhanced for-loop when visiting each element of an array/collection. However, with the new style, one’s code would typically change from something like the following:


for (int i=0; i<array.length; i++) {
    System.out.println(“Element: ” + array[i]);
}

to the newer form:

for (String element : array) {
    System.out.println(“Element: ” + element);
}


Assuming “array” is defined to be an array of String objects, each element is assigned to the element variable as it loops through the array. These basics of the enhanced for-loop were covered in an earlier Tech Tip: The Enhanced For Loop, from May 5, 2005.


If you have a class called Colony which contains a group of Penguin objects, without doing anything extra to get the enhanced for-loop to work, one way you would loop through each penguin element would be to return an Iterator and iterate through the colony. Unfortunately, the enhanced for-loop does not work with Iterator , so the following won’t even compile:


// Does not compile
import java.util.*;
public class BadColony {
  static class Penguin {
    String name;
    Penguin(String name) {
      this.name = name;
    }
    public String toString() {
      return “Penguin{” + name + “}”;
    }
  }

  Set<Penguin> set = new HashSet<Penguin>();

  public void addPenguin(Penguin p) {
    set.add(p);
  }

  public Iterator<Penguin> getPenguins() {
    return set.iterator();
  }

  public static void main(String args[]) {
    Colony colony = new Colony();
    Penguin opus = new Penguin(“Opus”);
    Penguin chilly = new Penguin(“Chilly Willy”);
    Penguin mumble = new Penguin(“Mumble”);
    Penguin emperor = new Penguin(“Emperor”);
    colony.addPenguin(opus);
    colony.addPenguin(chilly);
    colony.addPenguin(mumble);
    colony.addPenguin(emperor);
    Iterator<Penguin> it = colony.getPenguins();
// The bad line of code:
    for (Penguin p : it) {
      System.out.println(p);
    }
  }
}

You cannot just pass an Iterator into the enhanced for-loop. The 2nd line of the following will generate a compilation error:

    Iterator<Penguin> it = colony.getPenguins();
    for (Penguin p : it) {

The error:

BadColony.java:36: foreach not applicable to expression type
    for (Penguin p : it) {
                     ^
1 error

In order to be able to use your class with an enhanced for-loop, it does need an Iterator , but that Iterator must be provided via the Iterable interface:


public interface java.lang.Iterable {
    public java.util.Iterator iterator();
}

Actually, to be more correct, you can use a generic T , allowing the enhanced for-loop to avoid casting, returning the designated generic type, instead of just a plain old Object .

public interface java.lang.Iterable<T> {
    public java.util.Iterator<T> iterator();
}

It is this Iterable object which is then provided to the enhanced for-loop. By making the Colony class implement Iterable , and having its new iterator() method return the Iterator that getPenguins() provides, you’ll be able to loop through the penguins in the colony via an enhanced for-loop.


By adding the proper implements clause:

public class Colony implements Iterable<Colony.Penguin> {


You then get your enhanced for-loop for the colony:

    for (Penguin p : colony) {

Here’s the updated Colony  class with the corrected code:

import java.util.*;

public class Colony implements Iterable<Colony.Penguin> {

  static class Penguin {
    String name;
    Penguin(String name) {
      this.name = name;
    }
    public String toString() {
      return “Penguin{” + name + “}”;
    }
  }

  Set<Penguin> set = new HashSet<Penguin>();

  public void addPenguin(Penguin p) {
    set.add(p);
  }

  public Iterator<Penguin> getPenguins() {
    return set.iterator();
  }

  public Iterator<Penguin> iterator() {
    return getPenguins();
  }

  public static void main(String args[]) {
    Colony colony = new Colony();
    Penguin opus = new Penguin(“Opus”);
    Penguin chilly = new Penguin(“Chilly Willy”);
    Penguin mumble = new Penguin(“Mumble”);
    Penguin emperor = new Penguin(“Emperor”);
    colony.addPenguin(opus);
    colony.addPenguin(chilly);
    colony.addPenguin(mumble);
    colony.addPenguin(emperor);
    for (Penguin p : colony) {
      System.out.println(p);
    }
  }
}

Running the code produces the following output:

  > java Colony

  Penguin{Chilly Willy}
  Penguin{Mumble}
  Penguin{Opus}
  Penguin{Emperor}

Keep in mind that the individual penguins are internally kept in a Set type collection so the returned order doesn’t necessarily match the insertion order, which in this case it doesn’t.


Remember to genericize the implements clause for the class “implements Iterable<T> ” and not just say “implements Iterable “. With the latter, the enhanced for-loop will only return an Object for each element.


For more information on the enhanced for-loop, please see the Java Programming Language guide from JDK 1.5.