为Raspberry Pi 2编译内核模块

2015-04-21更新:原始的rpi-source项目已经由PeterOGB 接手维护,所以无须再用我下文中提到的我改过的rpi-source脚本,直接用原始的就可以了。其它文中提到的背景知识都仍然有效。

即把第一个命令改为:
$ wget https://raw.githubusercontent.com/notro/rpi-source/master/rpi-source
&& chmod a+x rpi-source

2015-07-29更新:Raspbian的内核版本已经升级到4.x,rpi-source还不能正确处理,需要进行以下额外的工作:

1. rpi-source需要获取的/proc/config.gz默认不存在了,需要额外加载模块来实现:

$ sudo modprobe configs

2. rpi-source在4.x内核下无法正确检测gcc版本,运行rpi-source时请加–skip-gcc选项。


 

在Linux下使用“360随身WiFi 2”》一文的留言区中,曾经有人问过,为什么编译出来的模块insmod/modprobe时报“Exec format error”,我不假思索的回复,请他检查编译模块时用的内核头文件与实际运行的内核是否完全匹配。这个答案倒也不算错,不过其实并没有解决问题,因为遇到的这个问题的人一般都已经用了“正确”的方式去编译他的模块,就算再重新做几遍,还是会遇到一样的问题。

最近我给Raspberry Pi 2编译内核模块时,遇到了一样的问题,花了很多时间才真正解决,在这里总结一下。以下描述的方法和内容,对Raspberry Pi (A/A+/B/B+)和Raspberry Pi 2都适用。

准备编译模块需要的内核树的方法(适用于Raspbian):

1. 下载我改过的rpi-source脚本
$ wget https://raw.githubusercontent.com/lifanxi/rpi-source/master/rpi-source
&& chmod a+x rpi-source

2. 运行rpi-source
$ ./rpi-source

3. 好了,可以进入模块源代码的目录进行模块编译了。

疑难排解:

1. rpi-source报gcc版本不一致

截止2015-03-12,Raspbian最新的内核是用gcc 4.8编译的(可以查看/proc/version确认),而Raspbian中自带的gcc是4.6的,需要升级到4.8。因为4.8的gcc已经backport了,所以可以直接sudo apt-get install gcc-4.8 g++-4.8,然后用update-alternatives设置优先级即可[1]。

2. 如果用rpi-source –skip-gcc忽略gcc版本检查,并强行用4.6的gcc会编译模块怎么样?

我的试验结果是模块可以编译,但在加载模块时会造成kernel oops,然后再用insmod/modprobe/rmmod/lsmod等命令时会挂住,只能重启解决。如果你编的模块是会自动加载的,重启前先把它删掉,不然启动时就会挂住。

3. rpi-source无法正常下载内核代码或Modules.symvers文件

有可能是你的内核版本太老,rpi-source只支持Raspberry Pi 3.10.37以上的内核。对于Raspberry Pi 2,它只支持3.18.6以上的内核。解决办法是先运行sudo rpi-update更新内核和固件,更新后请重启系统,然后再重新运行rpi-source。

4. 编译模块时报找不到arch/armv6l或arch/armv7l目录

尝试在make命令前加ARCH=arm参数,或尝试把/lib/modules/`uname -r`/build/arch中的arm软链为armv6l或armv7l后再编译。

背景知识:

1. Raspbian的内核包

不要按照使用Debian的习惯去找什么linux-image、linux-source之类的包,Raspbian的内核包是raspberrypi-bootloader,里面包含了内核、模块和一些启动文件,但没有Module.symvers和头文件。

2. rpi-update是啥

rpi-update是Raspbian内置的更新内核和相关固件的脚本,它的逻辑是去https://github.com/Hexxeh/rpi-firmware这个仓库下载最新的内核和固件,替换现有的版本。更新完成后会更新/boot/.firmware_revision,记下最新版本对应的Git Hash,以后rpi-update或rpi-source都会根据这个Hash去GitHub找对应文件。

3. Raspberry Pi的官方内核去哪里找

http://github.com/raspberrypi,里面的linux对应内核源代码,firmware是编译好的内核和相关文件。而rpi-update用的https://github.com/Hexxeh/rpi-firmware其实是firmware中部分文件的一个镜像,分出一个镜像仓库可以让rpi-update脚本的实现变得比较简单[2]。

4. rpi-source做了些啥

根据rpi-update记录在/boot/.firmware_revision中的内核版本Git Hash(如果没有用rpi-update更新过内核,就从raspberrypi-bootloaderq包的changlog中解析出Hash),去raspberrypi/linux仓库中获取对应的源代码,把/lib/modules/`uname -r`/build和/lib/modules/`uname -r`/source对应的软链建好,从/proc/config.gz获取当前内核配置,去raspberrypi/firmware仓库中获取对应的Modules.symvers跟内核代码放在一起,然后make modules_prepare准备好编译模块所需要的内核树。

5. 你改的rpi-source改了些啥

rpi-source的作者已经宣布不再维护这个脚本,并且这个脚本不支持Raspberry Pi 2,所以我在GitHub上Fork了一份,做了以下改动:

  • 修改了脚本自动更新URL到我Fork出来的版本;
  • 检查/proc/cpuinfo,判断当前硬件是Raspberry Pi还是Raspberry Pi 2;
  • 可以通过-b参数强行指定Raspberry Pi的硬件版本;
  • 根据不同的硬件,下载不同版本的Modules.symvers;
  • 如果用参数指定了要求用默认配置来配置内核树,则对不同硬件版本的Raspberry Pi调用不同的命令[3]。

6. Raspberry Pi和Raspberry Pi 2的内核有啥区别

Raspberry Pi 2的SOC是BCM2709,基于ARM 7(armv7l),而一代是BCM2708,ARM 6(armv6l),所以二代的内核中用了一些armv7l中特有特性。目前在打包的时候两个版本内核文件是打包在一起的,只是用后缀7或v7来区别,启动的时候会按实际硬件选择。

7. Module.symvers是干嘛用的?

一句话讲不清,有兴趣请参考[4]。总之,没有Module.symvers或用错了Module.symvers都可能会造成你加载模块时报Exec format error。如果你遇到了这样的情况,请确认rpi-source的执行过程中有没有失败的步骤。armv7l和armv6l版本的内核用的Module.symvers是不通用的,在raspberrypi/firmware中分别命名为Module.symvers和Modules7.symvers,但放到内核树中使用时需要命名为Module.symvers,如果是你自己准备内核树,务必要小心,我自己在这个问题上犯了错误,浪费了很多时间。当然,如果用我改过的rpi-source,那它已经帮你搞定了这件事。

8. 我用了rpi-update和rpi-source后编出来的模块还是无法加载。

目前我用本文描述的方法编译了过天猫魔盘(rtl8192eu)、360随身WiFi 2(mt7601u)这两种无线网卡的驱动,都工作正常。如果你遇到了别的问题,不妨在这里留言,可以一起讨论一下。

另外,终级大法一定是重新完整的编译整个内核,不过如果你想在Raspberry Pi上完成这个工作,那必须等有充分的耐心。所以,最好是在PC上进行交叉编译[3]。

[1] https://github.com/notro/rpi-source/wiki

[2] https://github.com/Hexxeh/rpi-firmware/blob/master/README.md

[3] https://github.com/raspberrypi/documentation/blob/master/linux/kernel/building.md

[4] http://www.ibm.com/developerworks/cn/linux/l-cn-kernelmodules/

戏说Linux-2.6.38中进程调度的优化

2.6.38?现在2.6.37还在RC中,怎么会有38?是的,因为37进了RC,所以不会再加新功能了。那么将要进入的新功能只能放到38中。而谁会是2.6.38的闪光点呢?对于进程调度subsys来说,很有可能就是autogroup了。

虽然现在这个新功能还在讨论中,但是Linus和Ingo都已经buy in得差不多了,进入main tree是指日可待了。这个feature要说最早也是来自于Linus的灵感,其实很多人应该也多有感受。比如在我们编译内核时,桌面响应就会变慢,甚 至在vi里敲个字都顿啊顿的;再比如在emule下载完一部电影后做sync时,大量IO也会拖得桌面很顿。这是由于进程调度器在选择下个需要被调度进程 时对于需要快速响应的进程没有做到充分考虑其特殊性。但是你要是用过Redhat9的话,一定会感觉现在Linux的桌面响应或是普通tty响应都快太多啦。这完全要归功于Con Kolivas 和Ingo的不懈努力得到的CFS(虽然Con在技术论战中败下阵后一蹶不振,回去干麻醉师的老本行了;但在论战中胜出的Ingo把sched维护的有 声有色),CFS的完全公平调度天性,让编译程序等非交互性进程被调度器选择时的优先度大大降低(运行的时间越多,在RB tree中越靠边;而运行时间及机会越少的交互式进程越有机会靠近RB tree的左下角,这个左下角可是黄金地段啊,调度器就喜欢从这个地方调进程了)。这么看来CFS做的也很好了,但是在这个牛人辈出的时代,Mike Galbraith站了出来,大声的说,我还可以让交互进程响应更快!

凭什么这么说?凭cgroup!说到这个cgroup,对笔者来说算是老朋友了,或者说是老朋友的老朋友了。国内的Linux新一代牛人中的一位Zefan LI是我的老兄弟,而他恰恰又是cgroup的maintainer,2008年时国内的少数几位内核维护者之一。cgroup是干嘛的呢?google 搜一下,有功底的朋友应该可以看懂,我在这里简单解释,相信也能让大家有个概念。想象一下有这么种容器,你可以把进程放入其中,并给这个容器贴个标签,比如:“核心业务进程”,“娱乐用进程”,“老婆用的进程”等等。并且你可以拿个更大的容器,把“娱乐用进程”和“老婆用的进程”的小容器都扔到那个大容器中,并贴个标签“家用进程”。cgroup就是用来定义这些容器的框架。有了它,我们可以让“核心业务进程”的容器内的进程享有一些特权,比如说固定的网络带宽,较大的IO速度等;我们还可以让“老婆用的进程”容器里的进程只得到很少的系统资源,哈哈,床上打不过你,我还不能靠我的拿手本领整你!(这里的我,泛指天下受苦受难的男性软件工程师)

废话少说,就要到正题了,看到这里,就算你不是Mike G,你也该知道怎么整了吧?虽然知道怎么整与能整出code来还是有差距的,但是至少我们也想到啦。Mike的做法果然不出我们所料,具体说,就是在进程需要用到tty资源时,比如说vi啦,gedit啦,魔兽世界啦。。。。(魔兽世界在Linux上还能玩啊?)就把它放到一个特殊的容器中,这个容器贴了标签“老子要快点响应”。那么进程调度器在挑选下一个家伙来使用CPU时,就会乖乖的去找它啦。就这么简单。简单?你去写一个!

Mike只用了一个220行左右的patch就做到了,代码极尽精炼无耦之能事,被Linus开心的表扬了一遍又一遍。不过最厉害的还是让数据说话,调度平均响应时间为原来的16分之1左右。厉害吗?太厉害了。但是。。。

对的,什么事都会有但是,这是唯物辩证法告诉我们的。cgroup对系统是有penalty的。你想啊,本来人家进程们好好呆在红黑树上等着调度,你非得再给人家按容器分分类,贴贴标签,等到调度器来拉壮丁去享受CPU或其他资源时,不仅得去树里挑,还得在一堆罐子中挑,最可恨的是罐子里还套着罐子。这个Linus当然也是一针见血的就看出来了,可Mike不愧是小牛人。立马拿出准备已久的理由说,对于缺少人机交互的server来说,这个确实比较扯淡(其实也还好啦,cgroup不会带来大于5%的开销),但是对于那些装了Fedora,Ubuntu等等的桌面用户来说,根本无法感知cgroup的开销,反而会乐意接受基于cgroup的autogroup带来的高速响应。厉害!

厉害归厉害,社区里厉害而又爱挑刺(追求完美啦)的人多得是,所以Mike同志还在矜矜业业根据大家的建议做些细微的改善,细微却很重要的改善,估计进Ingo的sched tree是没问题了。热切期待其正式进入2.6.38,接受大众的考验。

Receiving Packet Steering

Linux Kernel 2.6.35最赞的feature就是RPS。

现在的网卡可以接收/发送包的速度已经开始让宿主机的CPU难以跟上。而由于单个CPU的处理能力也慢慢变得难以提升,那么把数据包分布到多个CPU上处理就变得很有必要。在2.6.35内核中,来自Google的Tom Herbert提交了RPS的系列patch,实现了此机制。(对于发出的数据包,已经不存在这样的问题了,因为发送数据包的进程本身就是分散在CPU上运行的。)

Tom的实现非常简单,在netif_rx()和netif_receive_skb()这两条内核收包必经之路上做了以下处理:

  1. 为每个包做一个hash(根据源IP,目标IP,端口)
  2. 根据hash把包放入相应CPU的backlog队列
    • 用户可以通过/sys/class/net/<device>/queues/rx-<n>/rps_cpus设置一个mask,指定特定网卡或者网卡上的某个接收队列(如果支持多接受队列的话)可以包交给特定的几个CPU处理。
    • 包的hash值和rps_cpus的值共同决定了该数据包会放入哪个CPU的backlog队列。

经过以上修改,内核可以支持:

  1. 为网卡或网卡上的接收队列指定数据包可以在哪些CPU上被均衡处理。
  2. 来自网卡的数据包不再被一个CPU的计算瓶颈限制,可以被分布到多个CPU上处理接收。
  3. 由于hash值是根据源IP,目标IP和端口计算得出,因此同一个transaction的包被同一个CPU处理,可以达到很好cache命中率,从而提高性能。

Google的生产线上已经使用了RPS,效果不错。这里是Tom给的一些测试数据,明显看到CPU利用率提升以及网络PPS和TPS的提升。

e1000e on 8 core Intel
Without RPS: 108K tps at 33% CPU
With RPS:    311K tps at 64% CPU

forcedeth on 16 core AMD
Without RPS: 156K tps at 15% CPU
With RPS:    404K tps at 49% CPU

bnx2x on 16 core AMD
Without RPS  567K tps at 61% CPU (4 HW RX queues)
Without RPS  738K tps at 96% CPU (8 HW RX queues)
With RPS:    854K tps at 76% CPU (4 HW RX queues)

但是同时,这里有一些limitation。

  1. 一个数据包到达后,中断所到达的CPU负责分配,这时会通过SKB内容计算hash,由于是第一次读这个SKB,所以发生cache miss正常。但是之后这个数据包可能会被分配到其他CPU上处理,对于另一个CPU来说,SKB仍然是一个未知内容,所以cache miss又会发生,这次的cache miss就是RPS带来的新开销。不过幸好,很多网卡都天生支持计算这样的hash,驱动只需要把该值存入SKB,就可以提高cache hit,以及计算机整体性能。
  2. rps_cpus可以在运行时被动态改变,但是这样的改动可能会导致包的乱序。因为改动前某个transaction的数据包都是由某个CPU处理,而rps_cpus改变,可能导致该transaction的包被发往另一个CPU上处理,如果后到的包在第二个CPU上被处理的速度快于先到的包在第一个CPU上被处理的速度,就会发生乱序。所以尽可能不要频繁改变rps_cpus的值。

Which areaes kernel contributors focused on

I’ve made a new kind of analysis for Linux Kernel contributions. It tells us the contributors, such as IBM, Intel, Oracle and Fujitsu etc., were focusing on what kind of areas/subsystems of kernel to make their contributions.

As we know, Linux Kernel have a lot of different kind of active contributors including Non-profit and corporations. For corporations, there are also lots of different kinds and make the contribution be done on different kernel subsystems. We can simply attribute the corporations to distro vendor, hardware vendor, software vender and IT vendor, and pick up some representative corporations from each class to see what those giants are interesting in and bring them on contributing.

1) Distro Venders which sell Linux Distributions and provide support service

a) Red Hat

  • kernel/trace/:  No surprise that an OS development company uses a lot of effort to focus on how to monitor and debug kernel.
  • arch/[x86|sparc*|ia64|powerpc|x86_64]/:  As we will see, although other companies also cover some parts of arches,  such as x86(Intel), PowerPC(IBM) and ia64(SGI), no one completely covers so many arches as a kernel contributor. A distro vendor should take care of arch code as seriously as platform vender.
  • drivers/[net|char|scsi|media|ata|md|...]/:  Also no surprise, although some hardware vendors provide Linux driver for their products, distro vendors still need to take care of some orphan devices or integration bugs.
  • fs/gfs2/:  Not only as a Distro Vender, Red Hat plays some role of IT vendor too and provide integrated solution for costumers. The Red Hat’s product – GFS is the file system which used for manage cluster servers for enterprise deployment.
  • Red Hat are involved in a lot of other areas to which the contributions are not as shining as those mentioned before, however they are still huge ones compared to the contributions of companies after TOP10 contribution. Details can be referred by the KPS statistic data.

b) Novell

  • sound/pci/: As Novell claimed “SUSE Linux Enterprise Desktop is the market’s only enterprise-quality Linux desktop”, SUSE also aims on desktop market. Providing great sound subsystem seems be consistent to that target.
  • drivers/*/: Also consistent to the desktop market target, Novell provides more driver development efforts than Red Hat. Greg KH as one of the Novell employee costs a big part of his time on Staging Drivers to help Linux Kernel to support more and more new cool drivers.
  • fs/fuse/:  I can’t see obvious reason that why Novell support fuse such dedicated.

2) Hardware Venders which mainly sell hardware for profit.

a) Intel

  • drivers/net/:  Who are the biggest NIC and Wireless NIC vendors on the world? I think Intel must be one of the them.
  • drivers/acpi/:  As the biggest platform vendor, Intel leads the hardware and software development of power management.
  • arch/[i386|ia64|x86_64]/:  Who made the CPUs? As the biggest CPU manufactory, Intel should take care the arches which her CPUs support.

b) Renesas

  • arch/sh/:  SuperH arch’s biggest manufactory supports the sh kernel without any surprise.

c)  Analog Devices

  • arch/blackfin/: Most of her efforts focus on her own platform – blackfin.

3) Software Venders which mainly sell software(except OS) and service for profit.

a) Oracle (before acquiring Sun, Oracle is more like a software vendor)

  • fs/[btrfs|ocfs2|nfs]/: As the biggest OSS contributor of software vendor, Oracle is carrying a lot of OSS projects on. The Linux kernel filesystems are just the typical area Oracle is focusing on. As a database maker, Oracle is enforcing some filesystems of Linux Kernel and keeping creating new filesystems to tie in with their database or middleware products.
  • block/:  No surprise, Oracle should not only spend efforts on fs subsystems and also block IO subsystems to support their database solutions.

b)  Parallels

  • net/[ipv4|ipv6|core]/:  As a virtualization software maker, Paralles’s engineers did a lot of jobs to refine network namespace, which can support network virtualization better.

4) IT Venders which sell whole solutions including hardware, OS, middleware and applications, and of course service for profit.

a) IBM

  • arch/[powerpc|s390|ppc64]/: IBM created those arches and provided Linux kernel which can be run on those platforms.
  • kernel/:  As the No.2 kernel contributor following Red Hat, IBM did a lot of work to core kernel part, such as kernel synchronism, cpu control, kprobe and etc.
  • fs/[cifs|ext3|ext4]/: IBM also hired some active community engineers who work on filesystem area. As we all know, Ted is working as ext3/4 maintainer, IBM employee and TLF consultant.
  • Linux Test Project: Although my statistic analysis only includes kernel source code, the great work of LTP makes me have to mention it. As a synchronous-with-kernel and individual project, LTP keeps updating its test cases for kernel and makes a lot of Linux related IT vendors to have a easy day to do QA work for kernel. The LTP project is a special contribution to Linux Kernel.

b) SGI

  • fs/xfs/:  As a product of SGI, XFS got wonderful support from SGI. A storage solution and HPC vendor taking some efforts on filesystems is totally unsurprising.
  • mm/:  Former SGI employee Christoph Lameter did a lot of contribution to memory management subsystem.
  • arch/ia64/: SGI and HP are the initiators of IA64, thus for IA64, SGI did a lot of work as well as HP, Intel.

c) Fujitsu (my former employer :) )

  • kernel/trace/: As a whole IT solution vendor, Fujitsu are doing a lot of work to enhance Linux Kernel’s trace and debug features to provide customer a more robust, maintainable and higher availabilty IT environment.
  • drivers/pci/:  During integration, Fujitsu enhances drivers to ensure a high quality hardware environment of server.
  • mm/:  Memory controller is maintained by Fujitsu and Google engineers to provide user a more flexible IT environment.

As we can see, corporations are dedicating themselves to kernel contribution for their product lines or services and trying to feed back to community when they gain from community. They are taking the responsibility to make sure the enterprise using components of kernel to be as healthy as customers want.

Let’s see the non-profit contributors are focusing on what areas, which maybe different from that of corporations.

Hobbyists (No one pays them for doing kernel contribution)

  • drivers/media/:  As we know, no commercial companies focused on this area. But hobbyists committed 3818 patches(2.4% of total patches of kernel since Linux-2.6.12) to drivers/media/. That  is the amazing phenomena that desktop users are doing such great works to kernel development.
  • drivers/[net|ide|staging|usb|video]: A lot of hobbyists are taking care of the stuff which enterprise users maybe don’t want to care.
  • arch/[x86|arm]/:  The hobbyists’ most favorite platforms are x86 and arm :)

That’s the brief introduction about who are interesting in what subsystem of kernel.

For detail information about other corporations and non-profit population who are not mentioned here, such as Google, HP, Academics etc. please refer to the statistic.

Android代码将被Linux内核驱逐?

继之前微软的Hyper-V事件后,Linux内核驱动程序维护者GregKH在9-Oct.提出将在2.6.33内核开始删除Android驱动。

Due to no support from Google, I’ve dropped the android code from the staging tree for 2.6.33.
不过Google的职员的对应相当迅速,David(虽然不负责Android开发)质疑了Greg的“no support from Google”指责,表明对于是否删除Android代码没有意见,但是对于Greg给出的理由表示了不满。明显Google对于Android给予了一定的支持,但是由于和开源社区的开发流程不能很好融合的原因,导致了对于驱动的修补反应较慢。
那么Android驱动在Linux内核中的命运到底如何?这可能还要看Android开发者们跟社区是否能够更好的合作,Greg目前还没有收回删除Android驱动的想法。

Kernel Defects Before and After Release

I’ve tried to analyze the defect data of Linux Kernel to find out the answers of the following questions.

1. How much effort the release candidate period should give?

2. Does the quality of stable kernel(stable kernel without update) becomes better or worse?

To figure out the questions, I gathered the following datum.

1. Pre-RC Changed Lines: The lines of source code changed between 2.6.(n-1) and 2.6.n-rc1. These source codes are confluxed during merge window period of each version of kernel. They are mostly implemented for new features of a new version of kernel, which needs a long period of testing and fixing after merge window closes.

2. Defects found during RC: Because mainline almost only accepts bug fix patches in RC period and we encourage one patch per bug, the patch set quantity of RC approximately represents quantity of defects found in RC period, which we call internal(inside development community) testing period. This data comes from git-log between 2.6.n-rc1 and 2.6.n.

3. Defects found after release: After stable kernel release, users and developers themselves will find some bugs and fix them. To eliminate the unfair caused by different release time, I only collect the stable kernel defects which was fixed before next 2.6.x stable release. For example, if 2.6.n was released on 2009.A.B and 2.6.(n+1) was released on 2009.C.D, the “Defects found after release” means the fixes for 2.6.n stable kernel before 2009.C.D. Actually, I dig the data from GregKH’s stable git trees.

By putting the RC defect ratio(”Defects found during RC”/”Pre-RC Changed Lines”[kstep]) to X-axis and Stable defect ratio(”Defects found after release”/”Pre-RC Changed Lines”[kstep]) to Y-axis, I got some interesting graphics here.

Wait… What do you expect to see? More RC fix bring higher quality of stable kernel?

1. Picture-1 maybe disappoints you. We see here: the more bugs got killed during RC the more complaint from stable users or developers.  Why? I don’t remember who said this, but I remember these words “Finding MORE defects means as MORE defects unrevealed”. RC defect rate doesn’t mean that the more bugs fixed the higher quality of release. Instead it tells us whether new features have a trustful quality before being sent into merge window. Developers should well test their patches of new feature as well as the patches for RC. So here comes another question: we can’t keep doing RC forever, thus when should we stop RC and goto release.

Linux Kernel defects before and after release

2. Picture-2 shows that RC fixing increased nearly as linear trend according to different Pre-RC code quantity. But trend is trend, some kernel releases didn’t have enough defects be found during RC and some had above-average RC defects ratio. Let’s go back to Picture-1, we can see densest RC defect ratio region is about 2.5-3.0 defect/kstep, and the average RC defect ratio is about 2.87 defect/kstep, which is marked by the red ordinate. Below the average RC defect ratio, four of ten releases have above-average stable defect ratio. If we set a rule to not allow ending RC when the RC defect ratio is still below average, we can make “the four” releases more stable.

Now we have seen two pictures of whole kernel defect ratio, but how about subsystems?

RC Fix according to merge window quantity

X-axis: Pre-RC Changed Lines, Y-axis: Defects found during RC

3. Not only whole kernel should have a RC defect ratio gate, each subsystem also should not allow too low RC defect ratio. Let’s see some.

Network subsystem is almost like whole kernel, stable defect ratio increases as RC defect ratio increasing. “core kernel”, “fs”, “arch” and “block” are almost as same, which I don’t paste here to save my host’s space ;)

network subsys

network subsys, X-axis: RC defect ratio, Y-axis: Stable defect ratio

Sound subsystem is some kind of strange.  When RC defect ratio exceeds 3.0 defect/kstep, stable defect ratio begins to descend. In my superficial opinion: very strong RC effort(seven times of average) causes higher stable quality. At this point, very high RC defect ratio no longer means “Finding MORE defects means as MORE defects unrevealed”, but means “Highly quality control reduces unrevealed defects”.

sound subsys

sound subsys, X-axis: RC defect ratio, Y-axis: Stable defect ratio

Memory Management looks like an ideal descending trend, if we ignore 2.6.16, 2.6.20 and 2.6.21. In mm, RC defect ratio is very higher than any other subsystems, because mm is essential part of core kernel and rarely changes.  I don’t recall what happened on the three releases, maybe something like virtualization supporting can cause such a defect boom.

mm subsys

mm subsys, X-axis: RC defect ratio, Y-axis: Stable defect ratio

4. This is the last graph which answers my question about “Does the quality of stable kernel becomes better or worse?”. From 2.6.14 to 2.6.30, stable kernel has an average defect ratio about 0.32 defect/kstep. Since 2.6.23, kernel tends to keep a lower stable defect ratio, although 2.6.27 and 2.6.28 departed from the trend. I’d like to say maybe we are releasing kernels of better quality.

Stabe Defect Ratio Trend

Stabe Defect Ratio Trend

As Linus said at Linuxconf this week :”Linux is bloated“,  it doesn’t only means that kernel got bigger and bigger, but also means that kernel grows up faster and faster. So when to release a new kernel not only depends on time and RC rounds, but also depends on defect ratio and new feature’s influence scope.

微软向Linux提交代码能走多远

微软虚拟机代码可能被移出Linux!

Linux驱动维护者GregKH在为Linux-2.6.32的merge window做准备时整理了现有的staging目录。对于微软提供的虚拟机hv驱动代码发表了一下的观点:

hv (Microsoft Hyper-V) drivers. Over 200 patches make up the massive cleanup effort needed to just get this code into a semi-sane kernel coding style (someone owes me a bit bottle of rum for that work!) Unfortunately the Microsoft developers seem to have disappeared, and no one is answering my emails. If they do not show back up to claim this driver soon, it will be removed in the 2.6.33 release. So sad…

目前的hv还没有进入mainline,但是在Greg的patch队列中可以看到hv的代码确实来自于微软。微软在2009年七八月份时把她的Hyper-V技术相关的驱动代码以GPLv2的许可权方式贡献给了Linux内核社区。其主要目的是希望Linux可以运行在Windows2008及其Hyper-V虚拟机之上。

对于如此巨大的贡献,Greg自然非常乐意接受,可是在Greg提出一系列TODO后,微软的开发人员反应非常缓慢,导致Greg发了以上的观点。虽然hv还是有可能在2.6.32的merge window期间加入mainline,但是如果微软的工程师继续不理会Greg的要求的话,hv也有可能在2.6.33的开发期间被移出。

好在微软又做出了反应,这次的微软Linux合作得以有继续的可能,我们也将拭目以待这次重大的合作是否能给微软和Linux带来双赢。

Linux内核社区投稿

2年前指导公司内部人员如何提交Linux内核Patch时整理的wiki资料。在这里共享,主要介绍Patch作成时,以及社区对应时的注意点。

1.  patch制作

1.1 bug修正

bug修正时请注意以下几点:

  • 是否没有多余的注释。
  • 注释是否符合kernel doc format。
  • 是否include了多余的头文件。
  • 函数原型是否用了#ifdef。
  • 一个画面内不能显示#ifdef和#endif时,是否添加了/* CONFIG_NAME */ 这 样的注释。
  • 应该使用EXPORT_SYMBOL_GPL()的地方是否使用了EXPORT_SYMBOL()。
  • 使用scripts/checkpatch.pl检查一下。
  • 修正后是否再次编译并测试。
  • 修正后编译是否有warning。

1.2 生成patch

生成patch时请注意以下几点:

  • patch是否太大超过邮件服务器接收单封邮件的上限。
  • 一个patch只修正一个问题。类似patch分为多个作成系列,用[n/m]的形式发送。
  • 是否明确描述了patch的目的。patch被采用时,这些说明会直接放入Changelog。尤其是反映了社区的意见后再次投稿时容易忘记,请注意。
  • 是否对正确的版本制作的patch。
  • patch制作时的当前路径是否是新旧两个版本所在路径的父目录。
  • patch制作后是否应用并再次测试过。
  • 内核patch最好使用git生成。
  • diff的参数尽量使用 -Nurp

2.  patch推进

如果推进时已经有人正在社区中推进相关内容的话,请停止patch的提交,而只参与讨论。

2.1 问题再现方式

  • 请假设将要面对的社区的维护人员不能理解你在source层的解释,他只能理解更浅显的说明方式。
  • 再现的程序尽量是一组command的组合。
  • 如果有多种情况可以再现同一个bug的话,那请挑选最自然的方式。假设有个bug导致任何l打头的命令都不能使用。再现方法有多种,以下三种,哪个更好?
a.#lamnotacmd
b.#lshal
c.#ls

首先abc确实都能再现bug,但是a使用了一个不存在的命令,b使用了一个存在的命令,但是并不常用,显然c在实际应用中更加频繁和自然。c作为再现方式的话更容易被社区的维护人员理解和接受。

  • 如果这个bug有时会引起kernel panic,有时不会,那请不要忘了告诉社区kernel panic的事,这样可以引起重视。

2.2 邮件格式

  • 邮件的标题是否合适。
    • 投稿多个不同的patch时,必须分别采用不同的邮件标题。(标题为 patch名。例:”Subject:[PATCH] mm: fix a bug in container driver” → fix-a-bug-in-container-driver.patch)
    • 投稿一个系列的patch(为了一个目的制作的patch,因为太大的原因 而进行了分割)时,标题上必须添加编号。例:”Subject:[PATCH 1/4]…”
  • 邮件采用纯文本格式,并且每行字长小于72字节,不要使用英语以外的文字。
  • 邮件的签名不要使用公司统一的复杂签名。
  • 邮件作成后,直接另存为*.patch,用这个patch再试验一次是否能打上。因为社区维护人员大都直接把mail另存后作为patch使用。
  • 是否添加了必要的Signed-off-by, Reviewed-by, Tested-by。
  • Thunder Bird 发送patch时的配置(防止tab自动转换为空格):
    • 1 工具->邮件/新闻账户设置 选择“通讯录”去掉“以HTML格式编写消息”的选框
    • 2 在C:\Documents and Settings\YourName\Application Data\Thunderbird\Profiles\xxxxxx.default目录下生成user.js文件,文件内容如下
      • user_pref(”mailnews.wraplength”, 0);
      • user_pref(”mailnews.send_plaintext_flowed”, false);
    • 3 重启ThunderBird就可以了

2.3 和社区讨论

向社区投稿时请注意以下几点:

  • 投稿地址是否正确。(查看MAINTAINERS中相关人员后To or CC)
  • 讨论时回复邮件请不要置顶回复,应该在对方的邮件的正文中插入回复。

2.4 patch采用后的对应

patch经过社区review后,认为没有问题,就会被采用。 但是采用后,发生问题的可能性仍然存在。 比如,某个服务器制造商在自己的服务器上运用patch后,可能会由于硬件平台的差异,引起新的障害,这个时候大家就会讨论是否要把patch退回。 因此,出现这种问题后,需要迅速对应,否则很有可能被采用的patch又被退回。

Linux-2.6.31 was released

Linus released Linux-2.6.31 on 2009-09-09 USA time.

What I am focusing on is the development status of kernel. And by the KPS tool,  I found some interesting points below.

1. The new joining engineers increased since 2.6.29 after a long time continuous decline since 2.6.25, when one and half years ago. 

Linux Kernel New Joiner

Linux Kernel New Joiner

2. Although more enginners joined kernel development, involved companies declined 2.6.29.

3. To look at Report, Review and Test, we can find out that more and more people are contributing to Linux by doing QA job. 

Linux Kernel QA

Linux Kernel QA

4. Not like USA or Europe, Chinese engineers joined kernel development these years, their contribution become more and more recently.
Chinese Patch contribution

Chinese Patch contribution

局域网设置http代理使用git

在只提供http代理的局域网内如何用git连接远程服务器?

git使用了curl,因此设置了curl的代理服务器就可以让git使用代理了,有两种方法:

1. export http_proxy=”10.167.129.20:8080″

2. 编辑配置curl配置文件:~/.curlrc,把代理地址写进去

然后尝试下面几种方法,以下载util-linux-ng为例:

1. git-clone git://git.kernel.org/pub/scm/utils/util-linux-ng/util-linux-ng.git

结果:失败,因为这样不是通过http连接(注意开头是git://)。

2. git-clone http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git

结果:连接成功,但是下载了一点就中断。

3. git-clone http://www.kernel.org/pub/scm/utils/util-linux-ng/util-linux-ng.git

结果:结果同2。

4. git-clone http://www1.kernel.org/pub/scm/utils/util-linux-ng/util-linux-ng.git

结果:成功。

将www改成www1就行了,这应该跟mirroring有关,不使用镜像,直接选择一个真正的服务器,就成功了。 2的失败原因跟3的原因应该是一样的。

要下载linus的kernel tree的话,就可以这样了:

git-clone http://www1.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git