# An introduction to the Linux kernel

## telex4

I'm currently writing a new tutorial for a GNU/Linux newbie web site I help maintain, NewToLinux.org.uk. It's aim is, like many of our tutorials, unconventional in that it's not designed for any particular practical aim (like telling people how to compile a kernel), but instead it aims to give people a general understanding of what the kernel is, what it does, how it does it, and a little practical experience of modules. In my experience, most tutorials, often without realising it, assume a certain amount of knowledge on the behalf of the newbie which they simply don't have. NewToLinux addresses this problem by offering lots of general tutorials, and I'm looking to expand our range beyond the really new-newbie topics like "how to use the shell" to more advanced topics like kernels 

I've written a first draft, and have received a few comments from members of my LUG, but I'd really appreciate some more input, as I'm not exactly an expert on the Linux kernel, and so I'm not sure if its all technically correct, if I've missed any obvious points, etc. I'd also appreciate even more comments from newbies to the subject who find bits difficult to understand, so I can clarify the points.

You can find the tutorial here:

http://www.newtolinux.org.uk/tutorials/linuxkernel.html

Thanks in advance for any comments.

p.s. I realise this isn't stricly speaking a posted, complete piece of documentation, but it seemed like the most relevant board to stick it in, and I'd really appreciate the technical expertise of the Gentoo crowd on this one  :Smile:  Move/delete if this is inappropriate!

----------

## rac

 *Quote:*   

> The kernel is the first piece of code executed when booting your system.

 Doesn't the bootloader run first?

 *Quote:*   

> What does the kernel do?

 I think this section could use a piece on networking.

 *Quote:*   

> The alternative is a mach kernel which works as lots of separate processes that each handle different functions, and interoperate when necessary. An example of a mach kernel is HURD, being developed by the GNU Project since the 1980s, but still not useable on any modern system. The mach design is more UNIX-like, and much more advanced, but therefore much harder and slower to develop.

 The alternative is a microkernel.  Mach is just a well-known example of a microkernel architecture.  I don't agree with the characterization of it here, either.  The key difference between microkernel and monolithic kernel designs is that drivers operate in the same address space under a monolithic kernel, like Linux.  With a microkernel architecture, drivers operate in a different address space, and can only communicate with the microkernel with message passing.  And the business about "more UNIX-like and much more advanced" is a value judgement, and a rather dubious one at that, IMHO.  They're both viable design choices, and different people will argue about the merits and demerits of each, but it is by no means universally agreed that microkernels are more UNIX-like (whatever that means) or more advanced.

 *Quote:*   

> The Linux kernel is also now almost completely modular, in that all of its functions are allocated to modular chunks of code, that can be integrated into the compiled kernel, or compiled separately as modules which the kernel can then load and use. 

 I suppose it depends on your definition of "almost completely", but I think there are still large parts of the kernel that must be either in or out, and cannot be compiled as modules.

 *Quote:*   

> Having a modular system also makes maintaining the kernel much easier. If I want to write a driver for a device not yet supported by the Linux kernel, I can write and them maintain it almost independent of developments in the kernel itself. So long as it can still be loaded by the kernel, and fits the specifications required of modules, it will be a valid kernel module, and will be included in kernel sources if I submit it. 

 I don't see how compilation as a module have anything to do with kernel maintenance or submitting patches.  The source layout is the same: you're contributing a file or patches against a file.  The compilation as a module part is normally done with some #ifdefs and such: it doesn't have anything to do with source layout.  And just because something is a module doesn't mean that it's immune to changes in kernel data structures and/or calling conventions.  That's why module version numbering exists.

 *Quote:*   

> First of all it's handy to know what kernel version you have installed. To do that, (as root) run "uname -a"

 All you need for the kernel version is (and many guitarists will have no trouble remembering this) "uname -srv".  And it doesn't require root privileges.

----------

## lx

I thought that the first sector on the harddrive is loaded and executed (master boot record) by the bios code. this small bootcode will load the first sector of a partition (bootsector) which contains the first ......

Well this is too detailed anyways and I'm too tired....

Cya lX.

----------

## telex4

 *rac wrote:*   

>  *Quote:*   The kernel is the first piece of code executed when booting your system. Doesn't the bootloader run first?
> 
> 

 

Oops, that was unclear: the kernel is the first piece of code in the operating system executed once the bootloader starts up the operating system.

 *Quote:*   

>  *Quote:*   What does the kernel do? I think this section could use a piece on networking.
> 
> 

 

Any suggested information on this subject? I didn't notice anything about it whilst crawling documents I could find on the subject, so I'd really appreciate a headstart  :Smile: 

 *Quote:*   

> The alternative is a microkernel.  Mach is just a well-known example of a microkernel architecture.  I don't agree with the characterization of it here, either.  The key difference between microkernel and monolithic kernel designs is that drivers operate in the same address space under a monolithic kernel, like Linux.  With a microkernel architecture, drivers operate in a different address space, and can only communicate with the microkernel with message passing.

 

OK, but what does this mean? What you wrote makes very little sense to me, as I've never encountered "address space". In other words, how can this be explained to somebody who doesn't understand what "address space" and "message passing" mean?

 *Quote:*   

> I don't see how compilation as a module have anything to do with kernel maintenance or submitting patches.  The source layout is the same: you're contributing a file or patches against a file.  The compilation as a module part is normally done with some #ifdefs and such: it doesn't have anything to do with source layout.  And just because something is a module doesn't mean that it's immune to changes in kernel data structures and/or calling conventions.  That's why module version numbering exists.
> 
> 

 

Well in writing that, I simply elaborated on some points written in various documents which seemed to suggest that modules were not only easier to use but also to code. But if this simply isn't the case (so perhaps if others corroborate your disagreement) I'll remove the point entirely.

 *Quote:*   

> First of all it's handy to know what kernel version you have installed. To do that, (as root) run "uname -a"

 All you need for the kernel version is (and many guitarists will have no trouble remembering this) "uname -srv".  And it doesn't require root privileges.[/quote]

Ah, I'll remove the root bit then, but I'll leave the "-a" flag as one thing I've found when writing tutorials is that people like to find out things that they can do that are s slight diversion from the subject in hand. So I like to give people more info than they need - in this example, telling them to use "-a" gives them lots of info about their system, which is interesting. They should be at the level whereby they know how to find out about other runtime options, so they can always in future simply use "uname -sr" (v is slightly superfluous).

Anyway, thanks for the comments. Like I said, I'm not an expert on the subject, so whilst I bring a relative-newbie perspective on things, I'm not going to get every detail right, and I certainly don't want to mislead people!

----------

## Bones

Nice article.  Suggestion: As a linux noob, I'd like to see something about how the kernel knows what modules to load when booting.

----------

## telex4

I'll include a little about that  :Smile:  If you want to know, just check back on this thread occasionally and I'll post a notice when the tutorial is, in my opinion, "complete", and ready for people to come along and read (as the first bits I write on the topic will no doubt include some problems!  :Rolling Eyes:  )

----------

## zhenlin

Address space: Each process gets it's own address space, with certain addresses coresponding to the physical memory allocated to it by the kernel, and certain addresses coresponding to swapped/paged memory. In 32-bit architechtures, each process gets 4GB of memory to play with, although not all of it is real memory. My private address space is my private address space, as are it's contents. There is also shared memory for IPC (interprocess communication), which a set of processes can write to. Messaging is another way of implementing IPC, in which all the processes, including the kernel has a message reciever to handle messages. Under Linux, it would appear that the only form of messaging supported is signals, such as SIGTERM, SIGKILL, SIGHUP.

----------

## telex4

OK, so Linux keeps all of the drivers in the same single address space, and so can communicate through that address space (?), whilst with a microkernel like HURD each driver will occupy a different address space and will communicate by passing messages to each other.

Now what are the advantages/disadvantages of these approaches? Why might one adopt one approach over the other? I'd make a guess but I'd probably be wrong  :Wink:  and it's easier to understand things like this when you get some idea of *why* they work in the way that they do.

----------

## rac

 *telex4 wrote:*   

> Now what are the advantages/disadvantages of these approaches?

 General microkernel advantage: microkernel is immune to driver bugs: they cannot crash the kernel.  In a monolithic kernel, drivers can crash the entire machine.  Most decent operating systems (sorry, I'm a MacOS refugee) use protected memory and the MMU (memory management unit) to enforce rules such that one process may not write to another process' memory.  Without memory protection, bugs in an application can bring down the entire system.  With memory protection, the damage is limited to the process in question.  Microkernel architectures apply a similar logic to device drivers.  They are walled off from the microkernel.  The major disadvantage of this approach is the overhead involved with passing messages from the drivers to the microkernel.  In proverbial terms, a microkernel architecture says "don't put all your eggs in one basket", and a monolithic kernel architecture says "put all your eggs in one basket, and watch that basket carefully".

----------

## lx

I'm pro a microkernel, although there's some overhead, I still thing stability will be the key, as processors etc become faster. I've only experience with the Windows Driver Model, but I can say it's very easy too crash a Windows system, cause there are no restrictions what so ever and almost nothing gets checked. Error are sometimes hard to find because the OS crashes on a very different place then where the error is caused. In sort driver writers are responsible for the stability of a system....

Another benefit is that you can unload / load and replace parts with a running system, with a minimum of down time. 

I'm really pro object oriented c.q. modulair programming and a microkernel in my view fits in with this approach, only problem is I'm not quit sure what the performance impact is on the system and it's troughput, but it sounds like a good idea to me, if it works in reallife remains to be seen, I will try the hurd kernel if it gets more mature,

Cya lX.

----------

## zhenlin

Now, if someone could direct me to an up-to-date book/website about OS design, comparing and contrasting DOS, Windows NT (2000, XP) and {Insert Unix based kernel}

For the record, I have used:

DOS 6.11 (or earlier) on a 186 (AT/XT) 1993/4

Windows 3.11 on a 486 1994/5

Windows 95 on a Pentium 150 1995/6/7

Windows 98 on a Pentium 150 1998/9/2000

Windows ME on a Pentium 150 2000/1/2

Windows 2000 on a Pentium 4 1.5 2001/2

Windows 2000 on a Pentium 3 900M 2001/2

Windows XP on a Pentium 4 1.5 2001/2

Redhat Linux 7 on Pentium 4 1.5 2002

Redhat Linux 7 on Pentium 150 2002

Gentoo Linux 1 rc17+ on VMware 2002

Gentoo Linux 1.2 on Pentium 4 1.5 2002

Gentoo Linux 1.4 on Pentium 4 1.5 2002

Gentoo Linux 1.4 LiveCD on Pentium 150 2002

Gentoo Linux 1.4 LiveCD on Pentium 3 900M 2002

Demolinux on Pentium 4 1.5 2002

Mac OS X on iMac Combo 2002

Mac OS 9 on iMac Combo 2002

Gentoo Linux 1.2 CD on iMac Combo 2002

I really want to know the core differences of all of the above, the superficial ones I already know.

----------

## lx

 *zhenlin wrote:*   

> I really want to know the core differences of all of the above, the superficial ones I already know.

 

Good luck trying to find how Windows works, I tried when I was writting a binary compatible WDM driver for Win98se/me/2000 and XP, although I can guess how it works, you only can get your hands on the API's and function you need, not how the inner of the OS works, probably the same with MacOS, at least with gnu/linux you have the source,......

Besides I think you're asking a lot, and the info should be benificial to many companies, however, still I don't think you can answer the question when the inner workings of the OSes are obscured for the general public, and others who you can view / know about the inner workings are restricted by EULA's etc.

Ps. Windows 98, 98se (+WDM early model), Me are all the same in general( WDM model extended / later version).

Windows NT, 2000 (WDM), XP (WDM, directx etc), although the inner workings are probably still basicly the same.

Redhat Gentoo etc, are all based on gnu/linux model and have the same core tools, so they don't differ, well only by selection of tools and integration (higher level)

Cya lX.

----------

## pilla

AFAIK, Win32 uses a microkernel but it allows direct access to system resources to speedup things (!). So, it is not a clean implementation of a microkernel. Nothing should be allowed to access system resources without using the message passing mechanism, and that's why win is so vulnerable. 

The main problem with microkernels is the overhead of switching from user mode to system mode, and back to user mode. This can really hurt performance.  I like the middle-way approach of linux, with a monolitic kernel but with modules you can load anytime. Maybe it could be even more modular, with the possibility of changing memory management modules on the fly, for example, but that would be v-e-r-y dangerous  :Cool: 

 *lx wrote:*   

> I'm pro a microkernel, although there's some overhead, I still thing stability will be the key, as processors etc become faster. I've only experience with the Windows Driver Model, but I can say it's very easy too crash a Windows system, cause there are no restrictions what so ever and almost nothing gets checked. Error are sometimes hard to find because the OS crashes on a very different place then where the error is caused. In sort driver writers are responsible for the stability of a system....
> 
> Another benefit is that you can unload / load and replace parts with a running system, with a minimum of down time. 
> 
> I'm really pro object oriented c.q. modulair programming and a microkernel in my view fits in with this approach, only problem is I'm not quit sure what the performance impact is on the system and it's troughput, but it sounds like a good idea to me, if it works in reallife remains to be seen, I will try the hurd kernel if it gets more mature,
> ...

 

----------

## telex4

Hello all,

Just thought I'd mention - that the tutorial is now more of less finished  :Smile: 

Thanks for all the help.

----------

## BradB

I'd just like to make a small comment on the monolithic kernel approch, and Ix's thought of  *Quote:*   

>  I've only experience with the Windows Driver Model, but I can say it's very easy too crash a Windows system

 .

I'd like to suggest that this is true, but only for closed source OSes.  Say I have a dodgy device driver that causes kernel panics, or windows hangs.  In the windows world, all you can do is complain bitterly to the (usually) proprietry driver writer, or not use that driver.  In the linux world, you can complain bitterly to the driver maintainer, not use the module - or in most cases (nVidia drivers being a notable exception) dig in there and fix it yourself.  My point is that faulty drivers that are critical to OS operation are certainly an itch most programmers would like to fix, so therefore they fill get fixed by someone.

If I wrote a handy driver (say a USB driver for an unsupported scanner) that had a deliberate error in it that caused panics - I wonder how long it would be before I got a patch back.  Maybe that would be an interesting experiment.

Cheers

Brad

----------

## rac

One more thing I forgot to mention earlier: the discussion of tainted kernel modules is not entirely accurate.  It is the license of the module that determines whether or not it taints the kernel, not the method or place of compilation.  For details, see http://www.tux.org/lkml/#s1-18.

BTW, any chance that the forums could get a link back and a note as to how useful you found our input included somewhere?

----------

## telex4

rac, perhaps I should just expand on the point, because what you and I have written is essentially the same. It is because they are binary, meaning the kernel hackers cannot investigate problems, that they are marked as tainted. Of course being binary means that they are under a non-Free license, and that they weren't compiled with the kernel. The point probably just needs some clarification then.

And yes, that's a nice idea, I'll put an acknowledgements section at the botom.

----------

## telex4

 *telex4 wrote:*   

> rac, perhaps I should just expand on the point, because what you and I have written is essentially the same. It is because they are binary, meaning the kernel hackers cannot investigate problems, that they are marked as tainted. Of course being binary means that they are under a non-Free license, and that they weren't compiled with the kernel. The point probably just needs some clarification then.
> 
> And yes, that's a nice idea, I'll put an acknowledgements section at the botom.

 

update: I've put in an acknowledgement for the Gentoo forums  :Smile: 

----------

## jjares

 *zhenlin wrote:*   

> Now, if someone could direct me to an up-to-date book/website about OS design, comparing and contrasting DOS, Windows NT (2000, XP) and {Insert Unix based kernel}
> 
> 

 

Operating systems: Internals and design by Stallings, though hard to read, has a really good comparison between different operating systems design principles, as well as how the OS actually works. It includes Linux, and Windows 2000, I don't think it talks about MacOS X, but any BSD will be close.

----------

## Kalin

 *Quote:*   

> * Gentoo - Gentoo keeps a list of the modules, one per line, in /etc/modules.autoload

 Not quite right. Or at least not current. At the moment there is a directory /etc/modules.autoload.d/ with files like kernel-2.4 and kernel-2.6. These are autoloaded modules.

Another story are different configs(=alias lines) for (not autoload) modules which go into different files into /etc/modules.d/ directory. There is a command, modules-update, to generate /etc/modules.conf from /etc/modules.d/* See

```
man modules{-update,.autoload}
```

 for the Gentoo specific part.

AFAIK, RedHat had /etc/modules.conf and you write directly there (last used 7.3 long time ago).

----------

## telex4

Thanks for the update  :Smile: 

----------

## WebsterRF

Nice website, I like it.

----------

## telex4

Thanks, it's been an interesting couple of years work between us. It also always needs improvements, additions and translations into other languages, if people want to volunteer or propose new ideas  :Wink: 

----------

