summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2008-07-21 14:55:23 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2008-07-21 14:55:23 -0700
commiteb4225b2da2b9f3c1ee43efe58ed1415cc1d4c47 (patch)
tree573ce3591679ffcdc179801ed86107e48e0e11ca /Documentation
parent807677f812639bdeeddf86abc66117e124eaedb2 (diff)
parent4cddb886a4d0e5cc7a790151740bfb87b568c97d (diff)
downloadlinux-eb4225b2da2b9f3c1ee43efe58ed1415cc1d4c47.tar.gz
linux-eb4225b2da2b9f3c1ee43efe58ed1415cc1d4c47.tar.bz2
linux-eb4225b2da2b9f3c1ee43efe58ed1415cc1d4c47.zip
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (25 commits) mmtimer: Push BKL down into the ioctl handler [IA64] Remove experimental status of kdump [IA64] Update ia64 mmr list for SGI uv [IA64] Avoid overflowing ia64_cpu_to_sapicid in acpi_map_lsapic() [IA64] adding parameter check to module_free() [IA64] improper printk format in acpi-cpufreq [IA64] pv_ops: move some functions in ivt.S to avoid lack of space. [IA64] pvops: documentation on ia64/pv_ops [IA64] pvops: add to hooks, pv_time_ops, for steal time accounting. [IA64] pvops: add hooks, pv_irq_ops, to paravirtualized irq related operations. [IA64] pvops: add hooks, pv_iosapic_ops, to paravirtualize iosapic. [IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment. [IA64] pvops: paravirtualize NR_IRQS [IA64] pvops: paravirtualize entry.S [IA64] pvops: paravirtualize ivt.S [IA64] pvops: paravirtualize minstate.h. [IA64] pvops: define paravirtualized instructions for native. [IA64] pvops: preparation for paravirtulization of hand written assembly code. [IA64] pvops: introduce pv_cpu_ops to paravirtualize privileged instructions. [IA64] pvops: add an early setup hook for pv_ops. ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ia64/paravirt_ops.txt137
1 files changed, 137 insertions, 0 deletions
diff --git a/Documentation/ia64/paravirt_ops.txt b/Documentation/ia64/paravirt_ops.txt
new file mode 100644
index 000000000000..39ded02ec33f
--- /dev/null
+++ b/Documentation/ia64/paravirt_ops.txt
@@ -0,0 +1,137 @@
+Paravirt_ops on IA64
+====================
+ 21 May 2008, Isaku Yamahata <yamahata@valinux.co.jp>
+
+
+Introduction
+------------
+The aim of this documentation is to help with maintainability and/or to
+encourage people to use paravirt_ops/IA64.
+
+paravirt_ops (pv_ops in short) is a way for virtualization support of
+Linux kernel on x86. Several ways for virtualization support were
+proposed, paravirt_ops is the winner.
+On the other hand, now there are also several IA64 virtualization
+technologies like kvm/IA64, xen/IA64 and many other academic IA64
+hypervisors so that it is good to add generic virtualization
+infrastructure on Linux/IA64.
+
+
+What is paravirt_ops?
+---------------------
+It has been developed on x86 as virtualization support via API, not ABI.
+It allows each hypervisor to override operations which are important for
+hypervisors at API level. And it allows a single kernel binary to run on
+all supported execution environments including native machine.
+Essentially paravirt_ops is a set of function pointers which represent
+operations corresponding to low level sensitive instructions and high
+level functionalities in various area. But one significant difference
+from usual function pointer table is that it allows optimization with
+binary patch. It is because some of these operations are very
+performance sensitive and indirect call overhead is not negligible.
+With binary patch, indirect C function call can be transformed into
+direct C function call or in-place execution to eliminate the overhead.
+
+Thus, operations of paravirt_ops are classified into three categories.
+- simple indirect call
+ These operations correspond to high level functionality so that the
+ overhead of indirect call isn't very important.
+
+- indirect call which allows optimization with binary patch
+ Usually these operations correspond to low level instructions. They
+ are called frequently and performance critical. So the overhead is
+ very important.
+
+- a set of macros for hand written assembly code
+ Hand written assembly codes (.S files) also need paravirtualization
+ because they include sensitive instructions or some of code paths in
+ them are very performance critical.
+
+
+The relation to the IA64 machine vector
+---------------------------------------
+Linux/IA64 has the IA64 machine vector functionality which allows the
+kernel to switch implementations (e.g. initialization, ipi, dma api...)
+depending on executing platform.
+We can replace some implementations very easily defining a new machine
+vector. Thus another approach for virtualization support would be
+enhancing the machine vector functionality.
+But paravirt_ops approach was taken because
+- virtualization support needs wider support than machine vector does.
+ e.g. low level instruction paravirtualization. It must be
+ initialized very early before platform detection.
+
+- virtualization support needs more functionality like binary patch.
+ Probably the calling overhead might not be very large compared to the
+ emulation overhead of virtualization. However in the native case, the
+ overhead should be eliminated completely.
+ A single kernel binary should run on each environment including native,
+ and the overhead of paravirt_ops on native environment should be as
+ small as possible.
+
+- for full virtualization technology, e.g. KVM/IA64 or
+ Xen/IA64 HVM domain, the result would be
+ (the emulated platform machine vector. probably dig) + (pv_ops).
+ This means that the virtualization support layer should be under
+ the machine vector layer.
+
+Possibly it might be better to move some function pointers from
+paravirt_ops to machine vector. In fact, Xen domU case utilizes both
+pv_ops and machine vector.
+
+
+IA64 paravirt_ops
+-----------------
+In this section, the concrete paravirt_ops will be discussed.
+Because of the architecture difference between ia64 and x86, the
+resulting set of functions is very different from x86 pv_ops.
+
+- C function pointer tables
+They are not very performance critical so that simple C indirect
+function call is acceptable. The following structures are defined at
+this moment. For details see linux/include/asm-ia64/paravirt.h
+ - struct pv_info
+ This structure describes the execution environment.
+ - struct pv_init_ops
+ This structure describes the various initialization hooks.
+ - struct pv_iosapic_ops
+ This structure describes hooks to iosapic operations.
+ - struct pv_irq_ops
+ This structure describes hooks to irq related operations
+ - struct pv_time_op
+ This structure describes hooks to steal time accounting.
+
+- a set of indirect calls which need optimization
+Currently this class of functions correspond to a subset of IA64
+intrinsics. At this moment the optimization with binary patch isn't
+implemented yet.
+struct pv_cpu_op is defined. For details see
+linux/include/asm-ia64/paravirt_privop.h
+Mostly they correspond to ia64 intrinsics 1-to-1.
+Caveat: Now they are defined as C indirect function pointers, but in
+order to support binary patch optimization, they will be changed
+using GCC extended inline assembly code.
+
+- a set of macros for hand written assembly code (.S files)
+For maintenance purpose, the taken approach for .S files is single
+source code and compile multiple times with different macros definitions.
+Each pv_ops instance must define those macros to compile.
+The important thing here is that sensitive, but non-privileged
+instructions must be paravirtualized and that some privileged
+instructions also need paravirtualization for reasonable performance.
+Developers who modify .S files must be aware of that. At this moment
+an easy checker is implemented to detect paravirtualization breakage.
+But it doesn't cover all the cases.
+
+Sometimes this set of macros is called pv_cpu_asm_op. But there is no
+corresponding structure in the source code.
+Those macros mostly 1:1 correspond to a subset of privileged
+instructions. See linux/include/asm-ia64/native/inst.h.
+And some functions written in assembly also need to be overrided so
+that each pv_ops instance have to define some macros. Again see
+linux/include/asm-ia64/native/inst.h.
+
+
+Those structures must be initialized very early before start_kernel.
+Probably initialized in head.S using multi entry point or some other trick.
+For native case implementation see linux/arch/ia64/kernel/paravirt.c.