User: Password:
Subscribe / Log in / New account

Remove syscall instructions at fixed addresses

From:  Andy Lutomirski <luto@MIT.EDU>
To:  Ingo Molnar <>,
Subject:  [PATCH v2 00/10] Remove syscall instructions at fixed addresses
Date:  Sun, 29 May 2011 23:48:37 -0400
Message-ID:  <>
Cc:  Thomas Gleixner <>,, Jesper Juhl <>, Borislav Petkov <>, Linus Torvalds <>, Andrew Morton <>, Arjan van de Ven <>, Jan Beulich <>, richard -rw- weinberger <>, Mikael Pettersson <>, Andy Lutomirski <>
Archive-link:  Article

This series is really five different parts.

The first part (patch 1/10) is just a bugfix from the last vdso series.
The bug should be harmless but it's pretty dumb.  This is almost
certainly 3.0 material.

The second part removes a bunch of syscall instructions in kernel space
at fixed addresses that user code can execute.  This is not all that
well tested or inspected at this point.

Several are data that isn't marked NX.  Patch 2/10 makes vvars NX and
5/10 makes the HPET NX.

The time() vsyscall contains an explicit syscall fallback.  Patch 3/10
removes it.

The last one is the gettimeofday fallback.  We need that, but it doesn't
have to be a real syscall.  Patch 4/10 adds int 0xcc (callable only from
the vsyscall page) that implements the gettimeofday fallback and nothing

The third part is a more aggressive cleanup of the vsyscall page.  It
removes the code implementing the vsyscalls and replaces it with magic
int 0xcc incantations.  These incantations are specifically designed so
that jumping into them at funny offsets will either work fine or
generate some kind of fault.  Patch 8/10 is optional and might want to
be hidden away in CONFIG_EMBEDDED for awhile.  This needs some more
testing in CONFIG_UNSAFE_VSYSCALLS=y mode and a lot of careful
inspection in CONFIG_UNSAFE_VSYSCALLS=n mode.

Patch 6/10 removes venosys.  It's been broken (crashes) for a couple
years and it doesn't do anything particularly useful anyway.

Patch 7/10 fills the vsyscall page with 0xcc instead of 0x00.  0xcc is
an explicit trap

Patch 8/10 adds a config option to emulate the vsyscalls.  The int 0xcc
incantation intentionally depends on the config option -- it is not ABI.

Patch 9/10 randomizes the int 0xcc incantation at bootup.  It is pretty
much worthless for security (there are only three choices for the random
number and it's easy to figure out which one is in use) but it prevents
overly clever userspace programs from thinking that the incantation is
ABI.  One instrumentation tool author offered to hard-code special
handling for int 0xcc; I want to discourage this approach.

Patch 10/10 adds some documentation for entry_64.S.  A lot of the magic
in there is far from obvious.

Changes from v1:
 - Patches 6-10 are new.
 - The int 0xcc code is much prettier and has lots of bugs fixed.
 - I've decided to let everyone compile turbostat on their own :)

Andy Lutomirski (10):
  x86-64: Fix alignment of jiffies variable
  x86-64: Give vvars their own page
  x86-64: Remove kernel.vsyscall64 sysctl
  x86-64: Replace vsyscall gettimeofday fallback with int 0xcc
  x86-64: Map the HPET NX
  x86-64: Remove vsyscall number 3 (venosys)
  x86-64: Fill unused parts of the vsyscall page with 0xcc
  x86-64: Emulate vsyscalls
  x86-64: Randomize int 0xcc magic al values at boot
  x86-64: Document some of entry_64.S

 Documentation/x86/entry_64.txt       |   95 +++++++++++
 arch/x86/Kconfig                     |   17 ++
 arch/x86/include/asm/fixmap.h        |    1 +
 arch/x86/include/asm/irq_vectors.h   |    6 +-
 arch/x86/include/asm/pgtable_types.h |    6 +-
 arch/x86/include/asm/traps.h         |    4 +
 arch/x86/include/asm/vgtod.h         |    1 -
 arch/x86/include/asm/vsyscall.h      |    6 +
 arch/x86/include/asm/vvar.h          |   24 ++--
 arch/x86/kernel/Makefile             |    3 +
 arch/x86/kernel/entry_64.S           |    4 +
 arch/x86/kernel/hpet.c               |    2 +-
 arch/x86/kernel/traps.c              |    4 +
 arch/x86/kernel/        |   47 +++---
 arch/x86/kernel/vsyscall_64.c        |  289 +++++++++++++++++++++++++++++-----
 arch/x86/kernel/vsyscall_emu_64.S    |   40 +++++
 arch/x86/vdso/vclock_gettime.c       |   55 +++----
 17 files changed, 486 insertions(+), 118 deletions(-)
 create mode 100644 Documentation/x86/entry_64.txt
 create mode 100644 arch/x86/kernel/vsyscall_emu_64.S


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds