Last time I tried it, if the kernel happened to start running a different process 98 us before your sleep was scheduled to complete, and the other process was getting a 100 us slice, it would wake you 2 us after the start of the millisecond, when your process can't do anything other than record the loss of a sample and go back to sleep for 998 us. nanosleep is fine for telling the kernel you can't do anything until a given time, but it doesn't tell the kernel that, in 999 us, there will be 1 us that you can use, followed by another 999 us during which your either done or too late to do anything.
The scales in my example are different from what I was actually doing at the time, but I was trying to sample an accelerometer attached to an i2c bus attached to a serial port at 20 Hz; I needed to send a few bytes at the right time, which would cause the accelerometer to take a sample then. (The accelerometer device didn't support automatic periodic sampling.) It turned out that the only way to get data that I could analyze was to sleep until 1 ms before the time I wanted to sample and busy-wait until the right time; that meant I was generally running by the sample time, and generally hadn't used up my time slice. On the other hand, I was burning ~2% of the CPU on a power-limited system busy-waiting.