Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

Posted Feb 27, 2014 14:12 UTC (Thu) by sorokin (guest, #88478)
In reply to: Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet) by roc
Parent article: Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

> The Pepper API is the biggest problem in this regard. It's designed to map easily to Chrome internals (understandably so). There is no spec other
> than Chrome code. So porting it to another browser would be awfully ugly. The worst part is, the Web already *has* standardized APIs for all of
> Pepper's functionality. That's another reason asm.js is preferable, it doesn't bring that duplication (and attack surface!).

I heard that initially Mozilla participated in development of PPAPI, but then it leaves this project. Perhaps if Mozilla continue it participation we have good cross-browser API.

Also could you specify which part of Pepper API you consider Chrome specific and difficult to implement in other browser? For external user like me it doesn't looks very tied to Chrome internals.

>> JS lacks also integers of different bitness and signedness
> Just a matter of using the right operations. Emscripten takes care of it.

I agree that with an extension of API you can get a relatively effective code generation. Although aesthetically it is awful. I assume you agree with me that if we were going to make a high performance compiler we weren't translating C to dynamic typed language (discarding type information) and then perform optimization on it.

As I understand your work relates to this area. I have a few question about your vision of implementation of asm.js.

1. Most of compilers employs notion of undefined behavior. UB allows compiler to perform optimization based on assertions that hold true, but can not be proven by compiler. E.g. we've got the pointer, we don't see the place where it is allocated, but the programmer told us the pointer is aligned, we believe him and generate code using MOVAPS instead of MOVUPS. For specific cases these optimizations are crucial. Similar important optimization is autovectorization with aliasing problem (and usage of 'restrict' keyword). Are you planning to extent JS with notion of undefined behavior? If not (I assume you are not), how are you going perform these optimizations to match the performance of regular compiler?

2. The second question is subset of the first. As I understand Emscripten compile pointers as indices in HEAP array. As I understand JS doesn't have undefined behavior when access is outside of array bounds. So unless compiler can statically prove that index lies inside array bounds you should perform a bound check. Is it true that in asm.js double derefence of pointer (**p) leads to bound checking? If so, how are you going to overcome this problem?

>> As far I know (I can be wrong) Flash uses exactly the same API (PPAPI)
>> and exactly the same sandboxing mechanism as a regular NaCl programs use.

> FWIW that's incorrect. There is a bunch of Flash-specific PPAPI.

I was not aware of these details. Curious, which API they needed for Flash and why it can not be provided for regular PPAPI client?

> I understand why people feel LLVM IR is more aesthetically pleasing than JS, but JS is actually a much better code distribution format:
> -- It compresses to about the same size, and we're able to compile it to get almost the same performance (actually better startup performance, see above).

As I understand startup performance occured not because JS is compiled faster, but because not entire program is compiled. I believe if LLVM uses similar technics it can reach similar (or better) results.

> -- JS has a precise specification and multiple independent implementations. LLVM IR does not.

OpenCL guys was able to standardtise LLVM IR as binary representation. Although I find their specification awful and complete crap, they have done it. So I believe browser guys could do the same. LLVM IR is relatively clean and I don't see big difficulties to write a standard.

> -- LLVM IR was designed to be a compiler IR, not a code distribution format, which means it is not well suited to code distribution. For example it has
> undefined behaviors, which mean the same code can behave differently across architectures or even across different Chrome versions.

I don't see a problem with UB. The only thing browser must enforce is that exploiting UB should not allow breaking sandbox. Also you always could redefine specific operations to remove UB (but you will lose some optimizations in this case).

Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

Posted Feb 27, 2014 18:19 UTC (Thu) by khim (subscriber, #9252) [Link]

Initial version of Pepper is totally different from PPAPI as it exist today. Initially Pepper interface was proposed as kind-of "modern-cross-browser-NPAPI", but when Mozilla rejected offer Chromium guys went and developed totally different API with different (albeit similar) name. You can not use Pepper with today's [P]NaCl.

In particular NPAPI plugins (and PPAPI Flash, of course) can directly access HTML page's DOM while regular NaCl application can not do that. Of course this makes it easier to implement PPAPI, not harder (it's harder to use it for developers, but that's other kettle of fish).

Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

Posted Feb 28, 2014 2:00 UTC (Fri) by roc (subscriber, #30627) [Link]

> I heard that initially Mozilla participated in development of PPAPI, but
> then it leaves this project.

Google has suggested in their PR that we were involved, but we never really were. My feedback to them was to base their APIs on the already-standardized Web platform APIs (i.e. make PNaCl an execution environment which simply bound to some subset of the Web platform APIs), but at that point they'd already written some code and didn't want to change anything.

> Also could you specify which part of Pepper API you consider Chrome
> specific and difficult to implement in other browser?

It's not that the API is Chrome-specific. It just doesn't map directly to Firefox internals, so we'd be writing a bunch of code to overcome the impedance mismatch. Anyway I'll concede this point since it's not so important.

> I assume you agree with me that if we were going to make a high
> performance compiler we weren't translating C to dynamic typed language
> (discarding type information) and then perform optimization on it.

Of course, but making a high performance compiler is not the end goal; we're trying to create a stable code distribution format for C and C++-like programs for consumption by Web browsers.

> Are you planning to extent JS with notion of undefined behavior?

No.

> If not (I assume you are not), how are you going perform these
> optimizations to match the performance of regular compiler?

I'm not working on this directly, so I don't know all the details, but for unaligned SIMD operations I think we'd take an approach where JS SIMD operations always work but there's a performance penalty when unaligned. AVX already provides this in hardware and I think that's the way all hardware will go. In the meantime, for architectures that fault on aligned accesses, we could handle the fault, emulate the instructions, and reduce the performance penalty by recompiling the faulting code with dynamic alignment checks.

> Is it true that in asm.js double derefence of pointer (**p) leads to
> bound checking? If so, how are you going to overcome this problem?

On 64-bit machines we allocate 4GB of virtual address space for the heap array and no bounds checks are required. On 32-bit machines we have to do bounds checks, but we use range analysis to avoid most of them.

> Curious, which API they needed for Flash and why it can not be provided
> for regular PPAPI client?

I don't know. My information comes from Brian Smith:
https://groups.google.com/forum/#!topic/mozilla.dev.platf...

> As I understand startup performance occured not because JS is compiled
> faster, but because not entire program is compiled.

That's not correct for Firefox. We compile asm.js code fully ahead-of-time. This reduces run-time overhead and makes execution speed more predictable.

> LLVM IR is relatively clean and I don't see big difficulties to write a
> standard.

OK, but it hasn't been done. Also you don't have a good spec until someone has built an independent reimplementation of it.

> I don't see a problem with UB. The only thing browser must enforce is
> that exploiting UB should not allow breaking sandbox.

It's quite a big problem that any Chrome update (i.e. every six weeks or less) is allowed to break any PNaCl application that accidentally relies on undefined behavior (which is to say, most large C and C++ applications).

> Also you always could redefine specific operations to remove UB (but you
> will lose some optimizations in this case).

Google seems to disagree that it's fixable:
> PNaCl’s goal is that a single pexe should work reliably in the same
> manner on all architectures, irrespective of runtime parameters and
> through Chrome updates. This goal is unfortunately not attainable;

Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

Posted Feb 28, 2014 7:35 UTC (Fri) by ibukanov (subscriber, #3942) [Link] (2 responses)

> Most of compilers employs notion of undefined behavior.

asm.js operates on the assembler level when there is no undefined behavior as every instruction has precise meaning. Nothing prevents a compiler that targets asm.js to take advantage of the undefined behavior in the original source to generate better asm.js.

As regarding SSE instructions and vectorization, asm.js still can support them via generating perhaps calls to new functions in Math that will carry out the operations in the same way calls to Math.imul() and Math.fround() are done currently. For JavaScript engines that do not support the new functions an implementation in pure JS will be provided.

> Is it true that in asm.js double derefence of pointer (**p) leads to bound checking?

Each memory dereference in asm.js leads to an explicit bound check in the machine code unless the engine can optimize the check away or there is a hardware support for doing that like segmented registers on x86. This is not different from PNaCl that also generates extra code on platforms without CPU support for bound checking. As I wrote above, a particular way to enforce bound checking in PNaCl is slightly faster, but asm.js can be adjusted in backward-compatible way if it would not be possible to overcome that advantage by other means.

Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

Posted Feb 28, 2014 12:15 UTC (Fri) by sorokin (guest, #88478) [Link] (1 responses)

> Nothing prevents a compiler that targets asm.js to take advantage of the undefined behavior in the original source to generate better asm.js.

That is true, but asm.js don't match assembler instructions very well. And sometimes an information from original source is very useful for sane instruction selection.

This similar to attempting to generate code for one CPU architecture and then translate it other. Consider we are trying to translate x86 (which has EFLAGS register) to other architecture (which doesn't). Correctly evalute EFLAGS is expensive. So every time we have a proof that EFLAGS is not used later we omit its computation. But when we don't know where the call/jmp/ret jumps we have to evaluate EFLAGS. So this evaluation happens before any indirect call/jmp and before any ret. You know that EFLAGS are almost never used as argument or return value for functions. So with 99.9% chance we do this computation in vain. And this code is almost impossible to optimize out. If we generated code directly from C we weren't having such problem. Are you sure that no similar problem occurs with translating asm.js to assembler?

Servo: Inside Mozilla's mission to reinvent the web browser (ZDNet)

Posted Mar 2, 2014 20:30 UTC (Sun) by roc (subscriber, #30627) [Link]

Yes. Of course we didn't add anything daft like EFLAGS.

> asm.js don't match assembler instructions very well

Do you have examples in mind or are you just speculating?

asm.js has:
-- functions
-- heap array (i.e. memory)
-- primitive-typed local variables (i.e. registers)
-- primitive-typed loads and stores
-- primitive-typed arithmetic expressions
-- structured control flow with conditionals
-- FFI
-- a very small standard library

It looks like standard assembler instructions, because it was designed to.