The 3.16 kernel has been released
      Posted Aug 4, 2014 15:34 UTC (Mon)
                               by kloczek (guest, #6391)
                              [Link] (98 responses)
       
Linux CGroups development started in 2007 and still cgroups are useless on implementing only some demonstration. 
 
     
    
      Posted Aug 4, 2014 19:22 UTC (Mon)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (97 responses)
       
And also, systemd is waaaaaaaaaaaaaay nicer than smf could ever hope to be. 
     
    
      Posted Aug 5, 2014 1:25 UTC (Tue)
                               by kloczek (guest, #6391)
                              [Link] (96 responses)
       
 
     
    
      Posted Aug 5, 2014 2:11 UTC (Tue)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (95 responses)
       
Also, it actually _provides_ output because it's integrated with journald. And then systemd unit files are actually not a crazy XML-based format. And then there's an uber-useful systemd-nspawn tool to quickly start new containers. And integration with network configuration. And autofs. And cgroups. 
SMF along with launchd was one of the inspirations for systemd, so it's no wonder that a lot of good features were taken out of SMF and made easier to use. 
     
    
      Posted Aug 5, 2014 4:10 UTC (Tue)
                               by kloczek (guest, #6391)
                              [Link] (94 responses)
       
What is maintained by systemd/SMF really does not matter. It is part of exact setup on top of base functionalities. 
One of the good thing about SMF is that it does not prints any output if everything is OK. 
 
     
    
      Posted Aug 5, 2014 4:22 UTC (Tue)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (92 responses)
       
Yet when I'm debugging misbehaving services or developing new ones, I don't usually do it on 1000 hosts at a time. 
> One of the good thing about SMF is that it does not prints any output if everything is OK. 
What exactly SMF has that cgroups+systemd can't do? For example, SMF does not allow to meter block disk bandwidth - from what I remember working with it 5 years ago. 
     
    
      Posted Aug 5, 2014 12:19 UTC (Tue)
                               by clugstj (subscriber, #4020)
                              [Link] (91 responses)
       
     
    
      Posted Aug 5, 2014 14:12 UTC (Tue)
                               by zdzichu (subscriber, #17118)
                              [Link] (85 responses)
       
     
    
      Posted Aug 5, 2014 15:50 UTC (Tue)
                               by sjj (guest, #2020)
                              [Link] (84 responses)
       
While there certainly is NIH in Linux (it's made by /programmers/ after all), klozek's "points" amount to "it's not an exact copy of Solaris, hence inferior, nyah nyah". And of course, if it was an exact copy of some obsolete OS, the charge would be that Linux people are just stealing ideas. We've *all* seen this play before, haven't we? 
If kloczek wanted to have an actual technical fact-based discussion of the differences, pluses and minuses, like an adult, he could very easily do so. I certaily would welcome it and it would benefit all, increasing learning. 
     
    
      Posted Aug 6, 2014 1:24 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (83 responses)
       
Let's look on few stories. 
* Sound subsystem 
* COW file system 
ZFS is open source solution. OK .. latest versions are not but latest OpenSolaris code is far more advanced and stable than anything what is available under Linux. All what is necessary is to try to integrate OpenZFS into regular kernel. In existing Linux code is enough code under CDDL that integrate OpenZFS will not make difference. 
Few more thoughts about btrfs. 
The same failed approaches is possible to point on many other areas. Linux developers spend probably hundreds man/year reinventing the wheel. 
Effectively only Linux success area is number of supported hardware components and availability of the code when someone need to implement something which never will be available publically. Palette of supported hardware is egain and again trashed by rewriting KAPI. 
Linux was pushed out from desktops and most of the embedded systems (OpenWRT). Android initially based on Linux but now is developed by Google and number of functionalities backported to Linux kernel code is limited. 
Linux from the beginning has been and still is story of reinventing the wheel(s) (anything bigger than single device driver was and still is kind of rocket science). 
 
     
    
      Posted Aug 6, 2014 2:28 UTC (Wed)
                               by dlang (guest, #313)
                              [Link] (2 responses)
       
OSS was replaced because the company took it proprietary and so rather than forking people opted to replace it entirely. 
I'm sure that Google, Amazon and many other large companies would be surprised to hear that Linux doesn't scale and that they need to run Solaris. 
     
    
      Posted Aug 6, 2014 9:53 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (1 responses)
       
Aha .. so no one gave/donated DTrace or ZFS to Linux :) 
> OSS was replaced because the company took it proprietary and so rather than forking people opted to replace it entirely. 
Sorry but API is public (is not proprietary). Implementation was proprietary and that is true. However you can use specification of the API to implement own audio subsystem compatible with OSS on API layer. 
I'm calling this "syndrome of construction workers running with empty barrels". 
     
    
      Posted Aug 7, 2014 19:19 UTC (Thu)
                               by cortana (subscriber, #24596)
                              [Link] 
       
Oracle have claimed the very opposite in court! 
     
      Posted Aug 6, 2014 3:10 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (31 responses)
       
2) ZFS can't be included in Linux because of a conflicting license. Btrfs uses a similar design and is maturing fast and right now is pretty stable. And no, it's nowhere near the level of crashitude that the first versions of ZFS displayed. 
3) DTrace couldn't be included in the kernel due to a conflicting license. Tracing implementation for Linux took a long time but now it's pretty powerful and it has less overhead than DTrace in Solaris. 
4) Solaris runs OK on Amazon EC2. Yet almost nobody uses it there - exactly because it's a piece of crap with overcomplicated user interface (no colored output from SMF!). 
Linux indeed suffers from serious cases of NIH, but you're nowhere close to finding actual examples. You've missed the most prominent one: kqueue is much nicer than epoll (and evports, for that matter). 
So yeah, I'm stopping feeding the troll.  
     
    
      Posted Aug 6, 2014 10:05 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (27 responses)
       
Of course it is. I've been talking about ALSA. 
> ZFS can't be included in Linux because of a conflicting license. 
Try to grep across kernel source tree looking for "CDDL". 
> Solaris runs OK on Amazon EC2 
Sorry but I completely don't understand why you are mentioning here about this. Running Solaris inside KVM cage for something more than test/demonstration is pointless. Solaris running inside such cage will be not able interact straight with disks. Remember that ZFS can recognize faster parts of spindles to to utilize this like Linus never was and still is able to use these areas. 
> Linux indeed suffers from serious cases of NIH 
Thank you for agree with me. 
     
    
      Posted Aug 6, 2014 14:27 UTC (Wed)
                               by mjg59 (subscriber, #23239)
                              [Link] (16 responses)
       
Zero hits. 
     
    
      Posted Aug 6, 2014 18:02 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (15 responses)
       
     
    
      Posted Aug 6, 2014 18:07 UTC (Wed)
                               by mjg59 (subscriber, #23239)
                              [Link] (13 responses)
       
     
    
      Posted Aug 6, 2014 19:08 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (12 responses)
       
It was kind of mental problem. Initially many Linux developers as young guys been thinking that because they did so many do more will be not a problem. And this was pure bollocks :-> 
First software development principle says "if you want to slow down your code development you must add more developers". 
 
     
    
      Posted Aug 6, 2014 19:20 UTC (Wed)
                               by mjg59 (subscriber, #23239)
                              [Link] (11 responses)
       
     
    
      Posted Aug 6, 2014 20:58 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (10 responses)
       
Even if derivative work was done by original code owner? (Sum) 
 
     
    
      Posted Aug 6, 2014 21:10 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (8 responses)
       
     
    
      Posted Aug 6, 2014 21:47 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (6 responses)
       
I really don't care about how many developers are working on Linux. 
PS. For example many banking systems across the world have country legal requirements to have *each* system under external support. So banks must pay for Linux support even if it is *free to use*. 
> Good luck getting about 10000 Linux source code owners to agree. 
It is really so big problem with keep record of all linux developers and send spam with sing some agreement? Or still no one even tried to organize this? 
 
     
    
      Posted Aug 6, 2014 21:52 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] 
       
> PS. For example many banking systems across the world have country legal requirements to have *each* system under external support. So banks must pay for Linux support even if it is *free to use*. 
> It is really so big problem with keep record of all linux developers and send spam with sing some agreement? Or still no one even tried to organize this? 
     
      Posted Aug 6, 2014 22:18 UTC (Wed)
                               by dlang (guest, #313)
                              [Link] (3 responses)
       
The license incompatibility has been publicly discussed since the day the CDDL was published. The fact that Sun and Oracle have opted to not fix this indicates that they don't want this code to be in the linux kernel. 
Trying to get hundreds of thousands of other developers or their Estates if they are dead, as well as all the developers of code that was written outside the kernel and included due to having a compatible license to change the license of the kernel to something that's incompatible with the existing license is not going to go very far. 
RMS and the FSF weren't able to convince them to change to GPLv3, what makes you think they are going to agree to give Oracle more control over the license of the kernel than they were willing to give the FSF? 
Remember, Oracle is the company claiming that if you write a new implementation that uses the same API you are violating their copyright. 
     
    
      Posted Aug 7, 2014 3:32 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (2 responses)
       
Did you read CDDL license? 
If you are worry that if some Linux developers will improve something in such code Oracle will be able to integrate this into Solaris kernel please try to have look on dispute about problems with integrations of some features developed in Ilumos. IIRC looks like Oracle has problem because is not able to integrate such code (new version of the compression) without publishing first own current version of the code (with many new features  like 1MB block size or faster RaidZ resilvering). 
 
     
    
      Posted Aug 7, 2014 3:53 UTC (Thu)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] 
       
> Any Covered Software that You distribute or otherwise make available in Executable form must also be made available in Source Code form and that Source Code form must be distributed only under the terms of this License. You must include a copy of this License with every copy of the Source Code form of the Covered Software You distribute or otherwise make available. You must inform recipients of any such Covered Software in Executable form as to how they can obtain such Covered Software in Source Code form in a reasonable manner on or through a medium customarily used for software exchange. 
 
     
      Posted Aug 7, 2014 4:18 UTC (Thu)
                               by dlang (guest, #313)
                              [Link] 
       
this isn't an accident, people at sun have said that they explicitly based the CDDL on the MPL _because_ it wasn't compatible with the GPL. 
I suggest that you read up on it, a few min of google search will give you lots of info, or you can just hit wikipedia http://en.wikipedia.org/wiki/Common_Development_and_Distr... 
     
      Posted Aug 6, 2014 22:39 UTC (Wed)
                               by mjg59 (subscriber, #23239)
                              [Link] 
       
     
      Posted Aug 9, 2014 12:12 UTC (Sat)
                               by nix (subscriber, #2304)
                              [Link] 
       
     
      Posted Aug 6, 2014 21:56 UTC (Wed)
                               by rahulsundaram (subscriber, #21946)
                              [Link] 
       
Purely CDDL?  You are mistaken.  Dual license?  Sure as long as one of the licenses are compatible, the other one doesn't matter. 
     
      Posted Aug 7, 2014 19:20 UTC (Thu)
                               by cortana (subscriber, #24596)
                              [Link] 
       
     
      Posted Aug 6, 2014 16:37 UTC (Wed)
                               by raven667 (subscriber, #5198)
                              [Link] (9 responses)
       
Also ask Oracle and Google how reimplementing the same interface works out ... 8-) 
In your example of ALSA, it did come with an OSS compatibility module and the userspace components which sprung up like ESD or PulseAudio also provided compatibility.  You can certainly make an argument that the design and abstraction of ALSA could have been better layered in some other way, but it's hard to argue that they didn't care about OSS and you must concede that the kernel implementation was going to be different in any case after OSS relicensed. 
     
    
      Posted Aug 6, 2014 18:57 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (8 responses)
       
When you will buy first time Solaris you will probably still receive on paper  binary API license which guarantees that on *binary* layer some parts of API will be not changed. 
If we are talking about porting something like for example DTrace you need to keep the same level of compatibility which will allow you reuse on new version of your OS some D scripts using the same providers interfaces. 
If you are really worry about consequences of changing APIs what you can say about Linux? 
> In your example of ALSA, it did come with an OSS compatibility module and the userspace components which sprung up like ESD or PulseAudio also provided compatibility 
Whole EDS/PulseAudio stack was huge mistake made by Linux developers which been forcing to not accept software mixing audio streams with different sampling ratio when hardware mixers were not available. Only argument against kernel module doing this in kernel space was that it may potentially make kernel unstable. Problem was that OSS implementation proved that it was possible to do this without hurting kernel. 
If you will have look on for example ZFS you can easily find that it is nothing more like kernel space application which have direct access to many OS layer resources. ARC caches is not sharing memory with page cache for example and ZFS is running in kernel space dedup, compression is accessing over SCSI protocol to disks to obtain data about zones with different latencies .. 
Things are changing and after careful development now it make sense to move  some parts directly into kernel space. Not to many .. just few only to keep proper balance :) 
IIRC the same story was many decades ago with network interfaces. In few first unices implementation whole networking was done in user space. Can you imagine this today? Now try to think about not have network drivers, ethernet layer and tcp stack in kernel space :) 
 
     
    
      Posted Aug 6, 2014 19:38 UTC (Wed)
                               by raven667 (subscriber, #5198)
                              [Link] (1 responses)
       
You mention changing in-kernel APIs on Linux (because the userspace facing API is not allowed to break, you can run a Linux 2.0 userspace on a modern kernel if you want). There is definitely a difference of development philosophy, instead of siloed components which make API guarantees to each other within the kernel you have a modular system where any developer is empowered to make changes across the tree if they need to, and the system is treated as a unified whole. 
As far as network stacks, they have never been well integrated into UNIX-like systems (why does networking have its own namespace and not exist in /dev again??!) but there is research into userspace stacks again to reach the highest levels of performance (even though the linux in-kernel stack is one of the highest performing out there).  https://code.google.com/p/netmap/ 
What's old is new again 8-) 
     
    
      Posted Aug 6, 2014 20:51 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] 
       
Yes of course you are right and exactly the same defensive line was used many times in past years refusing kernel space mixing in Linux. 
 
     
      Posted Aug 6, 2014 21:19 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (2 responses)
       
> The same is with DTrace. DTrace uses in kernel small VM which is executing D code and this VM is protecting this code doing something which may crash kernel. This is why systemtap is so dangerous tool. 
     
    
      Posted Aug 6, 2014 22:16 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (1 responses)
       
Are you sure? I see many under dual license. Some code is only GPLv2. 
$ (for i in $(find /lib/modules/`uname -r` -name \*.ko); do modinfo $i | grep license; done) | sort | uniq -c 
What is the problem with keep for example https://github.com/dtrace4linux/linux as officialy supported by Linus&co as seprated source tree? They cannot work time to time on CDDL only code? 
 
     
    
      Posted Aug 6, 2014 22:26 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] 
       
And contrary to your delusions, most developers don't want to have anything to do with Solaris.  
     
      Posted Aug 9, 2014 12:18 UTC (Sat)
                               by nix (subscriber, #2304)
                              [Link] (2 responses)
       
Further, actually getting providers to be OS-independent is really hard. Solaris DTrace tried, but there are numerous things in the allegedly-OS-independent DTrace Solaris providers that nobody else has implemented as anything but stubs on any other OS to date, because they turned out not to be that OS-independent after all, because the designers did not have perfect foresight.
 
Besides -- if DTrace hid everything about kernel-release-to-kernel-release changes it would be much less useful. Try writing the fbt provider without exposing the names of functions in the kernel...
      
           
     
    
      Posted Aug 12, 2014 0:40 UTC (Tue)
                               by kloczek (guest, #6391)
                              [Link] (1 responses)
       
DTrace providers are not translations. 
> Further, actually getting providers to be OS-independent is really hard. 
Tell it to FreeBSD developers who already did tgis. 
> Solaris DTrace tried, but there are numerous things in the allegedly-OS-independent DTrace Solaris providers that nobody else has implemented as anything but stubs on any other OS to date, because they turned out not to be that OS-independent after all, because the designers did not have perfect foresight. 
Providers are not about abstracting OS dependent bits. Good example of such parts are kernel internal functions. 
> Besides -- if DTrace hid everything about kernel-release-to-kernel-release changes it would be much less useful 
No. DTrace is not hiding such parts. 
Sorry to say it but seems you are trying to talk about something what you never been using. 
 
     
    
      Posted Aug 12, 2014 10:46 UTC (Tue)
                               by nix (subscriber, #2304)
                              [Link] 
       
Your claim that FreeBSD's DTrace port proves that all providers are OS-independent is a vacuous non sequitur: one port proves nothing about the set of all possible ports nor even the set of all extant ones, and in any case the FreeBSD and Solaris kernels are related enough that you can extract things there that are quite hard to extract e.g. on Linux.
 
Your claim that 'providers are not about abstracting OS dependent bits' contradicts your own claim two posts above that 'whole idea of DTrace provider by definition allows to hide all future kernel layer changes'. Please make up your mind. In practice, it is possible to abstract away some OS-dependent features (a lot of Solaris providers tried to do this as an explicit design goal, with varying degrees of success) but some providers, like fbt, obviously can't abstract away anything or they'd be useless.
      
           
     
      Posted Aug 6, 2014 10:35 UTC (Wed)
                               by rleigh (guest, #14622)
                              [Link] (1 responses)
       
I've been trying to use Btrfs snapshots for running Debian package builds.  A brand new clean filesystem is reduced to unusability in under 48 hours.  Why?  The write load causes the filesystem to be "unbalanced" and in consequence all writes fail until you do a "rebalance", but this is only a temporary fix and it'll fail 48 hours later.  Repeat ad nauseam.  At all times the filesystem was never filled more than 10%, and was usually 1-2% full (this is being used as a scratch space for build chroots, with 8 parallel builds creating a transient snapshot and then deleting it).  Absolutely no excuse for becoming unbalanced and unusable since at all times 90% of the filesystem was free space. 
This load is a bit more intensive than typical.  But it's not unheard of, and it's the filesystem's job to deal with it.  A filesystem should not fall over due to use!  The same setup on ext3 runs for years without any problems (using other snapshot/overlay mechanisms). 
You might be thinking that this won't affect you since you're not using the filesystem so intensively, but I would suspect that it's really just a question of *how long* it will take before it's unbalanced to unusability.  It could be a week, maybe several months.  But the *uncertainty* is unacceptable.  You can't even rely on a daily rebalance; an intensive thrashing could break it in under 24 hours.  This makes it unsuitable for production use: you can't guarantee it won't fail at some arbitrary point in time simply due to normal usage!  No other filesystem fails hard due to usage in this manner. 
I've not even got into the several severe dataloss incidents I've had while trialling Btrfs RAID and other features.  Those might be addressed now, but Btrfs has consistently failed in the primary function of a filesystem: data integrity.  In comparison, I've never had dataloss on ext* in 15 years, despite having failing/faulty hardware and kernel bugs to contend with. 
When comparing ZFS with Btrfs, and now having used both extensively (ZFS on FreeBSD), it's quite clear that Btrfs really doesn't match up, and won't for a long time, if ever.  It might eventually, but it's still years away and has been for so long now I've lost faith in it.  It'll take some time to regain my trust.  But not failing through usage and not losing my data are the bare minimum it would take for me to consider using it again.  "Pretty stable" just isn't good enough when it comes to filesystems; they need to be "absolutely solid". 
     
    
      Posted Aug 6, 2014 11:45 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] 
       
Such behavior will be able to observe as long as btrfs will be using allocation table instead free list. 
This is MAIN difference between allocating deallocating something in allocation structures vs using free list. This is why all memorerations are so fast (memory allocator uses free list). 
 
     
      Posted Aug 9, 2014 12:10 UTC (Sat)
                               by nix (subscriber, #2304)
                              [Link] 
       
     
      Posted Aug 6, 2014 11:02 UTC (Wed)
                               by intgr (subscriber, #39733)
                              [Link] (47 responses)
       
> FreeBSD is using OSS KAPI/API. Solaris is using own implementation of OSS API. [...] On both systems sound just Works(tm) [...] On Linux audio subsystem still is far from similar state. 
You're saying that audio doesn't work on Linux with ALSA? 
> Btrfs is FS with COW semantics but still is not sharing base concepts of ZFS like using free list instead allocation tables. Only by this btrfs will never gain the same level of scalability as ZFS. 
The other arguments about btrfs/ZFS can be hand-waved away, but this is a deeply technical claim. Please provide some solid technical evidence why one approach is inherently less scalable than the other. 
> Any larger scale usage if the Linux sooner or later will be "dead by thousands small cuts" on these fundaments. 
Clearly large scale users such as Google and Facebook are struggling and failing all the time because Linux is so bad. 
> Sooner or later even last Linux bastion of wide range supported hardware will fall down. This prediction is more obvious on virtualised systems like AWS, VmWare and others. Wide range of supported hardware means here *NOTHING*. 
When is your phone going to run a virtualized version of Solaris? And what will be the hypervisor, Linux? :) 
 
     
    
      Posted Aug 6, 2014 12:49 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (46 responses)
       
No. I'm saying that Linux lost his moment. Lack of enough good working audio subsystem was one of the cause why Linux failed on desktop. 
> Please provide some solid technical evidence why one approach is inherently less scalable than the other. 
Again. Look on btrfs and ZFS and accept fact that btrfs is not using one of the fundamental techniques of ZFS which is free list. 
In past few years I remember only one crash of the system with ZFS not possible to recover in rescue mode when was necessary to rebuild the system and restore data from backup. To be honest I don't remember when exactly it was and it is strange because whoever knows me knows that I have very good long term memory. 
> Clearly large scale users such as Google and Facebook are struggling and failing all the time because Linux is so bad. 
What you mean "large scale"? 
Maybe few examples .. 
You may think that computers with terabytes of memory and thousands of CPUs it is niche now. Please wait few years .. 
I don't have crystall ball and I'm not trying to tell that I'm 100% sure that future will look like above. 
> When is your phone going to run a virtualized version of Solaris?  
Something is wrong with iOS or Android? No. 
 
     
    
      Posted Aug 6, 2014 12:50 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] 
       
 
     
      Posted Aug 6, 2014 13:38 UTC (Wed)
                               by niner (subscriber, #26151)
                              [Link] (44 responses)
       
Meanwhile, if you need up to 64 Terabytes of memory in a single image system, you go to SGI and run an unmodified Redhat Enterprise Linux or SUSE Linux Enterprise Server on their machines: https://www.sgi.com/products/servers/uv/ 
     
    
      Posted Aug 6, 2014 15:35 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (43 responses)
       
Did you know that this hardware is mainly used as partitioned computer where applications are using mainly MPI API? 
Using more and more CPU sockets and memory banks adds huge latency compare to for example two to eight T5 sockets host where each socket has 16 cores and each core is with 8/16 threads. So in maximum configuration you have 8x16*16=2048 CPUs. Still it is only limited number of applications and workloads able to run on all these CPUs and memory so using partitioning is natural. However in both cases we are talking about whole service environment where overhead is significant. Running Google of FB services in majority of whole ecosystem is not the case. 
In case sparc-T5 virtualization is done from firmware level and is supported from hardware layer as well virtual interconnect latency overhead in this case probably will be hard to compare with that one in SGI machines. Or if interconnect latency is lower cost per interconnect is probably higher (power consumption as well). 
In case T5-8 such hardware takes 8U. 
In both cases we are talking about hardware working always with hypervisor. 
Sometimes it is really cheaper to spend 0.5mln bucks (per box) to buy few such boxes instead spending every year the same pile of money only on few people salaries to keeping up and run huge bunch of computers or hundreds of virtualized systems. 
 
     
    
      Posted Aug 6, 2014 16:27 UTC (Wed)
                               by niner (subscriber, #26151)
                              [Link] (41 responses)
       
 I showed you that Linux is in use - today - on systems even larger than the ones you cited. 
So since you lost completely and utterly on that argument, you just change the discussion to who's hardware is more efficient? And you do this by nothing but hand waving? 
I think I'll just leave it at this perfect example of how you're just trolling and not at all interested in any real discussion. 
Even if Solaris might have some advantages anywhere, people like you keep me from giving it any thought at all. 
     
    
      Posted Aug 6, 2014 17:56 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (40 responses)
       
Yes, because "death by thousands cuts" syndrome will be like few kilos of lead hanging between legs. 
> So since you lost completely and utterly on that argument, you just change the discussion to who's hardware is more efficient? And you do this by nothing but hand waving? 
OK. Let's try to ecircle whole discussion to scale of single host with few tenths of GiG RAM (let's say up to 32GiG), two CPU sockets and bunch of disks (from two to up to 20-30 local disks), up to two 10Gb NICs and so on .. 
I'm today working on few host up to now used under Linux which will be migrated to Solaris. Each host has pair of 200GB SSDs in RAID1. I need more fast local storage and temporary we cannot buy new hardware. We are talking about really cheap and even relatively not fresh hardware. In this case upgrade SSDs to bigger ones (supported by hardware vendor) will costs more than Solaris license for two sockets host. Buying new hardware will cost even more. 
I'm going to reinstall such small computers on Solaris because I found that data used by applications are compressing on ZFS with 4KB block with compression ratio between 3 to 10 (application with bigger compression ratio will be used probably with maximum 1MB block so effectively compression ratio will be even bigger). In all Linux cases CPU is not saturated more than 40% and all this not used up to now CPU power will be more used on compression/decompression. 
I'm repeating *just now* the same what my friend did on few magnitudes bigger scale (http://milek.blogspot.co.uk/2012/10/running-openafs-on-so...). I'm trying to save few k quids he saved few Ms :) 
Really sometimes Solaris even with license costs can be really cheaper than *free to use* Linux. 
Just try to check on how many hosts in your DC have disk space problems and how many of these hosts have low CPU utilisation. Try to imagine what you will be able to do on such boxes with java app/mysql/postgresql/SMTP/FooBar server when you will be able to "transmute" CPU power to pure gold of more disk space (???) 
PS. My biggest compression ration on ZFS was few years ago. It was about 26 times with dedup on 2U Dell with 6x300GB disks. Host was used to store tenths thousands network devices configurations pushing to such box own conf data over TFTP every few hours or more often. Automatic snapshots every 10 minutes and keeping snapshots for one month (or longer). Every 10 min each new snapshot been pushed over "zfs send" over ssh to second box to have full standby copy of all data. 
     
    
      Posted Aug 6, 2014 18:18 UTC (Wed)
                               by intgr (subscriber, #39733)
                              [Link] 
       
Also, $proponent of $operating_system finds that it has lower TCO than Linux in some specific configuration. News at 11. 
 
     
      Posted Aug 6, 2014 18:19 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (12 responses)
       
In fact, no supercomputer from Top500 list uses Solaris. This alone speaks volumes. 
     
    
      Posted Aug 6, 2014 19:21 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (11 responses)
       
No .. it is not top500 list of biggest supercomputers. 
 
     
    
      Posted Aug 6, 2014 19:43 UTC (Wed)
                               by raven667 (subscriber, #5198)
                              [Link] 
       
     
      Posted Aug 6, 2014 21:20 UTC (Wed)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (9 responses)
       
     
    
      Posted Aug 7, 2014 0:14 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (8 responses)
       
> a CPU can be _completely_ assigned to a thread, without ANY kernelspace interrupts 
Yep that is true and it is true not only on Linux. 
If you are expecting that on your computations will be not interconnects intensive will you can relatively cheap build supercomputer. Problem is that in many cases you must deal with memory or interconnect intensive workloads. If your computations will be on interconnect intensive area you will have definitely many problems with locking and synchronization between threads and here OS may help. On diagnosing such problems you will need good instrumentation integrated with profiler. Tools like DTrace can do many things here. BTW .. on Linux still there is no cpc provider (CPU Performance Counter) https://wikis.oracle.com/display/DTrace/cpc+Provider 
Interconnect intensive workloads are not only a HPC problems. 
 
     
    
      Posted Aug 7, 2014 0:31 UTC (Thu)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (7 responses)
       
> However if such thread will start exchanging/sharing data with other threads such workload will enter on area where bottleneck will be not CPU but interconnect between cores/CPUs. 
>on Linux still there is no cpc provider (CPU Performance Counter) https://wikis.oracle.com/display/DTrace/cpc+Provider 
Please, at least familiarize yourself with the current state of Linux before speaking nonsense. 
     
    
      Posted Aug 7, 2014 1:51 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (6 responses)
       
Do you really think that reporting CPC registers data it is the same what you can do in few lines D script correlating CPC data with few other things? 
Please don't take this personally but seems you are yet another person which does not fully understand technological impact of approach implemented in DTrace. 
 
     
    
      Posted Aug 7, 2014 2:20 UTC (Thu)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (5 responses)
       
> Please don't take this personally but seems you are yet another person which does not fully understand technological impact of approach implemented in DTrace. 
     
    
      Posted Aug 7, 2014 3:06 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (4 responses)
       
Did you watch Big Bang Theory when Sheldon explained Penny what it means "just fine"? :) 
https://www.youtube.com/watch?v=Yo-CWXQ8_1M 
Try to imagine that my reaction on "just fine" phrase is like Penny reaction :P 
 
     
    
      Posted Aug 7, 2014 3:16 UTC (Thu)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (3 responses)
       
     
    
      Posted Aug 7, 2014 4:15 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (2 responses)
       
Because you can do base processing huge volume of tracing data in place of hook using D code instead doing this offline. 
 
     
    
      Posted Aug 7, 2014 4:18 UTC (Thu)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (1 responses)
       
     
    
      Posted Aug 9, 2014 12:13 UTC (Sat)
                               by kloczek (guest, #6391)
                              [Link] 
       
How many times did you rash system using SystemTap? 
> And perf subsystem also supports filters. 
You don't understand where is the problem. 
Now PCI bus cannot handle more than about 300k ops/s but new PCI protocol may change this to millions/s. Try to imagine how big overhead may be after this compare to DTrace way on for example tracing IOs.  
     
      Posted Aug 6, 2014 19:18 UTC (Wed)
                               by raven667 (subscriber, #5198)
                              [Link] (25 responses)
       
     
    
      Posted Aug 6, 2014 20:31 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (15 responses)
       
You don't know about this probably but few biggest companies in last year forbid employees to use Linux on desktops. It was done *very* quietly. 
I'm not worry that Solaris is not good on desktop as MOX or Windows. Really I don't care about this. Desktops are taken over by tablets. Some number of "normal" desktop still will be used. 
> I will grant that ZFS and DTrace can be awesome but there is a lot more to an OS than the filesystem and debug framework such as performance, scheduling, IO, networking, drivers, power management, etc. 
OK I see this glove .. show me real case scenarios. I'm not trying to tell that there is no such scenarios. Simple I don't know too much such cases. 
If you know about any performance problem on Solaris where Linux does better did open SR. It will be treated as *serious* bug and you will have acceptable time of fixing the issue. 
BTW power management. On these systems which I'm reinstalling now on Solarises after first reboot I had warning that kernel was unable to change  ACPI P-state objects. So kernel was not able to change power consumption depends on load. We are talking about HP hardware with factory default BIOS settings so I'm assuming that probably 99% of HP hardware working under Linux is consuming more power than it can be. The same probably is on other hardware. 
Exact error line from logs on Solaris: 
Jul 24 19:08:08 solaris unix: [ID 928200 kern.info] NOTICE: SpeedStep support is being disabled due to errors parsing ACPI P-state objects exported by BIOS. 
I've repeated reboot after this on Linux before I've changed BIOS setting. No warnings at all. Nothing strange or scary that may point that we have something which is blocking PM under Linux. 
BTW: did you saw PM in Solaris 11? Try t have look on http://docs.oracle.com/cd/E23824_01/html/821-1451/gjwsz.html and please show me the same level of clarity of PM status under Linux. 
Are you sure that if hardware component has power management it will be possible to change PM settings using the same tools? 
And for the recors: just try to have look on list of changes in Sol 11.2 http://docs.oracle.com/cd/E36784_01/html/E52463/index.html 
> Work done for Android improves performance of S390 mainframes for example 
Do you want to say that still someone is is using original S390? 
     
    
      Posted Aug 6, 2014 22:36 UTC (Wed)
                               by mjg59 (subscriber, #23239)
                              [Link] (10 responses)
       
     
    
      Posted Aug 7, 2014 1:00 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (9 responses)
       
Of course is able and this is what exactly I wrote. 
It is matter of ergonomy and thinking on development stage not to report Everything(tm) but first to report only crucial informations with some exact severity level. 
Typical Linux dmesg output just after reboot is incredibly long. Reporting everything with high verbosity level in Linux case makes such things like incorrect HW PM issues harder to find if you don't know about what about you are looking for. Initial kernel logs should be only in single lines like: 
And additional lines per module or hardware components in case some issues/errors/etc. 
$ lsmod | wc -l 
but .. 
$ wc -l /var/log/dmesg 
And now dmesg on the same hardware under Solaris where all HW components are fully supported as well (just after reboot): 
$ dmesg | wc -l 
Now .. try to think about test script fired automatically just after OS (re)install which should catch as many as possible problems/issues. In case Solaris in 99% cases "dmesg | grep -i err" or dmesg | grep -i warn" is enough. 
Devil sits really sometimes in small details. 
 
     
    
      Posted Aug 7, 2014 1:35 UTC (Thu)
                               by raven667 (subscriber, #5198)
                              [Link] (7 responses)
       
*shrug*  It's probably too late now to really organize them, too much effort, possibility of breaking deployed parsing scripts for little gain. 
     
    
      Posted Aug 7, 2014 2:12 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (6 responses)
       
This is not even close to truth. Really .. :) 
Again please have look on Solaris realty/practice/culture. 
What if Linus will announce that in end of the 2015 will be applied patch changing "en masse" all kernel initialization messages? 
If someone will be informed enough long before that something will be changed will be possible make decision about stick on some exact stable kernel line or rewrite auto tests scripts and follow behind latest kernel changes. Isn't it? 
Sometimes some problems are not strict technical but are more about good enough coordination or planning. 
     
    
      Posted Aug 7, 2014 4:13 UTC (Thu)
                               by dlang (guest, #313)
                              [Link] (5 responses)
       
But you will probably find that it's a lot more work than you expected, just like everyone who made the claim before and hasn't followed through enough to get the changes in. 
     
    
      Posted Aug 7, 2014 4:30 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (4 responses)
       
Do you really think that as employee someone pays me to be full time junior kernel developer? 
Serious development can be done for free only for short period of time. 
Please don't expect that I'll be contacting all kernen developers to agree on some few lines changes. 
     
    
      Posted Aug 7, 2014 4:43 UTC (Thu)
                               by dlang (guest, #313)
                              [Link] (2 responses)
       
     
    
      Posted Aug 7, 2014 4:53 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (1 responses)
       
 
     
    
      Posted Aug 7, 2014 6:06 UTC (Thu)
                               by dlang (guest, #313)
                              [Link] 
       
you may your insurance company for the trust. 
companies do pay Linux developers to search for problems and fix them. 
nobody considers the kernel messages a bad enough problem to spend money on, even while agreeing that the current situation isn't ideal 
     
      Posted Aug 9, 2014 12:30 UTC (Sat)
                               by nix (subscriber, #2304)
                              [Link] 
       
Right. That's a consistent argument, that is. 
(And you don't need to 'contact all kernel developers', you just need to make the changes, post the patch to l-k, and wait for the flames. For something this bikesheddy, I can guarantee attention and flames.) 
 
     
      Posted Aug 7, 2014 5:51 UTC (Thu)
                               by mjg59 (subscriber, #23239)
                              [Link] 
       
     
      Posted Aug 8, 2014 4:59 UTC (Fri)
                               by fn77 (guest, #74068)
                              [Link] (3 responses)
       
This made my day, sorry. Talking as one that worked as an external consultant for SUN M.S for ~ 8 years, but with their hat on me when facing costumers. 
Their support was really good and i mean it(remember explorer and such? kernel dumps?). It was really great till .... exactly 1 year before the failed IBM deal. For who does not know, it was before the Oracle deal. When the best of their people leaved. 
>BTW power management. On these systems which I'm reinstalling now on Solarises after first reboot I had warning that kernel was unable to change ACPI P-state objects. 
Solaris and power management? You mean the 6, yes, six connections to power suplies a SF10/15/20/25K needed? :-) 
Talking about logs, as i learnt from my friends at SUN... Read the damn logs. It's our job. Our job is complicated, that's why we get paid well. 
> We are talking about HP hardware with factory default BIOS settings so I'm assuming that probably 99% of HP hardware working under Linux is consuming more power than it can be. The same probably is on other hardware. 
Solaris on x86. Let's avoid this rare beast for now. 
Solaris and power eficience. Cool, reminds me a time in a datacenter with malfunctioning air conditioning. 
To be fair, i see that you have difficulties to express your thoughts in English and for me is the same, so to be clear, i have nothing against you, i just want to have a nice opinion exchange in a matter that interests me and you too i guess. 
Frederk 
     
    
      Posted Aug 12, 2014 1:00 UTC (Tue)
                               by kloczek (guest, #6391)
                              [Link] (2 responses)
       
Sorry but what you are talking about? 
> Solaris on x86. Let's avoid this rare beast for now. 
Why. IMO it is good example which shows that at the moment there is no gap here between Solaris and Linux. 
> Had to shut down ~ 4 full M9000 plus the t2k and other stuff. All done entering inside the data center like a diver and getting out without getting burned 
Again you are talking about quite old hardware. M9000 Sun started selling in April 2007. Try to compare this hardware with something equally old if you want to show something about PM. 
     
    
      Posted Aug 12, 2014 1:07 UTC (Tue)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (1 responses)
       
     
    
      Posted Aug 12, 2014 10:51 UTC (Tue)
                               by nix (subscriber, #2304)
                              [Link] 
       
This is the second time in a few days that kloczek has suggested that the Solaris guys had the gift of perfect foresight. I'm coming to the conclusion that kloczek speaks with great decisiveness on numerous subjects about which he(?) has very limited actual knowledge, bending all facts to the Truth that his preferred OS is the Greatest Ever. Clearly kloczek is either in management, or is a teenager. :P 
 
     
      Posted Aug 6, 2014 21:02 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] (8 responses)
       
One more time about only this part. 
 
     
    
      Posted Aug 7, 2014 0:38 UTC (Thu)
                               by raven667 (subscriber, #5198)
                              [Link] (7 responses)
       
     
    
      Posted Aug 7, 2014 1:31 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (6 responses)
       
This is like in real life. If someone will break a leg if rehabilitation will be OK someone may even fully recover. Try to spend huge part of your life walking with few small stones inside your shoe which you are not going to remove "because you are so busy". 
In case btrfs someone should really kick off this fs from kernel tree. 
     
    
      Posted Aug 7, 2014 2:04 UTC (Thu)
                               by raven667 (subscriber, #5198)
                              [Link] (5 responses)
       
Hmm, what you are describing doesn't sound like the Linux development I read about on LWN at all, I'm not seeing a lot of makework or wasted motion in what is being applied to the mainline kernel or the lack of new complex functionality due to people needing to spend all their time on bugfixing, what I am seeing is a massive amount of parallel development, a lot of people running in all different directions but each with a purpose and accomplishing some goal, like a million ants lifting and moving a city bus. 
     
    
      Posted Aug 7, 2014 2:18 UTC (Thu)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] 
       
     
      Posted Aug 7, 2014 2:50 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (3 responses)
       
Perfectly encircled :) 
About parallel development: It is not about wasting time on parallel development but more about developing more important things and fixing existing bugs or features (after more than decade after first kernel patch nfsstat still is not able to handle "nfsstat -z" which may be very frustrating sometimes). 
 
     
    
      Posted Aug 7, 2014 2:59 UTC (Thu)
                               by neilbrown (subscriber, #359)
                              [Link] (2 responses)
       
Linux nfsstat deliberately doesn't support -z.  It doesn't need to. 
Instead of running "nfsstat -z" you run "nfsstat > myfile". 
You could wrap this in a script which simulates "-z" if you like. 
 
     
    
      Posted Aug 7, 2014 4:47 UTC (Thu)
                               by kloczek (guest, #6391)
                              [Link] (1 responses)
       
It is really funny because kernel space few lines change to allow handle -z probably will be shorter than such script. 
You know sometimes it is all about the trust. 
 
     
    
      Posted Aug 9, 2014 12:33 UTC (Sat)
                               by nix (subscriber, #2304)
                              [Link] 
       
 
     
      Posted Aug 6, 2014 16:30 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] 
       
Should be: 
 
     
      Posted Aug 5, 2014 15:31 UTC (Tue)
                               by Cyberax (✭ supporter ✭, #52523)
                              [Link] (4 responses)
       
     
    
      Posted Aug 5, 2014 15:40 UTC (Tue)
                               by anselm (subscriber, #2796)
                              [Link] (1 responses)
       
Presumably Jörg Schilling will be more than happy to supply a list. A very long list. Starting with »SCSI drivers«.
 
     
    
      Posted Aug 5, 2014 23:57 UTC (Tue)
                               by pizza (subscriber, #46)
                              [Link] 
       
I'm not sure if you're being serious or not.. but Jorg's infamous rants were from the pre-SATA/SAS days, and a hell of a lot has changed, in both Linux and Solaris, since then. 
     
      Posted Aug 6, 2014 10:27 UTC (Wed)
                               by nye (subscriber, #51576)
                              [Link] (1 responses)
       
For my part, I really like the in-kernel CIFS server, which is dramatically higher performing than Samba, and in conjunction with ZFS offers better compatibility with Windows ACLs than I've been able to squeeze out of Samba after hours and hours reading docs, howtos, and tutorials. 
Since ZFS is natively supported it also seamlessly integrates snapshots with Previous Versions (the Windows equivalent) in a way that Samba can't do (Samba is capable of exposing snapshots as previous versions, but the implementation is so flawed that it's of little practical use). 
It's very easy to join a Solaris machine to your AD domain and have it set up in a few minutes such that Windows machines are perfectly happy to work with it as if it were one of their own. There are various tools on Linux that *attempt* to make it that quick and simple, but they have a tendency to be buggy and/or incomplete (I've not tried in about two years though; maybe the situation has improved). 
Finally, there are a number of other ways in which Solaris more readily integrates into a Windows domain (eg SMF services listed in Windows Computer Management console, for one). 
Overall I find Solaris to be buggy, ugly, poorly designed, filled with bizarre quirks and held barely together with string and staples, but it interoperates miles better than Linux, is easy to set up to do so out of the box, and makes a far better file server once you've got it set up and rarely need to interact with it directly. 
     
    
      Posted Aug 6, 2014 12:59 UTC (Wed)
                               by kloczek (guest, #6391)
                              [Link] 
       
Someone wise said that "on this Word we are dealing only with not well tested software". Above is proof that it is truth. 
 
     
      Posted Aug 7, 2014 14:19 UTC (Thu)
                               by alan (subscriber, #4018)
                              [Link] 
       
     
    The 3.16 kernel has been released
      
In Solaris everything started from implementing tasks. After this was implemented project as something keeping processes of different users under one hood. With resource control it was perfect platform up to the moment when virtualization like Solaris Containers aka Zones with SMF appeared where it was only necessary to add additional layer of description and manageability of processes and resources. Each step was only logical consequence of previous one. It was perfect almost evolution model ..
Still Linux resource management and processes management is in early stone age where flint was only used to create some sparks and not as material used to build more complicated tools.
Interesting when Linus&co will "discover" the copper and tin? :^)
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
That systemd provides colored output of what is started and smf does not have no output at all?
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
I administer thousands of hosts. Sometimes tens of thousands.
Uhm. Systemd does not print _anything_. It simply allows to display services' logs, even multiple log streams at the same time (quite a useful feature).
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Initial OSS interface was problematic because was problem with licensing. In the same time OSS API was kind of "crystal box" definition allowing port OSS interface to any OS.
Linux approach was: let's put everything on the bin and start work on ALSA.  To provide some backward compatibility with "obsolete" interface on top of ALSA has been added OSS interface.
After introduction this mid stage many application been working better using OSS API over wrapper than over native ALSA interface. ALSA API and KAPI was so complicated, constantly changing that porting ALSA to *BSD was total disaster. Linux end up with own API/KAPI which is still very complicated.It provides in theory complicated functionalities never used by anything more just dumb audio interface.
FreeBSD is using OSS KAPI/API. Solaris is using own implementation of OSS API. In both cases number of problems with providing even quite complicated in/out audio interface is small. On both systems sound just Works(tm) (despite this that in both case scale using these OSes on desktops is going down because "tablets and smartphones takes all")
On Linux audio subsystem still is far from similar state.
On Solaris decade ago was invented new approach to storage subsystem integrating block layer, volume management and FS layer in one subsystem.
Solaris ZFS source code was published. After this ZFS code was ported to *BSDs, MOX enriching these systems. Base ZFS skills obtained on any of these OSes may be reused on other systems. In last ten years ZFS matured and every year we have new functionalities covering new needs.
How it is on Linux? btrfs development started in 2007. After seven years stability of btrfs is not even the same as first version of ZFS available in Solaris Express 10. Btrfs is FS with COW semantics but still is not sharing base concepts of ZFS like using free list instead allocation tables. Only by this btrfs will never gain the same level of scalability as ZFS. It will be good on using few disks but on scale hundreds disks? forget about this. Equivalent of ZFS ARC/L2ARC is not part of btrfs. On Linux we have mbcache which is for every FS. If something is "for everything" in most cases it means "it is is not for what I need Now(tm)".
If RedHat or other big company will not do this no one will be able do this because it is quite complicated task which cannot be accomplished by amator spending few hour each week after working hours.
Why the h*l btrfs did not copy user space tools syntax replacing only zfs/zpool command tools names by btrfs/btrpool? Syntax of zfs and zpool commands is excellent example of consistency and consequent evolution. In mean time btrfs tools compare to ZFS tools syntax provides nothing more than chaos.
DTrace, KAPI and user space API + kstat interface, FMA and many many more.
About something like shadowfs on Linux no one even started thinking. We are living in time when operating on hundreds of terabytes or petabytes as single dataset is more and more often to meet. Data migration of such amount of data only during hardware upgrades is real pain. Linux is not even trying to lick this area leaving completly large scale needs to systems like Solaris and few other. Why? Because Linux has thousands small problems on itsown fundaments. Any larger scale usage if the Linux sooner or later will be "dead by thousands small cuts" on these fundaments.
Problem is that this shiny island is eroded by waves of shrinkage of hardware suppliers/vendors. Sooner or later even last Linux bastion of wide range supported hardware will fall down.
This prediction is more obvious on virtualised systems like AWS, VmWare and others. Wide range of supported hardware means here *NOTHING*.
Today most of the customers need database, JVM, search engine, queuing subsystem or other brics. On what everything will be working? Who cares? Kernel tuning? OMG .. what is that?
On horizon in context extreme needs are more and more successful hybrids solutions using software and hardware (ASICS or specialised hardware accelerators).
In few years questions about OS will be more and more often irrelevant ..
Linux will never die however "Linux" as word will be harder to find in single sentence with "future" word (https://www.google.com/trends/explore#q=linux%20future)
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Problem was that initially ALSA developers told "we can do better" and because everyone was busy no one been looking on what these guys have been doing. After few years rumors started "Hmm .. ALSA still is not working" but because (again) everyone was busy no one took attention.
Let me explain. It is old communist time Polish joke about about construction workers very fast running with empty barrels between not finished construction and pile of bricks. Someone walking near this construction place found that it looks very bizarre and stopped one workers asking him "Guys what are you doing here?". Answer was "Look we are so busy that we don't have time to load and unload bricks".
This is how exactly Linux development looks like from the beginning. Everyone is so busy that there is no time to make proper research, plan, test code and even fix some bugs after all.
Things are changing a bit when now companies like RedHat understand that such methodology or more lack of it goes nowhere. However critical mass of developers in RH is so big that changes are still not enough fast.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
ALSA (not OSS) is perfect NIH example. Isn't it?
> DTrace couldn't be included in the kernel due to a conflicting license.
DTrace it is open source solution.
Who forbid to (re)implement DTrace as general idea to have exactly the same functional user space interface? This layer as same like in OSS is not patented or licensed. Isn't it?
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Previously was no problem keeping code under CDDL license in kernel tree.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Cmon .. you can keep such code even in two different git repos on two continents. Isn't it? :)
It is and always was only ideological not technical or even legal problem.
Many thing in this World does not scales linearly ..
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Hmm ..
I really remember few device drivers contributed to Linux kernel tree by Sun employees under CDDL or dual license.
The 3.16 kernel has been released
      
Good luck getting about 10000 Linux source code owners to agree.
The 3.16 kernel has been released
      
Many people can agree to even pay some money to have something Working(tm).
The 3.16 kernel has been released
      
> Many people can agree to even pay some money to have something Working(tm).
There's SystemTap and perf. I've used both, they are quite nice.
Yes, and I worked at such an environment. We had to support Solaris crap because the company was paying millions to Sun every year. I'm glad this company went out of business not long before Oracle bought Sun.
Considering that some developers are dead, a couple are in prison and lots of others simply WOULD NOT sign any agreement? Yeah, it's a big problem.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
This license is far more liberal than BSD. I cannot imagine how integration of CDDL code into Linux kernel code can give Oracle *any* control here.
The 3.16 kernel has been released
      
> This license is far more liberal than BSD.
Did you?
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
OK .. you are trying to tell me that "no one expect Spanish inquisition" .. so we should be prepared for this :)
In reality Linux still have in kernel tree NFS code which protocol IIRC patent still is owned (now) by Oracle.
It is even better because whole idea of DTrace provider by definition allows to hide all future kernel layer changes :o)
So in this case by definition you don't need to be worry .. about "Spanish inquisition":)
I'm not going to put some comments here because many words would be replaced by "<censored>" phrase :P
Because was not possible to mix in kernel space esd daemon was point keeping shared memory block where different apps been able to write to allow esd to mix these streams. Problem was that such memory region was "naked" (without memory protection between processes). One process was able to trash stream of another process and in the same time one process was able to quietly snif data from microphone initialized by another application so it was as well kind of security risk.
Natural consequence was move software mixing to kernel space.
Problem was with kind of ideological barrier. IIRC even Linus been refusing idea mixing audio in kernel space.
Moving such mixing to kernel space been solving problem of real time processing audio data.
Now esd is no longer problem .. only because it is hard to buy new audio card without enough number of hardware mixers channels. However whole userspace mixing code still part of the esd :)
The same is with DTrace. DTrace uses in kernel small VM which is executing D code and this VM is protecting this code doing something which may crash kernel. This is why systemtap is so dangerous tool.
In kernel compression can be easily transformed to hardware compression.
Because it is done in kernel space some NICs can offload many things to hardware.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Problem is that in case Windows Was working magnitude more developers finding and fixing bugs because from some point of history of desktops sound was immanent part of the desktop.
However in case Windows software mixing is/was not done by use shared memory block shared  between application .. and this is funny part of such discussions on Linux forums in the past. Linux developers was unable even copy the same way of mixing in user space in case of using software mixing (which is hard to find in present days).
The 3.16 kernel has been released
      
It's not the question of who _owns_ the code. It's the question of licensing - Linux is owned by about 10000 of individuals and companies, yet all of it is license under the GPL.
I think there are plans to integrate BPF to do it. 
The 3.16 kernel has been released
      
     99 license:        Dual BSD/GPL
     12 license:        Dual MPL/GPL
   1886 license:        GPL
     15 license:        GPL and additional rights
    130 license:        GPL v2
I'm sure that many of them been working on only BSD code or code under Apache license.
It is still many work which needs to be done to allow not only DTrace work better under Linux. If this work will be not done it will be blocking probably all possible dynamic Linux instrumentation.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
It is even better because whole idea of DTrace provider by definition allows to hide all future kernel layer changes :o)
No, it really doesn't. It allows a lot to be hidden by translators and by careful provider design, but by no means everything. (And writing translators is an exercise in pain.)
The 3.16 kernel has been released
      
Try first after try to criticize.
The 3.16 kernel has been released
      
DTrace providers are not translations.
No. When did I ever say they were? Your arguments are so scattered and incohesive that it's hard to tell what point you're trying to make. Here, you appear to be so set on disagreeing with me that you're doing it even when I'm supporting your point. The fact remains: you can hide unavoidably-operating-system-dependent components of providers via translators. Stock DTrace on Solaris does just this.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
In ZFS each block of data has time stamp which is ctime (creation time). For example on delete file from FS in classic model you must go across all blocks used by deleted file to change state of the block in metadata to move state of the block to "not allocated". In case of ZFS you need to move back all blocks to free list. If it is continues part of the disk space such operation can be done by one change in free list.
Things are complicating when you must deal with snapshots. In classic model on making decision about change state allocate blocks to "not allocated" you must check is it just released block is still used by any previously created snapshots. You are doing this by checking in each snapshot allocation tables. Size of these structures are big.
In case FS with free list on move back block to free list all what you need to do is compare block ctime with each snapshot ctime. If block ctime is older than any snapshot ctime you can move block to free list. Simple and easy to do. If not you must keep it. No change.
This is why ZFS has no impact on speed on VFS layer with growing number of snapshots beneath. ctime of each snapshot it is only timestamp and it can be fully cached in memory so every snapshot operation or deallocation operation can be incredibly fast compare to the models with any allocation tables.
Creating snapshot? No problem. Allocate only structure describing current snapshot where one of the most important data part is transaction timestamp.
Adding new disk to the pool? No problem. Write in few places on new disk few blocks with base description and add entire disk to free list. Disk size can be 100MB or 100TB and speed of such operation will be almost the same.
Change to use free list instead classic approach is like start using jet engine instead steam engine on crossing Atlantic. Icebergs? Hmm few kilometers above sea level there is no icebergs .. we can go faster straight from A to B.
This is why current btrfs development IT IS PURE waste of time.
      Quite. It was
The 3.16 kernel has been released
      
Problem is that this shiny island is eroded by waves of shrinkage of hardware suppliers/vendors. Sooner or later even last Linux bastion of wide range supported hardware will fall down.
that had me realising that this guy lives in his own parallel reality with only the most tenuous connection to the one we know.
      
          The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
How long took btrfs to implement block checksumming? In ZFS checksumming was from the beginning because Sun guys competing with Netapp and storage arrays vendors understood that on using cheap disks you must care about data consistency. Linux has been working on cheaper hardware market sector always and never took seriously consequences of doing business in this area (you must deal using technology on doing business with cheaper and more prone on error hardware).
Using hundreds of thousands small computers?
Please be serious.
It is only coincidence that Google or FB type of business can be done using in majority such approach.
There are myriads of other business which cannot be done this way. Did you know about this?
Few months was announced that Solaris is now working on systems with >=32TB memory and few bugs in memory management code related to such scale where fixed.
Again: 32TB (TERA BYTES) of RAM.
We are not talking about testing. Oracle and Fujitsu are ready to deliver such hardware to your doorstep and support it.
Try to imagine scale of the problems where you must deal with for example failover to new node service using such hardware under clustered hood (spending two time huge pile of money on have only standby may be very good idea for everyone). Or how to reboot the system with so much memory with preserving all application and kernel space caches (sometimes you must reboot the kernel .. sorry). This is on what Oracle is working now .. how to do this in matter of seconds without losing speed on warming app caches on new node or on system just after full reboot.
How to add and remove in kernel space DTrace probe if your system has thousands CPUs .. you cannot lock every CPU activity to modify kernel code to add the hook (this problem is now fixed for example).
And .. when it will come to reality to have possibility to work with such hardware on daily bases which one system will be chosen? Linux which when it will come will start dealing with such problems still dealing with "death by thousands cuts" in his fundaments or Solaris? or maybe something which will be developed from scratch because it will be working only on one of few types of hardware? (Android is driving in directions where no one needs to care about typical Linux problems)
However above problems are not imaginary. All of them are reall .. *NOW*
So try to ask where is the Linux on this picture? How Linux is trying to deal with above?
Seriously? IMO now one even is trying to think about this because "everyone is so busy running with empty barrels".
Did anyone said "next year it will be Linux year on tablets or phones"? I know the answer. Do you?
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
So this one big computer is acting more like bunch of smaller computers connected over very fast interconnects.
This is "a bit" different than running on such scale of CPUs *one* applications with thousands of threads (without rewriting it to use MPI).
I don't know enough about SGI UV however in case SGI exchanging data over virtual interconnects may be not big deal as long as we are talking about many instances of the same applications attached to different sets of CPUs and memory banks effectively exchanging data over MPI API.
Please check how many Us takes SGI UV.
Try to compare prices, total machine power consumption and calculate CPU power per core in relation to CPU power per core.
However in case Solaris on T5 if you will really need to build nest for single application and you don't want to spend money on rewriting your application to use MPI, and we are talking about processing the data in parallel with big overhead on exchanging data between threads Solaris will be able to support this in real single image system.
It is yet another aspect of running on big scale single application in single box. If such application will have additionally huge needs on file system layer probably only answer at the moment is only Solaris.
Observability and instrumentation of the system or application on such scale on Linux? Forget about this ..
The 3.16 kernel has been released
      
"So try to ask where is the Linux on this picture? How Linux is trying to deal with above?
Seriously? IMO now one even is trying to think about this because "everyone is so busy running with empty barrels"."
The 3.16 kernel has been released
      
So let's talk about typical "pizza box" or blade.
If we will decide to use some OpenSolaris fork cost of such transition will be only cost of reinstallation on Solaris + licenses costs. We are not talking about Oracle hardware this however hardware is on official Solaris HCL.
In case of Linux I'll be forced to push harder on upgrade hardware.
Licenses costs as lower costs has been accepted by management.
Effectively at the end I'll be working on the same hardware but with 3 to 10 times bigger local storage (600GB to 3TB SSD local storage).
Please remember as well that cost of bigger hardware is not growing linearly with size.
Without Solaris would be necessary to spend probably even 10 times more cache on only hardware. Full redundancy was implemented in few lines scripts -> no costs of clustering or similar. Backup costs -> cost of second host (snapshots on hsts + secondary copy of all data on standby box).
Someone may say above can be done by buch f scripts compressing every new cfg file using gzip. Problem was that constant traversing whole directory structure been almost killing this box on IO layer when it was working under Linux. Transparent FS compression may solve many problems saving sometimes many bucks/quids/candies.
The 3.16 kernel has been released
      
Oh well, that's nearly as good as an admission of defeat.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
It is top500 HPC installations. And we are talking about top500 of HPC installations doing calculations where raw CPU power is more important than power of memory subsystems or interconnects.
If you will have on details of equation used to calculate yo each installations index you can find that it will be *exactly* the same if all computers will be connected over RS232 serial lines.
Many of these installations are computing myriads of qute small tasks. Only number of these small task causes sometimes that it is sese to put everything in straight line of rack cabinets.
I'm not telling that most of such installations is doing such things. I'm telling that looking only ob final index you can say very little about where is RealPower(tm).
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
First workload will be really very CPU intensive. Second one may be very memory intensive.
However if such thread will start exchanging/sharing data with other threads such workload will enter on area where bottleneck will be not CPU but interconnect between cores/CPUs.
Try to have look on https://www.kernel.org/pub/linux/kernel/people/paulmck/pe... chapter 5.1 "Why isn't concurent conting trivial?"
The 3.16 kernel has been released
      
Solaris and Windows use periodic ticks for scheduling. Linux can completely eliminate them.
Yes, and so? Linux supports various interconnects just fine.
Oh really? I guess I was in delirium when I read this: https://perf.wiki.kernel.org/index.php/Tutorial#Counting_...
The 3.16 kernel has been released
      
Please don't try to tell that I can do the same using perl/awk/python because it will be the same story like "Why LTTng is better than DTrace?" (try to notice only that LTT/LTTng is dead and in next year DTrace will have 10th birthday).
In gawg info documentation you can find sentence "Documentation is like sex: when it is good, it is very, very good; and when it is bad, it is better than nothing."
perf is good and I've been using it specially quite often in last few months but still it is only "better than nothing". Sorry ..
The 3.16 kernel has been released
      
SystemTap can do this just fine: https://github.com/fche/systemtap/blob/master/testsuite/s...
I'd used DTrace. It's nice but not groundshaking. And even before perf on Linux, I used oprofile and other similar tools to find bottlenecks in my apps.
The 3.16 kernel has been released
      
https://www.youtube.com/watch?v=_amwWlgS6LM
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Shuffling big volumes of data from kernel space to user space is causing kind of observability quantum effect (observer object state is disturbed by observation).
DTrace it is not like perf which is event driven.
Perf is provides analysis tools to navigate in large multi-GB traces. Dtrace does not have this because is designed to use concise traces. Simple perf is more about offline than online analysis.
So far systemtap or perf does not provides users space providers.
DTrace on Linux now is able to use USDT providers.
Example: https://blogs.oracle.com/wim/entry/mysql_5_6_20_4
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
I had such situations hundreds times.
This is why DTrace VM doing whole work is better.
If instrumentation generates even this event must be queued. Effectively here you will have few context switches. Taking data from queue add more cs and take event from queue to discard part events will be always wast of CPU/time.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
This is like with evolutions. Some species are no longer dominating but as long niche still exist they are still present even after many millions of years.
Try to raise in RH BTS case "guys I need something like ZFS. Can you do this for me under standard support contract?".
How Linux can do better PM if there is no proper reporting that PM cannot be used as warning on boot stage?
After above I've raised case for our OPS to check BIOS settings on every Linux next reboot.
Under Solaris you can for example manipulate PM of some RAID cards.
I heard that last year in Oracle in many kernel subsystems project is working more developers than at any time at Sun time on whole kernel. Looking on progress in last few years I think that it may be true.
Please .. stop kidding :)
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Problem is that on I found in few seconds just after first login executing as first command "dmesg | grep -i err" that factory default BIOS settings are not optimal/wrong.
On Linux you will see only different ACPI many lines reports. No errors or warnings.
It is very easy to overlook this on Linux. On Solaris looks like it is almost impossible to make similar mistake.
OK it is small detail but it is very good example of some development culture which is lack on Linux creates many of these "thousands cuts".
found A
found B
..
Lets have a look on Linux:
58
$ lspci | wc -l
38
770 /var/log/dmesg
     192
On linux it *is* really way harder.
On Solaris it is good verbosity level. On Linux every module can report even few pages reports .. only because *there is no standardisation* here (again: "running with empty barrels" syndrome) and lack of thinking that kernel messages may be useful sometimes if they will be formed using some exact convention.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
If something needs to be reimplemented it is sometimes flagged long time before that it will be EOS of some feature.
Something like this can be done by junior developer introducing him/her to real kernel space development. As exercise such developer may even prepare some good implementation of test script. Vuala .. isn't it?
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
All major contributors to the kernel code are full time kernel developers.
Code development is about the money .. big money.
I don't need to wait on consistent kernel messages. I can use for example Solaris (few other OSes does the same).
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
You are able to pay long time for insurance trusting that if something will go wrong you will have compensation.
Support fee quite often works like insurance.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
>So kernel was not able to change power consumption depends on load.
BTW, remember SUN Cluster's logs? Remeber the 1000 names for the same thing? Solstice disk suite, Solaris disk suite (argh.. yes, i have to deal with that sometimes even now in my job). Talking about ugly ;-)
>How Linux can do better PM if there is no proper reporting that PM cannot be used as warning on boot stage?
Had to shut down ~ 4 full M9000 plus the t2k and other stuff. All done entering inside the data center like a diver and getting out without getting burned.
Btw, how is called the equivalent of powertop on Solaris? ;-)
The 3.16 kernel has been released
      
SF10/15/20/25K are long time after EOL and EOS. IIRC none of this hardware is supported by Sol 11.
Rewritten PM is part of Sol 11.
Solaris has now base PM support implemented in way which makes very easy to extend it across any possible type of hardware components which is not the case in case of Linux.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
*General* you right. Elephant it is very big animal but if you will try to cut his skin thousands times believe or not by even elephant can die.
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
This is causing in many cases as well "death by thousands cuts" affect.
Why? Yet another metaphor:
If few days ago NASA announced that they started testing EmDrive and no one   today is thinking about using steam engine to make Solar system exploration possible. As same no one should be wasting time working on new Linux FS if it will be not using free list and few other new bits.
The 3.16 kernel has been released
      
This is causing in many cases as well "death by thousands cuts" affect.
The 3.16 kernel has been released
      
Admittedly, sometimes in multiple conflicting directions at once.
The 3.16 kernel has been released
      
Problem is that on Linux as platform is hard to find something even close to DTrace, ZFS, FMA, zoning, how whole network layer was rewritten in Solaris 10.
All these ants are not moving big vehicle but more trying to borrow/collect/preserve some flying dust of features/ideas originally developed on other OSes. It is nothing bad in such behavior. Sometimes something like army of ants it is exactly what you need.
Solaris needs some own "ants" as well and seems awareness of this fact is slowly growing again this time when Solaris is owned by Oracle.
It is more out keep good balance.
In last few years I'm really frustrated by messy Linux development. Working in larger and larger scales environments caused that I' easier choosing WhatIsWorking(tm) instead what I like. As consequence I'm changing my mind  .. to start like WhatIsWorking(tm) :o)
Again: btrfs is here perfect example.
The 3.16 kernel has been released
      
Then to see increment information, use "nfsstat --since myfile".
The 3.16 kernel has been released
      
FreeBSD netstat can do -z, Solaris can do, AIX can do and Linux cannot .. total zonk =8-o
Developers are trusting the clients that they will be able to play for support.
How can I trust (as client) that Linux can do something bigger if something so trivial cannot be done?
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
"However in both cases we are talking about whole service environment where overhead is significant"
"However in both cases we are talking about whole service environment where overhead on exchanging data or interconnects is significant."
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
The 3.16 kernel has been released
      
Problem is that talking about testing any new code in context of the Linux and Linux kernel specially it is kind of taboo ("fish is rotting from the head").
The 3.16 kernel has been released
      
 
           