News and updates from DockerCon 2015

July 1, 2015

This article was contributed by Josh Berkus

DockerCon on June 22 and 23 was a much bigger affair than CoreOSFest or ContainerCamp. DockerCon rented out the San Francisco Marriott for the event; the keynote ballroom seats 2000. That's a pretty dramatic change from the first DockerCon last year, with roughly 500 attendees; it shows the huge growth of interest in Linux containers. Or maybe, given that it's Silicon Valley, what you're seeing is the magnetic power of $95 million in round-C funding.

The conference was also much more commercial than the first DockerCon or CoreOSFest, with dozens of presentations by sponsoring partners, and a second-day keynote devoted entirely to proprietary products. Most notable among these presentations was the appearance of Mark Russinovich, CTO of Microsoft Azure, there to announce that Azure, ASP.NET, and Microsoft Visual Studio all support Docker containers now. This year's DockerCon was more of a trade show than a technical conference, with little or no distinction made between open-source and proprietary software.

However, there were a few good technical sessions, and the conference as a whole allows us to catch up with Docker technology and tools. Docker Inc. staff announced milestones in Swarm, Machine, and Compose, the advent of Docker Network and Plugins, and some new security initiatives. There were also some great hacking sessions by Jessie Frazelle and Bryan Cantrill. But before we explore those, it's time for a bit more container-world politics.

(As with earlier articles, "Docker" refers to the container technology and the open source project, and "Docker, Inc." refers to the company.)

Burying the hatchet

Solomon Hykes, CTO of Docker, Inc., took the stage to announce the creation of a new standard and foundation to govern the Docker container format. He said that users had said that it wasn't good enough for Docker to be a de-facto standard; it needs to be a real standard. Hykes was oddly careful not to mention CoreOS in this. It was not the last time in the conference where Docker, Inc. would respond to pressure from that competitor without mentioning it by name.

Docker, Inc. separated the runC code that governs the container format from the rest of the Docker project. The engineers were surprised to find that it was only about 5% of the total code. This is distinct from the Docker Engine, which is the daemon that manages runC containers, and remains in the Docker project under the stewardship of the company.

According to Hykes, Docker, Inc. then asked the Linux Foundation to create a new non-profit as governance for the container standard in development, called the Open Container Project (OCP). It chose the Linux Foundation because, in Hykes's words, "The Linux developers are famous for doing what's right. There's no politics involved in developing Linux, as far as I know." OCP will take the de-facto standard of runC and work on developing the Open Container Format, "a universal intermediary format for operating system containers." This is specifically not just Linux containers; the new effort aims to incorporate Illumos, FreeBSD, and Windows as well.

Will this end the divisive rivalry between Docker, Inc. and CoreOS, Inc.? Hykes hopes so. "Standards wars are an ugly, terrible thing. They're also boring," he said. He invited CoreOS CEO Alex Polvi up onto stage to shake his hand. Hykes also said that all founding members of the appc specification have been given seats on the board of the OCP. Strangely, Hykes's statement contradicts what's in OCP FAQ, which says that only two appc maintainers are included in OCP. It's unclear whether this is a policy change or a misstatement.

Aside from all of these politics, some useful technology has already come out of separating runC from the Docker Engine. In the closing session of DockerCon, Michael Crosby, chief maintainer of Docker, demonstrated an experimental fork of runC that supports live copying of running containers between machines. He and a partner showed off the feature by playing Quake in a container that they then copied to data centers around the world—while continuing to play.

New From Docker

Hykes announced that Docker is now doing "experimental" releases. The experimental branch is experimental indeed: the majority of the on-stage demonstrations of the new technology crashed and had to be shown via a video. This branch includes a number of orchestration and management features from Docker that replace or supplement those offered by third-party projects.

Ben Firshman of Docker, Inc. created Fig, a tool for deploying multiple Docker images with links between them. Fig has now been incorporated into mainstream Docker as Compose, which Firshman demonstrated. Compose uses a declarative YAML syntax in order to define and start these containers, which can then be launched with "docker-compose up". Firshman demonstrated using Compose with Docker Machine to perform "auto-scaling". In the demo, he created a web application container backed by a MongoDB container, and then used the "scale" option for Compose to deploy 50 of each container to cloud host Digital Ocean.

The demo also relied on another new project: Docker Network. Since several containers run on each server or virtual server, and containers can be migrated between servers, container-based infrastructures require some form of software-defined networking or proxies to allow services to connect with each other. Currently, users fill this need with tools like Weave, Project Calico, and Kubernetes.

Thanks to the acquisition of networking startup SocketPlane, Docker Inc. is now offering its own virtual network overlay. In the process, it completely overhauled Docker Engine's issue-plagued container networking code. The goal is that networking should "just work" regardless of how many containers you have or where they are physically located in your cluster. There are also plans to implement security "micro-segmentation" including rules and firewalls between containers.

While Docker, Inc. has been working hard to replace third-party functionality in container management and orchestration using its own open-source tools, it is also making Docker Engine more open to third-party tools by implementing plugin APIs. Launched at DockerCon Europe in December, the plugins API currently supports four kinds of plugins: networking, volumes, schedulers, and service discovery. The company plans to add additional plugin APIs in the future. Docker, Inc. partner ClusterHQ has been a large part of the design of the API that allows its Flocker plugin to work with the officially supported Docker.

Docker's general plugin approach is intended to be as flexible as possible. The API relies on Unix sockets, permitting plugins to be loaded at runtime without restarting the Docker Engine. It also claims that multiple plugins of the same type can be loaded, with different plugins being used in different containers on the same system. All applications on Docker are supposed to be able to work with all plugins, but we will see how this actually works out in practice.

Image security and the GSA

Docker, Inc. has come under increasing criticism for the inherent insecurity of Docker Hub. Indeed, one of the chief reasons CoreOS gives to use its Quay.io instead of Docker Hub is its security features, including privacy and cryptographic signing for images. While Docker Inc. is building many security features into its proprietary Docker Trusted Registry product, one security-oriented project is being added to the open-source environment: Docker Notary.

Notary was demonstrated by Docker Security Lead Diego Monica. Interestingly, instead of just Docker images, Notary is a generic system designed to validate cryptographic signatures for any type of content, just by piping the content through it. Notary will be integrated with Docker Hub in order to enforce the verification of origin on images.

A second-day keynote talk from Nirmal Mehta of Booz Allen Hamilton explained why Notary is being implemented now. With assistance from Booz Allen Hamilton, Docker, Inc. has secured a contract to implement Docker-based development systems at the US General Services Administration (GSA), which is the agency that oversees much of the US government's entire contractor budget. The GSA has long believed in "security by provenance", so Docker Hub now needs to support it.

The GSA project is intended to consolidate massive numbers of inefficient, duplicative development stacks with slow development turnaround times and, by using Docker, streamline and modernize them. Mehta demonstrated the developer infrastructure that the GSA is already testing; he showed committing a change to an application, which then initiated an automated Docker image build, testing under Jenkins, and then automatically deploying the application to a large cloud of machines. This new infrastructure, dubbed Project Jellyfish, will deploy at the GSA in July 2015.

The GSA's adoption of Docker will be interesting. Unlike startups, government agencies are usually slow to adopt new technologies, and even slower to let go of them. This move could ensure that there are significant funds and jobs in the Docker ecosystem for years to come, even if it never catches on anywhere else, since the GSA has a huge yearly budget. Its goals are also different from web startups, as its main reason to want to speed up development is to have more time for security review, Mehta said.

Hackery: Docker desktops and debugging

Aside from the keynotes and product announcements, there were some fun hacking presentations at DockerCon. Two of the best were back-to-back, showing off Docker desktop hacks and large-scale application debugging using Docker.

Jessie Frazelle of Docker, Inc. demonstrated how to put every single application on your Linux desktop into containers. After making some jokes about closet organization and the Container Store, she launched her presentation, running on LibreOffice in a Docker container. Her base desktop consists only of what she called a "text user interface"; all graphical applications (or just about anything else) run in containers, usually one or more containers per application.

Frazelle demonstrated running Spotify in a container. Other applications, like Skype and VLC, were shown running in their own containers and connecting to another container running PulseAudio, for sound. More usefully, she showed Chrome running in a container that routed all internet traffic through another container running Tor, permitting secure, anonymous browsing, something Chrome doesn't normally support. The most difficult application she put in a container was Microsoft Visual Studio for Linux: "It didn't have any instructions. I had to strace it to figure out why it was failing."

All of this requires a lot of configuration and delving into how desktop programs interact. She has to make many Unix socket files, which are used for internal communication by these desktop programs, accessible from within the Docker containers. Frazelle also has extensive, heavily edited user and application configuration files (i.e. "dotfiles") to make this all work.

On the other end of the scale, Bryan Cantrill of Joyent explained how running containers allows debugging of failed applications at scale. He made a strong appeal for developers to try to debug crashes, saying: "Don't just restart the container. That's like the modern version of 'just reboot the PC'. You're an educated person, right? You need to understand the root cause of the failure."

Joyent's main tool to do this for failures that cause crashes is the core dump. A core dump from a containerized application is easier to analyze than one from a regular server or virtual machine, since the container runs only that application and it terminates when the application crashes. Cantrill showed how Triton, Joyent's cloud container environment, automatically sends core dumps of crashed containers to Manta, its large-scale object store. He used GDB to troubleshoot one such core dump from Joyent's live environment, tracing the crash to some bad Node.js code.

Conclusion

Of course, there were many other things covered at DockerCon, including announcements and product demonstrations by IBM, EMC, Amazon AWS, Microsoft Azure, and others. Several companies explained their Docker-based development pipelines, including Disney, PayPal, and Capital One. There were also hands-on tutorials on some of the new Docker tools, such as Docker, Inc.'s beta orchestration platform built with Swarm and Machine.

One topic that was almost absent from the agenda was discussion of how to handle persistent data in containers. Aside from the Flocker project and some proprietary products from EMC, nobody was presenting on how to handle database data or other aspects of the "persistent data problem". Nor was it mentioned as part of Docker, Inc.'s grand vision of where the Docker platform is going.

In any case, DockerCon made it clear that Docker and containers are going to be a substantial part of the application infrastructures of the future. Not only is the accelerated development of projects and tools in this space continuing, usage is spreading across the technology industry and around the world. At the end of DockerCon, the company announced the next DockerCon Europe in November in Barcelona, for which registration and proposals are now open.

Index entries for this article
GuestArticles	Berkus, Josh
Conference	DockerCon/2015

News and updates from DockerCon 2015

Posted Jul 2, 2015 1:40 UTC (Thu) by b7j0c (guest, #27559) [Link] (9 responses)

Just when I thought point-and-drool cloud services were going to render most ops people obsolete, here comes a dizzying array of infrastructure gizmos to allow the formerly-endangered sysadmin to reinvent themselves as a container engineer. I'm sure once the turf wars are over and the beta-quality code is finally polished, we'll be able to click our way to completely containing the horrors of our internal frameworks in a black box hosted (err, paywalled) on docker.com.

Does anyone even know what all these things do? I tried figuring out what something like Pivotal's CloudFoundry was and all I can determine was 50k lines of something that guarantees "operational agility". Scaffolding to prop up scaffolding...

News and updates from DockerCon 2015

Posted Jul 2, 2015 5:56 UTC (Thu) by kleptog (subscriber, #1183) [Link] (8 responses)

The whole containerisation front is moving *very* fast, it's hard to keep up. We were having debates with the sysadmins about how many Dockers do you want to run on a single VM. And we were thinking about how to deal with failovers and networking and boom! we get companies like VMware and Microsoft announcing they'll run containers directly. Just when you thought of using etcd to configure everything we get Docker Compose which solves 99% of the problem. And now some networking management.

This isn't going to put ops out of a job, but it is going to change the way they look at services. Rather than managing machines they'll be managing services directly, which I think will actually make everyone happier. In a sense we're removing a layer of indirection: the VM host.

We're not going to run stuff in the cloud though, there are limits. Our customers might though.

News and updates from DockerCon 2015

Posted Jul 2, 2015 14:03 UTC (Thu) by raven667 (subscriber, #5198) [Link] (7 responses)

> Rather than managing machines they'll be managing services directly, which I think will actually make everyone happier. In a sense we're removing a layer of indirection: the VM host.

It was never anyones goal to virtualize the machine such that you needed to run nested kernels, the whole point of the kernel and memory protection is to provide separation between applications, the problem is that the state of the art of process separation lagged far behind what was needed to actually run separate programs with shared libraries on the same hardware. Now that the decade-plus long effort of adding namespaces and separation within the kernel is bearing some fruit we can remove the layer of indirection so that you have one kernel which handles both the interface for applications and the interface for hardware and has all the available information to make the best decisions on how to service the applications requests.

News and updates from DockerCon 2015

Posted Jul 2, 2015 16:43 UTC (Thu) by rriggs (guest, #11598) [Link] (3 responses)

> It was never anyones goal to virtualize the machine such that you needed to run nested kernels

Huh? How would one run a Windows OS on an Apple laptop without nested kernels? It is certainly a reasonable goal to do that. And with VMWare, there is no nesting of kernels -- just a hypervisor and non-nested OS peers. With Docker, it seems that one gives up OS flexibility for a little hardware efficiency.

News and updates from DockerCon 2015

Posted Jul 2, 2015 16:47 UTC (Thu) by jberkus (guest, #55561) [Link] (1 responses)

It's more than a *little* hardware efficiency. As an example, I can easily run four to six containers on my ultralight laptop and still do a presentation with LibreOffice slides. Whereas, if I run *one* VirtualBox VM, that's pretty much all I can run. In production, this means running 4 to 40 containers per machine, instead of 1 to 4 VMs.

So to rephrase: "giving up some flexibility for order-of-magnitude better hardware efficiency," which seems like a reasonable tradeoff. Sometimes you need a full VM, but often you don't.

News and updates from DockerCon 2015

Posted Jul 7, 2015 4:17 UTC (Tue) by Gnep (guest, #102586) [Link]

Not necessary, check out www.hyper.sh. You can certainly run hundreds or even thousands of these ultra light VMs per server.

The flexibility tradeoff is not made by Docker. It is instead container VS hypervisor. For a public CaaS platform, BYOK (bring-your-own-kernel) is necessary. Read more: https://hyper.sh/blog/post/2015/06/29/docker-hyper-and-th...

News and updates from DockerCon 2015

Posted Jul 3, 2015 2:14 UTC (Fri) by raven667 (subscriber, #5198) [Link]

> How would one run a Windows OS on an Apple laptop without nested kernels?

I'm not sure how that's relevant to a discussion about Docker which largely about servers, especially servers running Linux where it solves a software deployment problem with lower overhead than full machine virtualization solves the same problem.

> And with VMWare, there is no nesting of kernels -- just a hypervisor and non-nested OS peers.

I don't think that's how it works, the vmkernel hypervisor kernel is the primary kernel, all of the other OS kernels are subordinate to it and nested inside the interface which is controlled and provided by the vmkernel. This is highly performant in that the interface is often provided directly by hardware which has the capability to segment itself, such as an IOMMU or VT instructions and a new layer of page tables, with that segmentation controlled by the vmkernel. The vmkernel is the only kernel with a full and complete view of the hardware, the OS kernels which run under it are the only ones privy to the userspace processes and syscall API state.

> With Docker, it seems that one gives up OS flexibility for a little hardware efficiency.

Docker is only targeting Linux, and allows you to migrate from a bunch of Linux VMs on Xen, KVM, or VMware (I guess HyperV too), to running the same software on bare metal using namespaces, changing one management framework for another and removing a layer of abstraction which gets a performance benefit.

News and updates from DockerCon 2015

Posted Jul 3, 2015 10:34 UTC (Fri) by niner (subscriber, #26151) [Link] (2 responses)

It took more than a decade to add the required namespaces to the kernel, because there are so many places that have to correctly handle them. How do you think does replacing the relatively small channel between host and guest kernel by this huge attack surface impact security?

News and updates from DockerCon 2015

Posted Jul 3, 2015 11:54 UTC (Fri) by kleptog (subscriber, #1183) [Link] (1 responses)

It clearly does impact security and there's clearly going to be a lot of work on this in the future.

But there are plenty of cases where this isn't really a huge consideration. For example: when you have a large number of applications that all need to communicate with each other and use the same data. The isolation of containers is more than enough here, stricter isolation is needed when you're dealing with multiple customers or different levels of data sensitivity.

I don't think anyone is proposing replacing every VM with a container, but there are lots of situations what containers are a huge improvement on what there is now.

News and updates from DockerCon 2015

Posted Jul 6, 2015 15:43 UTC (Mon) by drag (guest, #31333) [Link]

My personal approach is to combine them.

Containers vs VMs is not a either-or situation.

When I run something like docker on my desktop I run it in a VM. Previously I would have about a dozen VMs and would have to start and stop them individually because I could only run a half a dozen at the same time at most. When I ran into applications that need lots of ram or high disk I/O speed then that reduced the amount of things I could run even further.

Now I just kick off one (relatively) huge VM with a lot of ram and CPU and then just run containers in that. It has it's own dedicated drive (USB 3) and it's own network interface separate from the one I use on my desktop. That way when I run applicaitons that have unique I/O needs then I can run them at the same time as the rest of the software I want to run in the VM.

All in all this has resulted in a massive improvement in resource utilization and just day to day ease of use.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 2, 2015 3:46 UTC (Thu) by kreide (subscriber, #4708) [Link] (26 responses)

At a recent Docker meetup we (Medallia) spoke about how we solved the persistent data in containers problem using Ceph:

http://www.meetup.com/Docker-Palo-Alto/events/222829902/

Direct link to the slides:

http://files.meetup.com/10524692/Relocatable%20Docker%20C...

Medallia and Redhat Talk Docker+Ceph

Posted Jul 2, 2015 4:51 UTC (Thu) by b7j0c (guest, #27559) [Link] (19 responses)

Thanks for the slides. Did you weigh the costs of Ceph vs just dumping objects in S3 or a similar online object storage system? These don't excel at latency and are no replacement for a filesystem but are basically unbeatable for development and operational costs, simplicity etc.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 2, 2015 17:05 UTC (Thu) by krakensden (subscriber, #72039) [Link] (18 responses)

Object storage doesn't really solve the problem of "how do we improve the way we deal with our databases", though- and they don't really provide enough features to jettison MySQL/Postgres etc.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 3, 2015 18:00 UTC (Fri) by b7j0c (guest, #27559) [Link] (17 responses)

Those are good points, but even then, I would have sought out something hosted like Aurora or Redshift. Going through these slides I felt like asking "why are you even building this?".

Medallia and Redhat Talk Docker+Ceph

Posted Jul 4, 2015 0:08 UTC (Sat) by Lennie (subscriber, #49641) [Link] (16 responses)

Because if you choose to use Aurora or Redshift you kind of glued your application to AWS instead of making your application more portable.

Aurora and Redshift are very MySQL and PostgreSQL based, but still you are putting someone else in control of how you manage your data or what features are available.

Can you test it on your laptop with regular MySQL and PostgreSQL, does that mean it will run the same on AWS ?

What if you want to move to an other provider ?

Maybe not a cloud provider but a hosting provider (or your own datacenter) because you want to run it on some dedicated servers because that's actually cheaper.

No, long term the right way is probably to deploy some containers which manages itself (like an appliance):

https://flynn.io/docs/postgres#design

I personally think solving this problem at the storage layer is the wrong place.

It's better to deploy some containers on multi servers and re-configure the database-server to replicate between them.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 6, 2015 2:28 UTC (Mon) by b7j0c (guest, #27559) [Link] (15 responses)

Its true that hosted cloud services can be viewed as a "roach motel" so to speak, but no more so than Linux, Postgres or any other technical commitment you make, modulo the monthly fees. Access to source code really isn't an issue, most organizations are not going to be forking their own kernel or RDBMs for a variety of reasons.

In coming years, those who take the time to "grow their own" starting from bare metal will find that their target markets have been staked out already by time their architectures are ready to accept business....by those who simply bit the bullet and let a cloud provider run the commodity functionality so they could get to market faster. Being portable doesn't matter much when you're irrelevant.

Many companies that "roll their own" don't have a NOC, don't have failover, don't have 24x7 ops staff in all regions, don't have redundancy, etc etc. They end up with worse security too, because security is often reactive and without a security-focused NOC, they're vulnerable.

What do you do when your storage maxes out and the vendor can't ship new units to you in time? AWS will never tell you the database is full.

At this point I would only advocate roll-your-own to hobbyists or those in noncompetitive/solved markets that don't have realtime ops or scalability requirements. Docker in my mind is just a speed bump...it only isolates your complexity, it doesn't reduce it. Hosted services will make most of this tooling irrelevant in a few years as people realize that being able to turn on a infinitely-scalable DB in thirty seconds (vs growing a bare-metal architecture over months) far outweighs the issues of lock-in.

It would be interesting to see a real case study on the real costs and benefits of growing a bare-metal architecture vs hosted services. Factoring in time-to-market, ops obligations, opportunity costs of capital (everyone and every dollar committed to building a bare-metal architecture could have been instead focused on real business objectives), etc, I would assume a project like the one described in the slides is more expensive than a hosted to the order of millions.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 6, 2015 17:17 UTC (Mon) by drag (guest, #31333) [Link] (9 responses)

You are kinda contradicting yourself,

You say things like "cloud provider run the commodity functionality"... but proprietary databases are _not_ commodity. Things like MySQL or PostgreSQL are commodity functionality. Not until you can provide the same APIs for 'redshift' yourself using off the shelf open source software can it be considered commodity functionality.

If you don't mind tying all your containers into a proprietary solution then that is fine, but if you are actually talking about commodity stuff then that means APIs and software that _any_ cloud provider can provide.. including yourself.

> What do you do when your storage maxes out and the vendor can't ship new units to you in time? AWS will never tell you the database is full.

You can take that into account with a 'hybrid' approach.

Using a public cloud or private cloud or just a Linux server running containers at home... these things are not mutually exclusive things. None of this stuff is either-or. It only becomes either-or if you choose let your applications depend on proprietary features.

You use software, databases, and APIs that are available in both cloud providers and at home. Take 'S3', for example. You can get that at Amazon, but you can get that by using Swift. That means as long as your application sticks to the subset that Swift can provide then you can use _ANY_ cloud provider you want. You are not stuck using Amazon. If your external network dies then you don't have to send everybody home. It's all still there locally.

That way you can run your own servers locally. Keep them pretty much tapped out at 80-90% capacity and when you need more capacity you just rent it. Believe it or not you can do it cheaper locally if you are smart about it. You use whitebox servers and whitebox switches.. you can use the same stuff the 'big boys' use.. they don't have a exclusive lock on this stuff. You can't do it, of course, if you are running something like 'Cicso Blade Servers' or some such nonsense (of course.). That way you don't have the pay the profit overhead of having Amazon do it for you and you still don't have to pay for unused, redundant, and 'just in case' capacity yourself.

If you use a RDBS then keep a Amazon VPS or Rackspare or whatever active and have it replicate the stuff you have on site. If some other 'clouder provider' has some fire sale on capacity you can take advantage of it instantly without having to bother your programmers or change anything in your business. Just get a account and deploy.

If you tie yourself down to proprietary Amazon services that can't be replicated by other people or yourself using open source software then you are locked in.

Amazon and other people do have massive economy of scale, but so does open source (when it's done correctly). 'Distributed' software shouldn't just mean 'one cloud vendor'. It should really be distribute-able across _anything_ you want.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 6, 2015 21:38 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> You use software, databases, and APIs that are available in both cloud providers and at home. Take 'S3', for example. You can get that at Amazon, but you can get that by using Swift. That means as long as your application sticks to the subset that Swift can provide then you can use _ANY_ cloud provider you want.
In theory. In practice it doesn't work this way.

All the object storages have some ...erm... peculiarities. For example, S3 is very 'eventually consistent' and if you think that you you'll be able to access an object immediately after you write it, then you'll get some nasty surprises. And they are especially nasty because they happen only during high load.

Then there's a question of performance. Individual S3 connections are fairly slow (2-10 megabytes per second, top) but you can have literally thousands of them and they scale almost indefinitely. You can have a cluster of Hadoop servers hammering the S3 storage and it will work just fine. If you try the same pattern locally - you're going to be surprised.

Then there's a question of metadata. While GETs and PUTs on Amazon S3 are insanely fast, metadata operations (LIST) are limited to about 100 per second before you are throttled. Again, that's not something that you'd expect from a single server with an SSD.

So making abstractions at the cloud API is a doomed enterprise.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 7, 2015 3:20 UTC (Tue) by b7j0c (guest, #27559) [Link]

No one does Lists on S3 as a practice, the pain of doing so is spelled out even in the introductory materials. You keep a reference to your objects in a database.

S3 on the other hand has many more '9's in its uptime history than hardware most people manage themselves, and it is insanely cheap, it is one of the few "no brainer" AWS services, which is why so many people use it.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 9, 2015 19:56 UTC (Thu) by drag (guest, #31333) [Link]

> For example, S3 is very 'eventually consistent'

yeah.

All distributed systems have their limits.

As they usually say, you have three choices (something like):

1. Speed

2. scalability/distributive-ness...

3. consistency

You get to pick two.

Things like S3 and Swift sacrifices consistency for the ability to distribute data globally and still have it reasonably fast. So they depend on 'best effort' or 'eventual' consistency. This is great for uploading images of kittens, or containers. javascript libraries for the web app you are going to release next week, but it's lousy if you want to be able to mount it and use it as a POSIX-like file system. On the upside they distribute stuff globally and can optimize things so that people fetching data will go to the closest/most economical servers first.

Which is why you still have a place for things like Ceph, that have a high degree of consistency. So much so that you can run a conventional POSIX file system on top of them. It's not something you can 'naturally' extend across multiple datacenters, but if you want a way to take advantage of a bunch of cheap JBOD arrays to share VMs across a few hundred nodes in one datacenter then it's going to work much better then most.

Ceph offers a S3-style API for itself, but I feel that it would be silly and expensive to try to use it beyond something small. You would get the the worst of both worlds. You wouldn't be able to take advantage of Ceph's consistency guarantees while at the same time you couldn't take advantage of the ability to distribute S3 cheaply.

So as long as you combine like-for-like and the API is good then application developers shouldn't run into much problems.

It's like: even though Ext3 and ZFS are radically different under the hood they still conform successfully to most POSIX conventions. At least enough that for most purposes it doesn't require any re-programming of applications.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 6, 2015 21:40 UTC (Mon) by Lennie (subscriber, #49641) [Link]

drag has a great reply, those are the kind of things I would have mentioned. But I just know what b7j0c is going to mention one part again: security

But I think that containers are a great way to have better security too, because of better componentization.

Why ? Because you are only running one process/application (in case of application containers like Docker and trends like microservices). This really reduces the attack surface over the network. Because these are single purpose applications we (community) have the potential to allow these processes to only do what they are supposed to do.

Remote shell ? There is no shell in your container, maybe even most syscalls aren't even allowed if we create good profiles.

Just like Docker-like application containers can make deploying easier it has the potential to make doing security easier.

An other reason is, if you have a Docker container with PostgreSQL (possibly replicated to multiple machines) how many people will have to make a security profile for running such a service ? Not many if we have a way to share them.

The Docker Security team also talked about this in I believe this video: https://www.youtube.com/watch?v=8mUm0x1uy7c

Those 2 came from Square and as part of the security team there they tried to make deploying security easy:

https://www.youtube.com/watch?v=lrGbK6fE7bI

One of the things they did was mutual TLS auth between all services and frequent key rollovers.

I guess it's their attempt at the OpenBSD motto: secure by default.

Something else I like is some one the persons working on the HP Openstack Cloud (including Public) said:
How do you mean log into a machine ? You can't. It just sends logs with errors to a logserver, there is no remote login.
These are all machines which are deployed and configured with automated tooling (like configuration management). At install time (actually they are using imaging) it pulls the machine configuration information over the network and configures itself.

If you look at for example RancherOS, they are running nothing on the host. It's all containers.

I think there is a lot of potential to get it right. Will we end up at that place... I don't know.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 7, 2015 3:16 UTC (Tue) by b7j0c (guest, #27559) [Link] (4 responses)

> You can take that into account with a 'hybrid' approach.
> Using a public cloud or private cloud or just a Linux server running containers at home... these things are not mutually
> exclusive things. None of this stuff is either-or. It only becomes either-or if you choose let your applications depend on
> proprietary features.

I am fully convinced that hybrid clouds are a loser solution. None of the "big three" (AWS, GCE, Azure) are emphasizing these for good reason: most people don't want them and they are both an architectural and business dead-end. RedHat is promoting hybrids partially because its all they *can* do...they simply don't have the resources to go build dozens of giant data centers.

Hybrid clouds just move your complexity around and mutates it by coupling it to new, immature tools (which Docker is one). What do you get? You've got the "lock in" you dread by using Openshift or whatever, yet you also have to provision a seven figure budget for ops and hardware, and you'll still probably do a crappier job than AWS.

I'd rather just roll-my-own than deal with something like OpenShift etc...I have zero interest in dealing with tens of thousands of lines of alpha quality code, open source or not.

Thats not to say AWS, GCE and Azure aren't without flaws, but the biggest danger so far is that they could gouge you once they have all your data and operations internalized. So far competition is keeping that from happening, and frankly all of these vendors realize that to abuse customers at this stage of the market would be suicide.

Indeed if you look at some of the stuff AWS is doing with Lambda and Kinesis...I don't even think it is possible for a single entity to roll something out like this...the only way you are going to get stuff like that is with a big bet.

> If you use a RDBS then keep a Amazon VPS or Rackspare or whatever active and have it replicate the stuff you have on site.
> If some other 'clouder provider' has some fire sale on capacity you can take advantage of it instantly without having to
> bother your programmers or change anything in your business. Just get a account and deploy.

Why? You can usually just pick up the phone and have your AWS rep start discounting you. Its easier than playing roulette with your business by hunting around for cheap VPS hosters.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 9, 2015 20:04 UTC (Thu) by drag (guest, #31333) [Link] (3 responses)

> Hybrid clouds just move your complexity around and mutates it by coupling it to new, immature tools (which Docker is one). What do you get? You've got the "lock in" you dread by using Openshift or whatever, yet you also have to provision a seven figure budget for ops and hardware, and you'll still probably do a crappier job than AWS.

If you need a seven figure budget for running your own cloud you'll need a seven figure budget for running AWS... if you are doing your job right. Amazon isn't doing anything magical. Their advantage is their experience and they are doing it at a massive scale. They still need to turn a profit and they are not going to be able to get access to any hardware or resources that you can't get yourself through other means.

It's not a slam-dunk either way. Going public cloud doesn't make expenses go to zero.

> Why? You can usually just pick up the phone and have your AWS rep start discounting you.

And when the AWS rep looks at your account and sees that your organization has coded your business around services that you can only get from Amazon... how well do you think negotiations are going to work out for you?

If he knows what he is doing any sort of discounts you get is going to be related directly to the expense of porting your applications to a different platform.

> Its easier than playing roulette with your business by hunting around for cheap VPS hosters.

Making a phone call is certainly easier, but your success with negotiations is going to be directly related to how much or how little risk you have by switching away.

What happens when you call your bluff?

You: "Hey AWS rep, so-and-so is offering the their entitlements at 1/2 the cost of yours. My boss was looking to move away, but I like you guys... what can you offer me?'

AWS: "Tell your boss: Good luck with that"

Medallia and Redhat Talk Docker+Ceph

Posted Jul 9, 2015 23:20 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> If you need a seven figure budget for running your own cloud you'll need a seven figure budget for running AWS...
It's very hard to run reliable infrastructure cheaper than Amazon. If you need a VPS for a private website that you don't care about, then you can get something cheaper. If you need a single dedicated server for some important but not uptime-critical tasks, then you can easily do cheaper than AWS.

However, if you need a redundant infrastructure and/or multiple servers then AWS becomes hard to beat. If you add spot instances to the mix, it becomes downright the cheapest option.

> You: "Hey AWS rep, so-and-so is offering the their entitlements at 1/2 the cost of yours. My boss was looking to move away, but I like you guys... what can you offer me?'
Then it's quite possible that $OTHERCOMPANY is either going to take a loss to get the client. AWS sales reps are actually quite good at offering constructive solutions, like using EC2 Spot or buying reserved instances.

But in the worst case, if you have a seven-figure-per-month cluster then you probably can afford the cost of engineering time to move it to one of the competitors. Microsoft and Google have services that are comparable to Amazon's.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 9, 2015 23:45 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

> It's very hard to run reliable infrastructure cheaper than Amazon.

not really, you just need to have a lot of bandwidth usage to drive you aws bill through the roof.

Amazon is not using magic. There is some economy of scale involved in their purchasing, but it's really easy to spend a lot more money hosting things in aws than hosting them yourself.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 12, 2015 1:12 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

> not really, you just need to have a lot of bandwidth usage to drive you aws bill through the roof.
The only major use-case that you can't easily fix is the price of outgoing data. It's fairly expensive, while incoming data is free. Also, intra-zonal traffic is free in AWS, Google Compute and Microsoft Azure.

> Amazon is not using magic. There is some economy of scale involved in their purchasing, but it's really easy to spend a lot more money hosting things in aws than hosting them yourself.
It's easy to do it, especially if you are going stupid stuff. If you're using AWS smartly - it becomes VERY hard to beat.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 8, 2015 0:51 UTC (Wed) by Lennie (subscriber, #49641) [Link] (4 responses)

Something else I didn't say.

The 3 major cloud providers and some of the larger hosting providers are all from the US. And I'm not from the US. This makes me a foreigner.

You know what that means ? It turns out I have none of the rights you might have (my guess is you are a US citizen):

http://media.ccc.de/browse/congress/2014/31c3_-_6195_-_en...

Maybe it's the land of the free to you, but I don't just don't have the same rights.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 8, 2015 19:26 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

All of the hosting providers have datacenters around the world.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 8, 2015 19:32 UTC (Wed) by Lennie (subscriber, #49641) [Link] (2 responses)

Do you mean all the large US providers have datacenters around the world ?

Why do you think that matters ?

The only thing that matters is that they are US companies not where the data is kept.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 8, 2015 20:21 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

That is debatable. Microsoft is fighting the US government: http://www.zdnet.com/article/microsoft-fights-us-effort-t...

This case is still in the court system.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 8, 2015 22:34 UTC (Wed) by Lennie (subscriber, #49641) [Link]

I really doubt they'll win.

It's not completely the same situation as 'cloud computing' so some other laws are used but.

There are laws in the US that clearly state, these rights only apply to: US citizens in the US.

Basically saying: everyone else is outlawed.

How about a little clip straight out of congress ?:

https://www.youtube.com/watch?v=ijr0E6Lw4Nk#t=15m00s

Medallia and Redhat Talk Docker+Ceph

Posted Jul 2, 2015 16:52 UTC (Thu) by jberkus (guest, #55561) [Link] (5 responses)

Hey, thanks for the link.

There was a presentation on Ceph at CoreOSFest, too, but I was kinda disappointed in it. I'm hoping that you can answer a couple questions that speaker didn't have time to take:

* What about running Ceph itself in containers? Given the complicated setup of a new Ceph cluster, it seems like containerization would really make it easier to deploy.

* You mention databases in your slides. But clustered object stores, as a rule, really really suck for small frequent writes. How does Ceph handle this?

Of course, if you can do a San Francisco Postgres or Docker meetup, then I'd be glad to watch you present and answer my questions in person.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 3, 2015 2:14 UTC (Fri) by thorvald (guest, #103394) [Link] (4 responses)

We (also Medallia) have Ceph monitors in docker; we want the easy relocation without updating ceph configs. For the OSDs, we run them directly on bare metal; we have their installation provisioned as part of the installation automation.

We run PostgreSQL on top of the rbd block devices that Ceph exports, and Ceph does partial block updates really well; we're seeing "all-local-SSD-array" equivalent performance on database transaction benchmarks. Our Ceph cluster is 64 nodes, each with 8 SSDs though, so it's not an entirely fair comparison.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 4, 2015 6:05 UTC (Sat) by jberkus (guest, #55561) [Link] (3 responses)

Huh! Do you have any online numbers for that cluster?

And, if you can do a Postgres meetup in the Bay Area, I think we'd be interested ...

Medallia and Redhat Talk Docker+Ceph

Posted Jul 6, 2015 17:39 UTC (Mon) by drag (guest, #31333) [Link] (1 responses)

I suspect that unless you have 10GbE for your ceph storage network running any sort of RDBS on it is going to be pretty painful.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 7, 2015 20:38 UTC (Tue) by thorvald (guest, #103394) [Link]

We do 40GbE end-to-end, so we peek out around ~4.5 GByte/sec of "disk" I/O.

Medallia and Redhat Talk Docker+Ceph

Posted Jul 7, 2015 20:50 UTC (Tue) by thorvald (guest, #103394) [Link]

I'll go rerun our benchmarks, my notes here say "~6000 tps", but doesn't include the parameters for filesystem or versions, nor which transaction is actually benchmarked. My bad for not taking proper notes.

We'd love to present this at the PostgreSQL meetup; what's the best way to get that going?

News and updates from DockerCon 2015

Posted Jul 4, 2015 0:15 UTC (Sat) by Lennie (subscriber, #49641) [Link]

If people want to seen more, all the videos are on their YouTube-channel:

https://www.youtube.com/user/dockerrun/videos