GAMING WITH DATA SCIENCE: Docker, future of virtualization?

Calling Docker a virtualization platform does not do it justice.In a sense it is more than that, although it is kind of virtualization, but it covers a wide spectrum from microservices to software distribution. Docker units are not virtual machines (VM) in the same sense as VirtualBox VMs, they are called containers. Unlike VM, which has guest OS, a container is more like a package of software with filesystem. Guest OS is missing completely and Docker engine runs the software in container.

Docker is an open source platform to develop, deploy and run distributed applications. It is extremely useful for developers, not just administrators. I have used virtualization platforms like VirtualBox and VMWare for more than a decade, and quickly saw the possibilities in Docker.

Since I have to develop quite often various software solutions and analysis pathways, I need a quick and reliable way to run software on various platforms in such way that components are isolated and contained. Sometimes I need SQL or NoSQL database on ad hoc basis for quick data import and analysis. Docker is invaluable tool, that solves scenarios where you have to develop and test something without the risk of messing other active processes.

Docker has several advantages. It is lightweight. It is isolated and secure, but uses host memory and processing resources more efficiently than VM. Perhaps the best part is that Docker is based on open standards, and the software itself is open source and free.

What makes it even better is portability and flexibility in matters of infrastructure. When properly packed, a docker container should run on any docker supported environment. Docker itself will take care of all dependencies in software.

In my opinion, Docker should be part of any Data Scientist's toolbox.

GAMING WITH DATA SCIENCE

Tuesday, 24 November 2015

Docker, future of virtualization?

No comments:

Post a Comment

About Me