docker logo (download via wikipedia)

Motivation for Docker:
docker matrix from hell diagram

Docker vs Virtual Machine (NeRSC Shifter slide 13):
docker vs vm diagram

Docker architecture:
docker architecture diagram


Under the hood:
docker under the hood diagram
  1. Container isolation is provided by Linux kernel Namespace.
  2. Resource restriction (cpu, memory) is governed by CGroups.
  3. UnionFS provides a unified file system inside the container, even when there are multiple pieces mounted in overlapping fashion. Several implementation exist, eg AUFS, btrfs, DeviceMapper, etc
  4. A container format is a wrapper around the components above. Modern docker use libcontainer. Older system used LXC, libvirt, etc.


Docker traits


Ecosystem, Competitor




Docker Hub

Docker images can be placed in a central repository. The "app store" equivalent for docker is at hub.docker.com.
This Pocket Survival Guide web site is served by an apache container with all the necessary web content in https://hub.docker.com/r/tin6150/apache_psg3/

No account is needed to pull docker images from the hub.
Account IS needed to post image to the hub.

Docker and RHEL7

Docker and CoreOS

Config


HTTPS_PROXY			# said to look at this env var for proxy
/etc/sysconfig/docker		# docker daemon config for RHEL/Fedora
				# may need to config proxy here.
/etc/default/docker		# docker daemon config for ubuntu 

/var/log/docker

Installation


### installation in ubuntu
sudo apt-get install docker.io	# "docker" without .io is for some gui docklet applet.
sudo service docker start	# start docker daemon


docker search httpd		# to allow non root to run docker command, add the user to the docker group.

/etc/default/docker		# docker daemon config for ubuntu 


### installation on amazon linux  
### http://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html

sudo yum install -y docker
sudo service docker start		# start docker daemon
sudo usermod -a -G docker ec2-user	# to allow non root to run docker command, add the user to the docker group.
					# docker commands are often run under the os-level user rather than root.

### installation on rhel7

sudo yum install docker 		# (deps: docker-selinux device-mapper, lvm2).  There are other optional tools
sudo service docker start
sudo usermod -a -G Dockerroot bofh
# other setup maybe needed...  docker command still don't run as bofh ... maybe selinux stuff?

sudo docker run httpd			# will download httpd container if not already present


/etc/sysconfig/docker			# docker daemon config for RHEL/Fedora
					# may need to config proxy here.

Basic Commands

diagram of basic docker actions and commands

docker search http		# look for container in hub.docker.com
docker pull   httpd:latest	# download the httpd image from hub.docker.com.  get the "latest" version.
				# stored in??
			
docker images

docker run -p 80:80   httpd	# run the httpd docker image, mapping port 80 on the container to port 80   on the host 
				# if image not already pull-ed, docker run will pull it automatically
docker run -p 8000:80 httpd &	# run the httpd docker image, mapping port 80 on the container to port 8000 on the host, putting docker in background so get back the prompt.
				# two instanced can run as above, resulting in service running in parallel 
				# netstat -an | grep LISTEN will show that both port 80 and 8000 are in LISTEN state
				# note that the httpd process is visible on the host.  it is NOT like in a VM that run all its process in a black box.


docker run -P -d training/webapp python app.py 	# this run a demo web app called app.py
						# -P will map all ports in container to host.  but it maps sequentially starting from 32768
						# -d is daemon mode, ie, put process in background.




docker ps			# list running docker process, container id, name, port mappings, etc
docker ps -l			# list last continer that was started
docker ps -a			# list all containers, including stopped one
docker logs  c7f46acc532b	# get console output of the specified container id
docker stop  c7f46acc532b	# stop the specified container id (container nickname can be used instead)
docker start c7f46acc532b	# restart the specified container id (assuming image hasn't deleted)

docker exec -it c7f46acc532b bash	# drop into a bash shell inside a running container
docker exec     uranus_hertz df -h	# run the "df -h" command inside the named container

docker port c7f46acc532b	# show port mapping of a container to its host  (similar to former docker network ls)
docker top  uranus_hertz	# see the process running inside the specified container 


docker pull rhel7:latest	# get a basic, off-the-shelf container for redhat 7 (eg from rh cert guide)
docker info			# display info on existing container, images, space util, etc.

docker inspect httpd		# get lots of details about the container. output in JSON 
				# eg data volumes, (namespace?) mappings,
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' uranus_hertz	# only retrieve specific "inspection" item




docker images -a		# list images (from docker pull).  Note that a container is an execution instance, reading info from an image.

docker ps -a			# list containers (process, -a include stopped containers) (note diff vs image, which is more like a source file)



docker rm  ContainerName		# remove container (process listed by docker ps    -a)
docker rmi ImageName 			# remove image     (files   listed by docker image -a)
					# images files from docker pull.  container are instance of image.

docker commit clever_shockley rhel7box2	# save changes of a running container to a new named image called "rhel7box2" 
docker commit -m "msg desc"		# -m add a message about the commit (description for the new image)
docker commit -a "author name"		# -a add the authoer's name 

docker tag ...			# tag/name (?container as image) for docker push to hub.docker.com

docker login			# login to hub.docker.com ...
docker info			# version, disk usage, metadata size, etc


man docker			# docker damon and general instructions
man docker run			# specific man page for the run command of docker 
				# (not sure how space is handled by man, but it does!)

docker 	   --help
docker run --help		# help specific to the sub command.

Under the hood

Process Isolation

  1. Each container has its own process tree, with its "seed" process (eg httpd) as PID 1.
  2. Achieved using Namespace.
  3. Parent is aware of all child process, but child don't know anything about parent -- security.
  4. Parent see child process, and each child process really has 2 PIDs.

Network Isolation

  1. Each container has its own network stack, NIC, ARP space, IP space, routing tables,
  2. Achieved using ip netns
  3. iptables, brctl, virtual switches
  4. Docker provides network isolation, but no throtling
  5. 3 network methods:
    • bridged (default)
      • Virtual ethernet interface is used (veth), called docker0
      • All containers on docker0 can communicate with one another by default (icc=true), ie no isolation between these containers.
    • host
      • container use host network stack, thus no isolation
      • allow access to D-Bus, unexpected behavior.
      • use not really recommended
    • shared
      • An existing container network stack is shaerd with other containers. ie Network Namespace is shared.
      • like bridged, FS and Process isolation remains

File System Isolation

  1. File system inside container is isolated from the host
  2. No Disk I/O throtle, no disk quota.
  3. Union FS: DeviceMapper in Fedora. AUFS in Ubuntu.

Docker vs LXC

  1. LXC is lightweight, but heavily embeded into Gentoo Linux, not portable.
  2. Docker started in Ubuntu. application portability was key goal. Gentoo was providing a container for security/isolation.
  3. Both rely on Cgroup, union file system.

docker run

Stock httpd conainer
# run a stock container (apache httpd) on amazon linux
# no changes to the content of the container

docker run -p 80:80   -i -t --rm --name ApacheA -v ~/htdocs4docker:/usr/local/apache2/htdocs/ httpd 
docker run -p 8000:80            --name ApacheB -v          "$PWD":/usr/local/apache2/htdocs/ httpd &
	# run a process (httpd)  in a new container
	# each process has its own FS, network, isolated process tree.
	# most of the default param are defined in the IMAGE, but cli arg will overwrite the IMAGE conf.
	# 
	# -p = map host port 8000 to port 80 on the container 
	# 	if not specified, default maps NOTHING, 
	# 	so outside, even on the host, can't get into the inside of the docker app!
	# -i = interactive, def=false.         ^P ^Q ^Z bg can suspend and background this.  cannot use with &
	# -t = allocate pseudo-tty, def=false
	# --name foo = assign a name to the container.  if not specified, a random 2_words name will be assigned
	# --rm = remove container when exiting, def=false.  
	# 	if don't delete container, it linger around.  
	# 	will see them with docker ps -a.  remove with docker rm ContainerName
	# 	Note that each time a container is started, the name must not match existing or stopped container.
	#	--rm is good especially for testing, to avoid having to do lot of clean up work.
	# -v "$PWD":/usr/local/apache2/htdocs/  bind mounts the host's current dir into the container's apache htdocs dir.
	# -v or --volume=... format is host : container


docker run -it httpd  /bin/bash
	# this run (a new) apache httpd container, and start the bash shell running *inside* the container env.
	# /usr/local/apache2/htdocs inside this container is where the web pages are served up from (if not mapped with -v)
	# changes to this will persist inside this instance, till container is removed with docker rm ContainerName 


docker exec -it ApacheB /bin/bash	
	# this run the bash command in an already running container called ApacheB
	# -it will make it interactive with tty, thus leaving one with a shell inside the container
	# ^D or exit will terminate this exec.  the container will continue to run. it is like logout.

docker exec ApacheB ps -ef
	# will run "ps -ef" inside the container, print output, and terminate the exec
	# think of "ssh remotehost ps -ef" to run command w/o interactive shell in a remote environment

Container of OSes in Amazon Linux
docker pull ubuntu
docker run -it ubuntu /bin/bash
	# start a bash shell that run inside the docker process tree env
	# the run command by default attach stdin, stdout and stderr to the console,
	# so all keystrokes will pass thru (which is why hitting ^Z) does not put docker to background.
	# this was done on backbox
	# apt-get works, no aptitude (even though host does have this command)
	# ps -ef shows only the bash process!

 
docker pull rhel7
docker run -it rhel7 /bin/bash
	# this was done on backbox
	# uname shows ubuntu
	# /etc/redhat-release exist in the FS inside the container
	# yum works
	# df -hl is very different than the ubuntu container
	# mount shows the same list of mounts as in the ubuntu container

CoreOS
CoreOS relies on docker to provide packages. no rpm, dpkg, yum, apt-get
All CoreOS install has two boot partition, active/standby, where OS upgrade is done on standby.
docker pull httpd
docker run -p 80:80 httpd

docker pull pdevine/elinks2     
docker run -it --rm pdevine/elinks2 elinks http://www.yahoo.com		# will start interactive browser 
docker run -it --rm pdevine/elinks2 elinks http://172.31.28.159 	# use the host's eth0 IP to access the web server running on it

Building an image using dockerfile

One way to build an image is to save/commit an existing/modified image.
The other method is to build one from scratch using dockerfile
ref: https://docs.docker.com/engine/userguide/dockerimages/
mkdir  myapp
cd     myapp
vi     dockerfile
docker build -t bofh/myapp:v3 .		# . will search for ./dockerfile  -t is for tag

# optional upload to docker hub, assuming user bofh has been registered
docker push     bofh/myapp 

docker rmi      myapp... 		# remove an image from the lost host.

dockerfile ::
FROM ubuntu:14.04			# use ubuntu as a base image to build this docker container/app
MAINTAINER user <user@example.com>
RUN apt-get update && apt-get install -y ruby ruby-dev		# 
RUN gem install sinatra
# always run the apt-get update and install command in the same line using &&, 
# or it would result in a db problem on the resulting container.

Example Docker HTTPD image + psg static web content
docker run    -p 80:80   -i -t --name ApachePsg -v ~/htdocs4docker:/mnt/htdocs httpd /bin/bash
		# /mnt/htdocs will be created inside the container by the startup process

		# note inside this container, it has very few commands.  no scp, no wget.
		# run this inside the container's bash :
		# cp -pR /mnt/htdocs/psg/* /usr/local/apache2/htdocs
		# httpd 						# this starts process and return to bash

# run the following on the host (ie, from a different window)
docker commit ApachePsg		# create a new image from changes done to exisiting container
docker kill   ApachePsg

docker run    -p 80:80   -i -t --name ApachePsg2 -v ~/htdocs4docker:/mnt/htdocs ApachePsg /bin/bash
## above didn't work.


docker start  ApachePsg		# re-start a container using its name
docker attach ApachePsg		# attach to the container (last start with bash, so get back to a prompt)

docker export 		# Export the contents of a container's filesystem as a tar archive
docker save		# Save an image(s) to a tar archive (streamed to STDOUT by default)
docker commit 		# Create a new image from a container's changes
docker commit  -m "commitMsg" RunningContainerName  	
			# but how is it don't see the new item in docker images?
			# nor was i able to do docker run with the new image...
			# certainly, after commit, psg content remained inside the container.
			# No way to give a name to the image!  and no way to rename it!
			# -m commitMsg is very important for identifying the image using history
			# Even after commit is done, docker ps -a don't show the new container id
			# because that was saved to disk and isn't the running instance!
			# so can only see the info by carefully checking with docker images -a
			# Managment in Docker is said to be none existing!  Kubernetes??

docker images -a		# see that 8e300072616a is newest image created
docker history 8e300072616a	# 8e300072616a is the IMAGENAME from above docker commit.  
				# this cmd will show the commitMsg
				# and some info on how image was created, and the 

docker run    -p 80:80   -i -t --name ApachePsgII 8e300072616a  /usr/local/apache2/bin/httpd
				# confirm that the new image can start


docker rename		# rename a container
? rename image... no way to do this??   but when upload to dockerhub, give it a new name, and pull it again...

docker login			# login to docker hub.  credentials will be saved w/ encryption

docker commit -m "apache psgIII" edb6d60c97da tin6150/apache_psg3
				# edb6d60c97da is a container id (from docker ps -a)
				# tin6150/apache_psg3 is username/imagename

docker push tin6150/apache_psg3	# push the image to hub.docker.com
				# at this point, image is uploaded to the web and ready for use by others :)


docker tag e4718e38a3b1  tin6150/apache_psg_3a:dev2	# was able to tag and push, no commit...  
	## but this push all images on the chain to create the current image??  
	## certainly pushing lot of stuff...  but prev pull end up with lot of stuff too...
docker push              tin6150/apache_psg_3a:dev2	# in hub.docker.com, see 81MB image.

# docker tag ref https://docs.docker.com/mac/step_six/







## on a different machine, eg centos7, can pull the image to test it out

docker run -p 80:80 -i -t --name ApachePsg_C tin6150/apache_psg3 /bin/bash
     # docker run will pull the image, and create a container to run this service.
     # dropped into bash prompt, can check the htdocs dir.  
     # httpd will start the web server


docker run -p 80:80 -i -t --name ApachePsg_C tin6150/apache_psg_3a:dev2 httpd	# when no latest tag exist, must specify which one to get


Other more advance topics

Docker Data Volume

There are ways for container to mount the host's directory to its internal directory.  
Multiple container can mount the same host directory as well, but application need to be careful with data access and lock or it can create corruption.

https://docs.docker.com/engine/userguide/dockervolumes/

Performance

Better than VM, as no hypervisor overhead. However, best hypervisor overhead is said to be just 2% over baremetal, so beyond the VM boot up time, some claim there is little room for container to improve on. For microservice, app startup time is important thus container still offer advantage over VM.

docker run --cpu-shares=n
	# assign a weight of how much cpu this run command get
	# by default all container share cpu equally.
	# weight are 0 to 1024.  All containers weight are added and then ratio calculated based on requested weight.
	
docker run --cpu-quota=n
	# wonder if they are enforced using cgroup

Dependencies

docker run will pull dependencies needed (think of yum auto download dependencies).

multi-container apps... one way is to view them as different pieces running on multiple host and rely on network communication.

Linking Containers

Linking allow container to communicate with one another. vs dependencies ?? https://goldmann.pl/blog/2014/01/21/connecting-docker-containers-on-multiple-hosts/ (maybe add diagram)

Multi-container app

Example of running wordpress in one container, and a MySQL DB on a different one. Ref: Article by Aleksander Koko
docker run --name wordpressdb -e MYSQL_ROOT_PASSWORD=password -e MYSQL_DATABASE=wordpress -d mysql:5.7	
# starts MySQL container

docker run -e WORDPRESS_DB_PASSWORD=password --name wordpress --link wordpressdb:mysql -p 0.0.0.0:80:80 -d wordpress   
# starts wordpress container
# --link name:alias is to create a private network connection to the named container
#   (--link also copy ENV variables, if conflict, duplicated to HOST_ENV_abc and HOST_PORT_nnn)
# if don't specify to bind to 0.0.0.0:80, wordpress default to some docker bridge IP.
# use "-i -t" instead of "-d" to see web page access log in the console.  Can then detach from session using ^P ^Q (it will NOT put container in paused state)
# alt, ^C on process.  docker start wordpress will largely resume where it was left off (wordpress saved most state info to disk).


# (tested to work on amazon linux.  centos7 didn't work, maybe SELinux... maybe cuz ran docker as root.)

The above just serve as proof of concept. It is not secure. To really run a wordpress site this way, refer to info in docker hub

Also, there is a more complex setup with NGINX load balancer, Redis cache: docker wordpress (mostly done via Docker Compose)

Docker Compose

Docker Compose is a declarative way of managing containers. Think Puppet, Chef and other configuration management tool, applied to docker.
eg Both examples of wordpress setup has reference to Docker Compose.

Reference



dilbert comic uranus-hertz (download via pinterest) (I am still waiting for docker to actually name my container Uranus_Hertz, but maybe it is just a matter of time?)


The Container Ecosystem


Container in HPC

Container in HPC is a quite a large topic. For a feature comparison table of Shifter vs Singularity, and a list of relevant articles, see my blogger article Docker vs Singularity vs Shifter

LBNL/NERSC Shifter


LBNL/HPCS Singularity v 1.0


Container in UGE





Search within the PSG pages:

Copyright info about this work

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike2.5 License. Pocket Sys Admin Survival Guide: for content that I wrote, (CC) some rights reserved. 2005,2012 Tin Ho [ tin6150 (at) gmail.com ]

Some contents are "cached" here for easy reference. Sources include man pages, vendor documents, online references, discussion groups, etc. Copyright of those are obviously those of the vendor and original authors. I am merely caching them here for quick reference and avoid broken URL problems.



Where is PSG hosted these days?