Know Your Binaries

It’s pretty hard to be clever if you’re an executable file. Understanding how your programming interpreter views the world can save you days of development time and release you from the perils of virtual environments.

After you hand your interpreter code, you will likely be including libraries. The PATH is what will be used to find available modules and code.

Let’s check it out in Python.

1
2
> \>> python -c "import sys; print(sys.path)"
['', '/usr/lib/python310.zip', '/usr/lib/python3.10', '/usr/lib/python3.10/lib-dynload', '/home/trevor/.local/lib/python3.10/site-packages', '/usr/local/lib/python3.10/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.10/dist-packages']

Some quick things to notice.

  • Python will crack open zip files and use them just like they are directories. Awesome.
  • /usr/lib/python3.10... python ships with some core libraries, and those get stored in a shared part of the file system, available to more than one user.
  • /home/trevor... libraries get installed in your user’s home dir – best choice yet.

Now let’s go custom.

1
PYTHONPATH=.:.venv

I like to live dangerously and include the present working directory, ., and if there is something in the folder named .venv it’s always something I will want to include. venv works too, but conflicts with python’s venv package setup.

Between these directories and my environment variables I can completely control the state of my environment. If I want to install a lib for all of my apps, go for it: python -m pip install {lib} – I never want to do that. Maybe I’ll do it begrudgingly if there is a command line executable I want available.

When I want to install a module for one project (which I always do, hard disk is cheap), I install to .venv.

For pip this is the -t argument.

1
python -m pip install -t .venv {lib}

And that’s pretty much my virtual environment. You don’t have to activate, or deactivate anything. There is no state being changed in my environment to confuse me. Every time I run python everything works exactly as I expect it to. If something was misbehaving I have full competence over my environment and would be able to debug it quickly.

Is it inconvenient to remember the -t flag, maybe. I place a Makefile in my project and I haven’t typed a pip install since. The Makefile I use is idempotent as well, so I can test, run, or serve and it will only install or update packages once and exactly when it needs to.

This doesn’t solve running multiple versions of your interpreter. I download mine individually at this point. It’s pretty hard to get organizations to keep bleeding-edge on interpreters, so this hasn’t been a problem for me. Once again, the binary I’m using gets checked into the Makefile. If I’m testing cross binaries, I use a framework, like tox. If I’m running something, I like being explicit with what I’m using.

On Developing Kubernetes Locally

Kubernetes has been a large force in the app deployment industry for some time now, and this article is by no means the authority on Kubernetes. I’ve personally have had some stop and go with adoption, and knowing a technology “enough” can lead to bad practices. This article will walk you through the basic logistics of developing for K8s on your machine without “experimenting in production.” The goal of this article is to note commands for new k8s developers and to demonstrate validating k8s manifests locally in order to cut down development time.

We’ll create a Kubernetes Cluster locally on our machines which will let us test our code more quickly, and give us a space to experiment and make bad decisions.

Creating a Local Kubernetes Cluster

We’ll use minikube to develop on Macintosh. MicroK8s may be a good option for linux. In both cases we’ll need to have docker installed as well.

If you’ve installed everything correctly you can run:

1
minikube start

Perhaps you’ll note your laptop’s fan trying to take off while you’re regaled with emojis. Hopefully we’ll see the following surfer:

1
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

This is important to note. kubectl is the command we’ll be doing work with, and they were thoughtful enough to reconfigure our environment to point to the minikube cluster to get started. You may eventually have many clusters you’ll want to connect to (~/.kube/config is a good place to save them) and you can point kubectl to different clusters using environment variables.

Connecting to minikube

You have your own kubernetes cluster! You’re free to use this as you will. kubectl cluster-info is a good sanity check.
You’ll probably use the following commands useful for running kubernetes manifests you’re working on.

1
2
3
4
5
6
7
8
9
10
11
# deploy your deployment/service/pod...
kubectl apply -f path/[file.yml]

# undo
kubectl delete -f path/[file.yml]

# or if you've made a mess, and just want to kill stuff.
kubectl delete [pod|deployment] name-to-delete

# process list
kubectl get [pods|deployment]

If you want to check the syntax of manifest files you’ve written, consider the following. Checking your syntax on a local cluster without bumping into all your coworkers is already a big win. You’ll have reliable quick results without having to a server.

1
2
# this checks the `manifests` folder
kubectl apply -f manifests --dry-run

docker images in minikube

You’re eventually going to want to use a docker image you have built. Unfortunately minikube is running in it’s own container and your local machine’s images won’t be available there. It will probably make sense to share a docker registry between the two hosts, but a quick solution is to just build the docker images where you want them (in minikube.) Your code and local files should all show up in the minikube container.

You can change your shell’s environment to point to the minikube docker:

1
2
3
eval $(minikube docker-env)

docker build . -t your-tag-name # this will now build to your minikube repo

NOTE: You can always ssh into the minikube instance with minikube ssh if you want to tinker.

image pull problems

If you’re having problems with pulling docker images and you already have them locally, you can use the imagePullPolicy configuration as IfNotPresent or Never.

Otherwise updating .docker/config.json with your private registry may help.

connecting to your services

minikube is going to require an extra step to provide service access to your host machine. AKA: If you want to connect to your web service from your browser, you’ll have to do this.
If you have set up a k8s service, you can run minikube service {service-name} and a tunneling process will start up (and maybe open your browser!)

If you are using LoadBalancer, minikube tunnel may work better for you.

conclusion

As technologists, we always need to be sharpening our tools. Hopefully the steps above have helped you improve your development environment and sped up your kubernetes development process.

MoSCoW Method of Software Proposals

MoSCoW method

Feature creep is a real thing. It’s pretty hard to present any technology proposal
to a group of engaged engineers without large “blue sky” converstations starting.

The MoSCoW method gives you four one word classifications that will make your intentions clear.

  • MUST - we will build this now
  • SHOULD - not committed to
  • COULD - not planning on it
  • WON'T - not doing

The wiki page is so good that I’m just going to link to the MoSCoW method.

On Organizing Machine Learning Code

Data Scientists breaking up their notebooks into standard methods (or modules) can improve their process without interrupting their workflow.

The following proposed organization of code would aid in:

  • faster deploy times
  • less code refactoring
  • interchangable functions
  • easier code hand offs
  • reproducable results

How do you run this? Try the makeup (pypi.org) framework designed just for this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
"""
Code taken from: https://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html

It has been modified to encapsulate and label sections of code.
"""
import numpy as np
from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier


def load(**kwargs):
"""
"Get the data" this sparse function allows a framework
to be able to swap out different data sets easily.
"""
iris_X, iris_y = datasets.load_iris(return_X_y=True)
return iris_X, iris_y


def features(X, y, **kwargs):
"""Optional method to transform data, add or remove columns"""
# if i had pandas here, I would do something!
return X, y


def split(X, y, **kwargs):
np.random.seed(0)
indices = np.random.permutation(len(X))
X_train = X[indices[:-10]]
y_train = y[indices[:-10]]
X_test = X[indices[-10:]]
y_test = y[indices[-10:]]

return (X_train, y_train),\
(X_test, y_test)


def train(X, y, **kwargs):
"""Create and fit a nearest-neighbor classifier"""
model = KNeighborsClassifier()
model.fit(X, y)

return model


def predict(model, X, **kwargs):
return model.predict(X)

tg.

ENV VARS (Environment Variables)

Environment Variables or “ENV vars” are under used and often misunderstood.
They are critical for separating configuration from any code you write. By
learning how to use this SHELL resource in your programs your code will
become more extensible and useful to a larger group of people and environments.

What is an ENV Var?

Environment Variables are variables for your Terminal Shell. They are passed to every
program you run and can be used by those programs. If you run commands in your IDE,
these are still in play (there is probably a field that you could define these in.)

Do I have ENV Vars? Yes. From a terminal, type:

1
env

That’s all of them! Ignore some of the gobblety-gook. Get these right and your
work environment will “just work” and feel magical. Some worth pointing out:

  • USER: Want to default to using your username in your code? How about defaulting to the username of whomever is using the code? Use this variable.
  • HOME: How many times do you have /Users/{your_username}/... pasted in your code? Use this variable. Or maybe you want to use…
  • PWD: Your Present Working Directory.
  • EDITOR: Choose what editor application will open when you need to edit a file. (Think git asking for commit message.) vim, nano are examples, but there is no reason you can’t use Sublime or any IDE!
  • PATH: A : separated list of where your terminal looks for programs to run. FUN!
  • PS1: Change the prompt in your terminal.

Using ENV Vars

ENV vars do everything you expect a variable to do in any programming environment. See this example in bash, and most shells.

1
2
3
4
5
6
7
8
9
# set it
export MYVAR=123

# read it
echo $MYVAR

# remove it!
unset MYVAR

These values are useful if you’re doing regular tasks in your terminal…

1
curl $MY_FREQUENTLY_USED_SERVICE

…and they also are a great way to inject configuration into your programs! After you “export” your variable 100% of the programs you run in that terminal will get the variable passed to it. Most programs may ignore it, but your program won’t

In python, as an example you can retrieve env vars with:

1
2
3
4
5
6
from os import environ

required_var = environ['REQUIRED_VAR'] # will throw a `KeyError` if doesn't exit.

SOME_DEFAULT_VALUE = 456
myvar = environ.get("MYVAR", SOME_DEFAULT_VALUE)

SOME_DEFAULT_VALUE is optional, it could just be excluded, but if you can do something useful with your program without it defined consider adding a useful default value. Maybe you need a folder to save data into, no good ideas to put it? $HOME may make sense.

This will look for a DATA_DIR variable that they could have set, or it will default reasonably to a directory called data in their home directory.

1
2
3
4
from os import environ

DEFAULT_DIR = environ.get("HOME") + "/data"
DATA_DIR = environ.get('DATA_DIR', DEFAULT_DIR)

Are you pretty sure about some constant value you’re setting, but you may change it in the future? Optional ENV Variable is a slam dunk. You don’t even have to export the variable, you can define it on the same line as your command.

1
DATA_DIR=/var/data/ python myscript.py

Usernames, Passwords, sensitive data? You got it, ENV Vars.

1
2
USER = environ.get("USER")         # defaults to the username on the computer
PASS = environ.get("MY_APP_PASS") # not saved into github anymore!!

You’ll get bored of setting these quickly, and miss the biggest strength of env vars if you don’t have your env vars set for you as you open your terminal. Look in your $HOME directory (ls -al $HOME) and you’ll find “rc” files. Maybe .zshrc or .bashrc. These configuration files are executed whenever you open a new terminal. You can put any code you’d like run before your sessions start in these files. export your ENV Vars here!

1
2
# ~/.zshrc
export SERVICE_HOST=whatever.com

Now you can always refer to $SERVICE_HOST in your terminals or programs! Save it once, use it always!

By separating the configuration of your app from your code you will be able to quickly reconfigure your code (am I pointing to my local machine, staging or production?) and more easily share config dependent code with others!

dsky

You’re a developer. You work on several projects. You use version control,
probably git, and you have stuff you need to get done.
You’re writing code, in multiple projects. Running, testing, and serving these projects.
And you probably even are stumbling through conjuring up some weird incantations
to get stuff done in the command line. dsky has some solutions for you.

The Concept of DSKY

The basic concept of Apollo 11’s computer, DSKY, was that you can specify any
unique task with only a NOUN and a VERB.

dsky is a NOUN VERB language for expressing what you would like to do in your
projects.

Read up about the history of DSKY on this anniversary year of Apollo 11!

NOUNs

Presently NOUNs are exclusively git repos. dsky is a language to express what you
would like to do in your projects.

VERBs

What would you like to do?

  • dsky $PROJECT isgo - PROJECT INITIALIZATION - in a microservice world, context switching is not trivial.
  • Execute anything.
    • compile, test, package, deploy projects
  • github - go to the project’s github page
  • (e)dit - open the project in your favorite $EDITOR

isgo

PROJECT INITIALIZING. This is core benefit to this system, and it involves several stages of
your dev cycle. Let’s say you decide to work on a project. It could be yet another time, or
your very first time working on it. Why do you have to distinguish? Just define the project,
which can usually be one name. This name is probably the project’s name in git. This project
is humane-api or probably trevorgrayson/humane-api to you, since you’re not working on it.

dsky trevorgrayson/humane-api isgo will:

  1. confirm the project is local on your computer. check it out if necessary.
  2. change to the directory of the project, as we will be working in it.
  3. fetch any updates to the code base, and print them on screen when possible.
    this is your “news” update for the project. what have your co-contributors been doing?
  4. display the status of the current checkout. what did your forget to check in last time?
    what weird branch are you on?
  5. command prompt. get to work.

isgo will be run automatically before any $VERB you specify.

See the install section if you want to personalize this a bit better.

github

1
dsky $PROJECT_NAME github

general status of project

1
dsky $PROJECT_NAME stat

List of open PRs.

1
dsky $PROJECT_NAME pr

checkout remote branch or tag

1
dsky $PROJECT_NAME checkout $TAG

Getting Started

Install dsky

dsky is a single file bash script. You will want to install it somewhere in
your $PATH. What’s your $PATH? This will work if you have write access
to the first directory in your $PATH.

Feel free to curl or wget it into a different folder.

You may take a look at the script here, or run the following to install it:

1
curl necessaryeval.com/dsky > `echo $PATH | cut -d: -f1`/dsky

env variables

Here are some good variables to have set for these projects, and in general.

1
2
3
4
5
export PROJECTS=$HOME/projects
export GIT_HOST=git@github.com
export GIT_USER=grevrtrayson

alias dsky=". dsky" # will let dsky change your directory for you.

export GIT_HOST into your ENV, or set it perminantly.

$REPO