In writing the most recent Hack This (“Scrape the Web with Beautiful Soup”) I again found myself trapped between the competing causes of blog-brevity and making sure everything is totally clear for non-programmers. It’s a tough spot! Recapping every little Python (the default language of Hack This) concept is tiring for everyone, but what’s the point in the first place if no one can follow what’s going on?
This post is then intended then as a sort of in-between edition of Hack This, covering a handful of Python features that are going to recur in pretty much every programming tutorial that we do under the Hack This name. A nice thing about Python is that it makes many things much clearer than is possible in almost any other language.
A variable is a name that we can use to store data. The idea is different than the variables used in normal algebra, where variables are used most often as unknowns that we’d like to solve for or use as placeholders for solutions. In programming, we usually give the variable a value before we do the computation, which then uses the variable value to yield a solution. Just understand that a variable stands for something else: a number, some text, a true or false value, another variable.
In Python, we don’t have to give variables types. This is different than in many languages, where we have to specify what a variable is meant to contain. In C++, for example, it’s illegal to assign some text to an integer variable. In Python, we don’t even have to specify that it’s a variable at all, which is very different than other programming languages. Here’s a variable assignment in C++:
int x = 8
var x = 8
In Python, which doesn’t require a type or a variable designation:
x = 8
Here’s an example of how we might use a variable. We first assign a number to a name and then use that name in a simple addition operation. Then, we assign the result to a new variable and print the contents of both variables.
x = 8
x = x + 2
y = x
The print command takes whatever expression follows it, evaluates that expression, and then outputs it to the screen. This evaluation step means that we can simply print values, as above, but we can also print the results of expressions, as below.
print 8 + 10
Which results in “18” being output to the screen (rather than “8 + 10”).
Almost anything can be assigned to a variable in Python. This means simple data like numbers and text, but also reusable sections of code, as in functions or prewritten Python modules. If you’ll remember from the web-scraping Hack This, we referenced functions from the Beautiful Soup package via a single variable. This is very, very common.
Python maintains a constant value called None. We can assign this to variables to indicate that the variable exists but that it does not contain a value.
A Python list is what would be called an array in most other programming languages. It is precisely what it sounds like: an indexed list of values. Anything that can be assigned to a variable can be an entry in a list. It’s the same thing, really.
If we want to create an empty list, we do this:
new_list = 
If we have a list that already has some stuff in it, we can overwrite an entry at a specific index like this.
new_list = "hey list"
The number in brackets is the index corresponding to a location within the list that has a value. This is how we put a value in, and also how we retrieve a value.
This will print “hey list.”
You will often see values added to lists with a function called append(). This adds a new value to the end of the list, which has the effect of increasing the size of the list. It’s also possible to add values to a list with insert(). The difference is that the latter puts a value into the list at a specified location (index, as in insert(5,”hey list”)), while the prior just puts the new value at the end.
If the list isn’t long enough to add an item at the index 5 position, Python will put the new item at the next available index. Note that inserting values like this doesn’t nuke whatever was originally in the specified index. Everything is just scooted upward.
We can remove a value from a list with the remove() function. This requires us to give said function a value to look for. If we wanted to remove the value “hey list” fromnew_list, we could do this.
If there is no “hey list” value, Python will return an error.
The pop() function is kind of the inverse of the append() function. It will remove the last item in the list.
A Python definition is what’s usually called a function or method in other programming languages. It’s a piece of code that is written once and can then be called again and again by name. A Python definition will very often produce or returna value. It may also require parameters, or input values. These are supplied in between the parentheses following the definition’s name. Say we have a function that returns the greater of two numbers. We might call it like this:
the_max = max(1,2)
The number 2 will be printed (assuming that the function is actually defined somewhere that the current script can access; see below).
Imagine that you wrote a bunch of code that you’ll probably want to use again, or maybe that someone else would find useful in their own program or script. You could package it into a module, which is just another, separate Python file. Like, when you import a module in your script, you’re just pointing the Python interpreter to some code that lives elsewhere that the current script depends on.
For example, if you needed some random numbers in your program, you are surely not going to implement your own random number generator because that would be a real pain and you would also fuck it up. I would fuck it up. Instead, we import therandom module, which comes with this functionality included.
We now have access to the random module. What does that actually mean? Well, you might start by looking at the documentation, which will list all of the module’s functions and tell you how to use them. When in doubt, read the docs.
The general idea of our import is that we now have access to all of the randomgoodies via the keyword random. It’s like a portal or doorway … or a bartender.
If we want, say, a random decimal number between 0 and 1, we can just use therandom() function that lives within the random module. It’s easy:
random_number = random.random()
Get it? We have access to the function through the module. We access all of the functions that are built into Python’s list structure in the same way.
Packages are directories of modules. If we import a package into our script, we get access to all of its contained modules. We can also selectively import modules that are components of packages, so we don’t have to bring in a bunch of extra code that we don’t need. It would look like this:
from somePackage import someModule
4.0) FOR LOOPS
It’s often the case that we want to repeat some operation in our code. Say, for example, that we have a list, and we want to process each list entry in the same way. We could use a for-loop. Perhaps we just want to print the contents of our list in order. We only need two lines of code:
for item in my_list:
The print statement here will execute once for every item in my_list. Every time the loop restarts, the next entry in the list will be assigned to the variable item. Note that the colon here is required syntax and trying to write a for-loop without it will cause an error. The indentation is likewise required for code that is contained within the body of the loop, i.e. code that will execute on every loop iteration.
We can write for loops even if we don’t have a list. Say we just want to add up all of the numbers between 0 and 10. We would do that like so:
total = 0
for i in range(0,10):
total = i + total
Python here is setting up a list of numbers between 0 and 10 for us with the range()function. This can be extremely useful.
5.0) IF-THEN STATEMENTS
The conditional statement is a crucial piece of any programming language. It allows us to write code that will only be executed if some condition is met. It’s pretty simple:
if (2 > 1):
The statement is true, so the script will print “duh.” We can elaborate on this by using an else clause, like so.
if (2 > 3):
print "no way"
Easy enough. Note again that the colons and white space are required.
Part of the Python attraction is that it’s a very pristine language. It’s not all junked up with “unnecessary” symbols, particularly curly braces. In many if not most languages, curly braces are used to group together lines of code. This grouping has all kinds of meanings in programming—for example, it may demarcate the section of code that is to be repeated in a loop, or the section of code that is to be executed or ignored in an if-statement. Our if-then from above would look like this in C++:
cout << "duh";
Instead of curly braces, Python uses whitespace. Statements that are at the same level of indentation are taken to be grouped together. Consider this:
No matter what, the script is going to print “derp” because it’s not part of the conditional. However, if x is not true in the code below, nothing will be printed because the second print statement is part of the conditional.
This is a fairly extreme crash course, but the aim is to clarify some things you’ll see elsewhere in Hack This and out in the programming world. For a deeper introduction, the Python documentation has you covered in its own tutorial. Learnpython.org, meanwhile, offers a beginner tutorial that adds interactivity. And then, of course, there is Learn Python the Hard Way. Recommended.