Python and Abstractions

A programming language is low level when its programs require attention to the irrelevant

The quote above is by Alan Perlis, from his 1982 article for ACM SIGPLAN titled "Epigrams on Programming". I came across the full list of epigrams from the tweet below and this particular epigram (epigram #6) caught my attention because it emphasises a mistake that I see beginner Python developers do over and over.

High level and low level languages

According to Perlis, a low level language is one in which you have to pay attention to the irrelevant. What does this mean? Let us take the example of making a function call.

If you want to make a function call in assembly language, there is a sequence of steps you need to follow

  1. Save the address where execution should continue after the function call
  2. Create a new stack frame and put it's address in the stack pointer register
  3. Next you need to push all the function parameters to the new stack frame
  4. Jump to the address of the function
  5. Within the function, you pop all the function parameters from the stack
  6. Then do the actual function computation with the parameters
  7. Put the return value into another register
  8. Jump back to the place where execution should continue after the function
  9. Access the return value from the register and continue execution

Out of all these steps, only step 6 is the relevant one from the perspective of the computation. All the other steps are things you need to do in order to keep track of ancillary information like saving the execution point, passing parameters, storing return values etc.

But since assembly language is low level, you have to pay attention to all these irrelevant details while you are programming. By irrelevant, we mean irrelevant to the problem being solved, it is still very relevant to the CPU because the function call cannot happen without those details.

By contrast this is how you make a function call in Python (or any other higher level language)

def add(a, b):
    return a + b
    
add(2, 3)

In a high level language like python, you just call the function and that's it. The language runtime will take care of doing whatever is necessary to make it work on the CPU. The programmer just needs to code the business logic, without worrying about stack frames and CPU registers. You don't need to pay attention to the irrelevant steps.

Coming to Python from C / Java

Compared to assembly one might call C or Java as high level languages. However, compared to Python, these two languages are still low level. There are a bunch of irrelevant details that you need to pay attention to in those languages which are abstracted away in Python.

A typical case is looping. In C or Java it is typical to use index based loops like this [1]

for(int i=0; i<list.size(); i++){
    print(list[i])
}

When developers come to Python with prior experience in Java, suddenly they realise that Python does not have index based looping. After a few moments of panic and some searches through Stack Overflow, they realise that they can do something similar with the code below and then all is well.

for i in range(len(list)):
    print(list[i])

All is not well

It is worth thinking about why python does not have index based looping to start with.

The whole idea that you need to have a variable to track the index, then increment it, then check when it reaches the end.... all those are irrelevant details that the programmer has to manage in the process of looping [2]. A language that makes you do all that would be considered low level, compared to how you should do it in python

for item in list:
    print(item)

The python code does not have an index variable in sight, no incremention, no conditions. You focus on writing code to solve the problem, rather than writing code to manage the loop.

Now this is pretty basic stuff, and developers quickly learn to write the python code above. But Python's looping abstractions go much, much deeper and most developers are not taking advantage of it.

Here is an example to show what I mean.

Divisible Sum Pairs Algorithm

Divisible Sum Pairs is a popular challenge on HackerRank. You can read the problem statement here but I will briefly summarise it below.

  • Given a list of numbers, eg: ar = [1, 2, 3, 4, 5, 6] and a divisor eg: k=5
  • Count how many pairs of numbers whose sum is divisible by k
  • In the example above, 2 + 3, 1 + 4 and 4 + 6 are the three pairs of numbers whose sum is divisible by k=5 so the final answer is 3

Do a quick search for the solution and you will find something like this

def divisibleSumPairs(n, k, ar):
    count = 0
    for i in range(n-1):
        j = i+1
        while j < n:
            if ((ar[i] + ar[j]) % k) == 0:
                count += 1
            j += 1
    return count

Now granted, HackerRank is not intended to focus on clean code, but even if you took an intermediate developer working in Python, you will often come across code like this.

You see, in this problem we don't want to just simply loop through the numbers in the list individually. We want every pair of numbers.

The for loop and while loop and two index variables i and j are all used to control the loops and give us pairs. The actual code to solve the problem is on lines 6 and 7, everything else is incidental to the problem.

Is there a better way to solve this problem?

Well, the itertools module has a handy function combinations. Using combinations(ar, 2) will automatically give us every pair of items from an iterable and then we can loop though the pairs (Read the docs for combinations here). The solution now becomes

def divisibleSumPairs(n, k, ar):
    totals = [sum(pair) for pair in combinations(ar, 2)]
    return len([total for total in totals if total % k == 0])

If you want, you can easily combine both into a single line

def divisibleSumPairs(n, k, ar):
    return len([pair for pair in combinations(ar, 2) if sum(pair) % k == 0])

Not a single index to be seen. The combinations function abstracts all the logic to get pairs and our code focuses on the essential part of the solution, excluding the irrelevant details.

So, the next time you want to loop in an unusual way and think you need to get into looping by index, first take a look at the itertools module. Chances are there is something in there to help you write code at a high level instead of messing around with indices and nested loops.


  1. Java has iterator based looping too, but it is limited. You can't use it to loop a stream for example ↩︎

  2. Besides, you can only use index based loops on data that is subscriptable ↩︎

Did you like this article?

If you liked this article, consider subscribing to this site. Subscribing is free.

Why subscribe? Here are three reasons:

  1. You will get every new article as an email in your inbox, so you never miss an article
  2. You will be able to comment on all the posts, ask questions, get answers etc
  3. Once in a while, I will be posting conference talk slides, longer form articles (such as this one), and other content as subscriber-only

Comments

Sign in or become a Playful Python member to join the conversation.
Just enter your email below to get a log in link.