Lists in Python

Basics of Lists

Lists in Python are one of the most used and convenient data types. Some programmers who come to Python with experience of other programming languages could confuse lists and arrays. In lists, unlike arrays, you can store heterogeneous data, you can store integers, floating-point numbers, strings, other lists, dictionaries, sets, whatever, and it's all mixed up. Let's take a quick look at creating lists, adding and removing items.

You can create lists by writing their elements in square brackets. Or through the service word list. By the way, it is strongly discouraged to create variables named list! If your imagination is terrible, it is better to add an underscore at the end and write list_, the same applies to other service words such as dict, set, tuple, sum and others.

 
lst = [1, 2.0, 'three', {4}]
lst.append({'five': 5})
print(lst)
>>[1, 2.0, 'three', {4}, {'five' : 5}]
 
lst = list((1, 2, 3))
lst.extend([1, 2, 3])
lst.remove(1)
print(lst)
>>[2, 3, 1, 2, 3 ]

Note that I used parentheses inside parentheses when creating the list with the list function. The point is that the function expects exactly one iterable variable as an argument. In this case, it turns out that we are passing a tuple as an argument. Let's try to create some more lists, but with only one element.


lst = list('a')
print(lst)
>>['a']

lst = list (10)
>> TypeError: 'int' object is not iterable

We got an error when trying to create a list with a single int element through the constructor. If we tried to create the same list with a float number element, then we would also fail. Let me remind you once again that the constructor expects only an iterable argument at the input! A string, even if it has only one character, is iterable. By the way, what do you think happens if we try to make a list from an empty string?


lst = list('')
print(lst)
>>[]

Right, we get an empty list! You probably already guessed what happens when you create a list from a string. It will be iterated over, and as a result, we don't get a list of one element with the original string (although we can make such a list by specifying through square brackets). Still, we get a list of elements (letters) from the original string.


lst = list("abc")
print(lst)
>>['a', 'b', 'c']

You can also pass a multi-string as an argument. Let me remind you that in Python you can specify multi-string in triple quotes.


st = '''Hello,
World
!!!!'''
print(st)
lst = list(st)
print(lst)
>>Hello,
>> World
>> !!!!
>>['H', 'e', 'l', 'l', 'o', ',', '\ n', 'W', 'o', 'r', 'l', 'd ',' \ n ','! ','! ','! ','! ']

As you can see, there are letters and newlines symbols in the individual list items.

You can access an element of a list by index. Indexes in Python, as in many other programming languages, start at 0. An interesting and useful feature of lists in Python is that you can also use negative indexes. Then they will be counted from the end. The item at index -1 is the last item in the list, the item at index -2 is the penultimate item, and so on.


lst = [1, 2, 3, 4]
print(lst[0])
>> 1
print(lst[-1])
>> 4
print(lst[-2])
>> 3

It is crucial that if you try to get a nonexistent index, it will raise an IndexError. Therefore, in the case when we do not know for sure whether this index exists in the list, it is necessary to arrange additional checks or use the try...except construction.

However, sometimes you can avoid these extra checks due to some peculiarity of the Python language. Below is a sample code that collects two lists item by item into tuples so that the items inside the tuple are sorted. If one of the lists is longer than the other, then the difference is discarded. The code below will raise an error.


lst1 = [1, 3, 7]
lst2 = [1, 2, 6, 8, 9]
res = []
i = 0
while True:
  if lst1[i] < lst2[i] and i < len(lst1) and i < len(lst2):
    res.append ((lst1 [i], lst2 [i]))
  elif lst1[i] > = lst2[i] and i < len(lst1) and j < len(lst2):
    res.append((lst2[i], lst1[i]))
  else:
    break
  i + = 1
print(res)
>> IndexError: list index out of range

But if you change the order of the conditions, the algorithm will work correctly.

 
lst1 = [1, 3, 7]
lst2 = [1, 2, 6, 8, 9]
res = []
i = 0
while True:
  if i < len(lst1) and i < len(lst2) and lst1[i] < lst2[i]:
    res.append ((lst1 [i], lst2 [i]))
    i + = 1
  elif i < len(lst1) and j < len(lst2) and lst1[i] >= lst2[i]:
    res.append((lst2[i], lst1[i]))
    i + = 1
  else:
    break
 
print(res)
>> [(1, 1), (2, 3), (6, 7)]

The point is that Python uses "lazy evaluation" of logical expressions. This means that the expression is evaluated from left to right, and if the result is already clear, then no further calculations are performed. What does it mean that "the result is already clear"? For example, you might not evaluate x in the expression False and x, because regardless of the value of x, the expression will already be False. Similar situation with True or x, it will always equal True.


False and expression = False
True or expression = True

<expression> does not have to be only one condition. It can be a very long computed expression. It turns out that Python saves resources on evaluating this expression, which is why such calculations are called lazy. In the example, the working code first calculates if the index has already gone beyond one of the list's limits. And if the index is beyond Python no longer tries to access the i-th element of the list, so the error does not occur.

Slicing

Often there is a need to access not one element of the list, but immediately to some part of it. For example, take the first few elements, or the last, or every second. It is convenient to use slicing for such tasks.

The slice syntax is similar to simple access to an element by index, but you can enter from one to three values in square brackets, separated by colons. These three values are sometimes called "three s": start, stop, step.

Pay particular attention to the fact that the element with the start index is included in the slice, and the element with the stop index is not included. You can write it like this: [<first element to include>: <first NOT included element>: <step>]


lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
lst[1:8:2]
>> [2, 4, 6, 8]

In use, you can omit some of the values, and sometimes even all of them. The default value of the step is 1 but start and stop default values depend on the direction of traversing the array (a sign of the step parameter). Let's look at some examples.

 
lst[:8:2]
>> [1, 3, 5, 7]
lst[1::2]
>> [2, 4, 6, 8, 10]
lst[1:8]
>> [2, 3 , 4, 5, 6, 7, 8]
lst[::2]
>> [1, 3, 5, 7, 9]

This is how you can create a copy of the list.


lst2 = lst[:]
lst2
>> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

If you use a negative value of step, the list will move in the opposite direction. You can easily generate an inverted list in this simple way.

 
lst[::-1]
>> [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

You can use negative indices, but you need to understand which element they refer to clearly.


lst[-5:-1]
>> [6, 7, 8, 9]
lst[-5:9]
>> [6, 7, 8, 9]
lst[-1:-5:-1]
>> [9, 8, 7]

If you can't reach from the start index to the stop index in the direction specified by the step parameter, then the slice will be empty.


lst[9:1]
>> []
lst[1:9: -1]
>> []

It's essential that slicing doesn't do anything directly with the original list. It just provides access to a set of elements. After all the above manipulations, the list remains the same. It still contains numbers from 1 to 10. But at the same time, you can assign values to the elements of the original list through its slice. For example, let's assign new values to the odd elements of the list.


lst[::2] = ['a', 'b', 'c', 'd', 'e']
lst
>> ['a', 2, 'b', 4, 'c', 6, 'd', 8, 'e', 10]

Even if the slice size does not match the size of the list you are trying to put there, Python will still try to figure out exactly what you mean.


lst = [1, 2, 3, 4]
lst[1:3] = ['a', 'b', 'c', 'd', 'e']
print(lst)
>>[1, 'a', 'b', 'c', 'd', 'e', 4]

Instead of elements with indices 2 and 3, we inserted a new list of length 5 here.

We can go even further. Let's insert the new list into an empty slice outside of our original list.


lst = [1, 2, 3, 4]
lst[7:] = ['a', 'b', 'c', 'd', 'e']
print(lst)
>>[1, 2, 3, 4, 'a', 'b', 'c', 'd', 'e']

A new list was added after the last item. We can even make an empty slice due to misdirection and insert a new list into it.


lst = [1, 2, 3, 4]
lst[3:1] = ['a', 'b', 'c', 'd', 'e']
print(lst)
>>[1, 2, 3, 'a', 'b', 'c', 'd', 'e', 4]

But all this concerns only the situation with the default step value, if we change the step value, then Python begins to strictly monitor the coincidence of the slice size and the size of the inserted list.


lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
lst[::2] = ['a', 'b', 'c', 'd']
>> ValueError: attempt to assign sequence of size 4 to extended slice of size 5

I would not recommend actively using this feature, because the result is not always obvious, and as a result, the clarity of the code suffers.

Copying and deep copying

A list in Python is mutable type as opposed to string type, for example. When we declare a variable of the list type, then, in fact, this variable is a reference to an object in memory. When you change the elements of the list, the link to it does not change. Therefore, you can change a list inside a function by passing it as an argument. It can also cause some confusion. For example, we are trying to create two empty lists A and B.


A.append(3)
A
>> [3]
B
>> [3] 

We add an element to list A, but this element was added to list B too. So it happens then that A and B are not exactly two variables, they are actually aliases referring to the same object in memory. Therefore, if we do not need an alias, but a copy of the list, then we cannot simply use =, for example, we can use a slice, as we showed above. Or use the copy module.


import copy
A = []
B = copy.copy(A)
A.append(3)
A
>> [3]
B
>> []

The copy function creates a shallow copy of the list. It copies only the first level elements. If we copy a list whose elements are other lists, then the links to the nested lists will be copied.


A = [1, 2, [3, 4]]
B = copy.copy(A)
A [2] .append(5)
A.append(6)
A
>> [1, 2, [3, 4, 5 ], 6]
B
>> [1, 2, [3, 4, 5]]

The element with index 2 is a nested list, adding an element to it affected the copied list as well. And adding new item 6 to list A did not affect the copy. If you want to commit a complete copy of a list along with all nested lists of any level, then you must use the deepcopy function. It iterates over all nested elements recursively and copies the values of each element into a new list.


A = [1, 2, [3, 4]]
B = copy.deepcopy(A)
A [2] .append(5)
A.append(6)
A
>> [1, 2, [3, 4, 5 ], 6]
B
>> [1, 2, [3, 4]]

As you can see, list B retains an exact nugget of list A at the moment of copying.

Due to the fact that a variable of the list type is reference, we can create a list with an element which is that list. The depth of such a list would be endless. Just take a look at this magic.


lst = [1, 2]
lst.append(lst)
lst.append(3)
print(lst)
>>[1, 2, [...], 3]
print(lst[2])
>>[1, 2, [...], 3]
print(lst[2][2][2][2])
>>[1, 2, [...], 3]
Image from knowyourmeme.com

Write in the comments if you know of how this can be used in practice.

There is an interesting feature when using compound operators over lists. You know operators like + =, - =, * =, and so on. Usually, an expression x += y is the same as an expression x = x + y. Let's take a look at a little puzzle below. Try to guess what will happen to the lists without peeking the solution.


lst = ['a']
lst1 = lst2 = lst
lst1 = lst1 + ['b']
lst2 += ['c']

Do you think all three lists are the same and contain ['a', 'b', 'c']? Now, let's print the lists and look.


print(lst)
>>['a', 'c']
print(lst1)
>>['a', 'b']
print(lst2)
>>['a', 'c']

As you can see, they are different. Let's print the id of these lists.


print(id(lst))
>>140610071661064
print(id(lst1))
>>140610071659784
print(id(lst2))
>>140610071661064

Id of the lst and id of the lst2 are expectedly same, but id of the lst1 is different from them. The fact is that at the time of execution of the line lst1=lst1+['b'] a new list (a concatenation of the lst1 and ['b']) was created and then a link to this new list was connected to the variable lst1.

Therefore, expressions x=x+y and x+=y are not equivalent for lists. The first creates a new list and the second works like x.extend(y) so it changes the current list.

Stack and Queue

It is straightforward to implement useful data types such as stack and queue with lists.

Stack representation from Wikipedia.org

A stack is a data structure that operates on the LIFO (last in, first out) principle. You can think about stack as a stack of documents on top of each other. However, We can put the document on the top (push operation) and take the document from the top (pop operation). However, we cannot take the bottom document from the pile because then everything will fall apart.

Let's consider a stack implementation in Python. It is trivial, for the push operation we will use an append method, for the pop operation - the built-in function pop with no parameters. The only caveat is that we use an inverted list to display the stack status on the screen, but this is just for clarity.


class Stack(list):
  def push(self, el):
    self.append(el)
  def pop(self):
    return super().pop()
  def __str__(self):  
    return 'Stack (↓): \ n' + '\ n'.join(map(str, self[::-1]))
 
s = Stack ()
s.push(1)
s.push(2)
s.push(3)
print(s)
>> Stack (↓):
>> 3
>> 2
>> 1
s.pop()
s.pop()
print(s)
>> Stack (↓):
>> 1

We put in stack 3 elements (we performed the push operation 3 times), we put element 1 at the very bottom of the stack, then - 2, and at the very top - 3. Then we began to pop elements from the stack (we performed the pop operation 2 times), respectively, we first removed element 3 , second 2, and 1 remained in the stack.

Queue representation from Wikipedia.org A

queue is also a data structure, but it operates on the FIFO principle (first in first out) ... This is similar to the queue at the supermarket, if you are the first to get in the queue, then you will be served before everyone else. Accordingly, the queue also supports two operations: enqueue and dequeue. And these operations are just as easy to implement on lists. The enqueue operation is just the append method, the dequeue operation is the pop method with parameter 0. list.pop () takes as a parameter the index of the element to be pulled out of the list, here we take the leftmost element, that is, the element with index 0.


class Queue(list):
  def enqueue(self, el):
    self.append(el)
  def dequeue(self):
    return super().pop(0)
  def __str__(self):  
    return 'Queue (←): \ n' + ''.join(map(str, self))
 
q = Queue()
q.enqueue(1)
q.enqueue(2)
q.enqueue(3)
print(q)
> > Queue (←):
>> 1 2 3
q.dequeue ()
print(q)
>>Queue (←):
>> 2 3

We also queued 3 items and served 1 of them.

Using Lists in Multithreaded Computing

Processors in modern computers and even mobile devices usually have multiple cores. This means that you can use multithreading operations on them. Calculate different parts of the code in parallel, and then combine the results. The task of synchronizing threads is a bit tricky, but Python has a threading module that makes it easier to work with threads.

Let's look at an example of multithreaded checking if numbers are prime. First, let's write a function to check if a prime number is.


def is_prime(n, idx, res):
    r = int(n ** 0.5)
    i = 2
    while i <= r:
        if n % i == 0:
            res[idx] = (n, False)
            print(res)
            return
        i + = 1
    res[idx] = (n, True)
    print(res)
    return

It writes a tuple of a number and a binary value to the res list at position idx. True if the number is prime, False if the number is composite. And then it prints out the current values of the list items res. We need this function just for illustration, it is unlikely that you will use it anywhere in this form.

Next, let's define a PrimeCalculationThread class based on the Thread class from the threading module.


import threading
 
class PrimeCalculationThread(threading.Thread):
    def __init__(self, x, idx, res):
        threading.Thread.__init__(self)
        self.x = x
        self.idx = idx
        self.res = res
    def run(self):
        is_prime(self.x, self.idx, self.res)

When creating a stream, we pass x - the number that we will check, idx - the index of the res array, into which we will write the result of the check.

The run method is built-in; we override it so that it runs our test function with the parameters specified during initialization.

Next, let's write the code of the main program.


lst = [999999000001, 433785907, 1328, 87178291199, 3994021, 3997457]
res = [()] * len(lst)
threads = []
 
for i, x in enumerate(lst):
    temp_thread = PrimeCalculationThread(x, i, res)
    temp_thread.start ()
    threads.append(temp_thread)
for thread in threads:
    thread.join ()

We create a list of numbers to check. We define the variable for the results as a list of empty tuples; the length of the result list is equal to the length of the list of numbers to check. Then, in a loop through the indices and elements of the checked list, we create streams, one for each checked number. And we execute the start method defined in the threading module. This method runs our overridden run function internally. And in the last loop we wait for the end of all threads and collect them into one, this is necessary before continuing the program execution. Let's look at the output of this program.


>>[(), (433785907, True), (), (), (), ()]
>>[(), (433785907, True), (1328, False), (), (), () ]
>>[(), (433785907, True), (1328, False), (), (3994021, True), ()]
>>[(), (433785907, True), (1328, False), ( ), (3994021, True), (3997457, True)]
>>[(), (433785907, True), (1328, False), (87178291199, True), (3994021, True), (3997457, True)]
>>[(999999000001, True), (433785907, True), (1328, False), (87178291199, True), (3994021, True), (3997457, True)]

We see how the resulting list is filled in steps. Note that the script records results as they are calculated, not according to the original order of the numbers tested. For example, we put the result of checking the number 999999000001 last in the list, although it is the very first element in the input list.

Conclusion

We have covered some of the features of lists in Python. We learned how to make slices of lists. We get how to change all the elements from the slice at once in one action. We had a look at what shallow copying and deep copying are and how you can use them in Python. We tried to compare the lists. Implemented data types such as stack and queue based on simple lists. And we wrote an example program that performs calculations in multithreaded mode.

Lists are a very convenient and common data type in Python if you have programmed at least a little in this language, then most likely you were already familiar with lists. Still, I hope that in this article you have found something new and useful for yourself.