Python Data Structures Limitations and Solutions

Python variables are good to store volatile information. This value may change over time. But, when you want to store a long list of information, which doesn’t change over time (non-volatile) or changes less frequently, variables are not a good option. For example, you want to store the names of the months of the year, phone book, home essential items list. What is the other option available in Python?
Yes! Data Structures are available in Python to overcome this scenario. Below are the different data structures are available:

  • Strings
  • Lists
  • Tuples
  • Sets
  • Dictionaries
  • The additional Data structure in the collection module
  • The additional Data structure in the heapq module

Strings:

The string data structure is an immutable sequence of characters. Strings can be declared using single and double-quotes.

Str1 = 'The surface of the circle can be calculated using predefined formula'
Str2 = "My Name is Python"

There is one more way to define strings by using triple quotes. The purpose of the triple quote string is to specify multi-line strings.

Str3 = """Triple quotes are used for writing comments
which spread over more than a single line"""

Str4 = '''My Docstring test'''

Other examples of the declaration of strings are given.

hello_1=30
print(hello_1)
Output : 30

Special character are not allowed while string identifier.

myname@='abc'
print(myname@)
Output : File "", line 17 print(myname@)SyntaxError: invalid syntax

There are many built in functions are available for manipulating strings:
zfill(),isdigit(),isalpha(),isupper(),islower(),istitle(),isspace(),len(),count(),strip(),replace(),translate(),partition()

Limitation of string:

String is a primitive data structure. It works well if a single item has to be assigned in a variable. But, when multiple items need to be handled in a single variable like months of years or students list of a class, strings are not useful. The solution is List and Tuple Data Structure which handles these requirements effectively.

List:

Lists are ordered collections of items like strings, integers, or even other lists. Each item has an index starting from zero. The list is a mutable data structure i.e. can be extended or reduced whenever required. The list can contain heterogeneous data types as items. The list is enclosed in square bracket. To declare a list:

lista=[1,2,3.4,'Five']
print(lista)
print(lista[0])                                                                                                                                                             print(lista[3])
Output :
[1, 2, 3.4, 'Five']
1
Five

There are many list manipulation functions are available in Python:
append(),extend(),index(),insert(),remove(),pop(),count(),sort(),reverse()
list[first index:last index:step]
Python provides one interesting feature slicing on list. Slicing is used to build new list from existing list.Now lets see slicing operation on list. First see pictorial representation of list items with index values.

list index
Lista Image

Now see below code how it slice the existing list lista based on index values. First parameter in square bracket is starting index and second parameter is stop value.List will be sliced till (stop value-1).

lista = [0, 1, 2, 3, 4, 5]
print(lista)
print(lista[2:])    #Slicing start point will be index 2 which is item “2”
print(lista[:2]     #Slicing will be from index 0 by default to index 1
print(lista[2:-1])  #Slicing will be from index 2 to index -2(Stop value-1)
Output:
[0, 1, 2, 3, 4, 5]
[2, 3, 4, 5]
[0, 1]
[2, 3, 4]

Limitation of List:

The list has the limitation that one can only append at the end. But, in real life, there are situations that a developer has to add items at the starting of the existing list which becomes difficult in the list. Sometimes the rotation of items within the list is also required which is also a limitation in the list. The solution is Deque Data Structure which provides flexibility to add items from both ends and rotate items within deque.

Tuples:

Tuples are ordered collections of items similar to the list data structure. The main difference between a tuple and List data structure is that tuples are immutable i.e. cannot be changed once declared. Therefore tuple manipulation is faster than the list. The tuple is enclosed in parenthesis.

tuplea =(1, 'meet', 28, 'abc')
print(tuplea)
print(tuplea[3])
tuplea[3] = 'def'  #This is immutable Hence TypeError
Output:
(1, 'meet', 28, 'abc')
abc
TypeError: 'tuple' object does not support item assignment

There are various methods available for tuple manipulation as well. Count(),index(),len(),max() are few examples.

The interesting feature of the tuple is that it can be used as a swap function as well.
Suppose you want to swap(interchange) value of 2 variables a and b, you can directly use tuple variable for this operation. Instead of using temporary variable logic which involves more commands, tuple swapping can be used which is one line command.

a = 2
b = 3
(b,a)=(a,b)
print("a=",a)
print("b=",b)
Output:
a= 3
b= 2

Observe how easy is to do swapping with a tuple data structure.

Limitation of Tuples:

Tuples cannot be changed once declared. But, in real life, there are situations where adding and updating items is required. Hence tuple does not fit into these scenarios. The solution is List Data Structure which provides flexibility to add, update at any point in time.

Sets:

Sets are mutable unordered collections of items. Sets are used to build a sequence of unique items as it does not allow duplicates. This data structure provides mathematical operations like union, intersect, and more.
Below is the code to declare set:

a = set([1, 2, 3,3,3, 4,4,5])   #Verify in output , it has removed duplicate values.
print(a)                        # Only unique values are displayed.
Output: {1, 2, 3, 4, 5}

The set is unordered hence it cannot be accessed by index method as list and tuples.Now see if you try to provide index in set, you will get error.

a = set([1, 2, 3,3,3, 4,4,5])
print(a[1])
Output:TypeError: 'set' object is not subscriptable

The operations for set data structure are as below.

a = set([1, 2, 3,3,3, 4,4,5])
b = set([3, 4, 5, 6])
print(a | b) # Union
print(a & b) # Intersection
print(a < b) # Subset
print(a - b) # Difference
print(a ^ b) # Symmetric Difference. Display items which are not common
Output:
{1, 2, 3, 4, 5, 6}
{3, 4, 5}
False
{1, 2}
{1, 2, 6}

Limitation of Sets:

Sets have a limitation that it can not be used as Key in Dictionary. But, in real life, there are situations that a developer needs to consider set as Key in Dictionary variable. There is one more limitation with the set is that set of sets is not possible. The solution is Frozenset Data Structure which overcomes both limitations of set data structure.

Dictionaries:

A dictionary is an unordered sequence of items. Each item has a key and a value. Items can be accessed using a key. Keys should be unique in the dictionary. A phone book is one example of a dictionary data structure. It holds Name as Key and contact detail as value.

DictionarySyntax
DictionarySyntax

Below is the example of few dictionary methods:

dict = {'first':'TestDict', 'second':[1,2]}
print(dict.keys())   # displays all the Keys
print(dict.values()) # displays all the Values
print(dict['first']) # displays the value of given Key ‘First’ . Value of this key is TestDict
print(dict.items())    # displays all the Keys and values
print(dict.get('first')) # displays the value of given Key ‘First’ . Value of this key is TestDict
print(dict.pop('first')) # displays the value of given Key ‘First’ and delete this item as well
print(dict.items())      # Now there is only one item in dict after pop()
Output:
dict_keys(['first', 'second'])
dict_values(['TestDict', [1, 2]])
TestDict
dict_items([('first', 'TestDict'), ('second', [1, 2])])
TestDict
TestDict
dict_items([('second', [1, 2])])

Limitation of Dictionaries:

The only limitation is that it is unordered data structured. Sometimes, item order needs to be maintained, in which items have been inserted. The Solution is the OrderedDict data structure that is available in the Collections module.

Frozensets:

Now there are few limitations with set data structure.

    1. 1.Sets are mutable, therefore cannot be used as keys in dictionaries.
    1. 2.Another problem is that sets themselves may only contain immutable values, and thus may not contain other sets.

But sets of sets (Nested Sets) often required, Hence Python provide frozenset data structure. Below code snippet shows how frozenset resolved the issues with sets:

a = set([1, 2, 3])
b = set([2, 3, 4])
print(a.add(b))
Output:  TypeError: unhashable type: 'set' . Sets cannot be added

Now replace the add() with frozenset as below. See how the issue of the set has been resolved.

a = set([1, 2, 3])
b = set([2, 3, 4])
#print(a.add(b))
a.add(frozenset(b))
print(a)
Output:   {1, 2, 3, frozenset({2, 3, 4})}

A set data structure cannot be used as a key in the dictionary so Frozenset is the solution.Lets try with set data structure and see what happens in the code:

dictsetkey = {set([1, 2]):1}
Output: TypeError: unhashable type: 'set'

Now replace set with frozenset and see the magic.Error has disappeared:

dictsetkey = {frozenset([1,2]): 'Frozenset as Key in dictionary'}
print(dictsetkey[ frozenset([1,2]) ])
Output: Frozenset as Key in dictionary

Collection and heapq modules provide additional data structure with robust features. We will see one example of the collection module.

Deque:

Deque stands for double-ended queues. You can add and remove items from both sides unlike lists. It can also be useful if items should be deleted in the same order as added. This is available in the collection module. Below is the code to create a deque data structure and to add an item at both ends.

from collections import deque
doubleque = deque(range(5))
print(doubleque)
doubleque.append(5)
print(doubleque)
doubleque.appendleft(6)
print(doubleque)
doubleque.pop()#item deleted from right
doubleque.popleft()#item deleted from left
Output:
deque([0, 1, 2, 3, 4])
deque([0, 1, 2, 3, 4, 5])
deque([6, 0, 1, 2, 3, 4, 5])
5
6

One interesting feature of the deque is rotating.you can rotate the last item as the first item.

print(doubleque)
q.rotate(1)
print(doubleque)
Output:
deque([0, 1, 2, 3, 4])
deque([4, 0, 1, 2, 3])

Heapq module allows you to add objects in arbitrary order and at any time (possibly in-between the adding) find (and possibly remove) the smallest element. These data structures are related to heap manipulation.
You can see that these additional data structures are robust. There are many other advanced data structures available in Python. For in-depth knowledge, you need to get your hands dirty in the coding. We will cover these additional powerful data structures in separate articles.

Pankaj Kumar
Pankaj Kumar
Articles: 207