5/10/19

Simplified Algorithm to find largest Color Blob in a Grid (in Python and C++)

I have taught how to use algorithms and data structures for years as a professor. And many students think that just learning about algorithm and data structure is about choosing the correct implementation of a queue, search, standard graph algorithm, etc. from the either a library or canned code they learned in their class.

The best students can use a deep knowledge of how each algorithm and data structure works and then design variations or combinations of the original structures and algorithms to solve a problem. This shows an real application of knowledge and the ability to think out of the box that interviewers look for.

This article will show how I created a custom algorithm that combines a 2d array, and linked list, and some other 'tricks' to solve the problem of given a 2d grid where the cells of the grid are colored(based in integer in grid). I believe this is a very efficient, and with a small amount of code to solve this problem.

I was inspired to develop this method after the article: Bet you can't solve this Google interview question! by Kevin Ghadyani and then also by the Geeksforgeeks article: Largest connected component on a grid .

The Problem:

Find the size of the largest contiguous color area (the number cells in the color blob). For example given this grid with 4 different colors

[[1, 4, 4, 4, 4, 3, 3, 1], 
 [2, 1, 1, 4, 3, 3, 1, 1], 
 [3, 2, 1, 1, 2, 3, 2, 1], 
 [3, 3, 2, 1, 2, 2, 2, 2], 
 [3, 1, 3, 1, 1, 4, 4, 4], 
 [1, 1, 3, 1, 1, 4, 4, 4]]

... which represents this colored grid:

You can see the biggest color group are the underlined 1's, there are 9 of them

The approach:

There are a lot of ways to approach this problem. Most articles will create a graph and do depth first search recursivly or bredth first search. You need to keep track of what cells have been visited and have a way to pick the next un-processed cell.

When every you are working a problem with a grid that is represented by a 2d array, you don't need to use a generalized graph alorithm such as a graph library that keeps a map of nodes and list of adjacent nodes. The indexes to the 2d array serve as a constant time lookup, and the adjacentcy list is just the four directions left, right, top or bottom limited just limited at the outer bounds of the gird. You do need an effencient way to keep track of visited and to pick a unvisited cell to continue. You can easily do this by having each cell(node) have composit data of not just the color number, but a visited flag.

Below is both a python 3.6 version and a C++ version:

Python 3.6:

C++:

Hope you enjoyed this. -- Professor Jenkins

Check out my youtube channel https://www.youtube.com/user/gjenkinslbcc

3/8/19

I have made avaliable teaching videos for Excel from my classes I taught.

These videos were created for concepts in an introduction to Excel class that covers topics that I found students were having trouble with. They may be for an older version of Excel, but not much has changed in Excel for these basic concepts.

See https://teklern.blogspot.com/p/excel-teaching-videos.html

2/27/19

New Python 3.8 Assignment Expression Feature

Video on using Python3.8 Assignment Expressions, a great addition to python3.8, available now in alpha release to try it.



1/12/19

Python f-strings. Formating faster and better

Formatting data into strings for the user or other consumption is now faster and easier with python f-strings since python 3.6. This will give you the basics.

1/8/19

Python: Using nested multiple assigment in your code



Go past simple multiple assignment taught in beginning python courses to understand unpacking nested sequences with these techniques. (also known as deep unpacking)



11/22/18

Python Tuples: detailed overview

Python tuple detailed

Contents

    Intro to tuples
    Creating a tuple
    Visual showing how immutable works
    A look at dir() of list vs. tuple
    Tuple operations
    What is packing/unpacking
    Packing/unpacking in function parameters
    * in assignment on right side is unpacking
    * used on left for packing
    Nested assignment by example
    Combining * and nesting
    Extra: python bytecodes for creating tuples

Intro to tuples (smaller, faster, safer than lists)

Tuples are a workhorse data structure in Python for both the programmer and internally. They are faster and smaller than lists. The are involved with function parameters, packing and unpacking, and the multiple assignment statement. This article will detail everything I could find that you should know about them from basics to advanced.

b = ( 1, 2, 'abc')  # tuple - immutable, can't change the items referenced
a = [ 1, 2, 'abc']  # list -- you can add, change, or remove items referenced

both tuples and lists can reference any type of data 
unlike languages like C++ or java arrays

A tuple is an array of object references and it is ordered and indexed from zero.
You cannot modify a tuple entry (immutable), but you can modify the object that is referenced by a tuple slot.

Note that all the basic data types in python that you can write as a literal: int, float, string, complex, boolean are also immutable.

Advantages of tuples over lists:

  • faster, and less memory than lists
  • immutable feature leads to less possibility of bugs
    since it can't be changed after it is created
  • tuples can be used for keys in dictionary

Disadvantage of tuples

  • you must add all the members of the tuple on creation
  • you cannot add, remove, insert, or rearrange the items in the tuple, you must create a new one

Creating a tuple

Simple tuples are a list of expressions in (), with a trailing , or not.

A trailing comma is only required if the list only has one item.

On the rare occation you need a empty tuple, you can use () or tuple().

t1 = (1, 1.4, 'hello', 22 + 5)  # using () with comma separated values/expressions
t2 = (1, 2, 4,) # you are allowed a trailing , in the list
t3 = () # creates empty tuple
t4 = tuple() # also creates an empty tuple
print(f'\nt4: {t1}\nt5: {t2}\nt6: {t3}\nt4: {t3}')

fig 1

This figure shows how each tuple slot actually is a reference to an object, and each object in a tuple and be different types unlike languages like C, C++, or java.

Below is code showing several ways to create a tuple, see the comments for descriptions of what is happening.

t2 = (55)  # This does not create a tuple but just the integer 55
print(f'\nt2 type: {type(t2)} value: {t2}')

t3 = (55,)  # trailing comma needed to create a tuple with one item
print(f't3 type: {type(t3)} value: {t3}')

# use the constructor with a generator, list, string or any other non infinite iterable
t4 = tuple(range(5))
t5 = tuple("hello") # -> ( 'h', 'e', 'l', 'l', 'o')
t6 = tuple(( i**2 for i in range(1,11))) # squares of 1 to 10
print(f'\nt4: {t4}\nt5: {t5}\nt6: {t6}')

tuple definitions allow for any multiline formating including variable indents inside the ( ):

Note: any data in (), [], {}, has this flexibility of 'escaping' fromm the indentation rules in python

The definition below shows an example of using this flexible indentation code formattiing:

foo_data = (
         10,
       30,  30,
    40,  50,   60,
  70,  80,  90, 100,)

Visual showing how immutable works:

If you use the link below, and step though the code line by line, you will see a demostration of what immutable means with a simple example.

http://www.pythontutor.com/visualize.html#mode=edit

list1 = ['a', 'b', 'c']
tuple1 = (1, 2, list1)
tuple1[2][0] = 43
tuple1[1] = 'x'  # will cause error

A look at dir() of list vs. tuple

Try this code to see a list of all the attributes that a list and a tuple have. The last group of code just shows you the attributes that a list has that is not shared with a tuple.

print('\n\n--list attributes:')
print('attributes:', dir(list()))
print('number:', len(dir(list())))
    
print('\n\n--tuple attributes:')
print('attributes:', dir(tuple()))
print('number:', len(dir(tuple())))
    
print('\n\n--attributes from list that are not in tuple: set(list - tuple):')
diff = set(dir(list())) - set(dir(tuple()))
print('attributes:', diff)
print('number:', len(diff))

Tuple operations:

These are all the basic operations and functions on a tuple, this is well covered in any python course, so I will not elaborate here.

operation code, t is tuple description
len len(t) number of elements in t
in 43 in t true if 43 in t
min min(t) smallest value in t
max max(t) largest value in t
count t.count(3) number of times 3 occurs in t
constructor tuple(range(4)) new tuple from iterable
index t[3] fourth item in tuple
slice t[start:end:step] normal slice and variations same as lists
add (1,2) + (8,9) concatenate two tuples into new tuple (1,2,8,9)
multiply (1,9) * 3 concatenate itself 3 times (1,9,1,9,1,9)

What is packing/unpacking (moving to more advanced concepts)

packing: placing a list of values into a tuple
unpacking: copying values from a tuple into individual variables

Examples:

person1 = ("joe","thomas", 45, 54_300.00) # packing, it is also just tuple creation
print(person1)

person1 = "joe","thomas", 45, 54_300.00   # note that () are not needed as shown here
print(person1)

(first, last, age, salary) = person1     # unpacking tuple into individual variables
first, last, age, salary = person1       # () are not needed here either
print( f"first: {first}, last: {last}, age: {age}, salary: {salary}")

# doing both at once is how multiple assignment works
first, last, age, salary = "mary", "shelly", 33, 88_500.00
print( f"first: {first}, last: {last}, age: {age}, salary: {salary}")

t1 = ((1,2), (1,3), (0,8), (9,9), (10,1))
for x, y in t1:  #  unpacking in for loop
    print(f'point@ ({x}, {y})')
    
def return_three():
    return 1,2,3 # packing multiple return values
    
x, y, z = return_three() # unpacking multiple values from function
print(x, y, z)

Packing/unpacking in function parameters

* used in front of a parameter in a call will unpack list or tuple

# first setup a function with three parameters
def print3(a, b, c): 
    print(a, b, c)

print3(1, 2, 3)

list1 = (1, 2, 3)
 
print3(*list1) # * causes unpacking of tuple or list into argument variables

* used in def of function before a parameter will pack parameters into tuple

def printn(*args): # packs parameters into tuple args
    print('len', len(args))
    print('args:', args)

printn(1, 2, 3)
printn(43)
printn(*list1, 4) # expand list1 into first three parameters and then 4 as fourth parameter

* in assignment on right side is unpacking

list1 = (1, 2)
list2 = (5, 6)
x = list1, list2
print(x)

x = *list1, *list2 # same as  list1 + list2
print(x)

x = list1 + list2
print(x)

* used on left for packing

first, *rest = (1, 2, 3, 4, 5)
print(f"first: {first}, rest: {rest}")

first, second, *rest = (1, 2, 3, 4, 5, 6)
print(f"first: {first}, second: {second}, rest: {rest}")

# *half1, *half2 = (1, 2, 3, 4, 5, 6) # can't do this

first, *mid, last = (1, 2, 3, 4, 5, 6)
print(f"first: {first}, mid: {mid}, last: {last}")

Nested assignment by example

Nested assignment is where the left side of the assignment has nested tuples with varible names that then recieve a corresponding nested structure on the right side.

point1 = (1, 2)
point2 = (5, 6)
color = "blue"
line1 = (point1, point2, color)
print(f"line1: {line1}")

p1, p2, c = line1
print('p1, p2, c', p1, p2, c)
print(f"p1: {p1}, p2: {p2}, c: {c}")

(x1, y1), p2, c = line1
print('x1, y1, p2, c2:', x1, y1, p2, c)
print(f"x1: {x1}, y1: {y1}, p2: {p2}, c: {c}")

(x1, y1), (x2, y2), c = line1
print(f"x1: {x1}, y1: {y1}, x2: {x2}, y2: {y2}, c: {c}")

Combining * and nesting

You can combine nesting and packing as shown below.

x = (1, 2, 3)
y = (7, 8, 9)
z = (x, y)  # ((1, 2, 3), (7, 8, 9))
(a, *b), (*c, d) = z
print(f"a: {a}, b: {b}, c: {c}, d: {d}")

Extra: python bytecodes for creating tuples

This shows you a dissasembly to python bytecodes for common tuple creation. It show the big differnce if the tuple is formed from literal basic values, or from variables or none literal values

from dis import dis
def dis_this(stmt):
    print(stmt,":")
    print(dis(stmt))
    
a = 123
b = 65

dis_this('x = (a, b)')  ## tuple from non literal values
dis_this('x = (1, 2)')  ## tuple from literal values

Output from these two lines show the difference between a all literal tuple, and one that has non-literal values:

x = (a, b) :
  1           0 LOAD_NAME                0 (a)     -- since a and b are not literals
              2 LOAD_NAME                1 (b)     -- they are loaded on stack
              4 BUILD_TUPLE              2         -- and the special BUILD_TUPLE creates
              6 STORE_NAME               2 (x)     -- a new tuple and assigns to x
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

x = (1, 2) :  -- when literal values are used, tuples are stored whole and just referenced
  1           0 LOAD_CONST               3 ((1, 2))
              2 STORE_NAME               0 (x)
              4 LOAD_CONST               2 (None)
              6 RETURN_VALUE

Try these and see the bytecodes generated:

dis_this('x = (1, True, "hello", 1.5, 1+3.5j)')
dis_this('x = tuple(range(3))')
dis_this("x = (1,2,('green','blue','red'))")
dis_this("x = ([1,2],[3,4])")

I hope you enjoyed this article. Please sign up to my Youtube Channel: YouTube Signup