Lecture 28 Python References

1. The '__name__' special variable:

You may have seen " if __name__== '__main__' " a lot in python programs you found online or somewhere else and wondering what does this mean and why people use it? Here we will clarify your questions use some very simple examples:

In the following window, the left half is the python file called 'test_subFunc.py', the right half is another python file called 'test_main.py':



I move the cursor into the test_main.py file's window and type ' ! python %' to run the code, then I get this:



This means the '__name__' special variable in this current code is '__main__'. Keep in mind that now the test_main.py file and the test_subFunc.py file are not connected to each other, I haven't invoked the test_subFunc.py file in the test_main.py file. They are two independent files.

Now, you must be wondering what is the '__name__' special variable look like if I put it in the subfunction and just run it? (I commented the print(__name__) in the test_main.py file this time).






So now the conclusion is it doesn't matter what is the name of your file, as long as you run the current file directly as the 'Mian Function', the '__name__' special variable will be '__main__', which is trying to tell you that I am now being executed as a 'Main Function' even though I am a 'sub function' for the real 'Main' function in the project.

Now something interesting will happen. If we call the subfunction from the main function and print the '__name__' variable from the subfunction, the '__name__' will be changed:





So why Python has this feature?

The '__name__' special variable is actually working like a 'Flag' to show if this current file is being executed as a 'Main Function' or a 'Sub Function'.

It is very useful when you have a main function and many sub functions. Of course the main function will call the sub functions to complete the work. However, you still have some 'testing code' in the sub function but you don't want them to be executed when this file is being invoked by the Main Function as a Sub Function.

For example, the same code as the one above but I put a condition checker " if __name__ == '__main__' " in front of the previous code:



Then I get the following errors when I run Python from the test_main.py file because the special variable __name__ doesn't equal to '__main__':



However, if I change it to: if __name__ == 'test_subFunc', run it using the test_main.py Main Function then I got this:





This shows that if the test_sunFunc.py file is executed as a sub function of the Main Function, the __name__ variable in the test_subFunc.py file has the same name as the file's name - 'test_subFunc'.

If I run the code inside the test_subFunc.py file, the __name__ variable will be '__main__', and the code will be executed:





Conclusion:

People always use " if __name__ == '__main__' " in the sub functions or sub modules to test the code and invoke the functions there only when it is being executed as a 'Main Function'. So when it is being invoked by the real 'Main' function in the entire project, the code underneath " if __name__ == '__main__' " won't be executed.


2. Understanding Python variables as pointers

We know that pointers are something like this:



In python, you do not need to use the '*' sign to define a pointer. However, in Python, a variable always store a reference to the real value stored in memory somewhere else. Udnerstand variables in Python are pointers is the essential to understand the Linked List datastructure which will be covered in the near future.

Before we start looking at the variables being used as pointers in Python, let's look at the cases that variables are not pointers but just a name or alias of the real value stored in memory (the appear the different name in your text editor but will be converted into the same machine code later).

- Integers



- Strings

Please keep in mind that the memory check function is in the front of the code and not being showed in the following code:



- Tuples

Things start getting different.



Apparently, var1 and var2 here are just pointers.

However, if I assign var1 to var2, then var2 is nothing but an 'alias' of pointer var1, then var1 and var2 should be the same thing and have the same address:



More interestingly, I printed tuple (1,2,3,4) twice by these two programs above, and I got two different addresses. So the memory assignment of tuples is dynamic.

Let's try one more thing, if I print the same tuple for multiple times, let's see what the address looks like:



All the addresses are different.

- Lists

Let's do the same thing as we did for tuples to the lists:



All these three addresses are different.

Of course if I only created an 'alias' of var1 as 'var2', the address of var1 and var 2 will be the same.



HOWEVER, if I print the same list for multiple times, all the addresses are the same:



This is very different from what we had in tuples.

This is why tuples are faster than lists:


In python lists comes under mutable objects and tuples comes under immutable objects. Tuples are stored in a single block of memory. Lists are allocated in two blocks: the fixed one with all the Python object information (from the class, for example '.append()', '.pop()', etc..) and a variable sized block for the data. It is the reason creating a tuple is faster than List.

- Dictionary

Still, first, all these three addresses are different:



If I create an 'alias' of var1 as var2, var1 and var2 are the same thing.



If I print the dictionary for multiple times, all these addresses are the same. This behaves similarly to the List.




When you program with Python you cannot visulize these things but there are a lot going on behind the scene!


3. Now let's take a closer look at the list

If I do a shallow copy of lst1 as lst2, and if lst1 has a sublist. If I only modify the sublist then both lst1 and lst2 are modified:



Everytime a list is created it is actually a 'list' object is being instantiated from the 'List Class'. Every List object has a different address, they are different things in the memory. It is like when two Toyota Camry cars are being assembled in the factory, they all belong to the Class 'Toyota Camry' but they are two different car individuals.

But why both the element in the sublist are changed at the same time?

The 'shallow copy' operation in this example creates a new list object stores all the refereces/pointers point to the same data set:



If you modify the element in lst2 outside of the sublist (at the first level), the 'Red Pointer' in the figure above will be tuned to a different value only for lst2:





If the second level (the sublist) is modified in lst2, the 'Red Pointers' won't be modifed but the 'Blue Pointer' will. However both the 'Red Pointers' are still pointing to the same thing so they will be changed at the same time:



When the user is modifying lst2, they usually don't want to modify lst1 as well. So this is actually a problem! The way to solve this problem is to do a 'deep copy' operation:



Now, when you try to modify the sublist of lst2, only lst2 is modified.





Tasks 1:

1. In section 3, I showed you the sublist modification issues for the 'shallow copy' operation and solved the issue using the 'deep copy' operation.
Now, I have a dictionary that contains a sub-dictionary in it:

dict1={'Jack':{'Python':'Proficient', 'C++':'Fair'}, 'Mike':'C++'}

I am trying to create a new dictionary 'dict2' from 'dict1', and change the 'C++' entry's value of 'Jack' to 'Proficiency', and this only applies to dict2.

As you know from the tutorial, a shallow copy may not work and you'll need a 'deep copy' operation to do this.

The task is: Create a tutorial similar to what I did for the sublist issue, show the addresses, explain why a shallow copy won't work. Put your tutorial in a Microsoft Word file, include all the snapshots of the code, the addresses, and any figures that may help you to explain this issue.