Skip to main content

Automate the boring stuff with python review

Chinese (Simplified)Chinese (Traditional)CzechDanishDutchEnglishFrenchGermanHindiIndonesianItalianKoreanPolishPortugueseRussianSerbianSlovakSpanishThaiBengaliGujaratiMarathiNepaliPunjabiTamilTelugu

 
Hi friends, so, I am starting to read the book named Automate the Boring stuff with python, by Al Sweigart. In this blog, I will continue to update my experience with this book. You can also read-only from here. I recently uploaded a selenium based automation bot in youtube to GitHub and shared the link in Reddit to get reviews about the same. There one user suggested me to read this book. So, we will start reading it soon. Hang on with me to get important notes and reviews on the same book.

So, I read the introduction and it is nicely put how the author has come to understand and explain the requirement of automation or programming out solutions for the daily boring staffs like searching a CSV for a value. Obviously, this was a good motivation to start reading the book. Although I am pretty much above very basic staffs of python like chapter 1 and 2, I have decided to give a read to each and every chapter to get a completion feeling.
To support the author and help your eyes, you can also buy the book:

Chapter 1: Python basics:

Read chapter 1 with me!
This is a kind of good introduction to the language for the first time. It is clear that the author has kept the bar really low and anyone, with no programming background, can even start reading this book, as it starts with testing out the IDE for 2+2=4. 
The concepts discussed here explicitly and implicitly are:
(1) basic functions: writing arithmetic expressions and small functions in the command line and testing them out.
(1) different types of errors: although the author does this to make the reader feel comfortable around the IDE and python while errors may come up and become frightening. Here, the author has discussed some of the most basic errors, like  
TypeError: when one type of data gets used as a different type of data, and therefore the language finds it wrong. i.e. if I try to multiply "ram"*"Rahim" then I will get an error, as both of the operands are strings and therefore the multiplication operator can not work. Although here he has pointed out one of the unusual string manipulations of python, where "python"*3 gives pythonpythonpython. 

Syntax error: He has pointed out the syntax error. The syntax is just like the syntax of verbal languages, special rules for the functions, and about commas, string quotations and what not. Basically, a syntax error occurs whenever the logic of the programming line does not make sense. A syntax error, with a snip of the line and the ^ sign under the violation, occurs when a syntax error occurs. 

The author has nicely started by explaining how the logics work in python, with flow chart diagrams and all. He has described some of the most used functions, like len(), and typecast functions to change one type of data to another, like int(), float() and str(). Here, I think there could have been a better mention of the functions, like with the argument types, and what does it do, just like the official documentation pages do so. He could have provided a link for a more formal description of the same too. But, anyway, as he has explained them in a slightly loose way quite well, so it is all good. 
In the extra credit though, he has suggested the reader to search built-in functions. As you can see, I have added a link for the same, to the official documentation. It is good to note, if you are a beginner, that, it is always good to try reading from the official documentation first if the function is not too hard. For complex and too detailed functions, it is okay to use tutorials and blogs. 

Next, in this chapter, we learn how to write a python file, save it as filename.py and then run it using F5 or by running it from the GUI option run--> run module. So, it is pretty cool. 

As a surprise to me, the chapter ends with practice questions. I will not answer them in this post, but if any reader wants I can edit it later to do so. (that can be you too ;) )

Now, I am going to read chapter 2.

Chapter 2: Flow control

In chapter 1 conclusion itself, the author had told that in chapter 2, he is going to talk about flow control.
I have completed reading this chapter now and will have to say that, it is really well written. For absolute beginners, who are supposed to read this book thoroughly, the flow chart introduces the formal way to design algorithms, rather than just writing up the algorithms. This is good, as the visualization becomes easy.
The author has written this chapter really long, starting with for, if, else, elif(special for python, you don't have it in your C/C++), while and other conditionals. The trinket embedding is good, and it is advice from my own experience that if you are the beginner, reading these chapters, then it is best to try out thoroughly, command by command. It helps with remembering the syntax and understanding, practicing the programming well; as well as you will find it easy to write later.
Here, our beloved author, as explaining for very basics, has skipped
(1) try and except
(2) mentioning that there is no case switch conditional inbuilt for python, but elif helps to overcome.
(3) mentioning the one-liner codes, like if 1==2? print(3): print(4) [not sure about the exact syntax] here.

Apparently, he has explained range function quite elaborately, while a good formal definition of the function would have been better; or a small animation, explaining what does the different arguments mean.
There is a second part which talks about one of the most basic and needed things in python; "import". He has nicely explained how to do import and why to do import. From all those people who were introduced C first, this is exactly the header files analog. The import statement is described properly by importing module random and using one function including in it called randint. For importing, the general syntax is below:
import some_library as short_name_you_prefer
#use in codes

like if you want to import a famous module named numpy, then often it is done as below:
import numpy as np
variable=np.array([1,2,3])
  don't worry if you understand the above code, but understand that numpy is imported as np in your coding file/session/ block; and then you have to put a dot between the short name and the function you want to use, to use that function.

Also, I learned two new things, which are the "continue" statement, which basically makes the loop run again and again. Also, as I generally work in jupyter, I never got the chance to know about sys.exit() command, though game programming always has something similar, which is for closing the tab when the user hits the escape (X) button. But it was good to learn and may help during complex programs.

Now, as chapter 2 is over, I will read chapter 3 tomorrow. See you guys tomorrow.
Chapter 3:

Read chapter 3 with me.
This chapter softly enters with the definition of function and then delves into the different aspects of it. On top, it discusses,
(1)
  how to define a function in python, what return means and what is returned when there is no return function. I think being of a more C background, I will want to point out here, that like the whole language works in case of python, you again do not have to specify any of the return types or the argument types. In the case of variable argument or variable numbers, there are **args and *args. But those are deep concepts and I am not going into them. Also one can search for lambda functions, which are basically nameless small functions used by experienced programmers to shorten long programs.

(2)
Al deeply explains the concept of global and local variables and the concept of scopes. This is one of the most important concepts in programming and probably you will pretty much have it if python is not your first language but you must read these sections if you are a real beginner in sense of coding even. The main idea is that, the global variables are variables valid for the whole portion of the programs while the local and the variables defined inside functions are valid or powerful inside the blocks of codes they are defined in, or the loop or conditional they are defined in, or the function, in case of a functional variable.
This portion is really important, as this, is one of the important points, which you will like to keep in mind while you are debugging. A lot of times, you may mess up variables unknowingly, or you are editing at the 500th line of your code and do not remember a variable name and therefore reload some already valid variable from the 20th line and therefore get a typeError because it does not match now. These are the things therefore pretty important to get into coding long and broad.

(3)
In the last two sections, author describes the Error handling tactics and writes off a function for game and describes how it works.
Chapter 3 is all that. I will go to chapter 4 tomorrow. Once we get to the interesting portion, we will hit the speed.

Chapter 4:

read chapter 4 here with me.
This chapter deals with lists and tuples. Lists are objects in python, which works like array in C and all, but it has many more attributes.
Author has described very well the concepts of index, homogeneous data types, negative and positive index. He has described the concept of slicing also very well. Slicing properly comes with the syntax L[a:b], this gives a sublist from a to b-1 index.
In this chapter, basically author has described the len() function. The syntax for len function is len(list).
In the next sections, author has continued to describe del, list assignment with index specifications etc. The looping in this chapter is well described in this list again. The general syntax
"for item in list" and "for i in range(len(list)):" is properly described in the looping section.
Other than this, sort(), remove(), append() etc different attributes of lists are described.
The description of list has ended with a example of list works.

Then we will read tuple from the next and last part of this chapter.

Tuple is nothing but a immutable version of the list i.e. once the values are assigned they can not be changed. These are important for assigning important fixed values.
In the last few sections of this chapter, author has described tuples, their general writing style i.e.
tuple=(item1,item2, item3). Here one important thing to notice is that tuples are non-homogeneous.
The most important and conceptually important concept is reference; i.e. variable assigning and passing using names and addresses. Read the book in details to understand this concept. I will suggest even pro students to read the book in this.
Next, a module copy is mentioned which provides copy() and deepcopy() functions to copy tuples and lists. Then the chapter summary arrives and the chapter ends with practice questions.

In this chapter, you can skip list portion if you already do python, but I will suggest reading the tuple section for your knowledge and help.

Chapter 5: dictionaries and structuring the data:

Read chapter 5 with me, from this link.
The chapter 5 talks about data dictionary. The dictionary chapter starts with basic description of what is dictionary and all and how to write a basic dictionary; i.e. by writing tuples key1:value1 inside{ bracket, using comma to separate each. For dictionary, you have to learn properly three things first, 
(1) keys: they are like the indices in arrays or lists.
(2) values are the values stored for a specific key.
(3) you need a key to access the value.
In this chapter, writer misses the simple fact to mention which is necessary that, to initialize an empty dictionary you have to write it like:
empty_dictionary={}
there are other ways too, but this is way faster than others and you should follow this therefore.
Now, he talks about a few good methods in the dictionary object. The methods which you will need to use a dictionary in your code in practical fields are
(1) keys(): if you have a dictionary named dict, then dict.keys() will give you the list of keys.
Application: keys are needed, when you want to
1. parse through the keys.
2. know the number of unique keys.
3. parse through the dictionary in linear time.

2.values(): it similarly gives us the values.
One thing here necessary is that, a dictionary may have values repeated but the keys do not repeat.
3. items(): this gives a tuple of keys and values. Again this is a good way to sort/search/parse through a dictionary based on conditions.
Here, author has done a very good job to explain the related methods like get(), setdefault() and some others like prettyprinting. After this, author has continued into two practical examples where you will be knowing about how to apply dictionaries in real staff; like a tic-tac-toe game and a chess game.
I really appreciate these two examples and beginners in python should really spend some time around these two, as well as should play with these ideas to get a hand with dictionary.
The part with the nested lists and dictionaries; I would have left to figure out myself, but if someone needs more details description about things, or if someone is again, new to programming too, should definitely grab the idea and maybe codes also. Nested objects are really of use and therefore should be used and learned.
So, I hope this is all with the chapter. Again, one personal add with the content; which I had a lot of pain to bear with.
The concept is to sort a dictionary. If you are not interested; then you can skip it, but I am afraid if you use dictionary, you have to use it. I will put links of stackoverflow discussions here for interested people.
First you need to know, dictionaries are orderless. There is no real order like there is in a list. Therefore, when you want to sort that dictionary, you have to store or print it to use. The dictionary will not get sorted and changed just like a list.sort() will do. Now, I will tell you
How to sort a dictionary:
(1) by keys: sorted(dict) will give you the sorted keys of the dictionary. But that you could have done by
list_of_keys=dict.keys()
list_of_keys.sort()
(2) by values: this is where the printing as well as doing gets a bit different. There is a module called operator, which lets you use python operators like iterators and others(don't get way confused; you don't need to know or use them). So the code for sorting by values go like this:
sorted(dict.items(),key=operator.itemgetter(1))
Now, funny thing is that, by changing the itemgetter argument to 0, you can also sort by value too. use sorted(dict.items(),key=operator.itemgetter(0)) to sort by keys also.
You can find the above from this link too.
So, that's what dictionaries are about.
I will be back with chapter 6 soon. Do visit these links below for some different taste.

Chapter 6:
Read chapter 6 with me, here is the link for it.
This chapter is about string manipulations. The three parts of this chapters are:
(1) string writing tricks
(2) string functions for string manipulation
(3) string manipulation projects[not gonna write about these, check about them yourself]

In first section, author has exhausted the methods to write python strings. The methods mentioned are:
(1) writing under double quotes to include strings including single quotes.

(2) writing within triple quotes i.e. ''' ''', which will allow you to write any types of document within it. This is generally used to provide documentation to python classes functions and classes or files in general as it allows to write long strings.

(3) writing a string to be considered as if, using the raw string feature. I did not personally know the exact description of this thing previously.
So, r'I am writing this, don't miss out final docs and subscribe today' will consider the whole string as it is. It will not match the single quotes and give invalid quotes. This is generally useful to provide filepaths in computer, as they contain backslashes, and get error in reading, if not provided as rawstring. i.e.
open(r'c:\users\programs\anaconda') will be able to access the anaconda file properly as here the file path is accepted as a raw string.

(4)writing strings using backslash to denote escape characters. Here, a character, which contains special effects like escaping lines or spaces in python in general, are provided with a backslash before it, so that python reads it just as a character, not the special character. i.e. \' will not mean the single quote as a quote, but a string character.
See the table 6.1 in the chapter to know properly about them.

Now, in the next part, general functions for string manipulations are mentioned and described.
The first one is that you can treat any string as a list. You can access the characters using list indices and slice the strings accordingly as you could have done for lists. i.e. 'string' is also ['s','t','r','i','n','g'] internally and therefore you can access s=string[0].
You can use it if you feel like it, but in general, in python string being handled as a list is not considered pythonic. Keep on reading to know more pythonic ways.
The second one discussed is one of the most important functions while using strings. It is 'in' and 'not in' function. The use is like below:
'hello' in 'hello debian!'
>True
So basically in searches the exact string previous to it, in the string next to it and gives output in boolean whether it is present or not.
This is generally used to search strings inside another string and helps in real applications very much.
The 4 other important functions described next are:
lower()
upper()
islower()
isupper()
clearly from the name of the functions, lower() makes a string lowercase, upper() makes the string lowercase and islower() checks if string is in lowercase and isupper() checks for uppercase.
These are attributes of a string and therefore the application of these functions are in way below:
string.function() --> string.upper()

Next thing in this chapter are: isX functions which are
isalpha
isnumeric
and some other similar functions.
These functions check the string for specific types, i.e. alphanumeric, numeric and other types. Although he has mentioned the types, I have not seen these functions in any real script that much, i.e. maybe they are not much used functions.

After this, author has mentioned some important functions. These functions are
startswith()
endswith()
Join()
split()
startswith() and endswith() are used in the following format: string_to_search_in.startswith('string_to_start_with') and similar for endswith().
This function is quite used in string manipulations and searches.
Join() is a function to join strings in lists with a specific separators into one single string. The use is
'separator '.join(list_items)
Common separators are spaces, dots, commas and tabs. You can follow the examples in the book to increase the understanding about how it works.
split() is also a similar function, and it is used as:
string_to_split.split('separator')
the similar separators are also used in this case. This function is generally used to, let's say separate strings or numbers with spaces in between them and you want them into a list. So, split provides you a list, i.e. it is just like a reverse of the join function.
Some other functions are also mentioned after these, i.e. Rjust(),Ljust(), Rstrip(),Lstrip() which deals with adding spaces or stripping spaces from a string in right or left and like that. But I do not see much real use or need to use custom function for that. That's why I am not describing them here. Reader can obviously read the functions and they should try finding usecases for these functions.
The last part of the chapter contains multiple nice projects. If you are a beginner in python, please go through this python projects and try to complete them without reading them straight up.

chapter 7

This chapter is on regular expressions. This chapter, unlike others, I think doesn't need any comment. Most part of this chapter is practically used and very much usable. This is why, I will recommend this chapter highly for anyone interested to work with text data, nlp or sorts. We will directly move on to chapter 8.

Notice!

Meanwhile, the second edition has come out. I am going to write about chapter 8-20 from 2nd edition available from the site directly.

chapter 8

Chapter 8 is about input testing. In a normal production level code, an engineer always include such codes to test whether the input abide by the assumptions of the application input. i.e. if I ask to input social security number then obviously my input is expected to be a 16 digit numeric. So, in that place, if you write your name or so, then it does not make sense. In general, whenever I used to handle such situation, I used to explicitly write the code with if elif else loops. But author, AI sweigart has talked about a third party module pyinputplus and have detailed out how to use that module to serve the purposes. This chapter discusses the different functions and their respective parameters in details. If you are a medium or amateur level coder, who can read new libraries and work with them on their own, I will not recommend reading the rest of this chapter, as it discusses about this module only. But for reference this chapter should be noted always. If you are interested still now, I will update the details of this chapter tomorrow.

This post has not completed still now; so why don't you visit something interesting in this page like:
(1)https://shyambhu20.blogspot.com/2019/04/i-made-youtube-botselenium-driven-web.html
(2) https://shyambhu20.blogspot.com/2019/03/trees-problem-solving-from-geeksforgeeks.html
(3)https://shyambhu20.blogspot.com/2019/03/object-oriented-programming-in-python.html
(4)https://shyambhu20.blogspot.com/2019/03/a-introduction-to-knapsack-problem.html

Comments

Popular posts from this blog

Mastering SQL for Data Science: Top SQL Interview Questions by Experience Level

Introduction: SQL (Structured Query Language) is a cornerstone of data manipulation and querying in data science. SQL technical rounds are designed to assess a candidate’s ability to work with databases, retrieve, and manipulate data efficiently. This guide provides a comprehensive list of SQL interview questions segmented by experience level—beginner, intermediate, and experienced. For each level, you'll find key questions designed to evaluate the candidate’s proficiency in SQL and their ability to solve data-related problems. The difficulty increases as the experience level rises, and the final section will guide you on how to prepare effectively for these rounds. Beginner (0-2 Years of Experience) At this stage, candidates are expected to know the basics of SQL, common commands, and elementary data manipulation. What is SQL? Explain its importance in data science. Hint: Think about querying, relational databases, and data manipulation. What is the difference between WHERE

What is Bort?

 Introduction: Bort, is the new and more optimized version of BERT; which came out this october from amazon science. I came to know about it today while parsing amazon science's news on facebook about bort. So Bort is the newest addition to the long list of great LM models with extra-ordinary achievements.  Why is Bort important? Bort, is a model of 5.5% effective and 16% total size of the original BERT model; and is 20x faster than BERT, while being able to surpass the BERT model in 20 out of 23 tasks; to quote the abstract of the paper,  ' it obtains performance improvements of between 0 . 3% and 31%, absolute, with respect to BERT-large, on multiple public natural language understanding (NLU) benchmarks. ' So what made this achievement possible? The main idea behind creation of Bort is to go beyond the shallow depth of weight pruning, connection deletion or merely factoring the NN into different matrix factorizations and thus distilling it. While methods like knowle

Spacy errors and their solutions

 Introduction: There are a bunch of errors in spacy, which never makes sense until you get to the depth of it. In this post, we will analyze the attribute error E046 and why it occurs. (1) AttributeError: [E046] Can't retrieve unregistered extension attribute 'tag_name'. Did you forget to call the set_extension method? Let's first understand what the error means on superficial level. There is a tag_name extension in your code. i.e. from a doc object, probably you are calling doc._.tag_name. But spacy suggests to you that probably you forgot to call the set_extension method. So what to do from here? The problem in hand is that your extension is not created where it should have been created. Now in general this means that your pipeline is incorrect at some level.  So how should you solve it? Look into the pipeline of your spacy language object. Chances are that the pipeline component which creates the extension is not included in the pipeline. To check the pipe eleme