Python build-ins modules: https://docs.python.org/3/library/functions.html#open
Template
With Docstring, you can use help()
command to get module information, for example:
1 | ## "words" is the script name: words.py |
Acutally help()
works on every object.
This is a python script named words.py
with demonstration for Docstring:
1 | """Retrieve and print words from a URL. |
On Linux, if you add shebang #!/usr/bin/env python3
at the top of the script, then you can run it by:
1 | chmod +x words.py |
后面会专门学习一下python script方面的知识, see my blog <<Python3 Scripting>>
.
Exception
try
statements do not create a new scope! the variables in try block can be seen from outside try block:
1 | import sys |
Common exception types:
indexError
: index out of boundarykeyError
: mappingTypeError
: usually avoiding check this, increase function usability.valueError
: int(“hello”)OSError
: os module open file
Modularity
Import module and attribute, 掌握import的语法, module normally is a single python source file, e.g. hello.py
. When import, it is represented by module
objects.
1 | ## import custom module |
In the interactive python console, use help
to explore modules:
1 | # you can search for modules, keywords, symbols, topics |
You can check the attributes of an object:
1 | ## show all methods |
Commonly used modules:
requests
: simple http libraryurllib
: urllib is a package that collects several modules for working with URLssys
: access argv, stdin, stdout, etc.time
: principally for working with unix time stampsdatetime
: UTC datetime, Unix timestamp, timedetlapprint
: pretty printos
: interface to system servicesitertools
: iteration processingcontextlib
: use withwith
statementtyping
: type hints, built-in after python 3.5functools
: functools.wraps() 用来 copy original function metadata
最近在做project的时候遇到几个新的modules:
threading
: threading operationsubprocess
: spawn new processes
time
vs datetime
modules:
https://stackoverflow.com/questions/7479777/difference-between-python-datetime-vs-time-modules
the time
module is principally for working with unix time stamps; expressed as a floating point number taken to be seconds since the unix epoch. the datetime
module can support many of the same operations, but provides a more object oriented set of types, and also has some limited support for time zones.
Function
Function name in python uses lowercase and -
as delimiter. def
keywork bind a function to a name, function in Python is treated as object.
Extended arguments
, for example: *args
(act as tuple), **kwargs
(act as dict). This is called parameters packing, these applies to all types of callables, for example lambda.
The parameter type order must follow: regular positional -> *args -> keyword -> **kwargs, for example:
1 | def print_args(arg1, arg2, *args, kwarg1, kwarg2, **kwargs): |
The parameters after *
must be passed by key word:
1 | def function_name(para1, *, para2 = "hello", para3): |
Correspondingly, we have extended call syntax
, unpacking when pass the parameters to function call, *
is for tuple or list, **
is for dict.
Unpacking parameters:
1 | def fun(a, b, c, d): |
Positional-only arguments:
1 | ## no kwarg can be used here |
If no return
, then will implicitly return None
.
1 | def function_name(para1 = "hello", para2 = 34): |
Notice that always use immutable value for default value!! Default value的赋值会在最初执行函数的时候运行一次,之后调用不会再重新赋值,看样子是一直存在内存里了。
1 | def append_word(org=[]): |
Another example:
1 | import time |
Function is also an object, can be used as parameters:
1 | def print_card(words): |
*args and **kwargs 可以用来argument forwarding
:
1 | def trace(f, *args, **kwargs): |
Special Functions
Detect whether a module is run as a script or imported as a module.
1 | # only execute function when it is run as a script |
Functional Programming
The special function __call__
, 使用后class object可以当做function来调用,__call__
就相当于定义了一个调用接口,并且加上其他数据结构,可以实现caching的效果 stateful.
You can use timeit
module timeit
method to measure exection time.
1 | ## resolver.py file |
Run in REPL:
1 | from resolver import Resolver |
How to know object is callable, use callable()
function to test.
Lambda
Create anonymous callable objects, syntax: lambda [args]: expr
, the args are separated by commas or empty, the body is a single expression.
1 | ## the key is assigned a callable function |
Functional-style Tools
map()
: maps function to a sequence, lazy implementation, return iterator.
filter()
: remove elements from sequence which don’t meet some criteria, lazily.
functools.reduce()
: 2-arguments function with a sequence, reduce the sequence to one result.
Local Function
functions defined inside function.
1 | ## 实现了和前面lambda类似的功能 |
Name resulation in the scope is checked by LEGB
rule: Local -> Enclosing (the containing function) -> Global -> Build-in:
1 | g = "global" |
Local function usage cases:
- define one-off functions close to their use.
- code organization and readability.
- similar to lambda but more general, can have mutliple expressions.
Local function can be returned, working with Closure
(在返回local function时,对其需要的资源进行保留,防止被垃圾回收, keep enclosing-scope objects alive):
1 | def enclosing(): |
Function factories
, return other functions, returned function use both their own arguments as well as arguments to the factory.
1 | def raise_to(exp): |
nonlocal
is like global
keyword, to name binding in enclosing scope. 有点类似于local function使用的全局变量,但只针对同一个local function.
Function Decorators
Allow you to modify existing functions or methods without changing their definition (在原函数中加入上下文). Decorators can be:
- Local function 以及 closure 结合使用.
- class, the class must implement
__call__()
, all class define variables are gave to decorated function. - instance of a class, can control decorator behavior via instance variable.
You can think decorator as a function accepting a function and returning a function (callable).
这里举一个local function作为decorator的例子,其他类型decorator暂时没用到:
1 | ## f: the target decorated function |
You can have multiple decorators, act in order 3->2->1:
1 |
|
Keep original function metadata, for example __name__
and __doc__
, using functooks.wraps():
1 | import functools |
Parameterized decorator的一个用途是检查传入原函数的参数,比如这里检查第二个参数不能为负数:
1 | def check_non_negative(index): |
Basic
Unlike other programming languages, Python has no command for declaring a variable, python is dynamic type.
1 | # explicit conversion |
Operators
Python will not perform implicit type conversion, for example "123" + 56
is wrong. Exception is in if
and while
condition.
Notice that ==
vs is
when compare strings, ==
compare the value but is
compares the identity equality, you can check the unique number by id()
. And comparsion by value can be controlled programatically.
The logic operators are and
, or
and not
. 这里注意它们会返回最后eval的值,可以利用这个特点:
1 | ## 999 |
Check if an object is None using is
operator.
Function parameters and return
are transferred using pass-by-object-reference.
Notice that sequence of the same type also support comparison, just like string comparison, item by item from left to right
1 | (3, 99, 5, 2) < (5, 7, 3) |
Control Flows
Python does not have switch statement, there are several ways to mimic it: Python switch replacements
1 | ## condition |
String
Unicode characters.
Python does not have a character data type, a single character is simply a string with a length of 1
. Square brackets can be used to access elements of the string or slice string.
Escape will work on both "
and '
.
The same as Java, string in Python is immutable.
1 | ## raw string, no escape |
Python f-Strings
is better and concise then format()
:
1 | a = 23 |
Bytes
In python3, Strings are represented by sequences of unicodes, but textual data in Linux is a sequence of bytes, we need to use encode()
and decode()
to convert python string to/from bytes.
You get byte object from HTTP request, need to convert to str to use.
1 | x = "hello world" |
List
1 | ## can have comma at end |
除了list自带的sort and reverse, out-of-place functions: sorted(), reversed() can also be used, they create a new list, reversed() will return a reversed iterator.
Dict
1 | d = dict() |
The copy of dict is shallow.
1 | d.copy() |
merge dict:
1 | ## if keys are overlapped, the value will be updated by the merged one |
iterate dict via foreach loop:
1 | for k in d.keys() |
Use in
and not in
to check the existence.
Set
Immutable collection with unique immutable objects.
1 | ## s = () is tuple! |
Use in
and not in
to check the existence.
The copy of dict is shallow.
1 | s.copy() |
Set 有很多代数运算法则可以使用:
1 | s.union(t) |
Tuple
Tuples are unchangeable, or immutable as it also is called.
1 | ## useless, because immutable |
Other operations:
1 | t3 = ("hello", 10.23, 99) |
Range
1 | range(stop) |
Used usually for loop counter:
1 | for i in range(10): |
Other usages, for example, generate a list:
1 | list(range(0, 10, 2)) |
Enumerate
If you want to have index pair with the item, use enumerate()
:
1 | t = [3, 35, 546, 76, 123] |
Iteration and Iterables
Comprehensions with filtering
1 | ## list comprehension |
Iterable
can be passed to iter()
to produce a iterator.
Iterator
can be passed to next()
to get the next value in the sequence.
1 | iterable = [1, 2, 3, 4, 5, 6, 7] |
Generator function, stateful, laziness with yield
:
1 | def gen(): |
Generator expression, can save big memory than list comprehension:
1 | (expr(item) for item in iterable if filtering(item)) |
Generator is only single use object, 用完就没了,需要重新造一个。
There are several aggregation functions: bool: any()
, all()
; sync unlimited number of iterables zip()
:
1 | for i, j in zip([1,2,3],[4,5,6]): |
Class
Python does not have public, private, protected key word, everything is public.
Polymorphism is implemented by late binding in Python, 并不是和Java 通过继承实现多态,Python中你可以对一个Object 调用任意method,只要它有这个method就行,用的时候才会检查。 并且继承在Python中主要用来分享共用的方法,当然也可以来多态, 但是继承在python中用得不多。
1 | class Parent: |
Parent.__doc__
can be used to access doc string.
注意,class method的第一个parameter self
可以是任意其他的名字,就是个标记而已,比如:
1 | class Test: |
File I/O and Resource Management
Check default encoding, if not specified in open(), will use this default encoding format.
1 | import sys |
Open file with options, for example:
1 | ## mode can be any combination of `crwa`|`tb` |
Some useful methods after open the file:
1 | ## read all return string |
For reading file, you can also use loop:
1 | for line in f: |
Use with-block
to auto close the resource (or you can use finally block),不仅仅是用在file上,比如网络上的读写也可以,它们背后的实现都遵循了同样的规则,所以可以使用with-block:
1 | with open(...) as f: |
注意这个as
是可以省略的,在threading module的lock使用中,就没有as
.
Data Structure
这里主要是和Java 对比,一些常用的数据结构,比如Stack, Queue, Deque, priorityQueue, Map, etc.
Queue
https://www.geeksforgeeks.org/queue-in-python/ There are 3 ways to implement queue in python:
list
, quite slow byappend()
andpop(0)
, need shift all elements.collections
module importdeque
, can be used as queue.queue
module importQueue
, A maxsize of zero0
means a infinite queue. (this can be a synchronzied queue for thread programming)
Deque
see Queue section:
collections
module importdeque
, can be used as queue.
Stack
https://www.geeksforgeeks.org/stack-in-python/
There are 3 ways to implement stack in python:
list
, but when growing it has speed issues.collections
module importdeque
, can be used as stack (use this one).queue
module importLifoQueue
, A maxsize of zero0
means a infinite stack.
Priority Queue
https://www.geeksforgeeks.org/priority-queue-in-python/
By using heap data structure to implement Priority Queues, Time complexity: Insert Operation: O(log(n)) Delete Operation: O(log(n))
Min heap can be implemented by heapq
module:
https://www.geeksforgeeks.org/heap-queue-or-heapq-in-python/
Max heap is not implemented in heapq
, the alternative way is invert the priority:
https://stackoverflow.com/questions/2501457/what-do-i-use-for-a-max-heap-implementation-in-python
from queue
import PriorityQueue
, min heap too:
https://www.geeksforgeeks.org/priority-queue-using-queue-and-heapdict-module-in-python
Map
use primitive type dict
.