1111

双十一、、、剁手啊

Python integer objects implementation

讲的是Python整数对象的实现,来看一个语言是怎么设计的,我觉得可以学到的有两点吧。

  1. When a block of integer objects is allocated by Python, the objects have no value assigned to them yet. We call them free integer objects ready to be used. A value will be assigned to the next free object when a new integer value is used in your program. No memory allocation will be required when a free integer object’s value is set so it will be fast.
    Python 在整数使用前申请一些整数对象的内存, 在给整数对象分配内存块时,直接赋给空闲的整数对象。来提高速度

  2. A specific structure is used to refer small integers and share them so access is fast. It is an array of 262 pointers to integer objects. Those integer objects are allocated during initialization in a block of integer objects we saw above. The small integers range is from -5 to 256. Many Python programs spend a lot of time using integers in that range so this is a smart decision.
    Python做的第二点是提前为一部分小整数分配了空间,以便快速访问

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>>> a = 256
>>> b = 256
>>> a is b
True
2.
>>> a = 257
>>> b = 257
>>> a is b
False
3.
>>> a, b = 257, 257
>>> a is b
True
4.
>>> a = 257; b = 257
>>> a is b
False

还有一个讲Python 内存管理的,先挖个坑吧,等有时间再看。 再填一个小坑,之前研究生数学建模里面有个题是做分类的,针对流型,今天看到scikit-learn就有专门干这个的 Manifold Learning,这个也用不到,mark一下吧

multiprocessing

这个主要是明白在Python写程序的时候更倾向多进程而不是多线程。具体如何实现,如果要用的时候可以参考这个

Python 爬虫的工具列表 附Github代码下载链接

我只能说这个总结的太全了。。网络、爬虫架构、HTML解析、文本处理、特定格式的处理、自然语言处理、、、