页面导航:
英文教程的下载地址:
本篇文章是根据英文教程《Python Tutorial》来写的学习笔记。该英文教程的下载地址如下:
百度盘地址:
http://pan.baidu.com/s/1c0eXSQG
DropBox地址:
点此进入DropBox链接
Google Drive:
点此进入Google Drive链接
这是学习笔记,不是翻译,因此,内容上会与英文原著有些不同。以下记录主要是根据英文教程的第八章来写的。(文章中的部分链接,可能需要通过代理访问!)
概述:
在之前的
"Python的变量类型"文章中,已经介绍过数字类型,不过那篇文章只是简单的介绍,下面将对数字类型及相关的类型转换函数进行更加深入的探讨(主要是从Python的C源代码的角度来进行分析)。
在进入正题之前,需要补充说明的是,Python不仅在大的版本号之间存在脚本代码的兼容性问题(比如2.x.x与3.x.x的版本),而且在小的版本号之间也存在兼容问题,如下所示:
[email protected]:~$ python2.6
Python 2.6.6 (r266:84292, Nov 27 2010, 19:47:39)
[GCC 4.5.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = {1,2,3,4,5}
File "", line 1
a = {1,2,3,4,5}
^
SyntaxError: invalid syntax
>>> a = set([1,2,3,4,5])
>>> quit()
[email protected]:~$ python2.7
Python 2.7.8 (default, Feb 8 2015, 20:16:27)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = {1,2,3,4,5}
>>> quit()
[email protected]:~$
|
从上面的输出显示中可以看到,python2.6.6中使用大括号来设置set数据类型时,会抛出
invalid syntax(无效的语法)的错误,而python2.7.8中则没有语法错误。因此,在写python代码时,需要注意各版本之间的兼容问题。
整数类型:
下面将从python源代码的角度来分析整数类型,在之前
"Python的安装与使用"文章里,我们介绍过如何使用源代码来编译安装python,如果想使用gdb来调试python源代码的话,就需要按照前面
"Python基本的操作运算符"文章中所说的,使用configure
--with-pydebug命令来重新编译安装python。
在安装了可调试的python后,就可以使用gdb来调试分析python的C源代码了:
[email protected]:~$ gdb python -q
...................................................
>>> a = 11345
...................................................
Breakpoint 1, PyInt_FromLong (ival=11345) at Objects/intobject.c:91
91 if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) {
(gdb) until 111
PyInt_FromLong (ival=11345) at Objects/intobject.c:111
111 v->ob_ival = ival;
(gdb) n
112 return (PyObject *) v;
(gdb) ptype v
type = struct {
struct _object *_ob_next;
struct _object *_ob_prev;
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
long int ob_ival;
} *
(gdb) p v->ob_ival
$1 = 11345
...................................................
[email protected]:~$
|
python源码中,与整数类型相关的代码位于
Objects/intobject.c文件里,上面的
a = 11345 脚本在执行时,会调用
PyInt_FromLong函数为
11345这个整数创建一个对象,该对象的C语言结构体为:
struct {
struct _object *_ob_next;
struct _object *_ob_prev;
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
long int ob_ival;
}
|
其中,前四个字段是python的每个对象都会有的,python中的所有对象都会通过
_ob_next与
_ob_prev这两个字段相互联系起来,以构成一个双向链表,
ob_refcnt是每个对象的引用计数器,当引用计数器为0时,该对象就会被回收掉,
ob_type字段用于表示对象的类型,第五个
ob_ival则是整数对象专有的字段,
ob_ival里会存储该整数对象具体的整数值,例如前面例子中的
11345的整数值。
变量与值:
python脚本代码中的所有东西都是对象,连变量名也是对象(确切的说,应该是字符串对象),在python源代码的Python/ceval.c文件中有如下一段C代码(2.7.8版本对应的起始行号为1948行):
case STORE_NAME:
w = GETITEM(names, oparg);
v = POP();
if ((x = f->f_locals) != NULL) {
if (PyDict_CheckExact(x))
err = PyDict_SetItem(x, w, v);
else
err = PyObject_SetItem(x, w, v);
Py_DECREF(v);
if (err == 0) continue;
break;
}
PyErr_Format(PyExc_SystemError,
"no locals found when storing %s",
PyObject_REPR(w));
break;
|
当对python的变量进行赋值时,就会执行上面这段C代码:
[email protected]:~$ gdb python -q
...................................................
>>> a = 11345
Program received signal SIGINT, Interrupt.(通过ctrl+c组合键来中断python,并进入gdb调试)
0xb7ee09f8 in ___newselect_nocancel () from /lib/libc.so.6
(gdb) b ceval.c:1949
Breakpoint 1 at 0x80f30ee: file Python/ceval.c, line 1949.
(gdb) c
Continuing.
(这里按个回车键, 让上面的a = 11345脚本得以执行,并触发ceval.c中设置的断点!)
Breakpoint 1, PyEval_EvalFrameEx (f=0xb7d8ce44, throwflag=0)
at Python/ceval.c:1949
1949 w = GETITEM(names, oparg);
(gdb) n
1950 v = POP();
(gdb) p * (PyStringObject *)w
$1 = {_ob_next = 0xb7dc1e34, _ob_prev = 0xb7dadcb8, ob_refcnt = 10,
ob_type = 0x81c9ca0, ob_size = 1, ob_shash = -468864544, ob_sstate = 1,
ob_sval = "a"}
(gdb) n
1951 if ((x = f->f_locals) != NULL) {
(gdb) p * (PyIntObject *)v
$2 = {_ob_next = 0xb7ca1114, _ob_prev = 0xb7ca1aa8, ob_refcnt = 3,
ob_type = 0x81c2d80, ob_ival = 11345}
(gdb) c
Continuing.
...................................................
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__',
'__doc__': None, 'a': 11345, '__package__': None}
...................................................
>>> quit()
...................................................
[email protected]:~$
|
上面的
w是一个
PyStringObject类型的字符串对象,该对象里存储的字符串
"a"就是要设置的变量名,上面的
v则是该变量对应的值,也就是值为
11345的
PyIntObject的整数对象,这些变量和对应的值会构成key-value(名值对),并加入到python内部的
f_locals的Dict(词典)中。在python的命令行下,可以输入如上所示的
locals函数来查看Python里当前设置了哪些变量和值。
Python中,在设置了某个变量后,还可以使用del关键字来删除该变量:
[email protected]:~$ python
Python 2.7.8 (default, Feb 20 2015, 12:54:46)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 11345
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__',
'__doc__': None, 'a': 11345, '__package__': None}
>>> del a
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__',
'__doc__': None, '__package__': None}
>>> print a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>> quit()
[email protected]:~$
|
从上面的输出中,可以看到:当使用
del a语句将变量a删除后,该变量就从f_locals词典中被移除掉了(可以通过
locals函数来查看f_locals词典里的内容),再通过
print a指令来访问该变量时,就会提示
name 'a' is not defined 即变量名
'a'没有被定义过的错误了。
del关键字的Python语法如下:
del var1[,var2[,var3[....,varN]]]]
|
从上面的语法,可以看出:单条del语句还可以同时删除多个变量,如下所示:
[email protected]:~$ python
Python 2.7.8 (default, Feb 20 2015, 12:54:46)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 11345
>>> b = 45678
>>> c = 33456
>>> locals()
{'a': 11345, 'c': 33456, 'b': 45678, '__builtins__': <module '__builtin__' (built-in)>, '__package__': None, '__name__': '__main__', '__doc__': None}
>>> del a, b, c
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__package__': None, '__name__': '__main__', '__doc__': None}
>>> quit()
[email protected]:~$
|
从Python源码角度分析的话,del语句在执行时,会通过Python/ceval.c文件里的如下C代码来删除变量(2.7.8的版本中对应的起始行号为1965):
case DELETE_NAME:
w = GETITEM(names, oparg);
if ((x = f->f_locals) != NULL) {
if ((err = PyObject_DelItem(x, w)) != 0)
format_exc_check_arg(PyExc_NameError,
NAME_ERROR_MSG,
w);
break;
}
PyErr_Format(PyExc_SystemError,
"no locals when deleting %s",
PyObject_REPR(w));
break;
|
上面的
w是一个字符串对象,该对象中保存了变量名信息,代码里通过
PyObject_DelItem函数来将该变量名对应的key-value(名值对)从
f_locals词典中给移除掉。
长整数类型:
和Python长整数类型相关的C源代码,位于Objects/longobject.c文件中,我们通过下面的例子来进行分析:
[email protected]:~$ gdb python -q
..................................................
>>> a = 1232323232344566777886555L
Program received signal SIGINT, Interrupt.
0xb7ee09f8 in ___newselect_nocancel () from /lib/libc.so.6
(gdb) b PyLong_FromString
Breakpoint 1 at 0x8088445: file Objects/longobject.c, line 1717.
(gdb) c
Continuing.
(这里按个回车键, 让上面的a = 123232....的脚本得以执行,
并触发longobject.c中设置的断点!)
Breakpoint 1, PyLong_FromString (str=0xb7ca0118 "1232323232344566777886555L",
pend=0x0, base=0) at Objects/longobject.c:1717
1717 int sign = 1;
(gdb) until 1977
PyLong_FromString (str=0xb7ca0132 "", pend=0x0, base=10)
at Objects/longobject.c:1977
1977 return (PyObject *) z;
(gdb) ptype z
type = struct _longobject {
struct _object *_ob_next;
struct _object *_ob_prev;
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
Py_ssize_t ob_size;
digit ob_digit[1];
} *
(gdb) p* (PyLongObject *)z
$1 = {_ob_next = 0xb7ca1114, _ob_prev = 0x81c5960, ob_refcnt = 1,
ob_type = 0x81c4440, ob_size = 6, ob_digit = {14171}}
(gdb) p z->ob_digit[0]
$2 = 14171
(gdb) p z->ob_digit[1]
$3 = 15083
(gdb) p z->ob_digit[2]
$4 = 1098
(gdb) p z->ob_digit[3]
$5 = 674
(gdb) p z->ob_digit[4]
$6 = 20294
(gdb) p z->ob_digit[5]
$7 = 32
(gdb) c
Continuing.
[40761 refs]
>>> 14171 * (32768**0) + 15083 * (32768**1) + 1098 * (32768**2) + 674 * (32768**3) + 20294 * (32768**4) + 32 * (32768**5)
1232323232344566777886555L
..................................................
[email protected]:~$
|
当我们执行
a = 1232323232344566777886555L 的脚本代码时,Python内部会通过
PyLong_FromString函数,将
1232323232344566777886555L这个十进制数转为32768进制的数,并将32768进制的每一位都存储到PyLongObject的
ob_digit字段所对应的数组中,例如上面例子中,数组的第一个元素
14171为该进制的最低位,第六个元素
32为该进制的最高位,数组的元素个数统计在
ob_size字段中。在C代码里,32768是被定义为
PyLong_BASE宏的形式,该宏定义于Include/longintrepr.h的头文件里:
..................................................
#define PyLong_SHIFT 15
..................................................
#define PyLong_BASE ((digit)1 << PyLong_SHIFT)
..................................................
|
可以看到,
PyLong_BASE宏就是1左移15位的值,即2的15次方,也就是上面提到的32768 。
在python中设置长整数时,最好以大写的"
L"字符结尾,小写的"
l"字符容易与数字
1混淆。
在不少论坛上,很多人都说Python的长整数类型没有尺寸大小的限制,但是通过分析C源码可知,在将十进制数转为32768进制时,32768进制的数是有位数限制的,可以从Objects/longobject.c文件的_PyLong_New函数中看出来:
/* Allocate a new long int object with size digits.
Return NULL and set exception if we run out of memory. */
#define MAX_LONG_DIGITS \
((PY_SSIZE_T_MAX - offsetof(PyLongObject, ob_digit))/sizeof(digit))
PyLongObject *
_PyLong_New(Py_ssize_t size)
{
if (size > (Py_ssize_t)MAX_LONG_DIGITS) {
PyErr_SetString(PyExc_OverflowError,
"too many digits in integer");
return NULL;
}
/* coverity[ampersand_in_size] */
/* XXX(nnorwitz): PyObject_NEW_VAR / _PyObject_VAR_SIZE need to detect
overflow */
return PyObject_NEW_VAR(PyLongObject, &PyLong_Type, size);
}
|
当32768进制的size(该进制的位数)超过MAX_LONG_DIGITS宏的限制时,就会产生
"too many digits in integer"的溢出错误。
浮点数类型:
Python浮点数对象相关的C源码,位于Objects/floatobject.c文件中,如下例所示:
[email protected]:~$ gdb -q python
...................................................
>>> a = 123.456789
Program received signal SIGINT, Interrupt.
0xb7ee09f8 in ___newselect_nocancel () from /lib/libc.so.6
(gdb) b PyFloat_FromDouble
Breakpoint 1 at 0x807730b: file Objects/floatobject.c, line 145.
(gdb) c
Continuing.
(这里再按个回车键, 让上面的a = 123.456789脚本得以执行,
并触发floatobject.c中设置的断点!)
Breakpoint 1, PyFloat_FromDouble (fval=123.456789) at Objects/floatobject.c:145
145 if (free_list == NULL) {
(gdb) until 153
PyFloat_FromDouble (fval=123.456789) at Objects/floatobject.c:153
153 op->ob_fval = fval;
(gdb) n
154 return (PyObject *) op;
(gdb) ptype op
type = struct {
struct _object *_ob_next;
struct _object *_ob_prev;
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
double ob_fval;
} *
(gdb) p * (PyFloatObject *)op
$1 = {_ob_next = 0xb7ca1114, _ob_prev = 0x81c5960, ob_refcnt = 1,
ob_type = 0x81c2800, ob_fval = 123.456789}
...................................................
[email protected]:~$
|
当上面
a = 123.456789 脚本执行时,会调用
PyFloat_FromDouble的C函数来创建一个
PyFloatObject类型的浮点数对象,该对象的
ob_fval字段用于存储具体的浮点值。
复数类型:
和Python复数相关的C源码,位于Objects/complexobject.c文件里:
[email protected]:~$ gdb -q python
....................................................
>>> a = 12 + 34j
Program received signal SIGINT, Interrupt.
0xb7ee09f8 in ___newselect_nocancel () from /lib/libc.so.6
(gdb) b PyComplex_FromCComplex
Breakpoint 1 at 0x8161543: file Objects/complexobject.c, line 235.
(gdb) c
Continuing.
(这里按个回车键, 让上面的 a = 12 + 34j 脚本得以执行,
并触发complexobject.c中设置的断点!)
Breakpoint 1, PyComplex_FromCComplex (cval=...) at Objects/complexobject.c:235
235 op = (PyComplexObject *) PyObject_MALLOC(sizeof(PyComplexObject));
(gdb) p cval (第一次执行该函数时,只传了imag即复数的虚部)
$1 = {real = 0, imag = 34}
(gdb) c
Continuing.
Breakpoint 1, PyComplex_FromCComplex (cval=...) at Objects/complexobject.c:235
235 op = (PyComplexObject *) PyObject_MALLOC(sizeof(PyComplexObject));
(gdb) p cval (第二次执行该函数时,才将real实部与imag虚部都传递过来)
$2 = {real = 12, imag = 34}
(gdb) until 239
PyComplex_FromCComplex (cval=...) at Objects/complexobject.c:239
239 op->cval = cval;
(gdb) n
240 return (PyObject *) op;
(gdb) ptype op
type = struct {
struct _object *_ob_next;
struct _object *_ob_prev;
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
Py_complex cval;
} *
(gdb) p * (PyComplexObject *)op
$1 = {_ob_next = 0xb7ca1b18, _ob_prev = 0x81c5960, ob_refcnt = 1,
ob_type = 0x81ebd40, cval = {real = 12, imag = 34}}
....................................................
[email protected]:~$
|
从上面的输出中,可以看到:当设置一个复数时,Python会在内部通过
PyComplex_FromCComplex函数,为其分配一个
PyComplexObject类型的对象,在该对象的cval字段中存储了复数的
real(实部)与
imag(虚部)。在
Objects/complexobject.c文件里还定义了复数相关的加减乘除之类的运算:
static PyObject *
complex_add(PyObject *v, PyObject *w)
{
Py_complex result;
Py_complex a, b;
TO_COMPLEX(v, a);
TO_COMPLEX(w, b);
PyFPE_START_PROTECT("complex_add", return 0)
result = c_sum(a, b);
PyFPE_END_PROTECT(result)
return PyComplex_FromCComplex(result);
}
static PyObject *
complex_sub(PyObject *v, PyObject *w)
{
................................................
}
static PyObject *
complex_mul(PyObject *v, PyObject *w)
{
................................................
}
static PyObject *
complex_div(PyObject *v, PyObject *w)
{
................................................
}
|
有关复数的相关概念,请参考 wiki百科
复数-维基百科 该链接对应的文章。
数字类型的转换函数:
上面介绍的几个数字类型,可以通过Python函数来进行相互转换,如下所示:
-
int(x) :可以将参数x转换为整数类型。
-
long(x) :将参数x转换为长整数类型。
-
float(x) :将参数x转换为浮点数类型。
-
complex(x) :将参数x转换为复数,该复数的实部为x,虚部为0 。
-
complex(x, y) :将参数x与参数y转换为复数,该复数的实部为x,虚部为y 。
下面是个简单的例子:
[email protected]:~$ gdb -q python
....................................................
>>> int(123.456)
123
>>> float(123)
123.0
>>> long(123.5)
123L
>>> complex(12, 34)
(12+34j)
....................................................
[email protected]:~$
|
如果读者想了解这些函数在Python内部都调用了哪些C函数的话,可以查看下面的gdb调试的例子:
[email protected]:~$ gdb -q python
....................................................
>>> int(123.456)
Breakpoint 1, type_call (type=0x81c2d80, args=0xb7d45424, kwds=0x0)
at Objects/typeobject.c:729
729 obj = type->tp_new(type, args, kwds);
(gdb) s
int_new (type=0x81c2d80, args=0xb7d45424, kwds=0x0) at Objects/intobject.c:1067
1067 PyObject *x = NULL;
(gdb) c
Continuing.
123
[40762 refs]
>>> float(123)
Breakpoint 1, type_call (type=0x81c2800, args=0xb7d45424, kwds=0x0)
at Objects/typeobject.c:729
729 obj = type->tp_new(type, args, kwds);
(gdb) s
float_new (type=0x81c2800, args=0xb7d45424, kwds=0x0)
at Objects/floatobject.c:1804
1804 PyObject *x = Py_False; /* Integer zero */
(gdb) c
Continuing.
123.0
[40762 refs]
>>> long(123.5)
Breakpoint 1, type_call (type=0x81c4440, args=0xb7d45424, kwds=0x0)
at Objects/typeobject.c:729
729 obj = type->tp_new(type, args, kwds);
(gdb) s
long_new (type=0x81c4440, args=0xb7d45424, kwds=0x0)
at Objects/longobject.c:4005
4005 PyObject *x = NULL;
(gdb) c
Continuing.
123L
[40763 refs]
>>> complex(12, 34)
Breakpoint 1, type_call (type=0x81ebd40, args=0xb7d3a84c, kwds=0x0)
at Objects/typeobject.c:729
729 obj = type->tp_new(type, args, kwds);
(gdb) s
complex_new (type=0x81ebd40, args=0xb7d3a84c, kwds=0x0)
at Objects/complexobject.c:1134
1134 PyNumberMethods *nbr, *nbi = NULL;
(gdb) c
Continuing.
(12+34j)
....................................................
[email protected]:~$
|
它们都会通过
type_call函数转到各自的xxxx_new函数(如
complex_new之类的C函数)去进行具体的转换工作。
限于篇幅,本章先到这里,下一篇将介绍和数学运算相关的python函数。
OK,休息,休息一下 o(∩_∩)o~~
最本质的人生价值就是人的独立性。
—— 布迪曼