创新互联Python教程:Python 2.3 有什么新变化

python 2.3 有什么新变化

作者

创新互联专注为客户提供全方位的互联网综合服务,包含不限于成都网站设计、成都网站建设、烈山网络推广、重庆小程序开发公司、烈山网络营销、烈山企业策划、烈山品牌公关、搜索引擎seo、人物专访、企业宣传片、企业代运营等,从售前售中售后,我们都将竭诚为您服务,您的肯定,是我们最大的嘉奖;创新互联为所有大学生创业者提供烈山建站搭建服务,24小时服务热线:13518219792,官方网址:www.cdcxhl.com

A.M. Kuchling

本文介绍了 Python 2.3 的新特性。 Python 2.3 发布于 2003 年 7 月 29 日。

The main themes for Python 2.3 are polishing some of the features added in 2.2, adding various small but useful enhancements to the core language, and expanding the standard library. The new object model introduced in the previous version has benefited from 18 months of bugfixes and from optimization efforts that have improved the performance of new-style classes. A few new built-in functions have been added such as sum() and enumerate(). The in operator can now be used for substring searches (e.g. "ab" in "abc" returns True).

Some of the many new library features include Boolean, set, heap, and date/time data types, the ability to import modules from ZIP-format archives, metadata support for the long-awaited Python catalog, an updated version of IDLE, and modules for logging messages, wrapping text, parsing CSV files, processing command-line options, using BerkeleyDB databases… the list of new and enhanced modules is lengthy.

This article doesn’t attempt to provide a complete specification of the new features, but instead provides a convenient overview. For full details, you should refer to the documentation for Python 2.3, such as the Python Library Reference and the Python Reference Manual. If you want to understand the complete implementation and design rationale, refer to the PEP for a particular new feature.

PEP 218: 标准集合数据类型

The new sets module contains an implementation of a set datatype. The Set class is for mutable sets, sets that can have members added and removed. The ImmutableSet class is for sets that can’t be modified, and instances of ImmutableSet can therefore be used as dictionary keys. Sets are built on top of dictionaries, so the elements within a set must be hashable.

这是一个简单的示例:

 
 
 
 
  1. >>> import sets
  2. >>> S = sets.Set([1,2,3])
  3. >>> S
  4. Set([1, 2, 3])
  5. >>> 1 in S
  6. True
  7. >>> 0 in S
  8. False
  9. >>> S.add(5)
  10. >>> S.remove(3)
  11. >>> S
  12. Set([1, 2, 5])
  13. >>>

The union and intersection of sets can be computed with the union() and intersection() methods; an alternative notation uses the bitwise operators & and |. Mutable sets also have in-place versions of these methods, union_update() and intersection_update().

 
 
 
 
  1. >>> S1 = sets.Set([1,2,3])
  2. >>> S2 = sets.Set([4,5,6])
  3. >>> S1.union(S2)
  4. Set([1, 2, 3, 4, 5, 6])
  5. >>> S1 | S2 # Alternative notation
  6. Set([1, 2, 3, 4, 5, 6])
  7. >>> S1.intersection(S2)
  8. Set([])
  9. >>> S1 & S2 # Alternative notation
  10. Set([])
  11. >>> S1.union_update(S2)
  12. >>> S1
  13. Set([1, 2, 3, 4, 5, 6])
  14. >>>

It’s also possible to take the symmetric difference of two sets. This is the set of all elements in the union that aren’t in the intersection. Another way of putting it is that the symmetric difference contains all elements that are in exactly one set. Again, there’s an alternative notation (^), and an in-place version with the ungainly name symmetric_difference_update().

 
 
 
 
  1. >>> S1 = sets.Set([1,2,3,4])
  2. >>> S2 = sets.Set([3,4,5,6])
  3. >>> S1.symmetric_difference(S2)
  4. Set([1, 2, 5, 6])
  5. >>> S1 ^ S2
  6. Set([1, 2, 5, 6])
  7. >>>

另外还有 issubset()issuperset() 方法用来检查一个集合是否为另一个集合的子集或超集:

 
 
 
 
  1. >>> S1 = sets.Set([1,2,3])
  2. >>> S2 = sets.Set([2,3])
  3. >>> S2.issubset(S1)
  4. True
  5. >>> S1.issubset(S2)
  6. False
  7. >>> S1.issuperset(S2)
  8. True
  9. >>>

参见

PEP 218 - 添加内置Set对象类型

PEP 由 Greg V. Wilson 撰写 ; 由 Greg V. Wilson, Alex Martelli 和 GvR 实现。

PEP 255: 简单的生成器

In Python 2.2, generators were added as an optional feature, to be enabled by a from __future__ import generators directive. In 2.3 generators no longer need to be specially enabled, and are now always present; this means that yield is now always a keyword. The rest of this section is a copy of the description of generators from the “What’s New in Python 2.2” document; if you read it back when Python 2.2 came out, you can skip the rest of this section.

You’re doubtless familiar with how function calls work in Python or C. When you call a function, it gets a private namespace where its local variables are created. When the function reaches a return statement, the local variables are destroyed and the resulting value is returned to the caller. A later call to the same function will get a fresh new set of local variables. But, what if the local variables weren’t thrown away on exiting a function? What if you could later resume the function where it left off? This is what generators provide; they can be thought of as resumable functions.

这里是一个生成器函数的最简示例:

 
 
 
 
  1. def generate_ints(N):
  2. for i in range(N):
  3. yield i

A new keyword, yield, was introduced for generators. Any function containing a yield statement is a generator function; this is detected by Python’s bytecode compiler which compiles the function specially as a result.

When you call a generator function, it doesn’t return a single value; instead it returns a generator object that supports the iterator protocol. On executing the yield statement, the generator outputs the value of i, similar to a return statement. The big difference between yield and a return statement is that on reaching a yield the generator’s state of execution is suspended and local variables are preserved. On the next call to the generator’s .next() method, the function will resume executing immediately after the yield statement. (For complicated reasons, the yield statement isn’t allowed inside the try block of a tryfinally statement; read PEP 255 for a full explanation of the interaction between yield and exceptions.)

这里是 generate_ints() 生成器的用法示例:

 
 
 
 
  1. >>> gen = generate_ints(3)
  2. >>> gen
  3. >>> gen.next()
  4. 0
  5. >>> gen.next()
  6. 1
  7. >>> gen.next()
  8. 2
  9. >>> gen.next()
  10. Traceback (most recent call last):
  11. File "stdin", line 1, in ?
  12. File "stdin", line 2, in generate_ints
  13. StopIteration

你可以等价地写成 for i in generate_ints(5)a,b,c = generate_ints(3)

Inside a generator function, the return statement can only be used without a value, and signals the end of the procession of values; afterwards the generator cannot return any further values. return with a value, such as return 5, is a syntax error inside a generator function. The end of the generator’s results can also be indicated by raising StopIteration manually, or by just letting the flow of execution fall off the bottom of the function.

You could achieve the effect of generators manually by writing your own class and storing all the local variables of the generator as instance variables. For example, returning a list of integers could be done by setting self.count to 0, and having the next() method increment self.count and return it. However, for a moderately complicated generator, writing a corresponding class would be much messier. Lib/test/test_generators.py contains a number of more interesting examples. The simplest one implements an in-order traversal of a tree using generators recursively.

 
 
 
 
  1. # A recursive generator that generates Tree leaves in in-order.
  2. def inorder(t):
  3. if t:
  4. for x in inorder(t.left):
  5. yield x
  6. yield t.label
  7. for x in inorder(t.right):
  8. yield x

Two other examples in Lib/test/test_generators.py produce solutions for the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no queen threatens another) and the Knight’s Tour (a route that takes a knight to every square of an $NxN$ chessboard without visiting any square twice).

The idea of generators comes from other programming languages, especially Icon (https://www.cs.arizona.edu/icon/), where the idea of generators is central. In Icon, every expression and function call behaves like a generator. One example from “An Overview of the Icon Programming Language” at https://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks like:

 
 
 
 
  1. sentence := "Store it in the neighboring harbor"
  2. if (i := find("or", sentence)) > 5 then write(i)

In Icon the find() function returns the indexes at which the substring “or” is found: 3, 23, 33. In the if statement, i is first assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon retries it with the second value of 23. 23 is greater than 5, so the comparison now succeeds, and the code prints the value 23 to the screen.

Python doesn’t go nearly as far as Icon in adopting generators as a central concept. Generators are considered part of the core Python language, but learning or using them isn’t compulsory; if they don’t solve any problems that you have, feel free to ignore them. One novel feature of Python’s interface as compared to Icon’s is that a generator’s state is represented as a concrete object (the iterator) that can be passed around to other functions or stored in a data structure.

参见

PEP 255 - 简单生成器

Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew.

PEP 263: 源代码的字符编码格式

Python source files can now be declared as being in different character set encodings. Encodings are declared by including a specially formatted comment in the first or second line of the source file. For example, a UTF-8 file can be declared with:

 
 
 
 
  1. #!/usr/bin/env python
  2. # -*- coding: UTF-8 -*-

Without such an encoding declaration, the default encoding used is 7-bit ASCII. Executing or importing modules that contain string literals with 8-bit characters and have no encoding declaration will result in a DeprecationWarning being signalled by Python 2.3; in 2.4 this will be a syntax error.

The encoding declaration only affects Unicode string literals, which will be converted to Unicode using the specified encoding. Note that Python identifiers are still restricted to ASCII characters, so you can’t have variable names that use characters outside of the usual alphanumerics.

参见

PEP 263 - 定义 Python 源代码的编码格式

由 Marc-André Lemburg 和 Martin von Löwis 撰写 ; 由 Suzuki Hisao 和 Martin von Löwis 实现。

PEP 273: 从ZIP压缩包导入模块

The new zipimport module adds support for importing modules from a ZIP-format archive. You don’t need to import the module explicitly; it will be automatically imported if a ZIP archive’s filename is added to sys.path. For example:

 
 
 
 
  1. amk@nyman:~/src/python$ unzip -l /tmp/example.zip
  2. Archive: /tmp/example.zip
  3. Length Date Time Name
  4. -------- ---- ---- ----
  5. 8467 11-26-02 22:30 jwzthreading.py
  6. -------- -------
  7. 8467 1 file
  8. amk@nyman:~/src/python$ ./python
  9. Python 2.3 (#1, Aug 1 2003, 19:54:32)
  10. >>> import sys
  11. >>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path
  12. >>> import jwzthreading
  13. >>> jwzthreading.__file__
  14. '/tmp/example.zip/jwzthreading.py'
  15. >>>

An entry in sys.path can now be the filename of a ZIP archive. The ZIP archive can contain any kind of files, but only files named *.py, *.pyc, or *.pyo can be imported. If an archive only contains *.py files, Python will not attempt to modify the archive by adding the corresponding *.pyc file, meaning that if a ZIP archive doesn’t contain *.pyc files, importing may be rather slow.

A path within the archive can also be specified to only import from a subdirectory; for example, the path /tmp/example.zip/lib/ would only import from the lib/ subdirectory within the archive.

参见

PEP 273 - 从 ZIP 压缩包导入模块

Written by James C. Ahlstrom, who also provided an implementation. Python 2.3 follows the specification in PEP 273, but uses an implementation written by Just van Rossum that uses the import hooks described in PEP 302. See section PEP 302: 新导入钩子 for a description of the new import hooks.

PEP 277: 针对 Windows NT 的 Unicode 文件名支持

On Windows NT, 2000, and XP, the system stores file names as Unicode strings. Traditionally, Python has represented file names as byte strings, which is inadequate because it renders some file names inaccessible.

Python now allows using arbitrary Unicode strings (within the limitations of the file system) for all functions that expect file names, most notably the open() built-in function. If a Unicode string is passed to os.listdir(), Python now returns a list of Unicode strings. A new function, os.getcwdu(), returns the current directory as a Unicode string.

字节串仍可被用作文件名,并且在 Windows 上 Python 将透明地使用 mbcs 编码格式将其转换为 Unicode。

Other systems also allow Unicode strings as file names but convert them to byte strings before passing them to the system, which can cause a UnicodeError to be raised. Applications can test whether arbitrary Unicode strings are supported as file names by checking os.path.supports_unicode_filenames, a Boolean value.

在 MacOS 下,os.listdir() 现在可以返回 Unicode 文件名。

参见

PEP 277 - 针对 Windows NT 的 Unicode 文件名支持

由 Neil Hodgson 撰写 ; 由 Neil Hodgson, Martin von Löwis 和 Mark Hammond 实现。

PEP 278: 通用换行支持

The three major operating systems used today are Microsoft Windows, Apple’s Macintosh OS, and the various Unix derivatives. A minor irritation of cross-platform work is that these three platforms all use different characters to mark the ends of lines in text files. Unix uses the linefeed (ASCII character 10), MacOS uses the carriage return (ASCII character 13), and Windows uses a two-character sequence of a carriage return plus a newline.

Python’s file objects can now support end of line conventions other than the one followed by the platform on which Python is running. Opening a file with the mode 'U' or 'rU' will open a file for reading in universal newlines mode. All three line ending conventions will be translated to a '\n' in the strings returned by the various file methods such as read() and readline().

Universal newline support is also used when importing modules and when executing a file with the execfile() function. This means that Python modules can be shared between all three operating systems without needing to convert the line-endings.

This feature can be disabled when compiling Python by specifying the --without-universal-newlines switch when running Python’s configure script.

参见

PEP 278 - 通用换行支持

由 Jack Jansen 撰写并实现。

PEP 279: enumerate()

A new built-in function, enumerate(), will make certain loops a bit clearer. enumerate(thing), where thing is either an iterator or a sequence, returns an iterator that will return (0, thing[0]), (1, thing[1]), (2, thing[2]), and so forth.

A common idiom to change every element of a list looks like this:

 
 
 
 
  1. for i in range(len(L)):
  2. item = L[i]
  3. # ... compute some result based on item ...
  4. L[i] = result

可以使用 enumerate() 重写为:

 
 
 
 
  1. for i, item in enumerate(L):
  2. # ... compute some result based on item ...
  3. L[i] = result

参见

PEP 279 - 内置函数 enumerate()

由 Raymond D. Hettinger 撰写并实现。

PEP 282: logging 包

A standard package for writing logs, logging, has been added to Python 2.3. It provides a powerful and flexible mechanism for generating logging output which can then be filtered and processed in various ways. A configuration file written in a standard format can be used to control the logging behavior of a program. Python includes handlers that will write log records to standard error or to a file or socket, send them to the system log, or even e-mail them to a particular address; of course, it’s also possible to write your own handler classes.

The Logger class is the primary class. Most application code will deal with one or more Logger objects, each one used by a particular subsystem of the application. Each Logger is identified by a name, and names are organized into a hierarchy using . as the component separator. For example, you might have Logger instances named server, server.auth and server.network. The latter two instances are below server in the hierarchy. This means that if you turn up the verbosity for server or direct server messages to a different handler, the changes will also apply to records logged to server.auth and server.network. There’s also a root Logger that’s the parent of all other loggers.

For simple uses, the logging package contains some convenience functions that always use the root log:

 
 
 
 
  1. import logging
  2. logging.debug('Debugging information')
  3. logging.info('Informational message')
  4. logging.warning('Warning:config file %s not found', 'server.conf')
  5. logging.error('Error occurred')
  6. logging.critical('Critical error -- shutting down')

这会产生以下输出:

 
 
 
 
  1. WARNING:root:Warning:config file server.conf not found
  2. ERROR:root:Error occurred
  3. CRITICAL:root:Critical error -- shutting down

In the default configuration, informational and debugging messages are suppressed and the output is sent to standard error. You can enable the display of informational and debugging messages by calling the setLevel() method on the root logger.

Notice the warning() call’s use of string formatting operators; all of the functions for logging messages take the arguments (msg, arg1, arg2, ...) and log the string resulting from msg % (arg1, arg2, ...).

There’s also an exception() function that records the most recent traceback. Any of the other functions will also record the traceback if you specify a true value for the keyword argument exc_info.

 
 
 
 
  1. def f():
  2. try: 1/0
  3. except: logging.exception('Problem recorded')
  4. f()

这会产生以下输出:

 
 
 
 
  1. ERROR:root:Problem recorded
  2. Traceback (most recent call last):
  3. File "t.py", line 6, in f
  4. 1/0
  5. ZeroDivisionError: integer division or modulo by zero

Slightly more advanced programs will use a logger other than the root logger. The getLogger(name) function is used to get a particular log, creating it if it doesn’t exist yet. getLogger(None) returns the root logger.

 
 
 
 
  1. log = logging.getLogger('server')
  2. ...
  3. log.info('Listening on port %i', port)
  4. ...
  5. log.critical('Disk full')
  6. ...

Log records are usually propagated up the hierarchy, so a message logged to server.auth is also seen by server and root, but a Logger can prevent this by setting its propagate attribute to False.

There are more classes provided by the logging package that can be customized. When a Logger instance is told to log a message, it creates a LogRecord instance that is sent to any number of different Handler instances. Loggers and handlers can also have an attached list of filters, and each filter can cause the LogRecord to be ignored or can modify the record before passing it along. When they’re finally output, LogRecord instances are converted to text by a Formatter class. All of these classes can be replaced by your own specially written classes.

With all of these features the logging package should provide enough flexibility for even the most complicated applications. This is only an incomplete overview of its features, so please see the package’s reference documentation for all of the details. Reading PEP 282 will also be helpful.

参见

PEP 282 - Logging 系统

由 Vinay Sajip 和 Trent Mick 撰写 ; 由 Vinay Sajip 实现。

PEP 285: 布尔类型

A Boolean type was added to Python 2.3. Two new constants were added to the __builtin__ module, True and False. (True and False constants were added to the built-ins in Python 2.2.1, but the 2.2.1 versions are simply set to integer values of 1 and 0 and aren’t a different type.)

The type object for this new type is named bool; the constructor for it takes any Python value and converts it to True or False.

 
 
 
 
  1. >>> bool(1)
  2. True
  3. >>> bool(0)
  4. False
  5. >>> bool([])
  6. False
  7. >>> bool( (1,) )
  8. True

Most of the standard library modules and built-in functions have been changed to return Booleans.

 
 
 
 
  1. >>> obj = []
  2. >>> hasattr(obj, 'append')
  3. True
  4. >>> isinstance(obj, list)
  5. True
  6. >>> isinstance(obj, tuple)
  7. False

Python’s Booleans were added with the primary goal of making code clearer. For example, if you’re reading a function and encounter the statement return 1, you might wonder whether the 1 represents a Boolean truth value, an index, or a coefficient that multiplies some other quantity. If the statement is return True, however, the meaning of the return value is quite clear.

Python’s Booleans were not added for the sake of strict type-checking. A very strict language such as Pascal would also prevent you performing arithmetic with Booleans, and would require that the expression in an if statement always evaluate to a Boolean result. Python is not this strict and never will be, as PEP 285 explicitly says. This means you can still use any expression in an if statement, even ones that evaluate to a list or tuple or some random object. The Boolean type is a subclass of the int class so that arithmetic using a Boolean still works.

 
 
 
 
  1. >>> True + 1
  2. 2
  3. >>> False + 1
  4. 1
  5. >>> False * 75
  6. 0
  7. >>> True * 75
  8. 75

To sum up True and False in a sentence: they’re alternative ways to spell the integer values 1 and 0, with the single difference that str() and repr() return the strings 'True' and 'False' instead of '1' and '0'.

参见

PEP 285 - 添加布尔类型

由 GvR 撰写并实现。

PEP 293: Codec Error Handling Callbacks

When encoding a Unicode string into a byte string, unencodable characters may be encountered. So far, Python has allowed specifying the error processing as either “strict” (raising UnicodeError), “ignore” (skipping the character), or “replace” (using a question mark in the output string), with “strict” being the default behavior. It may be desirable to specify alternative processing of such errors, such as inserting an XML character reference or HTML entity reference into the converted string.

Python now has a flexible framework to add different processing strategies. New error handlers can be added with codecs.register_error(), and codecs then can access the error handler with codecs.lookup_error(). An equivalent C API has been added for codecs written in C. The error handler gets the necessary state information such as the string being converted, the position in the string where the error was detected, and the target encoding. The handler can then either raise an exception or return a replacement string.

Two additional error handlers have been implemented using this framework: “backslashreplace” uses Python backslash quoting to represent unencodable characters and “xmlcharrefreplace” emits XML character references.

参见

PEP 293 - Codec Error Handling Callbacks

由 Walter Dörwald 撰写并实现。

PEP 301: Distutils的软件包索引和元数据

Support for the long-requested Python catalog makes its first appearance in 2.3.

The heart of the catalog is the new Distutils register command. Running python setup.py register will collect the metadata describing a package, such as its name, version, maintainer, description, &c., and send it to a central catalog server. The resulting catalog is available from https://pypi.org.

To make the catalog a bit more useful, a new optional classifiers keyword argument has been added to the Distutils setup() function. A list of Trove-style strings can be supplied to help classify the software.

Here’s an example setup.py with classifiers, written to be compatible with older versions of the Distutils:

 
 
 
 
  1. from distutils import core
  2. kw = {'name': "Quixote",
  3. 'version': "0.5.1",
  4. 'description': "A highly Pythonic Web application framework",
  5. # ...
  6. }
  7. if (hasattr(core, 'setup_keywords') and
  8. 'classifiers' in core.setup_keywords):
  9. kw['classifiers'] = \
  10. ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
  11. 'Environment :: No Input/Output (Daemon)',
  12. 'Intended Audience :: Developers'],
  13. core.setup(**kw)

The full list of classifiers can be obtained by running python setup.py register --list-classifiers.

参见

PEP 301 - Distutils 的软件包索引和元数据

由 Richard Jones 撰写并实现。

PEP 302: 新导入钩子

While it’s been possible to write custom import hooks ever since the ihooks module was introduced in Python 1.3, no one has ever been really happy with it because writing new import hooks is difficult and messy. There have been various proposed alternatives such as the imputil and iu modules, but none of them has ever gained much acceptance, and none of them were easily usable from C code.

PEP 302 borrows ideas from its predecessors, especially from Gordon McMillan’s iu module. Three new items are added to the sys module:

  • sys.path_hooks is a list of callable objects; most often they’ll be classes. Each callable takes a string containing a path and either returns an importer object that will handle imports from this path or raises an ImportError exception if it can’t handle this path.

  • sys.path_importer_cache caches importer objects for each path, so sys.path_hooks will only need to be traversed once for each path.

  • sys.meta_path is a list of importer objects that will be traversed before sys.path is checked. This list is initially empty, but user code can add objects to it. Additional built-in and frozen modules can be imported by an object added to this list.

Importer objects must have a single method, find_module(fullname, path=None). fullname will be a module or package name, e.g. string or distutils.core. find_module() must return a loader object that has a single method, load_module(fullname), that creates and returns the corresponding module object.

Pseudo-code for Python’s new import logic, therefore, looks something like this (simplified a bit; see PEP 302 for the full details):

 
 
 
 
  1. for mp in sys.meta_path:
  2. loader = mp(fullname)
  3. if loader is not None:
  4. = loader.load_module(fullname)
  5. for path in sys.path:
  6. for hook in sys.path_hooks:
  7. try:
  8. importer = hook(path)
  9. except ImportError:
  10. # ImportError, so try the other path hooks
  11. pass
  12. else:
  13. loader = importer.find_module(fullname)
  14. = loader.load_module(fullname)
  15. # Not found!
  16. raise ImportError

参见

PEP 302 - 新导入钩

由 Just van Rossum 和 Paul Moore 撰写 ; 由 Just van Rossum 实现。

PEP 305: 逗号分隔文件

Comma-separated files are a format frequently used for exporting data from databases and spreadsheets. Python 2.3 adds a parser for comma-separated files.

Comma-separated format is deceptively simple at first glance:

 
 
 
 
  1. Costs,150,200,3.95

Read a line and call line.split(','): what could be simpler? But toss in string data that can contain commas, and things get more complicated:

 
 
 
 
  1. "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"

A big ugly regular expression can parse this, but using the new csv package is much simpler:

 
 
 
 
  1. import csv
  2. input = open('datafile', 'rb')
  3. reader = csv.reader(input)
  4. for line in reader:
  5. print line

The reader() function takes a number of different options. The field separator isn’t limited to the comma and can be changed to any character, and so can the quoting and line-ending characters.

Different dialects of comma-separated files can be defined and registered; currently there are two dialects, both used by Microsoft Excel. A separate csv.writer class will generate comma-separated files from a succession of tuples or lists, quoting strings that contain the delimiter.

参见

该实现在“Python 增强提议” - PEP 305 (CSV 文件 API) 中被提出

由 Kevin Altis, Dave Cole, Andrew McNamara, Skip Montanaro, Cliff Wells 撰写并实现。

PEP 307: Pickle Enhancements

The pickle and cPickle modules received some attention during the 2.3 development cycle. In 2.2, new-style classes could be pickled without difficulty, but they weren’t pickled very compactly; PEP 307 quotes a trivial example where a new-style class results in a pickled string three times longer than that for a classic class.

The solution was to invent a new pickle protocol. The pickle.dumps() function has supported a text-or-binary flag for a long time. In 2.3, this flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle format, 1 is the old binary format, and now 2 is a new 2.3-specific format. A new constant, pickle.HIGHEST_PROTOCOL, can be used to select the fanciest protocol available.

Unpickling is no longer considered a safe operation. 2.2’s pickle provided hooks for trying to prevent unsafe classes from being unpickled (specifically, a __safe_for_unpickling__ attribute), but none of this code was ever audited and therefore it’s all been ripped out in 2.3. You should not unpickle untrusted data in any version of Python.

To reduce the pickling overhead for new-style classes, a new interface for customizing pickling was added using three special methods: __getstate__(), __setstate__(), and __getnewargs__(). Consult PEP 307 for the full semantics of these methods.

As a way to compress pickles yet further, it’s now possible to use integer codes instead of long strings to identify pickled classes. The Python Software Foundation will maintain a list of standardized codes; there’s also a range of codes for private use. Currently no codes have been specified.

参见

PEP 307 - Extensions to the pickle protocol

PEP 由 Guido van Rossum 和 Tim Peters 撰写和实现。

扩展切片

Ever since Python 1.4, the slicing syntax has supported an optional third “step” or “stride” argument. For example, these are all legal Python syntax: L[1:10:2], L[:-1:1], L[::-1]. This was added to Python at the request of the developers of Numerical Python, which uses the third argument extensively. However, Python’s built-in list, tuple, and string sequence types have never supported this feature, raising a TypeError if you tried it. Michael Hudson contributed a patch to fix this shortcoming.

For example, you can now easily extract the elements of a list that have even indexes:

 
 
 
 
  1. >>> L = range(10)
  2. >>> L[::2]
  3. [0, 2, 4, 6, 8]

Negative values also work to make a copy of the same list in reverse order:

 
 
 
 
  1. >>> L[::-1]
  2. [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

This also works for tuples, arrays, and strings:

 
 
 
 
  1. >>> s='abcd'
  2. >>> s[::2]
  3. 'ac'
  4. >>> s[::-1]
  5. 'dcba'

If you have a mutable sequence such as a list or an array you can assign to or delete an extended slice, but there are some differences between assignment to extended and regular slices. Assignment to a regular slice can be used to change the length of the sequence:

 
 
 
 
  1. >>> a = range(3)
  2. >>> a
  3. [0, 1, 2]
  4. >>> a[1:3] = [4, 5, 6]
  5. >>> a
  6. [0, 4, 5, 6]

Extended slices aren’t this flexible. When assigning to an extended slice, the list on the right hand side of the statement must contain the same number of items as the slice it is replacing:

 
 
 
 
  1. >>> a = range(4)
  2. >>> a
  3. [0, 1, 2, 3]
  4. >>> a[::2]
  5. [0, 2]
  6. >>> a[::2] = [0, -1]
  7. >>> a
  8. [0, 1, -1, 3]
  9. >>> a[::2] = [0,1,2]
  10. Traceback (most recent call last):
  11. File "", line 1, in ?
  12. ValueError: attempt to assign sequence of size 3 to extended slice of size 2

Deletion is more straightforward:

 
 
 
 
  1. >>> a = range(4)
  2. >>> a
  3. [0, 1, 2, 3]
  4. >>> a[::2]
  5. [0, 2]
  6. >>> del a[::2]
  7. >>> a
  8. [1, 3]

One can also now pass slice objects to the __getitem__() methods of the built-in sequences:

 
 
 
 
  1. >>> range(10).__getitem__(slice(0, 5, 2))
  2. [0, 2, 4]

Or use slice objects directly in subscripts:

 
 
 
 
  1. >>> range(10)[slice(0, 5, 2)]
  2. [0, 2, 4]

To simplify implementing sequences that support extended slicing, slice objects now have a method indices(length) which, given the length of a sequence, returns a (start, stop, step) tuple that can be passed directly to range(). indices() handles omitted and out-of-bounds indices in a manner consistent with regular slices (and this innocuous phrase hides a welter of confusing details!). The method is intended to be used like this:

 
 
 
 
  1. class FakeSeq:
  2. ...
  3. def calc_item(self, i):
  4. ...
  5. def __getitem__(self, item):
  6. if isinstance(item, slice):
  7. indices = item.indices(len(self))
  8. return FakeSeq([self.calc_item(i) for i in range(*indices)])
  9. else:
  10. return self.calc_item(i)

From this example you can also see that the built-in slice object is now the type object for the slice type, and is no longer a function. This is consistent with Python 2.2, where int, str, etc., underwent the same change.

其他语言特性修改

Here are all of the changes that Python 2.3 makes to the core Python language.

  • The yield statement is now always a keyword, as described in section PEP 255: 简单的生成器 of this document.

  • A new built-in function enumerate() was added, as described in section PEP 279: enumerate() of this document.

  • Two new constants, True and False were added along with the built-in bool type, as described in section PEP 285: 布尔类型 of this document.

  • The int() type constructor will now return a long integer instead of raising an OverflowError when a string or floating-point number is too large to fit into an integer. This can lead to the paradoxical result that isinstance(int(expression), int) is false, but that seems unlikely to cause problems in practice.

  • Built-in types now support the extended slicing syntax, as described in section 扩展切片 of this document.

  • A new built-in function, sum(iterable, start=0), adds up the numeric items in the iterable object and returns their sum. sum() only accepts numbers, meaning that you can’t use it to concatenate a bunch of strings. (Contributed by Alex Martelli.)

  • list.insert(pos, value) used to insert value at the front of the list when pos was negative. The behaviour has now been changed to be consistent with slice indexing, so when pos is -1 the value will be inserted before the last element, and so forth.

  • list.index(value), which searches for value within the list and returns its index, now takes optional start and stop arguments to limit the search to only part of the list. 新闻标题:创新互联Python教程:Python 2.3 有什么新变化
    网站路径:http://www.hantingmc.com/qtweb/news32/297532.html

    网站建设、网络推广公司-创新互联,是专注品牌与效果的网站制作,网络营销seo公司;服务项目有等

    广告

    声明:本网站发布的内容(图片、视频和文字)以用户投稿、用户转载内容为主,如果涉及侵权请尽快告知,我们将会在第一时间删除。文章观点不代表本网站立场,如需处理请联系客服。电话:028-86922220;邮箱:631063699@qq.com。内容未经允许不得转载,或转载时需注明来源: 创新互联