``ftputil`` - a high-level FTP client library
=============================================

:Version:   2.0.2
:Date:      2004-04-18
:Summary:   high-level FTP client library for Python
:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem
:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>

.. contents::

Introduction
------------

The ``ftputil`` module is a high-level interface to the ftplib_
module. The `FTPHost objects`_ generated from it allow many operations
similar to those of os_ and `os.path`_.

.. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
.. _os: http://www.python.org/doc/current/lib/module-os.html
.. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html

Examples::

    import ftputil

    # download some files from the login directory
    host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
    names = host.listdir(host.curdir)
    for name in names:
        if host.path.isfile(name):
            host.download(name, name, 'b')  # remote, local, binary mode

    # make a new directory and copy a remote file into it
    host.mkdir('newdir')
    source = host.file('index.html', 'r')  # file-like object
    target = host.file('newdir/index.html', 'w')  # file-like object
    host.copyfileobj(source, target)  # similar to shutil.copyfileobj
    source.close()
    target.close()

Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
modification time of a file. The latter can also follow links, similar
to `os.stat`_. Even `FTPHost.path.walk`_ works.

.. _`os.stat`: http://www.python.org/doc/current/lib/os-file-dir.html#l2h-1455

The distribution contains a custom ``UserTuple`` module to provide stat_
results with Python versions 2.0 and 2.1.

.. _stat: http://www.python.org/doc/current/lib/module-stat.html

Exception hierarchy
-------------------

The exceptions are in the namespace of the ``ftputil`` package (e. g.
``ftputil.TemporaryError``). They are organized as follows::

    FTPError
        FTPOSError(FTPError, OSError)
            TemporaryError(FTPOSError)
            PermanentError(FTPOSError)
            ParserError(FTPOSError)
        FTPIOError(FTPError)
        RootDirError(FTPError)
        TimeShiftError(FTPError)

and are described here:

- ``FTPError``

  is the root of the exception hierarchy of the module.

- ``FTPOSError``

  is derived from ``OSError``. This is for similarity between the
  os module and ``FTPHost`` objects. Compare

  ::

    try:
        os.chdir('nonexisting_directory')
    except OSError:
        ...

  with

  ::

    host = ftputil.FTPHost('host', 'user', 'password')
    try:
        host.chdir('nonexisting_directory')
    except OSError:
        ...

  Imagine a function

  ::

    def func(path, file):
        ...

  which works on the local file system and catches ``OSErrors``. If you
  change the parameter list to

  ::

    def func(path, file, os=os):
        ...

  where ``os`` denotes the ``os`` module, you can call the function also as

  ::

    host = ftputil.FTPHost('host', 'user', 'password')
    func(path, file, os=host)

  to use the same code for a local and remote file system. Another
  similarity between ``OSError`` and ``FTPOSError`` is that the latter
  holds the FTP server return code in the ``errno`` attribute of the
  exception object and the error text in ``strerror``.

- ``TemporaryError``

  is raised for FTP return codes from the 4xx category. This
  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
  ``ftplib.error_temp`` are *not* identical).

- ``PermanentError``

  is raised for 5xx return codes from the FTP server
  (again, that's similar but *not* identical to ``ftplib.error_perm``).

- ``ParserError``

  is used for errors during the parsing of directory
  listings from the server. This exception is used by the ``FTPHost``
  methods ``stat``, ``lstat``, and ``listdir``.

- ``FTPIOError``

  denotes an I/O error on the remote host. This appears
  mainly with file-like objects which are retrieved by invoking
  ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare

  ::

    >>> try:
    ...     f = open('notthere')
    ... except IOError, obj:
    ...     print obj.errno
    ...     print obj.strerror
    ...
    2
    No such file or directory

  with

  ::

    >>> host = ftputil.FTPHost('host', 'user', 'password')
    >>> try:
    ...     f = host.open('notthere')
    ... except IOError, obj:
    ...     print obj.errno
    ...     print obj.strerror
    ...
    550
    550 notthere: No such file or directory.

  As you can see, both code snippets are similar. (However, the error
  codes aren't the same.)

- ``RootDirError``

  is a special case. Due to the implementation of the lstat method it
  is not possible to do a ``stat`` call  on the root directory ``/``. If
  you know *any* way to do it, please let me know. :-)

- ``TimeShiftError``

  is used to denote errors which relate to setting the `time shift`_,
  e. g. trying to set a value which is no multiple of a full hour.


``FTPHost`` objects
-------------------

.. _`FTPHost construction`:

Construction
~~~~~~~~~~~~

``FTPHost`` instances may be generated with the following call::

    host = ftputil.FTPHost(host, user, password, account,
                           session_factory=ftplib.FTP)

The first four parameters are strings with the same meaning as for the
FTP class in the ``ftplib`` module. The keyword argument
``session_factory`` may be used to generate FTP connections with other
factories than the default ``ftplib.FTP``. For example, the M2Crypto
distribution uses a secure FTP class which is derived from
``ftplib.FTP``.

In fact, all positional and keyword arguments other than
``session_factory`` are passed to the factory to generate a new background
session (which happens for every remote file that is opened; see
below).

This functionality of the constructor also allows to wrap
``ftplib.FTP`` objects to do something that wouldn't be possible with
the ``ftplib.FTP`` constructor alone.

As an example, assume you want to connect to another than the default
port but ``ftplib.FTP`` only offers this by means of its ``connect``
method, but not via its constructor. The solution is to provide a
wrapper class::

    import ftplib
    import ftputil

    EXAMPLE_PORT = 50001

    class MySession(ftplib.FTP):
        def __init__(self, host, userid, password, port):
            """Act like ftplib.FTP's constructor but connect to other port."""
            ftplib.FTP.__init__(self)
            self.connect(host, port)
            self.login(userid, password)

    # try not to use MySession() as factory, - use the class itself
    host = ftputil.FTPHost(host, userid, password,
                           port=EXAMPLE_PORT, session_factory=MySession)
    # use `host` as usual

On login, the format of the directory listings (needed for stat'ing
files and directories) should be determined automatically. If not, you
may use the method `set_directory_format`_ to set the format
"manually".

``FTPHost`` attributes and methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Attributes
,,,,,,,,,,

- ``curdir``, ``pardir``, ``sep``

  are strings which denote the current and the parent directory on
  the remote server. sep identifies the path separator. Though `RFC
  959`_ (File Transfer Protocol) notes that these values may be server
  dependent, the Unix counterparts seem to work well in practice,
  even for non-Unix servers.

.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_

.. _`time shift`:

Time zone correction
,,,,,,,,,,,,,,,,,,,,

.. _`set_time_shift`:

- ``set_time_shift(time_shift)``

  sets the so-called time shift value (measured in seconds). The time
  shift is the difference between the local time of the server and the
  local time of the client at a given moment, i. e. by definition

  ::

    time_shift = server_time - client_time

  Setting this value is important if `upload_if_newer`_ and
  `download_if_newer`_ should work correctly even if the time zone of
  the FTP server differs from that of the client (where ``ftputil``
  runs). Note that the time shift value *can* be negative.

  If the time shift value is invalid, e. g. no multiple of a full hour
  or its absolute (unsigned) value larger than 24 hours, a
  ``TimeShiftError`` is raised.

  See also `synchronize_times`_ for a way to set the time shift with a
  simple method call.

- ``time_shift()``

  return the currently-set time shift value. See ``set_time_shift``
  (above) for its definition.

.. _`synchronize_times`:

- ``synchronize_times()``

  synchronizes the local times of the server and the client, so that
  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
  if the client and the server are in different time zones. For this
  to work, *all* of the following conditions must be true:

  - The connection between server and client is established.

  - The client has write access to the directory that is current when
    ``synchronize_times`` is called.

  - That directory is *not* the root directory (i. e. ``/``) of the
    FTP server.

  If you can't fulfill these conditions, you can nevertheless set the
  time shift value manually with `set_time_shift`_. Trying to call
  ``synchronize_times`` if the above conditions aren't true results in
  a ``TimeShiftError`` exception.

Files and directories
,,,,,,,,,,,,,,,,,,,,,

- ``file(path, mode='r')``

  returns a file-like object that is connected to the path on the
  remote host. This path may be absolute or relative to current
  directory on the remote host (this directory can be determined
  with the getcwd method). As with local file objects the default
  mode is "r", i. e. reading text files. Valid modes are "r", "rb",
  "w", and "wb".

- ``open(path, mode='r')``

  is an alias for ``file`` (see above).

- ``copyfileobj(source, target, length=64*1024)``

  copies the contents from the file-like object source to the
  file-like object target. The only difference to
  ``shutil.copyfileobj`` is the default buffer size.

- ``close()``

  closes the connection to the remote host. After this, no more
  interaction with the FTP server is possible without using a new
  ``FTPHost`` object.

- ``getcwd()``

  returns the absolute current directory on the remote host. This
  method acts similar to ``os.getcwd``.

- ``chdir(directory)``

  sets the current directory on the FTP server. This resembles
  ``os.chdir``, as you may have expected. :-)

- ``mkdir(path, [mode])``

  makes the given directory on the remote host. In the current
  implementation, this doesn't construct "intermediate" directories
  which don't already exist. The ``mode`` parameter is ignored. This
  is for compatibilty with ``os.mkdir`` if an ``FTPHost`` object is
  passed into a function instead of the os module (see the subsection
  on Python exceptions above for an explanation).

- ``rmdir(path)``

  removes the given remote directory.

  In previous versions of ``ftputil``, it depended on the remote
  server whether non-empty directories could be deleted. ``ftputil``
  version 2.0 and up by default allow only to delete empty
  directories.

  If you want to update to ``ftputil`` 2.0 with minimal changes
  to your source code, pass in an additional parameter
  ``_remove_only_empty=False``. Note that this is deprecated and
  will probably be unsupported in future ``ftputil`` versions.

- ``remove(path)``

  removes a file on the remote host (similar to ``os.remove``).

- ``unlink(path)``

  is an alias for ``remove``.

- ``rename(source, target)``

  renames the source file (or directory) on the FTP server.

- ``listdir(path)``

  returns a list containing the names of the files and directories
  in the given path; similar to ``os.listdir``.

Uploading and downloading files
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

- ``upload(source, target, mode='')``

  copies a local source file (given by a filename, i. e. a string)
  to the remote host under the name target. Both source and target
  may be absolute paths or relative to their corresponding current
  directory (on the local or the remote host, respectively). The
  mode may be "" or "a" for ASCII uploads or "b" for binary uploads.
  ASCII mode is the default (again, similar to regular local file
  objects).

- ``download(source, target, mode='')``

  performs a download from the remote source to a target file. Both
  source and target are strings. Additionally, the description of
  the upload method applies here, too.

.. _`upload_if_newer`:

- ``upload_if_newer(source, target, mode='')``

  is similar to the upload method. The only difference is that the
  upload is only invoked if the time of the last modification for
  the source file is more recent than that of the target file, or
  the target doesn't exist at all. If an upload actually happened,
  the return value is a true value, else a false value.

  Note that this method only checks the existence and/or the
  modification time of the source and target file; it can't recognize
  a change in the transfer mode, e. g.

  ::

    # transfer in ASCII mode
    host.upload_if_newer('source_file', 'target_file', 'a')
    # won't transfer the file again
    host.upload_if_newer('source_file', 'target_file', 'b')

  Similarly, if a transfer is interrupted, the remote file will have a
  newer modification time than the local file, and thus the transfer
  won't be repeated if ``upload_if_newer`` is used a second time.
  There are (at least) two possibilities after a failed upload:

  - use ``upload`` instead of ``upload_if_newer``, or

  - remove the incomplete target file with ``FTPHost.remove``, then
    use ``upload`` or ``upload_if_newer`` to transfer it again.

  If it seems that a file is uploaded unnecessarily, read the
  subsection on `time shift`_ settings.

.. _`download_if_newer`:

- ``download_if_newer(source, target, mode='')``

  corresponds to ``upload_if_newer`` but performs a download from the
  server to the local host. Read the descriptions of download and
  ``upload_if_newer`` for more. If a download actually happened, the
  return value is a true value, else a false value.

  If it seems that a file is downloaded unnecessarily, read the
  subsection on `time shift`_ settings.

Stat'ing files and directories
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

The methods ``lstat`` and ``stat`` (and others) rely on the
directory listing format used by the FTP server. When connecting
to a host, ``FTPHost``'s constructor tries to guess the right
format, which mostly succeeds. However, if you get strange results
(or exceptions), it may be necessary to set the directory format
"manually". This is done - immediately after connecting - with

.. _`set_directory_format`:

- ``set_directory_format(server_format)``

  ``server_format`` is one of the strings "unix" or "ms". To choose
  the correct format, you have to start a command line FTP client and
  request a directory listing (most clients do this with the ``DIR``
  command).

  If the resulting lines look like

  ::

      drwxr-sr-x   2 45854    200           512 Jul 30 17:14 image
      -rw-r--r--   1 45854    200          4604 Jan 19 23:11 index.html

  use "unix" as the argument.

  If the output looks like

  ::

      12-07-01  02:05PM       <DIR>          XPLaunch
      07-17-00  02:08PM             12266720 digidash.exe

  use "ms" as the argument string ``server_format``.

  If none of the above settings help, contact me. It would be very
  helpful if you could provide an example listing (``DIR``'s output).

If calling ``lstat`` or ``stat`` yields wrong modification dates
or times, look at the methods that deal with time zone differences
(`time shift`_).

.. _`FTPHost.lstat`:

- ``lstat(path)``

  returns an object similar that from ``os.lstat`` (a "tuple" with
  additional attributes; see the documentation of the ``os`` module for
  details). However, due to the nature of the application, there are
  some important aspects to keep in mind:

  - The result is derived by parsing the output of a ``DIR`` command on
    the server. Therefore, the result from ``FTPHost.lstat`` can not
    contain more information than the received text. In particular:

  - User and group ids can only be determined as strings, not as
    numbers, and that only if the server supplies them. This is
    usually the case with Unix servers but may not be for other FTP
    server programs.

  - Values for the time of the last modification may be rough,
    depending on the information from the server. For timestamps
    older than a year, this usually means that the precision of the
    modification timestamp value is not better than days. For newer
    files, the information may be accurate to a minute.

  - Links can only be recognized on servers that provide this
    information in the ``DIR`` output.

  - Items that can't be determined at all are set to ``None``.

  - There's a special problem with stat'ing the root directory. In
    this case, a ``RootDirError`` is raised. This has to do with the
    algorithm used by ``(l)stat`` and I know of no approach which
    solves this problem.

.. update for other servers

..

  Currently, ``ftputil`` recognizes the MS Robin FTP server.
  Otherwise, a format commonly used by Unix servers is assumed. If you
  need to parse output from another server type, please contact me
  under the email address at the end of this text.

.. implement and document support for setting the directory parser
   "manually"

.. _`FTPHost.stat`:

- ``stat(path)``
  returns ``stat`` information also for files which are pointed to by a
  link. This method follows multiple links until a regular file or
  directory is found. If an infinite link chain is encountered, a
  ``PermanentError`` is raised.

``FTPHost.path``
~~~~~~~~~~~~~~~~

``FTPHost`` objects contain an attribute named ``path``, similar to
`os.path`_. The following methods can be applied to the remote host
with the same semantics as for ``os.path``:

.. _`FTPHost.path.walk`:

::

    abspath(path)
    basename(path)
    commonprefix(path_list)
    dirname(path)
    exists(path)
    getmtime(path)
    getsize(path)
    isabs(path)
    isdir(path)
    isfile(path)
    islink(path)
    join(path1, path2, ...)
    normcase(path)
    normpath(path)
    split(path)
    splitdrive(path)
    splitext(path)
    walk(path, func, arg)

``FTPFile`` objects
-------------------

``FTPFile`` objects as returned by a call to ``FTPHost.file`` (or
``FTPHost.open``) have the following methods - with the same arguments and
semantics as for local files::

    close()
    read([count])
    readline([count])
    readlines()
    write(data)
    writelines(string_sequence)
    xreadlines()

and the attribute ``closed``. For details, see the section `File objects`_
in the Library Reference.

.. _`file objects`:
   http://www.python.org/doc/current/lib/bltin-file-objects.html

Note that ``ftputil`` supports both binary mode and text mode with the
appropriate line ending conversions.

Tips and tricks / FAQ
---------------------

Connecting on another port
~~~~~~~~~~~~~~~~~~~~~~~~~~

By default, an instantiated ``FTPHost`` object connects on the usual
FTP ports. If you have to use a different port, refer to the
section `FTPHost construction`_.

You can use the same approach to connect in active or passive mode, as
you like.

Using active or passive connections
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please see the previous tip.

Conditional upload/download to/from a server in a different time zone
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You may find that ``ftputil`` uploads or downloads files
unnecessarily, or not when it should. This can happen when the FTP
server is in a different time zone than the client on which
``ftputil`` runs. Please see the the section on setting the
`time shift`_. It may even be sufficient to call `synchronize_times`_.

Wrong dates or times when stat'ing on a server
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please see the previous and the next tip.

File-related query methods (e. g. ``listdir``) return unexpected results
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If, for example, ``listdir`` or ``lstat`` return a wrong value or
raise an exception, it may be because of a wrongly determined
directory format. Please see the discussion of
`set_directory_format`_.

Is there a mailing list on ``ftputil``?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Yes, you can subscribe at http://codespeak.net/mailman/listinfo/ftputil
or read the archives at http://codespeak.net/pipermail/ftputil/ .

Where can I get the latest version?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See http://www.sschwarzer.net/python/python_software.html#ftputil .
Announcements will also be sent to the mailing list (see the question
above). Announcements on major updates will also be posted to the
newsgroup `comp.lang.python`_ .

.. _`comp.lang.python`: news:comp.lang.python

I don't find an answer to my problem in this document
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please send me an email with your question, and I'll see what I can
do for you. :-) Probably a better way is to send your question to the
mailing list at ftputil@codespeak.net; potentially more people might
be able to help you.

Bugs and limitations
--------------------

- ``ftputil`` needs at least Python 2.0 to work.

- Due to the implementation of ``lstat`` it can not return a sensible
  value for the root directory ``/``. If you know an implementation that
  can do this, please let me know. The root directory is handled
  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.

- Timeouts of individual child sessions currently are not handled.
  This is only a problem if your ``FTPHost`` object or the generated
  ``FTPFile`` objects are inactive for about ten minutes or longer.

- Until now, I haven't paid attention to thread safety. In principle,
  at least, different ``FTPFile`` objects should be usable in different
  threads.

- ``FTPFile`` objects in text mode *may not* support charsets with more
  than one byte per character. Please email me your experiences
  (address below), if you work with multibyte text streams in FTP
  sessions.

- Currently, it is not possible to continue an interrupted upload or
  download. Contact me if you have problems with that.

- The ``UserTuple`` class, provided in ``UserTuple.py``, is not thoroughly
  tested. If you encouter problems, please notify me.

Files
-----

If not overwritten via installation options, the ``ftputil`` files
reside in the ``ftputil`` package. The documentation (in
`reStructured Text`_ and in HTML format) is in the same directory.

.. _`reStructured Text`: http://docutils.sourceforge.net/rst.html

The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
If you only *use* ``ftputil`` (i. e. *not* modifying it), you can
delete these files.

References
----------

- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
  Unit Testing with Mock Objects`_.

- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.

- Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.

.. _`Endo-Testing: Unit Testing with Mock Objects`:
   http://www.connextra.com/aboutUs/mockobjects.pdf
.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
.. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html

Author
------

``ftputil`` is written by Stefan Schwarzer <sschwarzer@sschwarzer.net>.

Feedback is appreciated. :-)

