One week ago, "JoePie91" wrote a blog post challenging the Python
community and the state of Python documentation, titled:
"The Python documentation is bad, and you should feel bad".
http://joepie91.wordpress.com/2013/02/19/the-python-documentation-is-bad-
and-you-should-feel-bad/
It is valuable to contrast and compare the PHP and Python docs:
tl;dr? tb
I haven't used PHP or its documentation so I can't compare it
to Python's. I have used Python's documentation and can say
I agree with many of the criticisms made by JoePie91.
One of the problems with "fixing" the Python reference docs
(by which I mean primarily the Language and Library References)
it that there is no common agreement about what a "good"
reference should be. In the Python development community
that controls the overall structure and contents of the
Python documentation, there seems to be strong minimalist
streak. It often seems like the documentation is the
product of a contest to find the minimum number of words to
describe something and still be able to defend it as correct.
Any documentation must be written with a target audience in
mind and IMO the audience for the Python reference docs should
be programmers familiar with one or two procedural or OO
languages at an intermediate level. (Obviously different
sections of documentation can modify this. Later documentation
will assume knowledge of basic concepts like Python objects,
argument passing and assignment semantics and so forth that
were presented earlier, and documentation for specialized
problem domain modules, eg an SMTP module, would assume some
knowledge of email, smtp and networking.)
As JoePie91 pointed out, reference material should describe
its subject matter completely and accurately. Once documentation
has archived that minimum bar of viability, its quality is
determined by how effectively it transfers that information
to the reader. I distinguish reference from tutorial material
in that the former is optimized for looking up information
and presenting it concisely, the latter for presenting (quite
possibly the same) information in a linear fashion with no
forward references and presenting it verbosely and experientially.
I distinguish a language reference from a language standard
in that the audience for the latter are language implementors
rather than users. I would describe a reference document for
those already competent with Python and as a big cheat-sheet.
A frequent failing of the Python docs is just plain poor
writing. When explaining something, start with a description
of what the something is, does, etc, in a form understandable
by the target audience. Is there anyone who can understand
what the very useful collections.defaultdict does without
multiple rereadings? According to its docs, it "returns a
new dictionary-like object." That is underspecified -- many
things return dictionary-like objects. It continues "it
overrides one method and adds one writable instance variable."
OK, but WTF does it *do*?! It then goes on to describe its
use which one has to understand without an overarching context
and then reason backwards to eventually figure out that it is
a dict that provides for user-specified behavior when accessed
with a key that doesn't exist [*1]
Important quality enablers are good tables of contents,
indexes, glossaries, cross references and examples.
Examples should be used to illustrate a textual description
and never used as a substitute for textual descriptions.
Cross references are particularly important in tying together
related material that is found in disparate doc locations.
For example, information on Python's "+" operator is found in:
Lang: 2.5. Operators
Lang: 3.3.7. Emulating numeric types
Lang: 6.5. Unary arithmetic and bitwise operations
Lang: 6.6. Binary arithmetic operations
Lang: 6.15. Summary (mislabeled, actually operator precedence)
Lib: 4.4. Numeric Types
Lib: 4.6.1. Common Sequence Operations
Lib: 10.3. operator
and probably other places I did not think to look.
The index is not much help in tying any of these together:
"add" -> Lib: 2.5
"+" -> Lib: 4.4
"plus" -> Lang: 6.5
There are also more obscure uses that should be findable such
as in float hex strings (4.4.3. Additional Methods on Float)
Cross references to similar information can help cover for
failings in the index -- if you can find some similar function
or concept, there is (or should be) a good chance of a cross-
reference to what you really wanted.
Good documentation will anticipate the questions a reader
will have and answer them.
----
Rebuttals to common responses to criticism of Python docs:
Python docs are already good
* Criticisms of Python's docs pop up on the Python
maillist and blogs with regularity.
* Many people confuse "usable", "i've learned to use
despite", "look impressive", etc with "good".
Google / blogs / stackoverflow / reddit, etc can provide better
* Even were it true, it is an argument that Python
doesn't need good documentation, rather than an argument
that Python's docs are good.
* They don't provide answers for infrequent questions.
* Answers can be conflicting, wrong, or out of date with
no way to correct.
* Even today, not everyone has access to internet all the
time.
Try it in an interactive Python session
* This is useful practical advice but experiments do
not substitute for documentation because they tell you
only what Python version 3.3 on Redhat Linux 4.2 does
on a machine with 2GB of memory 3 days after the full
moon.
Documentation is the ultimate authority for what it
is *supposed* to do.
Read the source code
* Oh please! The purpose of documentation is to alleviate
the need to read source code.
* Those most in need of documentation are those without
the Python knowledge to read the source code.
* Some source code is very complex and difficult to understand
even for experts.
* The behavior of source code is often obscured by details
not directly related to the info being looked for: error
handling, options for alternate behavior, performance
optimizations etc..
Don't complain, submit doc fixes.
* The people with motivation to fix the docs are often not
qualified to and the people qualified to have no motivation
because they already know it. (They may not even recognize
there is a problem.)
* There is a group of core developers who define (by accepting
or rejecting patches) the nature of the changes that can be
made. If the view of this group favors changes that continue
the status-quo, significant improvements via this route are
not possible.
* Small fixes can require orders of magnitude more effort to
submit and defend than the fix took to write.
Tutorials are the place to explain basics.
* Tutorials are great for some people but not everyone.
* They are not optimized for looking up and answering specific
questions.
* Their linear style builds on preceding info requiring
start-to-end reading.
* Since finding info in them is harder, there is an expectation
the reader will permanently commit the information to memory
as encountered. The best learning style for many is to
memorize most frequently needed info by looking things up as
needed.
* They often introduce programming or general programming
language concepts already known to the reader from prior
experience.
* They are often bloated with exercises/examples that are
not needed by readers with a higher level of experience.
* They require an unreasonable time/effort commitment for
those without a preexisting commitment to using Python.
* They are an alternate format of, not a replacement for,
information that should be in reference manuals.
The high standards demanded are impossible
* There are other reference manuals that do achieve a high
standard so it is not impossible, for example Beasley's
Python Essential Reference [*2]. The are also examples
for other languages.
* But, it may be impractical for the Python community
to achieve such results due to various Python intra-
community factors.
Python docs are excellent compared to most free software docs
* The "most free software docs" bar is too low to be a good
metric. Most such docs vary between "sucks" and "non-existent".
Please compare Python docs to best available docs (which is
why comparison to commercial books like Beasley's Essential
Reference is valid.)
----
[*1] I am not an advanced Python user nor a good technical
writer so my defauldict description may well be poor. That
does not mean that a better description than currently exists
can't or shouldn't be provided.
[*2] I am not holding up Beazley's book as a gold standard;
it has a number of its own problems. But it does provide
an example of reference material with better organization
and clarity than the python.org docs.