Development tools and practices for Pythonistas

S

Shawn Milochik

Depends on the project, but I'd start with git the time I created the
first file in my project. If you're in the habit of committing then you
can easily rollback missteps. If you're in the habit of making branches
you can experiment without breaking the currently-working code.
 
R

Roy Smith

CM said:
While we're on the topic, when should a lone developer bother to start
using a VCS?

No need to use VCS at the very beginning of a project. You can easily
wait until you've written 10 or 20 lines of code :)
Should I bother to try a VCS?

Absolutely. Even if you don't need it for a small one-person project,
it's a good habit to get into.

If you haven't used any, my recommendation would be hg. Partly because
it's powerful, and partly because it's relatively easy to use. The
other popular choice these days would be git. Hg and git are pretty
similar, and between the two of them probably cover 90% of current
usage. Unless you've got a specific reason to try something else (i.e.
a project you're interested in uses something else), those seem like the
only two reasonable choices.
 
H

Hans Georg Schaathun

While we're on the topic, when should a lone developer bother to start
: using
: a VCS? At what point in the complexity of a project (say a hobby
: project, but
: a somewhat seriousish one, around ~5-9k LOC) is the added complexity
: of
: bringing a VCS into it worth it?

You are asking the wrong question. It depends relatively little on the
number of lines, and much more on what you are likely to do with it.

One thing is certain. If you are ever going to want to use a VCS,
you can just as well start yesterday. Using a VCS is not an extra
hassle to use. Only an added hassle to get started with.

Personally I use the VCS as
1) My backup system; naturally doing incremental backups only.
2) A means to synchronise multiple boxes (not just my laptop and
my desktop, sometimes a linux and a mac system, and dedicated
number crunchers too), and merge changes made out of synch.
3) The possibility to make one branch to run a suite of jobs
which may take a week or more, and still continue development
independently on the main branch.

As you can see, the number of lines is irrelevant. 1-2 mean that
everything is VC-d, not only code for which the VCS is meant.
And 3 is of course about what kind of code.

I branch rather little. Programming is not my day job -- nor my
main hobby, and I simply have not got the capacity to keep track
of multiple branches. Even at 12-15kloc I have little use of
the VCS for its intended purposes. If you take your development
project more seriously, you may do more of that within the first
500 loc...

But then, using VCS is not sufficient. You need to /think/ VC.

In other words, taking up a VCS when the system is large enough
to require it is too late. You need time to learn the thinking.
 
M

Martin Schöön

While we're on the topic, when should a lone developer bother to start
: using
: a VCS? At what point in the complexity of a project (say a hobby
: project, but
: a somewhat seriousish one, around ~5-9k LOC) is the added complexity
: of
: bringing a VCS into it worth it?

You are asking the wrong question. It depends relatively little on the
number of lines, and much more on what you are likely to do with it.
You guys are very code focused, which is natural given where we are.

Having absorbed what I have seen here, looked a little at Mercurial,
read a little on the webs of Fossil and Bazaar I start to think there
is great merit in all this VCS stuff for other types of projects.

At work my projects contain very little coding (some Python, some
matlab/scilab perhaps) but a fair amount of CAD/CAE, written
reports, presentations (OpenOffice and that other Office),
spread sheets etc etc. A mixture of ascii-files and various
proprietary formats most of which is stored in binary form.
Some of the CAE-work generate pretty big files stored
in dynamically created subdirectories.

Our computer environment is mostly based on Vista and Suse Linux
and I still have a SUN Solaris machine in my office but probably
not for long.

Given this type of scenario, what VCS tools should I consider?
Still the Mercurial/Git/Bazaar/Fossil crowd? Any one of those
ruled out and then why?

/Martin
 
T

Tim Chase

You guys are very code focused, which is natural given where we are.

Having absorbed what I have seen here, looked a little at Mercurial,
read a little on the webs of Fossil and Bazaar I start to think there
is great merit in all this VCS stuff for other types of projects.

At work my projects contain very little coding (some Python, some
matlab/scilab perhaps) but a fair amount of CAD/CAE, written
reports, presentations (OpenOffice and that other Office),
spread sheets etc etc. A mixture of ascii-files and various
proprietary formats most of which is stored in binary form.
Some of the CAE-work generate pretty big files stored
in dynamically created subdirectories.

For non-text blobs, it takes a little bit of insight to get the
most out of them. For OpenDocument (Open/Libre Office
documents), they're zipped files containing text/XML which can be
diff'ed with more meaning. Usually there are custom filters for
git[1], Mercurial[2] and Bazaar[3] which will unpack the zipped
file contents before committing and give you more sensible diffs.
Likewise, for images (gif/jpg/tiff/raw/etc), there are
particular image-diff programs which make it easier to tell what
happened, as the textual diff of binary files is pretty useless.
However some images (such as .svg files) are XML/text inside,
and diff pretty nicely without extra effort.

I can't speak to CAD/CAE, but it would have to be addressed on a
per-format basis in your given VCS. That said, you *can* store
the binary blobs in each, it's just not as useful without
meaningful comparisons.

-tkc

[1]
http://kerneltrap.org/mailarchive/git/2008/9/15/3305014

[2]
http://mercurial.selenic.com/wiki/HandlingOpenDocumentFiles

[3]
http://doc.bazaar.canonical.com/plugins/en/oodiff.html
 
S

Shawn Milochik

For what it's worth, the Python core developers have selected Mercurial.
I personally use git and love it. Most open-source people seem to use
one or the other of the two. They're pretty similar in most ways.

Look at the big two sites for open-source repositories -- github and
bitbucket. One's git, the other Mercurial. I don't think you can go
wrong picking either one.
 
D

Dietmar Schwertberger

Am 01.05.2011 02:47, schrieb Shawn Milochik:
Look at the big two sites for open-source repositories -- github and
bitbucket. One's git, the other Mercurial. I don't think you can go
wrong picking either one.

Can any of those be used from Python as a library, i.e. something like
import Hg
r = Hg.open(path)

When I had a look at Mercurial, which is implemented in Python,
it was implemented in a way that I could not do that. It was implemented
as rather monolithic program which could be used from os.system(...)
only.
With a good API, I could easily have integrated it into my development
flow. I have a codebase which is shared between different projects and
there are many small changes on many different PCs.
In theory a distributed VCS is good at supporting that, but in practice
I went back to my lightweight synchronization scripts and file storage
again. With the API, I could have best of both worlds.


Regards,

Dietmar
 
M

Martin Schöön

You guys are very code focused, which is natural given where we are.

Having absorbed what I have seen here, looked a little at Mercurial,
read a little on the webs of Fossil and Bazaar I start to think there
is great merit in all this VCS stuff for other types of projects.

At work my projects contain very little coding (some Python, some
matlab/scilab perhaps) but a fair amount of CAD/CAE, written
reports, presentations (OpenOffice and that other Office),
spread sheets etc etc. A mixture of ascii-files and various
proprietary formats most of which is stored in binary form.
Some of the CAE-work generate pretty big files stored
in dynamically created subdirectories.

For non-text blobs, it takes a little bit of insight to get the
most out of them. For OpenDocument (Open/Libre Office
documents), they're zipped files containing text/XML which can be
diff'ed with more meaning. Usually there are custom filters for
git[1], Mercurial[2] and Bazaar[3] which will unpack the zipped
file contents before committing and give you more sensible diffs.
Likewise, for images (gif/jpg/tiff/raw/etc), there are
particular image-diff programs which make it easier to tell what
happened, as the textual diff of binary files is pretty useless.
However some images (such as .svg files) are XML/text inside,
and diff pretty nicely without extra effort.

I can't speak to CAD/CAE, but it would have to be addressed on a
per-format basis in your given VCS. That said, you *can* store
the binary blobs in each, it's just not as useful without
meaningful comparisons.

-tkc

[1]
http://kerneltrap.org/mailarchive/git/2008/9/15/3305014

[2]
http://mercurial.selenic.com/wiki/HandlingOpenDocumentFiles

[3]
http://doc.bazaar.canonical.com/plugins/en/oodiff.html
All very useful information. Thank you for that Tim.

/Martin
 
D

David Boddie

Am 01.05.2011 02:47, schrieb Shawn Milochik:

Can any of those be used from Python as a library, i.e. something like
import Hg
r = Hg.open(path)

When I had a look at Mercurial, which is implemented in Python,
it was implemented in a way that I could not do that. It was implemented
as rather monolithic program which could be used from os.system(...)
only.

After noting the warnings it contains, see the following page for a
description of the Python API for Mercurial:

http://mercurial.selenic.com/wiki/MercurialApi

Git also has a Python API, which is fairly reasonable to use, though a bit
different to the Mercurial one:

http://www.samba.org/~jelmer/dulwich/

I've used both with some success.

David
 
P

Paul Rubin

Look at the big two sites for open-source repositories -- github and
Note that there are three: Launchpad (backed by Bazaar) is the other
“big site†for free-software project hosting.

There is also patch-tag.com (using darcs) though it is smaller.
 
R

rusi

While we're on the topic, when should a lone developer bother to start
using a VCS?  At what point in the complexity of a project (say a hobby
project, but > a somewhat seriousish one, around ~5-9k LOC) is the added
complexity of bringing a VCS into it worth it?

When you hit your first bug?
Ok seriously, when you hit your first serious bug maybe?

I am a bit surprised that no one has mentioned rcs so far
Not an option if you are not on a *ix system and not something I am
specifically recommending.
[I grew up on rcs 15 years ago but not used it much of late]
You may want to look at rcs if you are in the space where you want:
-- something better than tarballs
-- no pretensions beyond single-user, single-machine, (almost)single-
file usage (ie small scale)
-- something that integrates nicely with emacs
 
R

rusi

I might have agreed ten years ago; compared to CVS or Subversion, RCS is
simpler to use and set up and had lower workflow overhead.

But today, Bazaar or Mercurial fill that role just as well: quick simple
set up, good tool support (yes, even in Emacs using VC mode), and easy
to use for easy things.

I really don't see any benefit to using RCS for even a lone hacker
tracking files; Bazaar or Mercurial fill that role just as well, and
continue to work well as your needs grow.

In a word: single files.
If you have a directory with a number of short unrelated scripts --
python, shell etc --
the philosophy: vcs-manages-projects-not-files is a nuisance not a
help.

And which is why things like zit http://git.oblomov.eu/zit have
arisen: the need to go back from bzr/git/hg to (something like) rcs
 
A

Algis Kabaila

But today, Bazaar or Mercurial fill that role just as well:
quick simple set up, good tool support (yes, even in Emacs
using VC mode), and easy to use for easy things.

Actually, Bazaar is more convenient than rcs for a single user,
as the repository can be the working directory (with a "hidden"
..bzr directory that stores diffs).

I had to use git, too, because some projects use git for their
version control (viz PySide, Nokia's tool to replace PyQt). IMHO
there is not much to pick between git and Bazaar and hg is also
rather similar. The remaining doubts are betwwed the
Distributed Version Control and the more traditional Subversion,
which is also quite nice, even for a single user.

OldAl.
 
R

rusi

Actually, Bazaar is more convenient than rcs for a single user,
as the repository can be the working directory (with a "hidden"
.bzr directory that stores diffs).  

Dont exactly understand...
Is it that you want it specifically hidden?
Otherwise rcs will look inside an RCS directory just as bzr will
with .bzr git will with .git and so on.
 
A

Algis Kabaila

Dont exactly understand...
Is it that you want it specifically hidden?
Otherwise rcs will look inside an RCS directory just as bzr
will with .bzr git will with .git and so on.

Sorry for not being clear - "ls" will not show directories that
start with "." - in that sense these directories are "hidden".
They are not really really hidden, as "ls -l" will show them.
They simply are not in the way and keep the progressive versions
of the program (in form of diffs).

Does that make better sense?.
 
J

jacek2v

Sorry for not being clear - "ls" will not show directories that
start with "." - in that sense these directories are "hidden".  
They are not really really hidden, as "ls  -l" will show them.  
They simply are not in the way and keep the progressive versions
of the program (in form of diffs).

"ls -l will not show directories that start with ".".
"ls -a" will.

Regards
Jacek
 
D

Dietmar Schwertberger

Am 02.05.2011 01:33, schrieb David Boddie:
After noting the warnings it contains, see the following page for a
description of the Python API for Mercurial:

http://mercurial.selenic.com/wiki/MercurialApi
Ah, yes, no need to use os.sytem(), but all in all not much difference
from doing so (and usage of CLI is recommended instead of using the
API).
I had a look at this page in 2009 and came to the conclusion that it's
no good idea to use Mercurial as library. Seems that this still holds
true.
Git also has a Python API, which is fairly reasonable to use, though a bit
different to the Mercurial one:

http://www.samba.org/~jelmer/dulwich/
That looks more promising to me.
I think I will try this and Bazaar to find which fits my needs better.


Regards,

Dietmar
 
A

Anssi Saari

rusi said:
I am a bit surprised that no one has mentioned rcs so far
Not an option if you are not on a *ix system and not something I am
specifically recommending.

I actually use rcs in Windows. Needs a little setup, but works great,
from Emacs VC-mode too.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,430
Latest member
7dog123

Latest Threads

Top