Understanding other people's code

L

L O'Shea

Hi all,
I've been asked to take over a project from someone else and to extend the functionality of this. The project is written in Python which I haven't hadany real experience with (although I do really like it) so I've spent the last week or two settling in, trying to get my head around Python and the way in which this code works.

The problem is the code was clearly written by someone who is exceptionallygood and seems to inherit everything from everywhere else. It all seems very dynamic, nothing is written statically except in some configuration files.
Basically the problem is I am new to the language and this was clearly written by someone who at the moment is far better at it than I am!

I'm starting to get pretty worried about my lack of overall progress and soI wondered if anyone out there had some tips and techniques for understanding other peoples code. There has to be 10/15 different scripts with at least 10 functions in each file I would say.

Literally any idea will help, pen and paper, printing off all the code and doing some sort of highlighting session - anything! I keep reading bits of code and thinking "well where the hell has that been defined and what does it mean" to find it was inherited from 3 modules up the chain. I really need to get a handle on how exactly all this slots together! Any techniques,tricks or methodologies that people find useful would be much appreciated.
 
C

Chris Angelico

I'm starting to get pretty worried about my lack of overall progress and so I wondered if anyone out there had some tips and techniques for understanding other peoples code. There has to be 10/15 different scripts with at least 10 functions in each file I would say.


The first thing I'd recommend is getting yourself familiar with the
language itself, and (to some extent) the standard library. Then
you'll know that any unrecognized wotzit must have come from your own
project, so you'll be able to search up its definition. Then I'd
tackle source files one at a time, and look at the very beginning. If
the original coder was at all courteous, each file will start off with
a block of 'import' statements, looking something like this:

import re
import itertools

Or possibly like this:

from itertools import cycle, islice

Or, if you're unlucky, like this:

from tkinter import *

The first form is easy. You'll find references to "re.sub" or
"itertools.takewhile"; the second form at least names what it's
grabbing (so you'll find "cycle" or "islice" in the code), and the
third just dumps a whole lot of stuff into your namespace.

Actually, if the programmer's been really nice, there'll be a block
comment or a docstring at the top of the file, which might even be
up-to-date and accurate. But I'm guessing you already know to look for
that. :)

The other thing I would STRONGLY recommend: Keep the interactive
interpreter handy. Any line of code you don't understand, paste it
into the interpreter. Chances are it won't wipe out your entire hard
drive :) But seriously, there is much to gain and nothing to lose by
keeping IDLE or the command-line interpreter handy.

ChrisA
 
P

Peter Otten

L said:
Hi all,
I've been asked to take over a project from someone else and to extend the
functionality of this. The project is written in Python which I haven't
had any real experience with (although I do really like it) so I've spent
the last week or two settling in, trying to get my head around Python and
the way in which this code works.

The problem is the code was clearly written by someone who is
exceptionally good and seems to inherit everything from everywhere else.
It all seems very dynamic, nothing is written statically except in some
configuration files. Basically the problem is I am new to the language and
this was clearly written by someone who at the moment is far better at it
than I am!

I'm starting to get pretty worried about my lack of overall progress and
so I wondered if anyone out there had some tips and techniques for
understanding other peoples code. There has to be 10/15 different scripts
with at least 10 functions in each file I would say.

That sounds like the project is well-organised and not too big. If you take
one day per module you're there in two weeks...
Literally any idea will help, pen and paper, printing off all the code and
doing some sort of highlighting session - anything! I keep reading bits of
code and thinking "well where the hell has that been defined and what does
it mean" to find it was inherited from 3 modules up the chain.

As you put it here, the project is too complex. So now we have a mixed
message. Of course your impression may stem from lack of experience...
I really
need to get a handle on how exactly all this slots together! Any
techniques,tricks or methodologies that people find useful would be much
appreciated.

Is there any documentation? Read that. Do the functions have docstrings?
import the modules (start with the main entry point) in the interactive
interpreter and use help():

Or use

$ python -m pydoc -g

and hit "open browser" (the project directory has to be in PYTHONPATH).

See if you can talk to the author/previous maintainer. He may be willing to
give you the big picture or hints for the parts where he did "clever"
things.

Try to improve your Python by writing unrelated scripts.

Make little changes to the project (add print statements, invoke functions
from your own driver script, make a local variable global for further
inspection in the interactive interpreter using dir() -- whatever you can
think of.

The latter should of course be done in a test installation rather than the
production environment.

Rely on version control once you start making modifications for real -- but
I think you knew that already...
 
E

Eric S. Johansson

Literally any idea will help, pen and paper, printing off all the code
and doing some sort of highlighting session - anything! I keep reading
bits of code and thinking "well where the hell has that been defined and
what does it mean" to find it was inherited from 3 modules up the chain.
I really need to get a handle on how exactly all this slots together!
Any techniques,tricks or methodologies that people find useful would be
much appreciated.

glad to hear you're having a WTF moment (what's that function). Suggestion
would be index cards, each containing notes on a class. truly understand
what each parent class is in which methods are to be overloaded. Then look
at one child and understand how it. Work your way breadth first down the
inheritance tree.
 
T

Terry Reedy

Hi all, I've been asked to take over a project from someone else and
to extend the functionality of this. The project is written in Python
which I haven't had any real experience with (although I do really
like it) so I've spent the last week or two settling in, trying to
get my head around Python and the way in which this code works.

If the functions are not documented in prose, is there a test suite that
you can dive into?
 
C

CM

Basically the problem is I am new to the language and this was clearly
written by someone who at the moment is far better at it than I am!

Sure, as a beginner, yes, but also it sounds like the programmer didn't document it much at all, and that doesn't help you. I bet s/he didn't always use very human readable names for objects/methods/classes, either, eh?
I'm starting to get pretty worried about my lack of overall progress and so I
wondered if anyone out there had some tips and techniques for understanding
other peoples code. There has to be 10/15 different scripts with at least10
functions in each file I would say.

Unless the programmer was really super spaghetti coding, I would think thatthere would be some method to the madness, and that the 10-15 scripts eachhave some specific kind of purpose. The first thing, I'd think (and having not seen your codebase) would be to sketch out what those scripts do, andfamiliarize yourself with their names.

Did the coder use this form for importing from modules?

from client_utils import *

If so, that's going to make your life much harder, because all of the namesof the module will now be available to the script it was imported into, and yet they are not defined in that script. If s/he had written:

import client_utils

Than at least you would expect lines like this in the script you're lookingat:

customer_name = client_utils.GetClient()

Or, if the naming is abstruse, at very least:

cn = client_utils.GC()

It's awful, but at least then you know that GC() is a function within the client_utils.py script and you don't have to go searching for it.

If s/he did use "from module import *", then maybe it'd be worth it to re-do all the imports in the "import module" style, which will break everything, but then force you to go through all the errors and make the names like module.FunctionName() instead of just FunctionName().

Some of that depends on how big this project is, of course.
Literally any idea will help, pen and paper, printing off all the code and
doing some sort of highlighting session - anything!

What tools are you using to work on this code? Do you have an IDE that hasa "browse to" function that allows you to click on a name and see where inthe code above it was defined? Or does it have UML or something like that?
 
A

Azureaus

Hi all,

I've been asked to take over a project from someone else and to extend the functionality of this. The project is written in Python which I haven't had any real experience with (although I do really like it) so I've spent the last week or two settling in, trying to get my head around Python and theway in which this code works.



The problem is the code was clearly written by someone who is exceptionally good and seems to inherit everything from everywhere else. It all seems very dynamic, nothing is written statically except in some configuration files.

Basically the problem is I am new to the language and this was clearly written by someone who at the moment is far better at it than I am!



I'm starting to get pretty worried about my lack of overall progress and so I wondered if anyone out there had some tips and techniques for understanding other peoples code. There has to be 10/15 different scripts with at least 10 functions in each file I would say.



Literally any idea will help, pen and paper, printing off all the code and doing some sort of highlighting session - anything! I keep reading bits of code and thinking "well where the hell has that been defined and what does it mean" to find it was inherited from 3 modules up the chain. I really need to get a handle on how exactly all this slots together! Any techniques,tricks or methodologies that people find useful would be much appreciated.

Thanks for all the suggestions, I'm afraid I didn't get a chance to view them over the weekend but I will get started with them this morning. I'm currently using sublime 2 for my text editor and tried to create a UML diagram using Pylint to try and get a map overview of what's going on. Unfortunately it seemed to map the classes into groups such as StringIO, ThreadPool, GrabOut etc.. rather than into the modules they belong go and how they fit together. Maybe this is just my inexperience showing through or I'm using theprogram wrong. If anyone has any 'mapping' programs they use to help them visualise program flow that would be a great bonus.

To be fair to who programmed it, most functions are commented and I can't complain about the messiness of the code, It's actually very tidy. (I suppose Python forcing it's formatting is another reason it's an easily readable language!) Luckily not blanked import * were used otherwise I really would be up the creek without a paddle.
Thanks!
 
C

CM

To be fair to who programmed it, most functions are commented and I can't
complain about the messiness of the code, It's actually very tidy. (I suppose
Python forcing it's formatting is another reason it's an easily readable
language!) Luckily not blanked import * were used otherwise I really would be
up the creek without a paddle.

Oh, good! OK, so then what you can think in terms of, in terms of a simplestrategy for getting clear without any fancy tools:

Learn what each module is for. In my own application programming, I don't just put random classes and functions in any old module--the modules have some order to them. So, for example, one module may represent one panel in the application, or all the database stuff, or all the graphing stuff, or some other set of logic, or whatever. One might be the main GUI frame. Etc.. So I'd get a notebook or file and make notes for yourself about what each module is for, and the name. Even tack a piece of paper above your workstation with the module names and a one line note about what they do, like:

MODULES:

Map_panel: Displays a panel with the map of the city, with a few buttons.
Dbases: Has all utility functions relevant to the database.
Utils: Has a collection of utility functions to format time, i18n, etc.

Now, there's a cheat sheet. So, if you come across a line in your code like:

pretty_time = Utils.GetPrettyTime(datetime)

You can quickly look at Utils module and read more about that function.

Does this approach make sense to at least clear the cobwebs?
 
A

asimjalis

Hi all,
I've been asked to take over a project from someone else and to extend the functionality of this. The project is written in Python which I haven't had any real experience with (although I do really like it) so I've spent the last week or two settling in, trying to get my head around Python and theway in which this code works.

Here are some techniques I use in these situations.

1. Do a superficial scan of the code looking at names of classes, functions, variables, and speculate where the modification that I have to make will go. Chances are you don't need to understand the entire system to make yourchange.

2. Build some hypotheses about how the system works and use print statements or some other debugging technique to run the program and see if you get the result you expect.

3. Insert your code into a separate class and function and see if you can inject a call to your new code from the existing code so that it now works with the new functionality.

If you have to understand the details of some code, one approach is to try to summarize blocks of code with a single comment to wrap your mind around it.

Asim
 
D

David M Chess

Literally any idea will help, pen and paper, printing off all the code
and doing some sort of highlighting session - anything!
I keep reading bits of code and thinking "well where the hell has that
been defined and what does it mean" to find it was inherited from 3
modules up the chain.
I really need to get a handle on how exactly all this slots together!
Any techniques,tricks or methodologies that people find useful would be
much appreciated.

I'd highly recommend Eclipse with PyDev, unless you have some strong
reason not to. That's what I use, and it saves pretty much all of those
"what's this thing?" problems, as well as lots of others...

DC
 
D

David Hutto

Any program, to me, is just like speaking english. The class, or function
name might not fully mesh with what your cognitive structure assumes it to
be.read through the imports first, and see the classes and functions come
alive with experience comes intuition of what it does, and the instances
that can be utilized with it. The term RTFM, and google always comes to
mind as well.
 
A

Azureaus

Hi all,

I've been asked to take over a project from someone else and to extend the functionality of this. The project is written in Python which I haven't had any real experience with (although I do really like it) so I've spent the last week or two settling in, trying to get my head around Python and theway in which this code works.



The problem is the code was clearly written by someone who is exceptionally good and seems to inherit everything from everywhere else. It all seems very dynamic, nothing is written statically except in some configuration files.

Basically the problem is I am new to the language and this was clearly written by someone who at the moment is far better at it than I am!



I'm starting to get pretty worried about my lack of overall progress and so I wondered if anyone out there had some tips and techniques for understanding other peoples code. There has to be 10/15 different scripts with at least 10 functions in each file I would say.



Literally any idea will help, pen and paper, printing off all the code and doing some sort of highlighting session - anything! I keep reading bits of code and thinking "well where the hell has that been defined and what does it mean" to find it was inherited from 3 modules up the chain. I really need to get a handle on how exactly all this slots together! Any techniques,tricks or methodologies that people find useful would be much appreciated.

Thank you to everyone who replied constructively, the various suggestions all helped a lot. I'd like to suggest to anyone who reads this in the futurewho is in a similar situation to do as David Chess suggested and install eclipse with pydev. Although I prefer to use Sublime to actually write code,Eclipse turned out to be invaluable in helping me jump around and understand the code especially how things were passed around) and for debugging things over the last few days. Success!
Cheers everyone.
 
A

Albert van der Horst

To be fair to who programmed it, most functions are commented and I
can't complain about the messiness of the code, It's actually very tidy.
(I suppose Python forcing it's formatting is another reason it's an
easily readable language!) Luckily not blanked import * were used
otherwise I really would be up the creek without a paddle.

If the code is really tidy, it is possible to understand a function
using only the *documentation* (not the code itself) of any function
or data it uses. In oo you also need a context about what an object
is supposed to do. The next step is to proof for yourself that the
function exactly does what is promised in its own documentation.

And you get nowhere without domain knowledge. If you're in railways
and don't know the difference between a "normal" and an "English"
whathaveyou, then you're lost, plain and simple.

Don't treat the original comment as sacred. Any time it is unclear
rewrite it. You may get it wrong, but that's wat source control
systems are for. If at all possible, if you add a statement about
a function, try to add a test that proves that statement.

Anytime you come across something that is unsufficiently documented,
you document it tentatively yourself, keeping in mind that what
you write down may be wrong. This does no harm! Because you must
keep in mind that everything written by the original programmer
may be wrong, there is actually no difference! Now study the places
where it is called and check whether it makes sense.
This an infinite process. After one round of improvements you
have to go through everything again. I've got pretty bad stuff under
control this way.

You'll find bugs this way. They may or may not let you fix them.

There is however not much point in "working in" by reading through
the code. Time is probably better spent by running and studying, maybe
creating test cases.

Trying to understand any substantial code body in detail is
a waste of time.
For example: I once had to change the call code of the gcc compiler
to be able to use a 68000 assembler library (regarding which register
contain what data passed to the function). There is absolutely no
point in studying the gcc compiler. You must have an overview
then zoom in on the relevant part. In the end maybe only a couple
of lines need change. A couple of days, and a pretty hairy problem
was solved. (The assembler library was totally undocumented.
Nobody even tried to study it. ).

There is an indication that the original programmer made it all very
easy and maybe you go about it not quite the right way.
If you have a tower of abstractions, then you must *not* go down
all the way to find out "eactly" what happens. You must pick
a level in the middle and understand it in terms of usage, then
understand what is on top of that in terms of that usage.
That is how good programmers build there programs. Once there is
a certain level they don't think about what's underneath, but
concentrate on how to use it. If it is done really well, each
source module can be understood on its own.

All this is of course general, not just for Python.
 
C

Chris Angelico

If the code is really tidy, it is possible to understand a function
using only the *documentation* (not the code itself) of any function
or data it uses.

I'd broaden that slightly to the function's signature, which consists
of the declaration line and any associated comments (which in Python
should be in the docstring). The docstring kinda violates this
concept, but what I generally try to explain is that you should be
able to understand a function without reading any of the indented
content.

ChrisA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,007
Messages
2,570,267
Members
46,866
Latest member
Aletlirm

Latest Threads

Top