do a sed / awk filter with python tools (at least as fast)

Mathieu Prevot · Jul 7, 2008

Hi,

I use in a bourne shell script the following filter:

sed '/watch?v=/! d;s/.*v=//;s/$.\{11\}$.*/\1/' \
| sort | uniq | awk 'ORS=" "{print $1}'

that give me all sets of 11 characters that follows the "watch?v="
motif. I would like to do it in python on stdout from a
subprocess.Popen instance, using python tools rather than sed awk etc.
How can I do this ? Can I expect something as fast ?

Thanks,
Mathieu

Peter Otten · Jul 7, 2008

Mathieu said:
I use in a bourne shell script the following filter:

sed '/watch?v=/! d;s/.*v=//;s/$.\{11\}$.*/\1/' \
| sort | uniq | awk 'ORS=" "{print $1}'

that give me all sets of 11 characters that follows the "watch?v="
motif. I would like to do it in python on stdout from a
subprocess.Popen instance, using python tools rather than sed awk etc.
How can I do this ? Can I expect something as fast ?

You should either do it in Python , e. g.:

def process(lines):
candidates = (line.rstrip().partition("/watch?v=") for line in lines)
matches = (c[:11] for a, b, c in candidates if len(c) >= 11)
print " ".join(sorted(set(matches)))

if __name__ == "__main__":
import sys
process(sys.stdin)

or invoke your shell script via subprocess.Popen(). Invoking a python script
via subprocess doesn't make sense IMHO.

Peter

Mathieu Prevot · Jul 7, 2008

2008/7/7 Peter Otten said:
Mathieu said:

I use in a bourne shell script the following filter:

sed '/watch?v=/! d;s/.*v=//;s/$.\{11\}$.*/\1/' \
| sort | uniq | awk 'ORS=" "{print $1}'

that give me all sets of 11 characters that follows the "watch?v="
motif. I would like to do it in python on stdout from a
subprocess.Popen instance, using python tools rather than sed awk etc.
How can I do this ? Can I expect something as fast ?

Click to expand...

You should either do it in Python , e. g.:

def process(lines):
candidates = (line.rstrip().partition("/watch?v=") for line in lines)
matches = (c[:11] for a, b, c in candidates if len(c) >= 11)
print " ".join(sorted(set(matches)))

if __name__ == "__main__":
import sys
process(sys.stdin)

or invoke your shell script via subprocess.Popen(). Invoking a python script
via subprocess doesn't make sense IMHO.

Thanks.
Mathieu

converting a sed / grep / awk / . . . bash pipe line into python	11	Sep 2, 2008
Right tool and method to strip off html files (python, sed, awk?)	5	Jul 13, 2007
Suggestions on writing a sh <--> python Howto/Tutorial	0	Jul 27, 2011
Python hangs: Problem with wxPython, threading, pySerial, or events?	0	Oct 15, 2011
generate and send mail with python: tutorial	8	Aug 11, 2011
This is why Ruby 1.8.6 can never be made to run anywhere near as fast as Python 2.5.1	61	Sep 24, 2007
Modify Python Code - no idea at all	0	Nov 5, 2003
how do i use "tkinter.createfilehandler" with a regular c program?	3	Nov 14, 2005

do a sed / awk filter with python tools (at least as fast)

Mathieu Prevot

Peter Otten

Mathieu Prevot

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads