At a loss how to sort this file

G

grocery_stocker

If I have a file in the following format:

lar ttyp2 216.106.179.129 Fri Nov 17 17:12 - 17:14
(00:01)
lar ttypa 216.106.179.129 Fri Nov 17 15:53 - 15:55
(00:01)
lar ttypp 216.106.179.129 Thu Nov 16 17:11 - 17:21
(00:09)
lar ttypk 216.106.179.129 Thu Nov 16 14:20 - 14:21
(00:01)
lar ttypn 216.106.179.129 Thu Nov 16 13:23 - 13:37
(00:13)
irongeek ttypi 216.106.179.129 Wed Nov 15 17:27 - 17:32
(00:04)
sabre ttyp5 216.106.179.129 Wed Nov 15 13:59 - 14:03
(00:04)
lar ttyp5 216.106.179.129 Wed Nov 15 13:57 - 13:59
(00:01)
sabre ttyp5 216.106.179.129 Wed Nov 15 13:28 - 13:57
(00:28)
sabre ttypc 216.106.179.129 Wed Nov 15 12:10 - 12:10
(00:00)

lar ttypd 71.57.146.22 Fri Nov 17 07:27 - 07:43
(00:16)
irongeek ttyp2 71.57.146.22 Thu Nov 16 07:49 - 07:56
(00:07)
sabre ttypg 71.57.146.22 Sat Nov 11 15:56 - 16:09
(00:12)

The stuff before the space is already sorted. After the space break,
the data is sorted again. I want
all the data in this file to be sorted in descending order. I tried
looking at bash sort, but I couldn't
find anything.

I thought maybe I could also do this:
-read in each line
-have the date as the first value, and the reference to the list(?) as
the second value.
-Make the first value the hash key and the reference to the list the
value of the hash key
-sort the hash keys and write back

However, for some reason, this seems a bit overcomplicated.

Ideas?
 
J

Jürgen Exner

grocery_stocker wrote:

Dear Grocery
If I have a file in the following format:

lar ttyp2 216.106.179.129 Fri Nov 17 17:12 - 17:14
(00:01)

I presume this (00::01) actually belongs at the end of the preceeding line
and it's just you newsreader that wrapped that line?
lar ttypa 216.106.179.129 Fri Nov 17 15:53 - 15:55
(00:01)
lar ttypp 216.106.179.129 Thu Nov 16 17:11 - 17:21
(00:09)
lar ttypk 216.106.179.129 Thu Nov 16 14:20 - 14:21
(00:01)
lar ttypn 216.106.179.129 Thu Nov 16 13:23 - 13:37
(00:13)
irongeek ttypi 216.106.179.129 Wed Nov 15 17:27 - 17:32
(00:04)
sabre ttyp5 216.106.179.129 Wed Nov 15 13:59 - 14:03
(00:04)
lar ttyp5 216.106.179.129 Wed Nov 15 13:57 - 13:59
(00:01)
sabre ttyp5 216.106.179.129 Wed Nov 15 13:28 - 13:57
(00:28)
sabre ttypc 216.106.179.129 Wed Nov 15 12:10 - 12:10
(00:00)

lar ttypd 71.57.146.22 Fri Nov 17 07:27 - 07:43
(00:16)
irongeek ttyp2 71.57.146.22 Thu Nov 16 07:49 - 07:56
(00:07)
sabre ttypg 71.57.146.22 Sat Nov 11 15:56 - 16:09
(00:12)

The stuff before the space is already sorted.

You got me confused. Do you mean the list is already sorted by 'lar',
'sabre', and 'irongeek'?
Where? I mean, I can clearly see that each of those values appears several
times and totally intermixed with the others. To me that is not sorted at
all.
If you mean something else by "before the space is already sorted", then
please explain.
After the space break,
the data is sorted again.

Are you talking about the substrings that start with 'ttyp', either with or
without the rest of each line. Again, I do not recognize any sorting order
in those values, neither overall nor within each group of values for 'lar',
'sabre', and 'irongeek' in case you were talking about primary and secondary
sort keys.
I want
all the data in this file to be sorted in descending order.

Do you mean all the data (like in each column individually; I cannot think
of an application where this would make sense) or all the lines?
I tried
looking at bash sort, but I couldn't
find anything.

Well, I may not be up to speed but last I've never heard of any algorithm
called bash sort, either.
What's wrong with just using the buildin Perl sort()?
All you have to do is to define your custom comparison function and then you
can sort by whatever you like. There is even an FAQ "How do I sort an array
by (anything)?". I'd like to help, but from your description and data sample
it is impossible to figure out what your desired end result should look
like.
BTW: some people will probably suggest a Schwartzian Transformation. That is
an interesting optimization technique if and when the data set becomes large
and performance and issue. For a first simple solution "Make it work" there
is no need for it.

jue
 
G

grocery_stocker

Jürgen Exner said:
grocery_stocker wrote:

Dear Grocery


I presume this (00::01) actually belongs at the end of the preceeding line
and it's just you newsreader that wrapped that line?

Yes the (00:01) belongs at the end of the line. My newsreader like to
wrap things at the most inane times.
You got me confused. Do you mean the list is already sorted by 'lar',
'sabre', and 'irongeek'?
Where? I mean, I can clearly see that each of those values appears several
times and totally intermixed with the others. To me that is not sorted at
all.
If you mean something else by "before the space is already sorted", then
please explain.

If you look at the 4th column (which has dates line Fri Nov 17), the
dates start to go backwards. That is how the list is ordered. If there
are multiple entries for the same date (like Thu Nov 16), the data is
then ordered by time (in the 5th column).
Are you talking about the substrings that start with 'ttyp', either with or
without the rest of each line. Again, I do not recognize any sorting order
in those values, neither overall nor within each group of values for 'lar',
'sabre', and 'irongeek' in case you were talking about primary and secondary
sort keys.


Do you mean all the data (like in each column individually; I cannot think
of an application where this would make sense) or all the lines?


Well, I may not be up to speed but last I've never heard of any algorithm
called bash sort, either.

My wording was sloppy on that line.
What's wrong with just using the buildin Perl sort()?
All you have to do is to define your custom comparison function and then you
can sort by whatever you like. There is even an FAQ "How do I sort an array
by (anything)?". I'd like to help, but from your description and data sample
it is impossible to figure out what your desired end result should look
like.

I forgot what else. I need to go to the store and get some alcohol.
However, I refuse to get nylons for my girlfiend.
BTW: some people will probably suggest a Schwartzian Transformation. Thatis
an interesting optimization technique if and when the data set becomes large
and performance and issue. For a first simple solution "Make it work" there
is no need for it.

I was thinking if I had a really large honkin data set, I would use
something like a k-way sort.
 
M

Mumia W. (reading news)

If I have a file in the following format:

lar ttyp2 216.106.179.129 Fri Nov 17 17:12 - 17:14
(00:01)
lar ttypa 216.106.179.129 Fri Nov 17 15:53 - 15:55
(00:01)
lar ttypp 216.106.179.129 Thu Nov 16 17:11 - 17:21
(00:09)
lar ttypk 216.106.179.129 Thu Nov 16 14:20 - 14:21
(00:01)
lar ttypn 216.106.179.129 Thu Nov 16 13:23 - 13:37
(00:13)
irongeek ttypi 216.106.179.129 Wed Nov 15 17:27 - 17:32
(00:04)
sabre ttyp5 216.106.179.129 Wed Nov 15 13:59 - 14:03
(00:04)
lar ttyp5 216.106.179.129 Wed Nov 15 13:57 - 13:59
(00:01)
sabre ttyp5 216.106.179.129 Wed Nov 15 13:28 - 13:57
(00:28)
sabre ttypc 216.106.179.129 Wed Nov 15 12:10 - 12:10
(00:00)

lar ttypd 71.57.146.22 Fri Nov 17 07:27 - 07:43
(00:16)
irongeek ttyp2 71.57.146.22 Thu Nov 16 07:49 - 07:56
(00:07)
sabre ttypg 71.57.146.22 Sat Nov 11 15:56 - 16:09
(00:12)

The stuff before the space is already sorted. After the space break,
the data is sorted again. I want
all the data in this file to be sorted in descending order. I tried
looking at bash sort, but I couldn't
find anything.

I thought maybe I could also do this:
-read in each line
-have the date as the first value, and the reference to the list(?) as
the second value.
-Make the first value the hash key and the reference to the list the
value of the hash key
-sort the hash keys and write back

However, for some reason, this seems a bit overcomplicated.

Ideas?

That would be overly complicated. Just create a sorting (comparison)
function that extracts the dates from each string, converts them and
compares them.

Get the Date::parse module from CPAN (or ActiveState or Debian) and use
that to easily convert those date strings to time values.

Show some code, and we can help you better.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top