Need Help with Split

A

A. Mcbomb

I have a list of records that need to be split between the address and
the city. Here is some of the data:

</a>-16 Bonner StreetHartford,-CT
</a>-450 Main StreetHartford,-CT
</a>-812 Farmington AvenueWest Hartford,-CT
</a>-25 Forest Street No. 18Stamford,-CT
</a>-25 Forest Street No. 6AStamford,-CT
</a>-1450 Main StreetBridgeport,-CT

If you notice, the address butts directly up against the city name and
the only thing that is consistant it that the city always starts with a
capital letter (but can be more than one word).

If I could find a way to split where a lower case letter butts directly
against an Upper case letter, that might be a good start.

example: StreetHartford => if I could split between the lower case t and
the upper case H that are directly next to each other?

thanks

atomic
 
W

w_a_x_man

I have a list of records that need to be split between the address and
the city. Here is some of the data:

</a>-16 Bonner StreetHartford,-CT
</a>-450 Main StreetHartford,-CT
</a>-812 Farmington AvenueWest Hartford,-CT
</a>-25 Forest Street No. 18Stamford,-CT
</a>-25 Forest Street No. 6AStamford,-CT
</a>-1450 Main StreetBridgeport,-CT

If you notice, the address butts directly up against the city name and
the only thing that is consistant it that the city always starts with a
capital letter (but can be more than one word).

If I could find a way to split where a lower case letter butts directly
against an Upper case letter, that might be a good start.

example: StreetHartford => if I could split between the lower case t and
the upper case H that are directly next to each other?

thanks

atomic

DATA.each{|s|
city = s.reverse[ /^.*?,.*?[[:upper:]](?=[\d[:alpha:]])/m ].reverse
street = s[0, s.size - city.size]
puts street
puts city
}



__END__
</a>-16 Bonner StreetHartford,-CT
</a>-450 Main StreetHartford,-CT
</a>-812 Farmington AvenueWest Hartford,-CT
</a>-25 Forest Street No. 18Stamford,-CT
</a>-25 Forest Street No. 6AStamford,-CT
</a>-1450 Main StreetBridgeport,-CT
 
B

brabuhr

I have a list of records that need to be split between the address and
the city. Here is some of the data:

If I could find a way to split where a lower case letter butts directly
against an Upper case letter, that might be a good start.

/tmp$ cat i.rb
s = <<END
</a>-16 Bonner StreetHartford,-CT
</a>-450 Main StreetHartford,-CT
</a>-812 Farmington AvenueWest Hartford,-CT
</a>-25 Forest Street No. 18Stamford,-CT
</a>-25 Forest Street No. 6AStamford,-CT
</a>-1450 Main StreetBridgeport,-CT
END

require 'pp'

pp s.scan(/-(.*?[a-zA-Z\d])([A-Z][a-z].*)/)

/tmp$ ruby i.rb
[["16 Bonner Street", "Hartford,-CT"],
["450 Main Street", "Hartford,-CT"],
["812 Farmington Avenue", "West Hartford,-CT"],
["25 Forest Street No. 18", "Stamford,-CT"],
["25 Forest Street No. 6A", "Stamford,-CT"],
["1450 Main Street", "Bridgeport,-CT"]]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,141
Messages
2,570,817
Members
47,365
Latest member
BurtonMeec

Latest Threads

Top