D
Daniel Berger
Hi all,
Park Heesob and I came up with a custom implementation for
IO.readlines using scattered I/O I thought would be fun to share. I
think I'm seeing a 2x performance increase, but page caching is making
it difficult to tell. Also, it looks like the main profiling issue is
the call to 'split' at the end, so you can remove that last bit of
logic if you want to see the speed without it.
What do folks think? Are you seeing a performance increase? You
probably won't see any noticeable difference unless the file is
greater than 25mb or so, btw.
Link to ReadFileScatter() function definition:
http://msdn2.microsoft.com/en-us/library/aa365469.aspx
# nio.rb - requires the latest windows-pr gem
require 'windows/file'
require 'windows/handle'
require 'windows/error'
require 'windows/memory'
require 'windows/nio'
require 'windows/synchronize'
require 'windows/system_info'
require 'windows/msvcrt/io'
require 'windows/msvcrt/buffer'
require 'win32/event'
module Win32
class NIO
include Windows::File
include Windows::Handle
include Windows::Error
include Windows::Synchronize
include Windows::MSVCRT::IO
include Windows::MSVCRT::Buffer
include Windows::SystemInfo
include Windows::Memory
include Windows::NIO
extend Windows::File
extend Windows::Handle
extend Windows::Error
extend Windows::Synchronize
extend Windows::MSVCRT::IO
extend Windows::MSVCRT::Buffer
extend Windows::SystemInfo
extend Windows::Memory
extend Windows::NIO
class Error < StandardError; end
# Reads the entire file specified by portname as individual
lines, and
# returns those lines in an array. Lines are separated by +sep+.
#--
# The semantics are the same as the MRI version but the
implementation
# is drastically different. We use a scattered IO read, which is
about
# as fast as the MRI version for small files, but much faster
for very
# large files.
#
def self.readlines(file, sep = $/)
handle = CreateFile(
file,
GENERIC_READ,
FILE_SHARE_READ,
nil,
OPEN_EXISTING,
FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING,
nil
)
if handle == INVALID_HANDLE_VALUE
raise Error, get_last_error
end
# Get your system's page size, probably 4k
sysbuf = 0.chr * 40
GetSystemInfo(sysbuf)
page_size = sysbuf[4,4].unpack('L')[0]
num_pages = (File.size(file).to_f / page_size).ceil
base_address = VirtualAlloc(
nil,
page_size * num_pages,
MEM_COMMIT,
PAGE_READWRITE
)
buf_list = []
for i in 0...num_pages
buf_list.push(base_address + page_size * i)
end
seg_array = buf_list.pack('Q*') + 0.chr * 8
olap = 0.chr * 20
olap[16,4] = [CreateEvent(nil, 1, 0, nil)].pack('L')
bool = ReadFileScatter(
handle,
seg_array,
page_size * num_pages,
nil,
olap
)
unless bool
raise Error, get_last_error
end
WaitForSingleObject(olap[16,4].unpack('L')[0], INFINITE)
# MRI's File.size cannot be trusted for files larger than
2gb.
file_size = [0].pack('Q')
GetFileSizeEx(handle, file_size)
file_size = file_size.unpack('Q')[0]
unless CloseHandle(handle)
raise Error, get_last_error
end
buffer = 0.chr * file_size
memcpy(buffer, buf_list[0], file_size)
VirtualFree(base_address, 0, MEM_RELEASE)
# TODO: Fix line ending issue (?)
unless sep.nil?
if sep.empty?
buffer = buffer.split("\r\n\r\n")
else
buffer = buffer.split(sep)
end
end
buffer
end
end
end
Park Heesob and I came up with a custom implementation for
IO.readlines using scattered I/O I thought would be fun to share. I
think I'm seeing a 2x performance increase, but page caching is making
it difficult to tell. Also, it looks like the main profiling issue is
the call to 'split' at the end, so you can remove that last bit of
logic if you want to see the speed without it.
What do folks think? Are you seeing a performance increase? You
probably won't see any noticeable difference unless the file is
greater than 25mb or so, btw.
Link to ReadFileScatter() function definition:
http://msdn2.microsoft.com/en-us/library/aa365469.aspx
# nio.rb - requires the latest windows-pr gem
require 'windows/file'
require 'windows/handle'
require 'windows/error'
require 'windows/memory'
require 'windows/nio'
require 'windows/synchronize'
require 'windows/system_info'
require 'windows/msvcrt/io'
require 'windows/msvcrt/buffer'
require 'win32/event'
module Win32
class NIO
include Windows::File
include Windows::Handle
include Windows::Error
include Windows::Synchronize
include Windows::MSVCRT::IO
include Windows::MSVCRT::Buffer
include Windows::SystemInfo
include Windows::Memory
include Windows::NIO
extend Windows::File
extend Windows::Handle
extend Windows::Error
extend Windows::Synchronize
extend Windows::MSVCRT::IO
extend Windows::MSVCRT::Buffer
extend Windows::SystemInfo
extend Windows::Memory
extend Windows::NIO
class Error < StandardError; end
# Reads the entire file specified by portname as individual
lines, and
# returns those lines in an array. Lines are separated by +sep+.
#--
# The semantics are the same as the MRI version but the
implementation
# is drastically different. We use a scattered IO read, which is
about
# as fast as the MRI version for small files, but much faster
for very
# large files.
#
def self.readlines(file, sep = $/)
handle = CreateFile(
file,
GENERIC_READ,
FILE_SHARE_READ,
nil,
OPEN_EXISTING,
FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING,
nil
)
if handle == INVALID_HANDLE_VALUE
raise Error, get_last_error
end
# Get your system's page size, probably 4k
sysbuf = 0.chr * 40
GetSystemInfo(sysbuf)
page_size = sysbuf[4,4].unpack('L')[0]
num_pages = (File.size(file).to_f / page_size).ceil
base_address = VirtualAlloc(
nil,
page_size * num_pages,
MEM_COMMIT,
PAGE_READWRITE
)
buf_list = []
for i in 0...num_pages
buf_list.push(base_address + page_size * i)
end
seg_array = buf_list.pack('Q*') + 0.chr * 8
olap = 0.chr * 20
olap[16,4] = [CreateEvent(nil, 1, 0, nil)].pack('L')
bool = ReadFileScatter(
handle,
seg_array,
page_size * num_pages,
nil,
olap
)
unless bool
raise Error, get_last_error
end
WaitForSingleObject(olap[16,4].unpack('L')[0], INFINITE)
# MRI's File.size cannot be trusted for files larger than
2gb.
file_size = [0].pack('Q')
GetFileSizeEx(handle, file_size)
file_size = file_size.unpack('Q')[0]
unless CloseHandle(handle)
raise Error, get_last_error
end
buffer = 0.chr * file_size
memcpy(buffer, buf_list[0], file_size)
VirtualFree(base_address, 0, MEM_RELEASE)
# TODO: Fix line ending issue (?)
unless sep.nil?
if sep.empty?
buffer = buffer.split("\r\n\r\n")
else
buffer = buffer.split(sep)
end
end
buffer
end
end
end